Re: [R-sig-eco] anova.cca question / missing data in constraining matrix
Hi Jari, I have a final (hopefully) question. I have narrowed down the number of environmental variables and removed some of the community (and associated env. data) rows if they are missing too many environmental variables. Then, when I run the anova.cca by terms I get the following error message. anova(m2,by='terms',perm=999) Error in La.svd(x, nu, nv) : error code 1 from Lapack routine 'dgesdd' any thoughts about why this is happening or how I can avoid it so that I can run the anova by terms. Thank you so much for all of your help with this! colleen On Tue, Jun 4, 2013 at 1:59 AM, Jari Oksanen [via r-sig-ecology] < ml-node+s471788n7578184...@n2.nabble.com> wrote: > Dear Colleen, > On 03/06/2013, at 22:32 PM, ckellogg wrote: > > > Hello Jari, > > Thank you for your help with this. The solution you suggested in your > > second post worked quite well. However, i think another subset of my > data > > is too 'holey', because when I run CCA on this set of environmental > > variables (or the a CCA with the previous environmental variables and > the > > additional ones), I get an error: > > > >> toolik250.cca2 > >> > <-cca(toolikotus250.ra~logtemp+conductivity+pH+logBacProd+DIC+logDCO2+sqrtDCH4+logDOC+sqrtPhosphate+sqrtNitrate+sqrtTDN+sqrtTDP+logPC+logPN+Ca+Mg+logNa+logK+SO4+logChloride+Silica,toolikenv.s, > > >> na.action=na.exclude) > > Error in predict.cca(x, newdata = excluded, type = "wa", model = "CA") : > > model âCAâ has rank 0 > > > > The CCA runs if I use na.action=na.omit, but then when I run the anovas, > > there is apparently no residual component. For example, > > No residual component > > > > Model: cca(formula = toolikotus250.ra ~ logtemp + conductivity + pH + > > logBacProd + DIC + logDCO2 + sqrtDCH4 + logDOC + sqrtPhosphate + > sqrtNitrate > > + sqrtTDN + sqrtTDP + logPC + logPN + Ca + Mg + logNa + logK + SO4 + > > logChloride + Silica, data = toolikenv.s, na.action = na.omit, subset = > > -toolik250.cca2$na.action) > > Df Chisq F N.Perm Pr(>F) > > Model12 5.30030 > > Residual 0 0. > > > Yes, probably too many holes. You have no residual variation which > indicates that the number > of predictor variables (constraints) is higher than the number of > remaining observations. > > Cheers, Jari Oksanen > > > So, I am thinking that examining the relationship between the microbial > > community and this subset of environmental variables might not be > possible > > without my first manually curating which samples and variables should be > > included, correct? > > > > Thank you, > > Colleen > > > > > > > > -- > > View this message in context: > http://r-sig-ecology.471788.n2.nabble.com/anova-cca-question-missing-data-in-constraining-matrix-tp7578175p7578179.html > > Sent from the r-sig-ecology mailing list archive at Nabble.com. > > > > ___ > > R-sig-ecology mailing list > > [hidden email] <http://user/SendEmail.jtp?type=node&node=7578184&i=0> > > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology > > -- > Jari Oksanen, Dept Biology, Univ Oulu, 90014 Finland > [hidden email] <http://user/SendEmail.jtp?type=node&node=7578184&i=1>, > Ph. +358 400 408593, http://cc.oulu.fi/~jarioksa > > ___ > R-sig-ecology mailing list > [hidden email] <http://user/SendEmail.jtp?type=node&node=7578184&i=2> > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology > > > -- > If you reply to this email, your message will be added to the discussion > below: > > http://r-sig-ecology.471788.n2.nabble.com/anova-cca-question-missing-data-in-constraining-matrix-tp7578175p7578184.html > To unsubscribe from anova.cca question / missing data in constraining > matrix, click > here<http://r-sig-ecology.471788.n2.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=7578175&code=Y3Rla2VsbG9nZ0BnbWFpbC5jb218NzU3ODE3NXw2MzE3Nzc4OTg=> > . > NAML<http://r-sig-ecology.471788.n2.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- colleen t.e. kellogg (c) 206-714-2441 -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/Re-anova-cca-question-missing-data-in-constraining-matrix-tp7578186.html Sent from the r-sig-ecology mailing list archive at Nabble.com. [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] anova.cca question / missing data in constraining matrix
Hello Jari, Thank you for your help with this. The solution you suggested in your second post worked quite well. However, i think another subset of my data is too 'holey', because when I run CCA on this set of environmental variables (or the a CCA with the previous environmental variables and the additional ones), I get an error: > toolik250.cca2 > <-cca(toolikotus250.ra~logtemp+conductivity+pH+logBacProd+DIC+logDCO2+sqrtDCH4+logDOC+sqrtPhosphate+sqrtNitrate+sqrtTDN+sqrtTDP+logPC+logPN+Ca+Mg+logNa+logK+SO4+logChloride+Silica,toolikenv.s, > na.action=na.exclude) Error in predict.cca(x, newdata = excluded, type = "wa", model = "CA") : model “CA” has rank 0 The CCA runs if I use na.action=na.omit, but then when I run the anovas, there is apparently no residual component. For example, No residual component Model: cca(formula = toolikotus250.ra ~ logtemp + conductivity + pH + logBacProd + DIC + logDCO2 + sqrtDCH4 + logDOC + sqrtPhosphate + sqrtNitrate + sqrtTDN + sqrtTDP + logPC + logPN + Ca + Mg + logNa + logK + SO4 + logChloride + Silica, data = toolikenv.s, na.action = na.omit, subset = -toolik250.cca2$na.action) Df Chisq F N.Perm Pr(>F) Model12 5.30030 Residual 0 0. So, I am thinking that examining the relationship between the microbial community and this subset of environmental variables might not be possible without my first manually curating which samples and variables should be included, correct? Thank you, Colleen -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/anova-cca-question-missing-data-in-constraining-matrix-tp7578175p7578179.html Sent from the r-sig-ecology mailing list archive at Nabble.com. ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[R-sig-eco] anova.cca question / missing data in constraining matrix
Hello, I am using the cca function in Vegan to examine the relationship between microbial community structure and a (large) suite of environmental variables. My constraining/environmental data matrix as a lot of holes in it so I have been exploring using the na.action argument. This is the command I am currently using: toolik250.cca<-cca(toolikotus250.ra~julianday+logwindspd_max_1dayprior+lograin_3dayprior+sqrtwindspd_1dayprior+windspd_3dayprior+days_since_thaw+days_since_iceout+days_btw_iceoutandthaw+toolikepitemp_24h+logtemp+conductivity+pH, toolikenv.s, na.action=na.omit) The CCA seems to run just fine, but when I attempt to do the posthoc tests such as anova.cca (anova(toolik250.cca,by='terms',perm=999), I get an error message: "Error in anova.ccabyterm(object, step = step, ...) : number of rows has changed: remove missing values?" What exactly is occurring here to cause this error - I suspect it must be related to the fact that the environmental data matrix has a lot of missing data. I don't quite understand why it states that the number of rows has changed...changed from what? Is there any way to get around having missing data when running the anovas as you can when running the CCA itself? One other question I have is when I try and run the CCA on all the data in my environmental data matrix (toolikenv.s), not just a subset of variables as I do above, using this command: toolik250.cca <-cca(toolikotus250.ra~., toolikenv.s, na.action=na.omit) I get the following error message. "Error in svd(Xbar, nu = 0, nv = 0) : a dimension is zero" What might be causing this error message to be thrown? Thank you so much for your help. Maybe I will just have to filter out the samples with missing environmental data (or filter out some of the variables themselves if they have too much missing data), but I was just hoping to avoid having to do this. Colleen -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/anova-cca-question-missing-data-in-constraining-matrix-tp7578175.html Sent from the r-sig-ecology mailing list archive at Nabble.com. ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] problems with loading packfor package
Well, after much looking around, I think this may be a problem with the R 2.14.0 GUI. I can load this package if i run R through X11 or RStudio. So, crisis averted for the most part. Colleen -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/problems-with-loading-packfor-package-tp7043338p7046642.html Sent from the r-sig-ecology mailing list archive at Nabble.com. ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[R-sig-eco] problems with loading packfor package
Hello, I want to use the packfor package (http://r-forge.r-project.org/R/?group_id=195). I installed it using the R package installer on the Mac version of R, using the "Other Repository" option and putting the URL: http://R-Forge.R-project.org. It seems to install fine, but when I load the package using library(packfor) I get a line of errors: -- packfor: R Package for Forward Selection (Canoco Manual p.49) version 0.0-7Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared object '/Library/Frameworks/R.framework/Versions/2.14/Resources/library/packfor/libs/x86_64/packfor.so': dlopen(/Library/Frameworks/R.framework/Versions/2.14/Resources/library/packfor/libs/x86_64/packfor.so, 6): Library not loaded: @rpath/R.framework/Versions/2.14/Resources/lib/libRlapack.dylib Referenced from: /Library/Frameworks/R.framework/Versions/2.14/Resources/library/packfor/libs/x86_64/packfor.so Reason: image not found Error: package/namespace load failed for ‘packfor’ -- Has anyone else experienced this problem and figured out how to fix it? I have R version 2.14. I would really appreciate any help. Thank you so much! Colleen Kellogg -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/problems-with-loading-packfor-package-tp7043338p7043338.html Sent from the r-sig-ecology mailing list archive at Nabble.com. ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[R-sig-eco] variance partitioning using 'varpart' in vegan
Hello, I have been using the 'varpart' function in vegan. For one of my datasets, I get many "negative" variances. I have looked at the equations provided in partitioning.pdf in vegandocs, and I see how this can physically happen - but I guess I do not fully understand why or what exactly this means. If anyone can provide any insight or point me in the direction of a helpful resource, it would be much appreciated. One other item I have been struggling with is in the output for 'varpart'. If the Also, in testing the significance of the variances explained, I have following the code shown in the 'varpart' documentation. If I am using 4 explanatory variables - A, B, C, D - and want to figure out the significance of the variance shared by all variables, would this be correct implementation of this code: rda.result <- rda(ANEEAbulk.z2 ~ A + B + C + D) anova(rda.result, step=999, perm.max=999) But isn't this more the total variance explained - not the shared variance? How would I code to find the significance of the shared variance among all variables? Is this significance even testable? and if I want to figure out the significance of the shared variance explained by just A and D, would this be the correct code: rda.result <- rda(ANEEAbulk.z2 ~ A + D + Condition(B) + Condition(C)) anova(rda.result, step=999, perm.max=999) I hope my questions make sense. I just want to make sure I am stepping through all of these procedures correctly. Thank you very much, Colleen . Colleen Kellogg Postdoctoral Scientist UMCES - Horn Point Laboratory Cambridge, MD 21613 -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/variance-partitioning-using-varpart-in-vegan-tp6972644p6972644.html Sent from the r-sig-ecology mailing list archive at Nabble.com. ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] when to use PCNM and other questions about varpart in Vegan
Thank you very much for this reference. It is very interesting. The more I read, the more it is becoming difficult to decide what to do. I was initially considering using mantel tests and partial mantel tests to compare my bacterial community with environmental and spatial distances, but then I found this paper: LEGENDRE, P. and FORTIN, M.-J. (2010), Comparison of the Mantel test and alternative approaches for detecting complex multivariate relationships in the spatial analysis of genetic data. Molecular Ecology Resources, 10: 831–844. doi: 10./j.1755-0998.2010.02866.x and am now questioning that too. I guess the hunt continues Colleen -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/when-to-use-PCNM-and-other-questions-about-varpart-in-Vegan-tp6295323p6295488.html Sent from the r-sig-ecology mailing list archive at Nabble.com. ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] when to use PCNM and other questions about varpart in Vegan
Oh, and forgot to mention that my sites span a distance of >2000 km. colleen -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/when-to-use-PCNM-and-other-questions-about-varpart-in-Vegan-tp6295323p6295337.html Sent from the r-sig-ecology mailing list archive at Nabble.com. ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[R-sig-eco] when to use PCNM and other questions about varpart in Vegan
Hi All, I am planning to use varpart in {Vegan} to partition the variance in my bacterial community composition between 14 sites in the Arctic Ocean between space and environmental variables. I am noticing that in many instances, a PCNM is employed to remove spatial autocorrelation. But, I think I am not quite understanding what this does exactly and how I use this prior to doing the variance partitioning. Maybe i can try and explain what I am trying to do. Basically, I have 14 sites and at each site 3 samples were taken at 2 to 3 depths to for bacterial community composition analysis. I also took samples for environmental data at each depth. I have distance calculated between sites over water (not over land or 'as the crow flies'), referenced to a point on land (i also have the latitude and longitude of each site; but i wasn't sure that these coordinates themselves were the best measure of distance between sites). Ultimately, I am trying to figure out how much of the variation in the bacterial community between sites is attributable to environmental variability as well as to geographic distance between the sites. Surely, the environmental variables also have some spatial component to them as well. So, my plan was to use varpart as follows: BACvsEnvvsGeo<-varpart(ANbac,ANenv.z,ANdist_MR.z) plot(BACvsEnvvsGeo) ...where ANbac = bacterial community matrix (site x species), ANenv.z = log-transformed and standardized environmental data (site x env parameters), ANdist_MR.z = log-transformed and standardized distance between cites referenced to a point land to the west of all sites, site x distance (1 column)) I am planning to do this in conjunction with an RDA for space and environmental data separately. This would allow me to calculate how much variation is do to environmental variation alone, space alone, and environment+space. So, my questions are: (1) do I need to do a PCNM beforehand. And do I do this just on the distance between sites data. Or do this also on the environmental data. (2) And ultimately, what does doing the PCNM get me. (3) Also, I am wondering if I really should use Lat/Long data instead of distance between sites. I am not really sure it matters, but I was not sure if it matter for the analysis. (4) And, my bacterial community data is standardized to total abundance at each size (so, percent abundance), and I am wondering if it is necessary to do further transformations of the data. I see that this standardization is an option in the decostand algorithm, so I am think that it is okay to not further transform my data. (5) Alternatively, can varpart be used with db-RDA, so that I can put in my distance matrix instead of the transformed species data. If so, how can I do this? Sorry for the onslaught of questions. There are not really any people in my department that I can ask these questions too... Thank you very VERY much for any advice. I am really having a hard time deciding/figuring out which is the best way to proceed. Colleen Colleen Kellogg PhD Candidate School of Oceanography University of Washington Box 357940 Seattle, WA 98195 -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/when-to-use-PCNM-and-other-questions-about-varpart-in-Vegan-tp6295323p6295323.html Sent from the r-sig-ecology mailing list archive at Nabble.com. ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] plotting capscale results using ordiplot in Vegan
Right. The code. Sorry about that. This is what I used: plot(ANareabac_red.cap,choices=c(1,2),type='n',scaling=2) points(ANareabac_red.cap,choices=c(1,2),display='wa',pch=AN_info[,9],col=AN_info[,6],cex=1.5,scaling=2) I don't really have any outliers so I don't think that is the problem. I will give what Andres suggested (thanks!) a shot. Colleen -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/plotting-capscale-results-using-ordiplot-in-Vegan-tp6288126p6290975.html Sent from the r-sig-ecology mailing list archive at Nabble.com. ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[R-sig-eco] plotting capscale results using ordiplot in Vegan
Hi, I used capscale to examine relationships between relative species abundances at different sites with environmental variables. When I attempt to plot the results using ordiplot all of my site points are squished to the center of the biplot. I have tried changing the scaling and that only makes things worse. Is there anyway I can get the sites to spread out a bit in the ordination? Thanks so much! Colleen -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/plotting-capscale-results-using-ordiplot-in-Vegan-tp6288126p6288126.html Sent from the r-sig-ecology mailing list archive at Nabble.com. ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology