Re: [R-sig-eco] Vegan-Adonis-NMDS-SIMPER
For SIMPER you can read this thread: https://www.researchgate.net/post/What_do_you_think_about_Similarity_Percentage_analysis_SIMPER_to_measure_species_indicator_value Moreover read the function help(simper) carefully, because: The results of 'simper' can be very difficult to interpret. The method very badly confounds the mean between group differences and within group variation, and seems to single out variable species instead of distinctive species (Warton et al. 2012). Even if you make groups that are copies of each other, the method will single out species with high contribution, but these are not contributions to non-existing between-group differences but to within-group variation in species abundance. Warton, D.I., Wright, T.W., Wang, Y. 2012. Distance-based multivariate analyses confound location and dispersion effects. *Methods in Ecology and Evolution*, 3, 89-101. hope that helped, Gian On 24 March 2014 22:08, Brandon Gerig bge...@nd.edu wrote: I am assessing the level of similarity between PCB congener profiles in spawning salmon and resident stream in stream reaches with and without salmon to determine if salmon are a significant vector for PCBs in tributary foodwebs of the Great Lakes. My data set is arranged in a matrix where the columns represent the congener of interest and the rows represent either a salmon (migratory) or resident fish (non migratory) from different sites. You can think of this in a manner analogous to columns representing species composition and rows representing site. Currently, I am using the function Adonis to test for dissimilarity between fish species, stream reaches (with and without salmon) and lake basin (Superior, Huron, Michigan). The model statement is: m1adonis(congener~FISH*REACH*BASIN,data=pcbcov,method=bray,permutations=999) The output indicates significant main effects of FISH, REACH, and BASIN and significant interactions between FISH and BASIN, and BASIN and REACH. Is it best to then interpret this output via an NMDS ordination plot or use something like the betadiver function to examine variances between main effect levels or both? Also, can anyone recommend a procedure to identify the congeners that contribute most to the dissimilarity between fish, reaches, and basins?. I was thinking the SIMPER procedure but am not yet sold. Any advice is appreciated! -- Brandon Gerig PhD Student Department of Biological Sciences University of Notre Dame [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology *- Do not print this email unless you really need to. Save paper and protect the environment! -* [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] Vegan-Adonis-NMDS-SIMPER
Further to the Warton paper, he and colleagues developed an R package that I've used in place of SIMPER called ' mvabund'. This runs multivariate GLMs and gives effects sizes and adjusted significance values of species that can be used to determine those having the greatest effect. I haven't looked at it recently but when I used the package they had it working for count and binary data. It's very simple to use and there are excellent references for both the methods and theory. David even uploaded possibly the most entertaining stats video about it on you tube which explains reasons for the approach very clearly. Good luck, Liz Liz Pryde On 25 Mar 2014, at 9:19 pm, Gian Maria Niccolò Benucci gian.benu...@gmail.com wrote: For SIMPER you can read this thread: https://www.researchgate.net/post/What_do_you_think_about_Similarity_Percentage_analysis_SIMPER_to_measure_species_indicator_value Moreover read the function help(simper) carefully, because: The results of 'simper' can be very difficult to interpret. The method very badly confounds the mean between group differences and within group variation, and seems to single out variable species instead of distinctive species (Warton et al. 2012). Even if you make groups that are copies of each other, the method will single out species with high contribution, but these are not contributions to non-existing between-group differences but to within-group variation in species abundance. Warton, D.I., Wright, T.W., Wang, Y. 2012. Distance-based multivariate analyses confound location and dispersion effects. *Methods in Ecology and Evolution*, 3, 89-101. hope that helped, Gian On 24 March 2014 22:08, Brandon Gerig bge...@nd.edu wrote: I am assessing the level of similarity between PCB congener profiles in spawning salmon and resident stream in stream reaches with and without salmon to determine if salmon are a significant vector for PCBs in tributary foodwebs of the Great Lakes. My data set is arranged in a matrix where the columns represent the congener of interest and the rows represent either a salmon (migratory) or resident fish (non migratory) from different sites. You can think of this in a manner analogous to columns representing species composition and rows representing site. Currently, I am using the function Adonis to test for dissimilarity between fish species, stream reaches (with and without salmon) and lake basin (Superior, Huron, Michigan). The model statement is: m1adonis(congener~FISH*REACH*BASIN,data=pcbcov,method=bray,permutations=999) The output indicates significant main effects of FISH, REACH, and BASIN and significant interactions between FISH and BASIN, and BASIN and REACH. Is it best to then interpret this output via an NMDS ordination plot or use something like the betadiver function to examine variances between main effect levels or both? Also, can anyone recommend a procedure to identify the congeners that contribute most to the dissimilarity between fish, reaches, and basins?. I was thinking the SIMPER procedure but am not yet sold. Any advice is appreciated! -- Brandon Gerig PhD Student Department of Biological Sciences University of Notre Dame [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology *- Do not print this email unless you really need to. Save paper and protect the environment! -* [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] Estimating HR overlap by adehabitatHR
Augusto: I'm also looking to calculate hr overlap from MCP; however, I run into an error in the code you provided: Augusto Ribas wrote #calc sobreposition area for(i in 1:4){ for(j in 1:4){ p1 - as(cp@'polygons'[[i]]@'Polygons'[[1]]@'coords','gpc.poly') p2 - as(cp@'polygons'[[j]]@'Polygons'[[1]]@'coords','gpc.poly') res[i,j] - area.poly(p1)+area.poly(p2)-area.poly(union(p1,p2)) }} res/1 When I get to this point, the code produces an error, saying: Error in as(hrmcp@polygons[[i]]@Polygons[[1]]@coords, gpc.poly) : no method or default for coercing “matrix” to “gpc.poly” Any ideas as to getting around this error? Any feedback would be greatly appreciated! Cheers, Ellen -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/Estimating-HR-overlap-by-adehabitatHR-tp7409589p7578782.html Sent from the r-sig-ecology mailing list archive at Nabble.com. ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[R-sig-eco] Cosinor with data that trend over time
Hello all, I am thinking about applying season::cosinor() analysis to some irregularely spaced time series data. The data are unevenly spaced, so usual time series methods, as well as the nscosinor() function are out. My data do however trend over time and I am wondering if I can feed date as a variable into my cosinor analyis. In the example below, then I'd conclude then that the abundances are seasonal, with maximal abundance in mid June and furthermore, they are generally decreasing over time. Can I use both time variables together like this? If not, is there some better approach I should take? Thanks in advance, -Jacob For context lAbundance is logg-odds transformed abundance data of a microbial species in a given location over time. RDate is the date the sample was collected in the r date format. res - cosinor(lAbundance ~ RDate, date = RDate, data = lldata) summary(res) Cosinor test Number of observations = 62 Amplitude = 0.58 Phase: Month = June , day = 14 Low point: Month = December , day = 14 Significant seasonality based on adjusted significance level of 0.025 = TRUE summary(res$glm) Call: glm(formula = f, family = family, data = data, offset = offset) Deviance Residuals: Min 1Q Median 3Q Max -2.3476 -0.6463 0.1519 0.6574 1.9618 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.0115693 1.8963070 0.006 0.99515 RDate -0.0003203 0.0001393 -2.299 0.02514 * cosw-0.5516458 0.1837344 -3.002 0.00395 ** sinw 0.1762904 0.1700670 1.037 0.30423 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for gaussian family taken to be 0.9458339) Null deviance: 70.759 on 61 degrees of freedom Residual deviance: 54.858 on 58 degrees of freedom AIC: 178.36 Number of Fisher Scoring iterations: 2 llldata as a csv below RDate,lAbundance 2003-03-12,-3.3330699059335 2003-05-21,-3.04104625745886 2003-06-17,-3.04734680029566 2003-07-02,-4.18791034708572 2003-09-18,-3.04419201802053 2003-10-22,-3.13805060873929 2004-02-19,-3.80688269144794 2004-03-17,-4.50755507726145 2004-04-22,-4.38846502542992 2004-05-19,-3.06618649442674 2004-06-17,-5.20518774876304 2004-07-14,-3.75041853151097 2004-08-25,-3.67882486716196 2004-09-22,-5.22205827512234 2004-10-14,-3.99297508670535 2004-11-17,-4.68793287601157 2004-12-15,-4.31712380781011 2005-02-16,-4.30893550479904 2005-03-16,-4.05781773988454 2005-05-11,-3.94746237402035 2005-07-19,-4.91195185391358 2005-08-17,-4.93590576323119 2005-09-15,-4.85820800095518 2005-10-20,-5.22956391101343 2005-12-13,-5.12244047315448 2006-01-18,-3.04854660925046 2006-02-22,-6.77145858348375 2006-03-29,-4.33151493849021 2006-04-19,-3.36152357710535 2006-06-20,-3.09071584142593 2006-07-25,-3.31430484483825 2006-08-24,-3.09974933041469 2006-09-13,-3.33288992218458 2007-12-17,-4.19942661980677 2008-03-19,-3.86146499633625 2008-04-22,-3.36161599919095 2008-05-14,-4.30878307213324 2008-06-18,-3.74372448768828 2008-07-09,-4.65951429661651 2008-08-20,-5.35984647704619 2008-09-22,-4.78481898261137 2008-10-20,-3.58588161980965 2008-11-20,-3.10625125552057 2009-02-18,-6.90675477864855 2009-03-11,-3.43446932013368 2009-04-23,-3.82688066341466 2009-05-13,-4.44885332005661 2009-06-18,-3.97671552612412 2009-07-09,-3.40185954003936 2009-08-19,-3.44958231694091 2009-09-24,-3.86508161094726 2010-01-28,-4.95281587967569 2010-02-11,-3.78064756876257 2010-03-24,-3.5823501176064 2010-04-27,-4.3363571587 2010-05-17,-3.90545735473055 2010-07-21,-3.3147176517321 2010-08-11,-4.53218360860017 2010-10-21,-6.90675477864855 2010-11-23,-6.90675477864855 2010-12-16,-6.75158176094352 2011-01-11,-6.90675477864855 [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] Cosinor with data that trend over time
I would probably attack this using a GAM modified to model the residuals as a stochastic time series process. For example require(mgcv) mod - gamm(y ~ s(DoY, bs = cc) + s(time), data = foo, correlation = corCAR1(form = ~ time)) where `foo` is your data frame, `DoY` is a variable in the data frame computed as `as.numeric(strftime(RDate, format = %j))` and `time` is a variable for the passage of time - you could do `as.numeric(RDate)` but the number of days is probably large as we might encounter more problems fitting the model. Instead you might do `as.numeric(RDate) / 1000` say to produce values on a more manageable scale. The `bs = cc` bit specifies a cyclic spline applicable to data measured throughout a year. You may want to fix the start and end knots to be days 1 and days 366 respectively, say via `knots = list(DoY = c(0,366))` as an argument to `gam()` [I think I have this right, specifying the boundary knots, but let me know if you get an error about the number of knots]. The residuals are said to follow a continuois time AR(1), the irregular-spaced counter part to the AR(1), plus random noise. There may be identifiability issues as the `s(time)` and `corCAR1()` compete to explain the fine-scale variation. If you hit such a case, you can make an educated guess as to the wiggliness (degrees of freedom) for the smooth terms based on a plot of the data and fix the splines at those values via argument `k = x` and `fx = TRUE`, where `x` in `k = x` is some integer value. Both these go in as arguments to the `s()` functions. If the trend is not very non-linear you can use a low value 1-3 here for x and for the DoY term say 3-4 might be applicable. There are other ways to approach this problem of identifiability, but that would require more time/space here, which I can go into via a follow-up if needed. You can interrogate the fitted splines to see when the peak value of the `DoY` term is in the year. You can also allow the seasonal signal to vary in time with the trend by allowing the splines to interact in a 2d-tensor product spline. Using `te(DoY, time, bs = c(cc,cr))` instead of the two `s()` terms (or using `ti()` terms for the two marginal splines and the 2-d spline). Again you can add in the `k` = c(x,y), fx = TRUE)` to the `te()` term where `x` and `y` are the dfs for each dimension in the `te()` term. It is a bit more complex to do this for `ti()` terms. Part of the reason to prefer a spline for DoY for the seasonal term is that one might not expect the seasonal cycle to be a symmetric cycle as a cos/sin terms would imply. A recent ecological paper describing a similar approach (though using different package in R) is that of Claire Ferguson and colleagues in J Applied Ecology (2008) http://doi.org/10./j.1365-2664.2007.01428.x (freely available). HTH G On 25 March 2014 19:14, Jacob Cram cramj...@gmail.com wrote: Hello all, I am thinking about applying season::cosinor() analysis to some irregularely spaced time series data. The data are unevenly spaced, so usual time series methods, as well as the nscosinor() function are out. My data do however trend over time and I am wondering if I can feed date as a variable into my cosinor analyis. In the example below, then I'd conclude then that the abundances are seasonal, with maximal abundance in mid June and furthermore, they are generally decreasing over time. Can I use both time variables together like this? If not, is there some better approach I should take? Thanks in advance, -Jacob For context lAbundance is logg-odds transformed abundance data of a microbial species in a given location over time. RDate is the date the sample was collected in the r date format. res - cosinor(lAbundance ~ RDate, date = RDate, data = lldata) summary(res) Cosinor test Number of observations = 62 Amplitude = 0.58 Phase: Month = June , day = 14 Low point: Month = December , day = 14 Significant seasonality based on adjusted significance level of 0.025 = TRUE summary(res$glm) Call: glm(formula = f, family = family, data = data, offset = offset) Deviance Residuals: Min 1Q Median 3Q Max -2.3476 -0.6463 0.1519 0.6574 1.9618 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.0115693 1.8963070 0.006 0.99515 RDate -0.0003203 0.0001393 -2.299 0.02514 * cosw-0.5516458 0.1837344 -3.002 0.00395 ** sinw 0.1762904 0.1700670 1.037 0.30423 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for gaussian family taken to be 0.9458339) Null deviance: 70.759 on 61 degrees of freedom Residual deviance: 54.858 on 58 degrees of freedom AIC: 178.36 Number of Fisher Scoring iterations: 2 llldata as a csv below RDate,lAbundance 2003-03-12,-3.3330699059335 2003-05-21,-3.04104625745886 2003-06-17,-3.04734680029566 2003-07-02,-4.18791034708572