Re: [R-sig-eco] Vegan-Adonis-NMDS-SIMPER

2014-03-25 Thread Gian Maria Niccolò Benucci
For SIMPER you can read this thread:
https://www.researchgate.net/post/What_do_you_think_about_Similarity_Percentage_analysis_SIMPER_to_measure_species_indicator_value

Moreover read the function help(simper) carefully, because: The results of
'simper' can be very difficult to interpret. The method very badly
confounds the mean between group differences and within group variation,
and seems to single out variable species instead of distinctive species
(Warton et al. 2012). Even if you make groups that are copies of each
other, the method will single out species with high contribution, but these
are not contributions to non-existing between-group differences but to
within-group variation in species abundance.

Warton, D.I., Wright, T.W., Wang, Y. 2012. Distance-based multivariate
analyses confound location and dispersion effects. *Methods in Ecology and
Evolution*, 3, 89-101.

hope that helped,
Gian


On 24 March 2014 22:08, Brandon Gerig bge...@nd.edu wrote:

 I am assessing the level of similarity between PCB congener profiles in
 spawning salmon and resident stream in stream reaches with and without
 salmon to determine if salmon are a significant vector for PCBs in
 tributary foodwebs of the Great Lakes.

 My data set is arranged in a matrix where the columns represent the
 congener of interest and the rows represent either a salmon (migratory) or
 resident fish (non migratory) from different sites.  You can think of this
 in a manner analogous to columns representing species composition and rows
 representing site.

 Currently, I am using the function Adonis to test for dissimilarity between
 fish species, stream reaches (with and without salmon) and lake basin
 (Superior, Huron, Michigan).
 The model statement is:

 m1adonis(congener~FISH*REACH*BASIN,data=pcbcov,method=bray,permutations=999)

 The output indicates significant main effects of FISH, REACH, and BASIN and
 significant interactions between FISH and BASIN, and BASIN and REACH.

 Is it best to then interpret this output via an NMDS ordination plot or use
 something like the betadiver function to examine variances between main
 effect levels or both?

 Also,  can anyone recommend a procedure to identify the congeners that
 contribute most to the dissimilarity between fish, reaches, and basins?. I
 was thinking the SIMPER procedure but am not yet sold.

 Any advice is appreciated!
 --
 Brandon Gerig
 PhD Student
 Department of Biological Sciences
 University of Notre Dame

 [[alternative HTML version deleted]]

 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology




*- Do not print this email unless you really need to. Save paper and
protect the environment! -*

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Vegan-Adonis-NMDS-SIMPER

2014-03-25 Thread Liz Pryde
Further to the Warton paper, he and colleagues developed an R package that I've 
used in place of SIMPER called ' mvabund'. This runs multivariate GLMs and 
gives effects sizes and adjusted significance values of species that can be 
used to determine those having the greatest effect.

I haven't looked at it recently but when I used the package they had it working 
for count and binary data. It's very simple to use and there are excellent 
references for both the methods and theory. David even uploaded possibly the 
most entertaining stats video about it on you tube which explains reasons for 
the approach very clearly.

Good luck,
Liz

Liz Pryde


 On 25 Mar 2014, at 9:19 pm, Gian Maria Niccolò Benucci 
 gian.benu...@gmail.com wrote:
 
 For SIMPER you can read this thread:
 https://www.researchgate.net/post/What_do_you_think_about_Similarity_Percentage_analysis_SIMPER_to_measure_species_indicator_value
 
 Moreover read the function help(simper) carefully, because: The results of
 'simper' can be very difficult to interpret. The method very badly
 confounds the mean between group differences and within group variation,
 and seems to single out variable species instead of distinctive species
 (Warton et al. 2012). Even if you make groups that are copies of each
 other, the method will single out species with high contribution, but these
 are not contributions to non-existing between-group differences but to
 within-group variation in species abundance.
 
 Warton, D.I., Wright, T.W., Wang, Y. 2012. Distance-based multivariate
 analyses confound location and dispersion effects. *Methods in Ecology and
 Evolution*, 3, 89-101.
 
 hope that helped,
 Gian
 
 
 On 24 March 2014 22:08, Brandon Gerig bge...@nd.edu wrote:
 
 I am assessing the level of similarity between PCB congener profiles in
 spawning salmon and resident stream in stream reaches with and without
 salmon to determine if salmon are a significant vector for PCBs in
 tributary foodwebs of the Great Lakes.
 
 My data set is arranged in a matrix where the columns represent the
 congener of interest and the rows represent either a salmon (migratory) or
 resident fish (non migratory) from different sites.  You can think of this
 in a manner analogous to columns representing species composition and rows
 representing site.
 
 Currently, I am using the function Adonis to test for dissimilarity between
 fish species, stream reaches (with and without salmon) and lake basin
 (Superior, Huron, Michigan).
 The model statement is:
 
 m1adonis(congener~FISH*REACH*BASIN,data=pcbcov,method=bray,permutations=999)
 
 The output indicates significant main effects of FISH, REACH, and BASIN and
 significant interactions between FISH and BASIN, and BASIN and REACH.
 
 Is it best to then interpret this output via an NMDS ordination plot or use
 something like the betadiver function to examine variances between main
 effect levels or both?
 
 Also,  can anyone recommend a procedure to identify the congeners that
 contribute most to the dissimilarity between fish, reaches, and basins?. I
 was thinking the SIMPER procedure but am not yet sold.
 
 Any advice is appreciated!
 --
 Brandon Gerig
 PhD Student
 Department of Biological Sciences
 University of Notre Dame
 
[[alternative HTML version deleted]]
 
 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
 
 
 
 *- Do not print this email unless you really need to. Save paper and
 protect the environment! -*
 
[[alternative HTML version deleted]]
 
 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Estimating HR overlap by adehabitatHR

2014-03-25 Thread ellepero
Augusto:

I'm also looking to calculate hr overlap from MCP; however, I run into an
error in the code you provided:


Augusto Ribas wrote
 #calc sobreposition area
 for(i in 1:4){
  for(j in 1:4){
p1 - as(cp@'polygons'[[i]]@'Polygons'[[1]]@'coords','gpc.poly')
p2 - as(cp@'polygons'[[j]]@'Polygons'[[1]]@'coords','gpc.poly')
res[i,j] - area.poly(p1)+area.poly(p2)-area.poly(union(p1,p2))
  }}
 res/1

When I get to this point, the code produces an error, saying:
Error in as(hrmcp@polygons[[i]]@Polygons[[1]]@coords, gpc.poly) :
  no method or default for coercing “matrix” to “gpc.poly”

Any ideas as to getting around this error?
Any feedback would be greatly appreciated!
Cheers,
Ellen 




--
View this message in context: 
http://r-sig-ecology.471788.n2.nabble.com/Estimating-HR-overlap-by-adehabitatHR-tp7409589p7578782.html
Sent from the r-sig-ecology mailing list archive at Nabble.com.

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] Cosinor with data that trend over time

2014-03-25 Thread Jacob Cram
Hello all,
 I am thinking about applying season::cosinor() analysis to some
irregularely spaced time series data. The data are unevenly spaced, so
usual time series methods, as well as the nscosinor() function are out. My
data do however trend over time and I am wondering if I can feed date as a
variable into my cosinor analyis.  In the example below, then I'd conclude
then that the abundances are seasonal, with maximal abundance in mid June
and furthermore, they are generally decreasing over time.
Can I use both time variables together like this? If not, is there some
better approach I should take?
Thanks in advance,
-Jacob

For context lAbundance is logg-odds transformed abundance data of a
microbial species in a given location over time.  RDate is the date the
sample was collected in the r date format.


 res - cosinor(lAbundance ~ RDate, date = RDate, data = lldata)

summary(res)
Cosinor test
Number of observations = 62
Amplitude = 0.58
Phase: Month = June , day = 14
Low point: Month = December , day = 14
Significant seasonality based on adjusted significance level of 0.025  =
TRUE

summary(res$glm)


Call:
glm(formula = f, family = family, data = data, offset = offset)

Deviance Residuals:
Min   1Q   Median   3Q  Max
-2.3476  -0.6463   0.1519   0.6574   1.9618

Coefficients:
  Estimate Std. Error t value Pr(|t|)
(Intercept)  0.0115693  1.8963070   0.006  0.99515
RDate   -0.0003203  0.0001393  -2.299  0.02514 *
cosw-0.5516458  0.1837344  -3.002  0.00395 **
sinw 0.1762904  0.1700670   1.037  0.30423
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for gaussian family taken to be 0.9458339)

Null deviance: 70.759  on 61  degrees of freedom
Residual deviance: 54.858  on 58  degrees of freedom
AIC: 178.36

Number of Fisher Scoring iterations: 2


llldata as a csv below
RDate,lAbundance
2003-03-12,-3.3330699059335
2003-05-21,-3.04104625745886
2003-06-17,-3.04734680029566
2003-07-02,-4.18791034708572
2003-09-18,-3.04419201802053
2003-10-22,-3.13805060873929
2004-02-19,-3.80688269144794
2004-03-17,-4.50755507726145
2004-04-22,-4.38846502542992
2004-05-19,-3.06618649442674
2004-06-17,-5.20518774876304
2004-07-14,-3.75041853151097
2004-08-25,-3.67882486716196
2004-09-22,-5.22205827512234
2004-10-14,-3.99297508670535
2004-11-17,-4.68793287601157
2004-12-15,-4.31712380781011
2005-02-16,-4.30893550479904
2005-03-16,-4.05781773988454
2005-05-11,-3.94746237402035
2005-07-19,-4.91195185391358
2005-08-17,-4.93590576323119
2005-09-15,-4.85820800095518
2005-10-20,-5.22956391101343
2005-12-13,-5.12244047315448
2006-01-18,-3.04854660925046
2006-02-22,-6.77145858348375
2006-03-29,-4.33151493849021
2006-04-19,-3.36152357710535
2006-06-20,-3.09071584142593
2006-07-25,-3.31430484483825
2006-08-24,-3.09974933041469
2006-09-13,-3.33288992218458
2007-12-17,-4.19942661980677
2008-03-19,-3.86146499633625
2008-04-22,-3.36161599919095
2008-05-14,-4.30878307213324
2008-06-18,-3.74372448768828
2008-07-09,-4.65951429661651
2008-08-20,-5.35984647704619
2008-09-22,-4.78481898261137
2008-10-20,-3.58588161980965
2008-11-20,-3.10625125552057
2009-02-18,-6.90675477864855
2009-03-11,-3.43446932013368
2009-04-23,-3.82688066341466
2009-05-13,-4.44885332005661
2009-06-18,-3.97671552612412
2009-07-09,-3.40185954003936
2009-08-19,-3.44958231694091
2009-09-24,-3.86508161094726
2010-01-28,-4.95281587967569
2010-02-11,-3.78064756876257
2010-03-24,-3.5823501176064
2010-04-27,-4.3363571587
2010-05-17,-3.90545735473055
2010-07-21,-3.3147176517321
2010-08-11,-4.53218360860017
2010-10-21,-6.90675477864855
2010-11-23,-6.90675477864855
2010-12-16,-6.75158176094352
2011-01-11,-6.90675477864855

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Cosinor with data that trend over time

2014-03-25 Thread Gavin Simpson
I would probably attack this using a GAM modified to model the
residuals as a stochastic time series process.

For example

require(mgcv)
mod - gamm(y ~ s(DoY, bs = cc) + s(time), data = foo,
 correlation = corCAR1(form = ~ time))

where `foo` is your data frame, `DoY` is a variable in the data frame
computed as `as.numeric(strftime(RDate, format = %j))` and `time` is
a variable for the passage of time - you could do `as.numeric(RDate)`
but the number of days is probably large as we might encounter more
problems fitting the model. Instead you might do `as.numeric(RDate) /
1000` say to produce values on a more manageable scale. The `bs =
cc` bit specifies a cyclic spline applicable to data measured
throughout a year. You may want to fix the start and end knots to be
days 1 and days 366 respectively, say via `knots = list(DoY =
c(0,366))` as an argument to `gam()` [I think I have this right,
specifying the boundary knots, but let me know if you get an error
about the number of knots]. The residuals are said to follow a
continuois time AR(1), the irregular-spaced counter part to the AR(1),
plus random noise.

There may be identifiability issues as the `s(time)` and `corCAR1()`
compete to explain the fine-scale variation. If you hit such a case,
you can make an educated guess as to the wiggliness (degrees of
freedom) for the smooth terms based on a plot of the data and fix the
splines at those values via argument `k = x` and `fx = TRUE`, where
`x` in `k = x` is some integer value. Both these go in as arguments to
the `s()` functions. If the trend is not very non-linear you can use a
low value 1-3 here for x and for the DoY term say 3-4 might be
applicable.

There are other ways to approach this problem of identifiability, but
that would require more time/space here, which I can go into via a
follow-up if needed.

You can interrogate the fitted splines to see when the peak value of
the `DoY` term is in the year.

You can also allow the seasonal signal to vary in time with the trend
by allowing the splines to interact in a 2d-tensor product spline.
Using `te(DoY, time, bs = c(cc,cr))` instead of the two `s()`
terms (or using `ti()` terms for the two marginal splines and the
2-d spline). Again you can add in the `k` = c(x,y), fx = TRUE)` to the
`te()` term where `x` and `y` are the dfs for each dimension in the
`te()` term. It is a bit more complex to do this for `ti()` terms.

Part of the reason to prefer a spline for DoY for the seasonal term is
that one might not expect the seasonal cycle to be a symmetric cycle
as a cos/sin terms would imply.

A recent ecological paper describing a similar approach (though using
different package in R) is that of Claire Ferguson and colleagues in J
Applied Ecology (2008) http://doi.org/10./j.1365-2664.2007.01428.x
(freely available).

HTH

G

On 25 March 2014 19:14, Jacob Cram cramj...@gmail.com wrote:
 Hello all,
  I am thinking about applying season::cosinor() analysis to some
 irregularely spaced time series data. The data are unevenly spaced, so
 usual time series methods, as well as the nscosinor() function are out. My
 data do however trend over time and I am wondering if I can feed date as a
 variable into my cosinor analyis.  In the example below, then I'd conclude
 then that the abundances are seasonal, with maximal abundance in mid June
 and furthermore, they are generally decreasing over time.
 Can I use both time variables together like this? If not, is there some
 better approach I should take?
 Thanks in advance,
 -Jacob

 For context lAbundance is logg-odds transformed abundance data of a
 microbial species in a given location over time.  RDate is the date the
 sample was collected in the r date format.


 res - cosinor(lAbundance ~ RDate, date = RDate, data = lldata)

summary(res)
 Cosinor test
 Number of observations = 62
 Amplitude = 0.58
 Phase: Month = June , day = 14
 Low point: Month = December , day = 14
 Significant seasonality based on adjusted significance level of 0.025  =
 TRUE

summary(res$glm)


 Call:
 glm(formula = f, family = family, data = data, offset = offset)

 Deviance Residuals:
 Min   1Q   Median   3Q  Max
 -2.3476  -0.6463   0.1519   0.6574   1.9618

 Coefficients:
   Estimate Std. Error t value Pr(|t|)
 (Intercept)  0.0115693  1.8963070   0.006  0.99515
 RDate   -0.0003203  0.0001393  -2.299  0.02514 *
 cosw-0.5516458  0.1837344  -3.002  0.00395 **
 sinw 0.1762904  0.1700670   1.037  0.30423
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 (Dispersion parameter for gaussian family taken to be 0.9458339)

 Null deviance: 70.759  on 61  degrees of freedom
 Residual deviance: 54.858  on 58  degrees of freedom
 AIC: 178.36

 Number of Fisher Scoring iterations: 2


 llldata as a csv below
 RDate,lAbundance
 2003-03-12,-3.3330699059335
 2003-05-21,-3.04104625745886
 2003-06-17,-3.04734680029566
 2003-07-02,-4.18791034708572