[R-sig-eco] Comparison of gam and gamm fits

2012-02-08 Thread Thackeray, Stephen J.
Dear list members,

I apologise in advance for the large-ish email, but I thought it was important 
to paste in some plots for what follows.

I am using generalised additive models to capture patterns of seasonal and 
interannual variation in the abundance of zooplankton, in a lake ecosystem. I 
am trying to fit models with smoothers for year and day of year to capture the 
average pattern in each of these temporal dimensions, and then have added a 
two-dimensional (tensor product) smoother to try to model any changes in the 
seasonal pattern among years. I am mindful that I may need to deal with 
correlated errors in these models and so would like to fit error structures to 
see if they improve model fit, judged by AIC. Therefore, as a first step I 
re-fitted the gam model using gamm, to allow later inclusion of a correlation 
structure:

Daph_gam4-gam((DAPHG+0.1)~s(Year,bs=cr)+s(DOY,bs=cr)+te(Year,DOY),family=Gamma(link=log),data=ZooDat2)
Daph_gam4_no_ac-gamm((DAPHG+0.1)~s(Year,bs=cr)+s(DOY,bs=cr)+te(Year,DOY),family=Gamma(link=log),data=ZooDat2)

...where DAPHG is the abundance of a particular species of interest and DOY= 
day of year. I am using a Gamma distribution as the data are heavily skewed and 
on a continuous scale (numbers per litre lake water).

The problem I am having is that these two models produce dramatically different 
fits, see the image plots below. In this case the result of the gam model 
(Daph_gam4, labelled gam in the plot) bears a much greater resemblance to the 
original data. Could anyone help me to understand why these two model fits are 
so very different, when they are fitting the same smoothers?

Any help much appreciated!

Steve





Dr Stephen  Thackeray
Lake Ecosystem Group
Centre for Ecology and Hydrology
Lancaster Environment Centre
Library Avenue
Bailrigg
Lancaster
LA1 4AP
s...@ceh.ac.ukmailto:s...@ceh.ac.uk



-- 
This message (and any attachments) is for the recipient only. NERC
is subject to the Freedom of Information Act 2000 and the contents
of this email and any reply you make may be disclosed by NERC unless
it is exempt from release under the Act. Any material supplied to
NERC may be stored in an electronic records management system.___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Comparison of gam and gamm fits

2012-02-08 Thread Simpson, Gavin
How do the fits compare if you add 'method = REML' to the gam() call? Simon 
Wood has shown that GCV can overfit in some circumstances. You might need 
'method = ML' as I forget what the default in gamm() is.

Some other points: you probably want bs = cc for the DOY smooth as it will 
stop there being a discontinuity between December and January.

You will therefore also need to add bs = c(cr , cc) in the te() smooth.

HTH

Gavin

Sent from my HTC

- Reply message -
From: Thackeray, Stephen J. s...@ceh.ac.uk
Date: Wed, Feb 8, 2012 09:14
Subject: [R-sig-eco] Comparison of gam and gamm fits
To: apos;R-sig-ecology@r-project.orgapos; R-sig-ecology@r-project.org

Dear list members,

I apologise in advance for the large-ish email, but I thought it was important 
to paste in some plots for what follows.

I am using generalised additive models to capture patterns of seasonal and 
interannual variation in the abundance of zooplankton, in a lake ecosystem. I 
am trying to fit models with smoothers for year and day of year to capture the 
average pattern in each of these temporal dimensions, and then have added a 
two-dimensional (tensor product) smoother to try to model any changes in the 
seasonal pattern among years. I am mindful that I may need to deal with 
correlated errors in these models and so would like to fit error structures to 
see if they improve model fit, judged by AIC. Therefore, as a first step I 
re-fitted the gam model using gamm, to allow later inclusion of a correlation 
structure:

Daph_gam4-gam((DAPHG+0.1)~s(Year,bs=cr)+s(DOY,bs=cr)+te(Year,DOY),family=Gamma(link=log),data=ZooDat2)
Daph_gam4_no_ac-gamm((DAPHG+0.1)~s(Year,bs=cr)+s(DOY,bs=cr)+te(Year,DOY),family=Gamma(link=log),data=ZooDat2)

...where DAPHG is the abundance of a particular species of interest and DOY= 
day of year. I am using a Gamma distribution as the data are heavily skewed and 
on a continuous scale (numbers per litre lake water).

The problem I am having is that these two models produce dramatically different 
fits, see the image plots below. In this case the result of the gam model 
(Daph_gam4, labelled gam in the plot) bears a much greater resemblance to the 
original data. Could anyone help me to understand why these two model fits are 
so very different, when they are fitting the same smoothers?

Any help much appreciated!

Steve





Dr Stephen  Thackeray
Lake Ecosystem Group
Centre for Ecology and Hydrology
Lancaster Environment Centre
Library Avenue
Bailrigg
Lancaster
LA1 4AP
s...@ceh.ac.ukmailto:s...@ceh.ac.uk



--
This message (and any attachments) is for the recipient ...{{dropped:9}}

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Comparison of gam and gamm fits

2012-02-08 Thread Thackeray, Stephen J.
Dear all, 

My apologies, the figure seems to have been removed from the mail! Hope this 
attachment makes it instead...

Steve


Dr Stephen  Thackeray
Lake Ecosystem Group
Centre for Ecology and Hydrology
Lancaster Environment Centre
Library Avenue
Bailrigg
Lancaster
LA1 4AP
s...@ceh.ac.uk



-Original Message-
From: r-sig-ecology-boun...@r-project.org 
[mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of Thackeray, Stephen J.
Sent: 08 February 2012 09:13
To: 'R-sig-ecology@r-project.org'
Subject: [R-sig-eco] Comparison of gam and gamm fits

Dear list members,

I apologise in advance for the large-ish email, but I thought it was important 
to paste in some plots for what follows.

I am using generalised additive models to capture patterns of seasonal and 
interannual variation in the abundance of zooplankton, in a lake ecosystem. I 
am trying to fit models with smoothers for year and day of year to capture the 
average pattern in each of these temporal dimensions, and then have added a 
two-dimensional (tensor product) smoother to try to model any changes in the 
seasonal pattern among years. I am mindful that I may need to deal with 
correlated errors in these models and so would like to fit error structures to 
see if they improve model fit, judged by AIC. Therefore, as a first step I 
re-fitted the gam model using gamm, to allow later inclusion of a correlation 
structure:

Daph_gam4-gam((DAPHG+0.1)~s(Year,bs=cr)+s(DOY,bs=cr)+te(Year,DOY),family=Gamma(link=log),data=ZooDat2)
Daph_gam4_no_ac-gamm((DAPHG+0.1)~s(Year,bs=cr)+s(DOY,bs=cr)+te(Year,DOY),family=Gamma(link=log),data=ZooDat2)

...where DAPHG is the abundance of a particular species of interest and DOY= 
day of year. I am using a Gamma distribution as the data are heavily skewed and 
on a continuous scale (numbers per litre lake water).

The problem I am having is that these two models produce dramatically different 
fits, see the image plots below. In this case the result of the gam model 
(Daph_gam4, labelled gam in the plot) bears a much greater resemblance to the 
original data. Could anyone help me to understand why these two model fits are 
so very different, when they are fitting the same smoothers?

Any help much appreciated!

Steve





Dr Stephen  Thackeray
Lake Ecosystem Group
Centre for Ecology and Hydrology
Lancaster Environment Centre
Library Avenue
Bailrigg
Lancaster
LA1 4AP
s...@ceh.ac.ukmailto:s...@ceh.ac.uk



--
This message (and any attachments) is for the recipient only. NERC is subject 
to the Freedom of Information Act 2000 and the contents of this email and any 
reply you make may be disclosed by NERC unless it is exempt from release under 
the Act. Any material supplied to NERC may be stored in an electronic records 
management system.
___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] R-Sig Eco- Looking for help with R code to calculte fish metrics using 2 tables

2012-02-08 Thread Alison Anderson
Hello All,

I am looking for help finding R code to help me generate fish indices of stream 
quality.  A brief over view of what I currently have:  I am using two different 
data sets to help generate these metrics.  The first one consists of site data, 
where each row is a sampling event and each column is a species, with the 
corresponding cells containing abundances for that site.  The second data set 
is a fish traits database where each row is a fish species (same species as the 
sampling events) and each column is a specific trait with each cell indicating 
whether the fish species has that trait or not.  Specifically, I'm looking for 
code that will help me read from both tables at once, with out me having to 
write a bunch of if-then statements.

Any help, or ideas on a starting point, would be greatly appreciated.

Thanks,

Alison Anderson

--
Alison M. Anderson
Graduate Research Assistant
West Virginia University
aande...@mix.wvu.edu
(419)305-4167

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] adonis: error in rowSums

2012-02-08 Thread Caroline Wallis
I am trying to use the 'adonis' function in the 'vegan' package to assess 
differences in water depth and water velocity between areas of a river channel 
categorised by surface flow type (6 types in total, unequal sample sizes). 

Sample Data (LB):

SFT Depth   Vel
BSW 0.181.2
BSW 0.161.03
BSW 0.160.98
BSW 0.220.53
BSW 0.110.668
BSW 0.140.432
BSW 0.120.391
BSW 0.160.647
BSW 0.2 0.903
BSW 0.3 0.594
BSW 0.370.429


The dependent data was used in data frame format, rather than a dissimilarity 
matrix.

Using the call 
'adonis(formula=SFT~Depth*Vel,data=LB,permutations=999,method=canberra,strata=NULL)'
 I get the following error:

Error in rowSums (x, na.rm=TRUE)
'x' must be an array of at least two dimensions

I examined the adonis code to find 'x'. It first appears at the permutation 
stage:

if (missing(strata)) 
strata - NULL
p - sapply(1:permutations, function(x) permuted.index(n, 
strata = strata))
tH.s - lapply(H.s, t)
tIH.snterm - t(I - H.snterm)
f.perms - sapply(1:nterms, function(i) {
sapply(1:permutations, function(j) {
f.test(tH.s[[i]], G[p[, j], p[, j]], df.Exp[i], df.Res, 
tIH.snterm)

However I'm no closer to understanding what 'x' is or how to correct the error. 
If anyone could offer any advice or help I'd be very grateful.

I also tried transposing the data but this generated a different error! 

Regards,

Caroline Wallis

PhD student
University of Worcester

Tel: 01905 542441
Mobile: 07811 384641

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Comparison of gam and gamm fits

2012-02-08 Thread Ivailo
On Wed, Feb 8, 2012 at 11:17 AM, Thackeray, Stephen J. s...@ceh.ac.uk wrote:
 Dear all,

 My apologies, the figure seems to have been removed from the mail! Hope this 
 attachment makes it instead...

 Steve

Dear Steve,

attachments are generally stripped off the message, so try hosting the
figure somewhere and post the link instead.

Cheers,
Ivailo
-- 
UBUNTU: a person is a person through other persons.

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] adonis: error in rowSums

2012-02-08 Thread Ivailo
On Wed, Feb 8, 2012 at 4:10 PM, Caroline Wallis c.wal...@worc.ac.uk wrote:
 I am trying to use the 'adonis' function in the 'vegan' package to assess 
 differences in water depth and water velocity between areas of a river 
 channel categorised by surface flow type (6 types in total, unequal sample 
 sizes).

 Sample Data (LB):

 SFT     Depth   Vel
 BSW     0.18    1.2
 BSW     0.16    1.03
 BSW     0.16    0.98
 BSW     0.22    0.53
 BSW     0.11    0.668
 BSW     0.14    0.432
 BSW     0.12    0.391
 BSW     0.16    0.647
 BSW     0.2     0.903
 BSW     0.3     0.594
 BSW     0.37    0.429
 

 The dependent data was used in data frame format, rather than a dissimilarity 
 matrix.

 Using the call 
 'adonis(formula=SFT~Depth*Vel,data=LB,permutations=999,method=canberra,strata=NULL)'
  I get the following error:

 Error in rowSums (x, na.rm=TRUE)
 'x' must be an array of at least two dimensions

 I examined the adonis code to find 'x'. It first appears at the permutation 
 stage:

 if (missing(strata))
        strata - NULL
    p - sapply(1:permutations, function(x) permuted.index(n,
        strata = strata))
    tH.s - lapply(H.s, t)
    tIH.snterm - t(I - H.snterm)
    f.perms - sapply(1:nterms, function(i) {
        sapply(1:permutations, function(j) {
            f.test(tH.s[[i]], G[p[, j], p[, j]], df.Exp[i], df.Res,
                tIH.snterm)

 However I'm no closer to understanding what 'x' is or how to correct the 
 error. If anyone could offer any advice or help I'd be very grateful.

 I also tried transposing the data but this generated a different error!

 Regards,

 Caroline Wallis

Dear Caroline,

you need to provide a community table (i.e. species x samples data
frame) to adonis, but you have provided a single categorical variable.
Therefore I am not sure if the adonis() function would be appropriate
for the analysis you're trying to perform.

Cheers,
Ivailo
-- 
UBUNTU: a person is a person through other persons.

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] Using function adonis for unbalanced designs?

2012-02-08 Thread Sandra Bibiana Correa

 Hi all,

I am trying to use PERMANOVA in a one factor design to test for 
differences in diet among species (species are in rows and foods in 
columns).I however have different sample sizes for each species (groups).


In Anderson, 2001 I found that the PERMANOVA method is for balanced 
designs but it could be modified for unbalanced designs.Does adonis 
account for differences in sample size among groups?I couldn’t find 
direct reference to this in the Vegan manual or tutorial.I read that 
MRPP can be used for unbalanced designs.  I'm not sure about Anosim.I 
would greatly appreciate any suggestions.


Thanks,

Bibiana
___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] lda on groups of species

2012-02-08 Thread Vincenzo Ellis
Hi everyone,

I am trying to perform a linear discriminant analysis (lda) using the MASS
package on a community dataset (sites by species, similar to the dune
dataset in vegan). I have defined groups of species based on an a priori
ecological hypothesis. Does anyone know how to perform the lda on species
based on their abundances at each site?  I want to see if the groups of
species are significantly different from one another, and how they are
realized in the community space.  So far I can only find examples of lda's
of sites where they have been previously grouped based on clustering
alogrithms.

Thanks so much,
Vincenzo

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] Residual Deviance in Binomial GLM

2012-02-08 Thread Josh Jahner

Hello,
I am attempting to calculate the variation in many butterfly population sizes 
across 35 years of monitoring. The data are in fractions, so I am using a 
binomial glm with a logit link function to look for population trends across 
years. I was wondering if I could use the residual deviance output from the 
glm's as a proxy of population variability? What are the issues associated with 
doing this? Thanks!
Josh


  
[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology