[R-sig-eco] Comparison of gam and gamm fits
Dear list members, I apologise in advance for the large-ish email, but I thought it was important to paste in some plots for what follows. I am using generalised additive models to capture patterns of seasonal and interannual variation in the abundance of zooplankton, in a lake ecosystem. I am trying to fit models with smoothers for year and day of year to capture the average pattern in each of these temporal dimensions, and then have added a two-dimensional (tensor product) smoother to try to model any changes in the seasonal pattern among years. I am mindful that I may need to deal with correlated errors in these models and so would like to fit error structures to see if they improve model fit, judged by AIC. Therefore, as a first step I re-fitted the gam model using gamm, to allow later inclusion of a correlation structure: Daph_gam4-gam((DAPHG+0.1)~s(Year,bs=cr)+s(DOY,bs=cr)+te(Year,DOY),family=Gamma(link=log),data=ZooDat2) Daph_gam4_no_ac-gamm((DAPHG+0.1)~s(Year,bs=cr)+s(DOY,bs=cr)+te(Year,DOY),family=Gamma(link=log),data=ZooDat2) ...where DAPHG is the abundance of a particular species of interest and DOY= day of year. I am using a Gamma distribution as the data are heavily skewed and on a continuous scale (numbers per litre lake water). The problem I am having is that these two models produce dramatically different fits, see the image plots below. In this case the result of the gam model (Daph_gam4, labelled gam in the plot) bears a much greater resemblance to the original data. Could anyone help me to understand why these two model fits are so very different, when they are fitting the same smoothers? Any help much appreciated! Steve Dr Stephen Thackeray Lake Ecosystem Group Centre for Ecology and Hydrology Lancaster Environment Centre Library Avenue Bailrigg Lancaster LA1 4AP s...@ceh.ac.ukmailto:s...@ceh.ac.uk -- This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] Comparison of gam and gamm fits
How do the fits compare if you add 'method = REML' to the gam() call? Simon Wood has shown that GCV can overfit in some circumstances. You might need 'method = ML' as I forget what the default in gamm() is. Some other points: you probably want bs = cc for the DOY smooth as it will stop there being a discontinuity between December and January. You will therefore also need to add bs = c(cr , cc) in the te() smooth. HTH Gavin Sent from my HTC - Reply message - From: Thackeray, Stephen J. s...@ceh.ac.uk Date: Wed, Feb 8, 2012 09:14 Subject: [R-sig-eco] Comparison of gam and gamm fits To: apos;R-sig-ecology@r-project.orgapos; R-sig-ecology@r-project.org Dear list members, I apologise in advance for the large-ish email, but I thought it was important to paste in some plots for what follows. I am using generalised additive models to capture patterns of seasonal and interannual variation in the abundance of zooplankton, in a lake ecosystem. I am trying to fit models with smoothers for year and day of year to capture the average pattern in each of these temporal dimensions, and then have added a two-dimensional (tensor product) smoother to try to model any changes in the seasonal pattern among years. I am mindful that I may need to deal with correlated errors in these models and so would like to fit error structures to see if they improve model fit, judged by AIC. Therefore, as a first step I re-fitted the gam model using gamm, to allow later inclusion of a correlation structure: Daph_gam4-gam((DAPHG+0.1)~s(Year,bs=cr)+s(DOY,bs=cr)+te(Year,DOY),family=Gamma(link=log),data=ZooDat2) Daph_gam4_no_ac-gamm((DAPHG+0.1)~s(Year,bs=cr)+s(DOY,bs=cr)+te(Year,DOY),family=Gamma(link=log),data=ZooDat2) ...where DAPHG is the abundance of a particular species of interest and DOY= day of year. I am using a Gamma distribution as the data are heavily skewed and on a continuous scale (numbers per litre lake water). The problem I am having is that these two models produce dramatically different fits, see the image plots below. In this case the result of the gam model (Daph_gam4, labelled gam in the plot) bears a much greater resemblance to the original data. Could anyone help me to understand why these two model fits are so very different, when they are fitting the same smoothers? Any help much appreciated! Steve Dr Stephen Thackeray Lake Ecosystem Group Centre for Ecology and Hydrology Lancaster Environment Centre Library Avenue Bailrigg Lancaster LA1 4AP s...@ceh.ac.ukmailto:s...@ceh.ac.uk -- This message (and any attachments) is for the recipient ...{{dropped:9}} ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] Comparison of gam and gamm fits
Dear all, My apologies, the figure seems to have been removed from the mail! Hope this attachment makes it instead... Steve Dr Stephen Thackeray Lake Ecosystem Group Centre for Ecology and Hydrology Lancaster Environment Centre Library Avenue Bailrigg Lancaster LA1 4AP s...@ceh.ac.uk -Original Message- From: r-sig-ecology-boun...@r-project.org [mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of Thackeray, Stephen J. Sent: 08 February 2012 09:13 To: 'R-sig-ecology@r-project.org' Subject: [R-sig-eco] Comparison of gam and gamm fits Dear list members, I apologise in advance for the large-ish email, but I thought it was important to paste in some plots for what follows. I am using generalised additive models to capture patterns of seasonal and interannual variation in the abundance of zooplankton, in a lake ecosystem. I am trying to fit models with smoothers for year and day of year to capture the average pattern in each of these temporal dimensions, and then have added a two-dimensional (tensor product) smoother to try to model any changes in the seasonal pattern among years. I am mindful that I may need to deal with correlated errors in these models and so would like to fit error structures to see if they improve model fit, judged by AIC. Therefore, as a first step I re-fitted the gam model using gamm, to allow later inclusion of a correlation structure: Daph_gam4-gam((DAPHG+0.1)~s(Year,bs=cr)+s(DOY,bs=cr)+te(Year,DOY),family=Gamma(link=log),data=ZooDat2) Daph_gam4_no_ac-gamm((DAPHG+0.1)~s(Year,bs=cr)+s(DOY,bs=cr)+te(Year,DOY),family=Gamma(link=log),data=ZooDat2) ...where DAPHG is the abundance of a particular species of interest and DOY= day of year. I am using a Gamma distribution as the data are heavily skewed and on a continuous scale (numbers per litre lake water). The problem I am having is that these two models produce dramatically different fits, see the image plots below. In this case the result of the gam model (Daph_gam4, labelled gam in the plot) bears a much greater resemblance to the original data. Could anyone help me to understand why these two model fits are so very different, when they are fitting the same smoothers? Any help much appreciated! Steve Dr Stephen Thackeray Lake Ecosystem Group Centre for Ecology and Hydrology Lancaster Environment Centre Library Avenue Bailrigg Lancaster LA1 4AP s...@ceh.ac.ukmailto:s...@ceh.ac.uk -- This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system. ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[R-sig-eco] R-Sig Eco- Looking for help with R code to calculte fish metrics using 2 tables
Hello All, I am looking for help finding R code to help me generate fish indices of stream quality. A brief over view of what I currently have: I am using two different data sets to help generate these metrics. The first one consists of site data, where each row is a sampling event and each column is a species, with the corresponding cells containing abundances for that site. The second data set is a fish traits database where each row is a fish species (same species as the sampling events) and each column is a specific trait with each cell indicating whether the fish species has that trait or not. Specifically, I'm looking for code that will help me read from both tables at once, with out me having to write a bunch of if-then statements. Any help, or ideas on a starting point, would be greatly appreciated. Thanks, Alison Anderson -- Alison M. Anderson Graduate Research Assistant West Virginia University aande...@mix.wvu.edu (419)305-4167 [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[R-sig-eco] adonis: error in rowSums
I am trying to use the 'adonis' function in the 'vegan' package to assess differences in water depth and water velocity between areas of a river channel categorised by surface flow type (6 types in total, unequal sample sizes). Sample Data (LB): SFT Depth Vel BSW 0.181.2 BSW 0.161.03 BSW 0.160.98 BSW 0.220.53 BSW 0.110.668 BSW 0.140.432 BSW 0.120.391 BSW 0.160.647 BSW 0.2 0.903 BSW 0.3 0.594 BSW 0.370.429 The dependent data was used in data frame format, rather than a dissimilarity matrix. Using the call 'adonis(formula=SFT~Depth*Vel,data=LB,permutations=999,method=canberra,strata=NULL)' I get the following error: Error in rowSums (x, na.rm=TRUE) 'x' must be an array of at least two dimensions I examined the adonis code to find 'x'. It first appears at the permutation stage: if (missing(strata)) strata - NULL p - sapply(1:permutations, function(x) permuted.index(n, strata = strata)) tH.s - lapply(H.s, t) tIH.snterm - t(I - H.snterm) f.perms - sapply(1:nterms, function(i) { sapply(1:permutations, function(j) { f.test(tH.s[[i]], G[p[, j], p[, j]], df.Exp[i], df.Res, tIH.snterm) However I'm no closer to understanding what 'x' is or how to correct the error. If anyone could offer any advice or help I'd be very grateful. I also tried transposing the data but this generated a different error! Regards, Caroline Wallis PhD student University of Worcester Tel: 01905 542441 Mobile: 07811 384641 ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] Comparison of gam and gamm fits
On Wed, Feb 8, 2012 at 11:17 AM, Thackeray, Stephen J. s...@ceh.ac.uk wrote: Dear all, My apologies, the figure seems to have been removed from the mail! Hope this attachment makes it instead... Steve Dear Steve, attachments are generally stripped off the message, so try hosting the figure somewhere and post the link instead. Cheers, Ivailo -- UBUNTU: a person is a person through other persons. ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] adonis: error in rowSums
On Wed, Feb 8, 2012 at 4:10 PM, Caroline Wallis c.wal...@worc.ac.uk wrote: I am trying to use the 'adonis' function in the 'vegan' package to assess differences in water depth and water velocity between areas of a river channel categorised by surface flow type (6 types in total, unequal sample sizes). Sample Data (LB): SFT Depth Vel BSW 0.18 1.2 BSW 0.16 1.03 BSW 0.16 0.98 BSW 0.22 0.53 BSW 0.11 0.668 BSW 0.14 0.432 BSW 0.12 0.391 BSW 0.16 0.647 BSW 0.2 0.903 BSW 0.3 0.594 BSW 0.37 0.429 The dependent data was used in data frame format, rather than a dissimilarity matrix. Using the call 'adonis(formula=SFT~Depth*Vel,data=LB,permutations=999,method=canberra,strata=NULL)' I get the following error: Error in rowSums (x, na.rm=TRUE) 'x' must be an array of at least two dimensions I examined the adonis code to find 'x'. It first appears at the permutation stage: if (missing(strata)) strata - NULL p - sapply(1:permutations, function(x) permuted.index(n, strata = strata)) tH.s - lapply(H.s, t) tIH.snterm - t(I - H.snterm) f.perms - sapply(1:nterms, function(i) { sapply(1:permutations, function(j) { f.test(tH.s[[i]], G[p[, j], p[, j]], df.Exp[i], df.Res, tIH.snterm) However I'm no closer to understanding what 'x' is or how to correct the error. If anyone could offer any advice or help I'd be very grateful. I also tried transposing the data but this generated a different error! Regards, Caroline Wallis Dear Caroline, you need to provide a community table (i.e. species x samples data frame) to adonis, but you have provided a single categorical variable. Therefore I am not sure if the adonis() function would be appropriate for the analysis you're trying to perform. Cheers, Ivailo -- UBUNTU: a person is a person through other persons. ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[R-sig-eco] Using function adonis for unbalanced designs?
Hi all, I am trying to use PERMANOVA in a one factor design to test for differences in diet among species (species are in rows and foods in columns).I however have different sample sizes for each species (groups). In Anderson, 2001 I found that the PERMANOVA method is for balanced designs but it could be modified for unbalanced designs.Does adonis account for differences in sample size among groups?I couldn’t find direct reference to this in the Vegan manual or tutorial.I read that MRPP can be used for unbalanced designs. I'm not sure about Anosim.I would greatly appreciate any suggestions. Thanks, Bibiana ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[R-sig-eco] lda on groups of species
Hi everyone, I am trying to perform a linear discriminant analysis (lda) using the MASS package on a community dataset (sites by species, similar to the dune dataset in vegan). I have defined groups of species based on an a priori ecological hypothesis. Does anyone know how to perform the lda on species based on their abundances at each site? I want to see if the groups of species are significantly different from one another, and how they are realized in the community space. So far I can only find examples of lda's of sites where they have been previously grouped based on clustering alogrithms. Thanks so much, Vincenzo [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[R-sig-eco] Residual Deviance in Binomial GLM
Hello, I am attempting to calculate the variation in many butterfly population sizes across 35 years of monitoring. The data are in fractions, so I am using a binomial glm with a logit link function to look for population trends across years. I was wondering if I could use the residual deviance output from the glm's as a proxy of population variability? What are the issues associated with doing this? Thanks! Josh [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology