Re: [R-sig-eco] Cosinor with data that trend over time
Thanks Gavin, Your suggestion seems promising. I don't think I'll do the derivatives analysis at this time so no hurry on those codes. New question. I am wondering if there are additional considerations when making multiple comparasons with a gamm model. I have been testing a large number of parameters to see which of say 20 environmental variables are seasonal. I would expect a number of these variables to be seasonal but based on this conversation http://stats.stackexchange.com/questions/49052/are-splines-overfitting-the-data I am nervous about the variety of possible spline fits somehow compounding the problem of testing multiple variables for seasonality. Any thoughts? -Jacob On Wed, Mar 26, 2014 at 2:34 PM, Gavin Simpson wrote: > Sorry about the errors (typos, not syntax errors) - I was forgetting > that you'd need to use `gamm()` and hence access the `$gam` component > > I don't follow the point about a factor trending up or down. You > shouldn't try to use the `$lme` part of the model for this. > `summary(mod$gam)` should be sufficient, but as it relates to a > spline, this is more a test of whether the spline is different from a > horizontal, flat, null line. The problem with splines is that the > trend need not just be "trending up or down". In the past, to convey > where change in the trend occurs I have used the first derivative of > the fitted spline and looked for where in time the 95% confidence > interval on the first derivative of the spline doesn't include zero; > that shows the regions in time where the trend is significantly > increasing or decreasing. I cover how to do this in a blog post I > wrote: > > > http://www.fromthebottomoftheheap.net/2011/06/12/additive-modelling-and-the-hadcrut3v-global-mean-temperature-series/ > > the post contains links to the R code used for the derivatives etc, > though it is a little more complex in the case of a model with a trend > spline and seasonal spline. > > I'm supposed to have updated those codes and the post because several > people have asked me how I do the analysis for models with multiple > spline terms. If you can't get the code to work for your models, ping > me back and I'll try to move that to the top of my TO DO list. > > Note the `Xs(time)Fx1` entries in the `summary(mod$lme)` table refer > to the basis functions that represent the spline or at least to some > part of those basis functions. You can't really make much practical > use out of those values are they relate specifically to way the > penalised regression spline model has been converted into an > equivalent linear mixed effect form. > > HTH > > G > > On 26 March 2014 12:10, Jacob Cram wrote: > > Thanks again Gavin, this works. > > gamm() also models the long term trend with a spline s(Time), which is > > great. I would still like though, to be able to say whether the factor is > > trending up or down over time. Would it be fair to query > > summary(mod$lme)$tTable > > and to look at the p-value and "Value" corresponding Xs(time)Fx1 value to > > identify such a trend? > > > > Also, here are a few syntax corrections on the code provided in the last > > email: > > 1)visual appoach > > plot(mod$gam, pages = 1) > > > > 2) quantitative approach > > pred <- predict(mod$gam, newdata = newdat, type = "terms") > > take <- which(pred[,"s(DoY)"] == max(pred[,"s(DoY)"])) > > or > > take <- as.numeric(which.max(pred[,"s(DoY)"])) > > > > Cheers, > > -Jacob > > > > > > On Wed, Mar 26, 2014 at 9:46 AM, Gavin Simpson > wrote: > >> > >> 1) Visually - unless it actually matters exactly on which day in the > >> year the peak is observed? If visually is OK, just do `plot(mod, pages > >> = 1)` to see the fitted splines on a single page. See `?plot.gam` for > >> more details on the plot method. > >> > >> 2) You could generate some new data to predict upon as follows: > >> > >> newdat <- data.frame(DoY = seq_len(366), time = mean(foo$time)) > >> > >> Then predict for those new data but collect the individual > >> contributions of the spline terms to the predicted value rather than > >> just the final prediction > >> > >> pred <- predict(mod, newdata = newdat, type = "terms") > >> > >> Then find the maximal value of the DoY contribution > >> > >> take <- which(pred$DoY == max(pred$DoY)) > >> newdat[take, , drop = FALSE] > >> > >> You could use > >> > >> take <- which.max(pred$DoY) > >> > >> instead of the `which()` I used, but only if there is a single maximal > >> value. > >> > >> This works because the spline terms in the additive model are just > >> that; additive. Hence because you haven't let the DoY and time splines > >> interact (in the simple model I mentioned, it is more complex if you > >> allow these to interact as you then need to predict DoY for each years > >> worth of time points), you can separate DoY from the other terms. > >> > >> None of the above code has been tested, but was written off top of my > >> head, but should work or at least get you pretty
Re: [R-sig-eco] Cosinor with data that trend over time
Sorry about the errors (typos, not syntax errors) - I was forgetting that you'd need to use `gamm()` and hence access the `$gam` component I don't follow the point about a factor trending up or down. You shouldn't try to use the `$lme` part of the model for this. `summary(mod$gam)` should be sufficient, but as it relates to a spline, this is more a test of whether the spline is different from a horizontal, flat, null line. The problem with splines is that the trend need not just be "trending up or down". In the past, to convey where change in the trend occurs I have used the first derivative of the fitted spline and looked for where in time the 95% confidence interval on the first derivative of the spline doesn't include zero; that shows the regions in time where the trend is significantly increasing or decreasing. I cover how to do this in a blog post I wrote: http://www.fromthebottomoftheheap.net/2011/06/12/additive-modelling-and-the-hadcrut3v-global-mean-temperature-series/ the post contains links to the R code used for the derivatives etc, though it is a little more complex in the case of a model with a trend spline and seasonal spline. I'm supposed to have updated those codes and the post because several people have asked me how I do the analysis for models with multiple spline terms. If you can't get the code to work for your models, ping me back and I'll try to move that to the top of my TO DO list. Note the `Xs(time)Fx1` entries in the `summary(mod$lme)` table refer to the basis functions that represent the spline or at least to some part of those basis functions. You can't really make much practical use out of those values are they relate specifically to way the penalised regression spline model has been converted into an equivalent linear mixed effect form. HTH G On 26 March 2014 12:10, Jacob Cram wrote: > Thanks again Gavin, this works. > gamm() also models the long term trend with a spline s(Time), which is > great. I would still like though, to be able to say whether the factor is > trending up or down over time. Would it be fair to query > summary(mod$lme)$tTable > and to look at the p-value and "Value" corresponding Xs(time)Fx1 value to > identify such a trend? > > Also, here are a few syntax corrections on the code provided in the last > email: > 1)visual appoach > plot(mod$gam, pages = 1) > > 2) quantitative approach > pred <- predict(mod$gam, newdata = newdat, type = "terms") > take <- which(pred[,"s(DoY)"] == max(pred[,"s(DoY)"])) > or > take <- as.numeric(which.max(pred[,"s(DoY)"])) > > Cheers, > -Jacob > > > On Wed, Mar 26, 2014 at 9:46 AM, Gavin Simpson wrote: >> >> 1) Visually - unless it actually matters exactly on which day in the >> year the peak is observed? If visually is OK, just do `plot(mod, pages >> = 1)` to see the fitted splines on a single page. See `?plot.gam` for >> more details on the plot method. >> >> 2) You could generate some new data to predict upon as follows: >> >> newdat <- data.frame(DoY = seq_len(366), time = mean(foo$time)) >> >> Then predict for those new data but collect the individual >> contributions of the spline terms to the predicted value rather than >> just the final prediction >> >> pred <- predict(mod, newdata = newdat, type = "terms") >> >> Then find the maximal value of the DoY contribution >> >> take <- which(pred$DoY == max(pred$DoY)) >> newdat[take, , drop = FALSE] >> >> You could use >> >> take <- which.max(pred$DoY) >> >> instead of the `which()` I used, but only if there is a single maximal >> value. >> >> This works because the spline terms in the additive model are just >> that; additive. Hence because you haven't let the DoY and time splines >> interact (in the simple model I mentioned, it is more complex if you >> allow these to interact as you then need to predict DoY for each years >> worth of time points), you can separate DoY from the other terms. >> >> None of the above code has been tested, but was written off top of my >> head, but should work or at least get you pretty close to something >> that works. >> >> HTH >> >> G >> >> On 26 March 2014 10:02, Jacob Cram wrote: >> > Thanks Gavin, >> > This seems like a promising approach and a first pass suggests it >> > works >> > with this data. I can't quite figure out how I would go about >> > interrogating >> > the fitted spline to deterine when the peak value happens with respect >> > to >> > DoY. Any suggestions? >> > -Jacob >> > >> > >> > On Tue, Mar 25, 2014 at 9:06 PM, Gavin Simpson >> > wrote: >> >> >> >> I would probably attack this using a GAM modified to model the >> >> residuals as a stochastic time series process. >> >> >> >> For example >> >> >> >> require("mgcv") >> >> mod <- gamm(y ~ s(DoY, bs = "cc") + s(time), data = foo, >> >> correlation = corCAR1(form = ~ time)) >> >> >> >> where `foo` is your data frame, `DoY` is a variable in the data frame >> >> computed as `as.numeric(strftime(RDate, format = "%j")
Re: [R-sig-eco] Cosinor with data that trend over time
Thanks again Gavin, this works. gamm() also models the long term trend with a spline s(Time), which is great. I would still like though, to be able to say whether the factor is trending up or down over time. Would it be fair to query summary(mod$lme)$tTable and to look at the p-value and "Value" corresponding Xs(time)Fx1 value to identify such a trend? Also, here are a few syntax corrections on the code provided in the last email: 1)visual appoach plot(mod$gam, pages = 1) 2) quantitative approach pred <- predict(mod$gam, newdata = newdat, type = "terms") take <- which(pred[,"s(DoY)"] == max(pred[,"s(DoY)"])) or take <- as.numeric(which.max(pred[,"s(DoY)"])) Cheers, -Jacob On Wed, Mar 26, 2014 at 9:46 AM, Gavin Simpson wrote: > 1) Visually - unless it actually matters exactly on which day in the > year the peak is observed? If visually is OK, just do `plot(mod, pages > = 1)` to see the fitted splines on a single page. See `?plot.gam` for > more details on the plot method. > > 2) You could generate some new data to predict upon as follows: > > newdat <- data.frame(DoY = seq_len(366), time = mean(foo$time)) > > Then predict for those new data but collect the individual > contributions of the spline terms to the predicted value rather than > just the final prediction > > pred <- predict(mod, newdata = newdat, type = "terms") > > Then find the maximal value of the DoY contribution > > take <- which(pred$DoY == max(pred$DoY)) > newdat[take, , drop = FALSE] > > You could use > > take <- which.max(pred$DoY) > > instead of the `which()` I used, but only if there is a single maximal > value. > > This works because the spline terms in the additive model are just > that; additive. Hence because you haven't let the DoY and time splines > interact (in the simple model I mentioned, it is more complex if you > allow these to interact as you then need to predict DoY for each years > worth of time points), you can separate DoY from the other terms. > > None of the above code has been tested, but was written off top of my > head, but should work or at least get you pretty close to something > that works. > > HTH > > G > > On 26 March 2014 10:02, Jacob Cram wrote: > > Thanks Gavin, > > This seems like a promising approach and a first pass suggests it > works > > with this data. I can't quite figure out how I would go about > interrogating > > the fitted spline to deterine when the peak value happens with respect to > > DoY. Any suggestions? > > -Jacob > > > > > > On Tue, Mar 25, 2014 at 9:06 PM, Gavin Simpson > wrote: > >> > >> I would probably attack this using a GAM modified to model the > >> residuals as a stochastic time series process. > >> > >> For example > >> > >> require("mgcv") > >> mod <- gamm(y ~ s(DoY, bs = "cc") + s(time), data = foo, > >> correlation = corCAR1(form = ~ time)) > >> > >> where `foo` is your data frame, `DoY` is a variable in the data frame > >> computed as `as.numeric(strftime(RDate, format = "%j"))` and `time` is > >> a variable for the passage of time - you could do `as.numeric(RDate)` > >> but the number of days is probably large as we might encounter more > >> problems fitting the model. Instead you might do `as.numeric(RDate) / > >> 1000` say to produce values on a more manageable scale. The `bs = > >> "cc"` bit specifies a cyclic spline applicable to data measured > >> throughout a year. You may want to fix the start and end knots to be > >> days 1 and days 366 respectively, say via `knots = list(DoY = > >> c(0,366))` as an argument to `gam()` [I think I have this right, > >> specifying the boundary knots, but let me know if you get an error > >> about the number of knots]. The residuals are said to follow a > >> continuois time AR(1), the irregular-spaced counter part to the AR(1), > >> plus random noise. > >> > >> There may be identifiability issues as the `s(time)` and `corCAR1()` > >> compete to explain the fine-scale variation. If you hit such a case, > >> you can make an educated guess as to the wiggliness (degrees of > >> freedom) for the smooth terms based on a plot of the data and fix the > >> splines at those values via argument `k = x` and `fx = TRUE`, where > >> `x` in `k = x` is some integer value. Both these go in as arguments to > >> the `s()` functions. If the trend is not very non-linear you can use a > >> low value 1-3 here for x and for the DoY term say 3-4 might be > >> applicable. > >> > >> There are other ways to approach this problem of identifiability, but > >> that would require more time/space here, which I can go into via a > >> follow-up if needed. > >> > >> You can interrogate the fitted splines to see when the peak value of > >> the `DoY` term is in the year. > >> > >> You can also allow the seasonal signal to vary in time with the trend > >> by allowing the splines to "interact" in a 2d-tensor product spline. > >> Using `te(DoY, time, bs = c("cc","cr"))` instead of the two `s()` > >> terms (or using
Re: [R-sig-eco] Cosinor with data that trend over time
1) Visually - unless it actually matters exactly on which day in the year the peak is observed? If visually is OK, just do `plot(mod, pages = 1)` to see the fitted splines on a single page. See `?plot.gam` for more details on the plot method. 2) You could generate some new data to predict upon as follows: newdat <- data.frame(DoY = seq_len(366), time = mean(foo$time)) Then predict for those new data but collect the individual contributions of the spline terms to the predicted value rather than just the final prediction pred <- predict(mod, newdata = newdat, type = "terms") Then find the maximal value of the DoY contribution take <- which(pred$DoY == max(pred$DoY)) newdat[take, , drop = FALSE] You could use take <- which.max(pred$DoY) instead of the `which()` I used, but only if there is a single maximal value. This works because the spline terms in the additive model are just that; additive. Hence because you haven't let the DoY and time splines interact (in the simple model I mentioned, it is more complex if you allow these to interact as you then need to predict DoY for each years worth of time points), you can separate DoY from the other terms. None of the above code has been tested, but was written off top of my head, but should work or at least get you pretty close to something that works. HTH G On 26 March 2014 10:02, Jacob Cram wrote: > Thanks Gavin, > This seems like a promising approach and a first pass suggests it works > with this data. I can't quite figure out how I would go about interrogating > the fitted spline to deterine when the peak value happens with respect to > DoY. Any suggestions? > -Jacob > > > On Tue, Mar 25, 2014 at 9:06 PM, Gavin Simpson wrote: >> >> I would probably attack this using a GAM modified to model the >> residuals as a stochastic time series process. >> >> For example >> >> require("mgcv") >> mod <- gamm(y ~ s(DoY, bs = "cc") + s(time), data = foo, >> correlation = corCAR1(form = ~ time)) >> >> where `foo` is your data frame, `DoY` is a variable in the data frame >> computed as `as.numeric(strftime(RDate, format = "%j"))` and `time` is >> a variable for the passage of time - you could do `as.numeric(RDate)` >> but the number of days is probably large as we might encounter more >> problems fitting the model. Instead you might do `as.numeric(RDate) / >> 1000` say to produce values on a more manageable scale. The `bs = >> "cc"` bit specifies a cyclic spline applicable to data measured >> throughout a year. You may want to fix the start and end knots to be >> days 1 and days 366 respectively, say via `knots = list(DoY = >> c(0,366))` as an argument to `gam()` [I think I have this right, >> specifying the boundary knots, but let me know if you get an error >> about the number of knots]. The residuals are said to follow a >> continuois time AR(1), the irregular-spaced counter part to the AR(1), >> plus random noise. >> >> There may be identifiability issues as the `s(time)` and `corCAR1()` >> compete to explain the fine-scale variation. If you hit such a case, >> you can make an educated guess as to the wiggliness (degrees of >> freedom) for the smooth terms based on a plot of the data and fix the >> splines at those values via argument `k = x` and `fx = TRUE`, where >> `x` in `k = x` is some integer value. Both these go in as arguments to >> the `s()` functions. If the trend is not very non-linear you can use a >> low value 1-3 here for x and for the DoY term say 3-4 might be >> applicable. >> >> There are other ways to approach this problem of identifiability, but >> that would require more time/space here, which I can go into via a >> follow-up if needed. >> >> You can interrogate the fitted splines to see when the peak value of >> the `DoY` term is in the year. >> >> You can also allow the seasonal signal to vary in time with the trend >> by allowing the splines to "interact" in a 2d-tensor product spline. >> Using `te(DoY, time, bs = c("cc","cr"))` instead of the two `s()` >> terms (or using `ti()` terms for the two "marginal" splines and the >> 2-d spline). Again you can add in the `k` = c(x,y), fx = TRUE)` to the >> `te()` term where `x` and `y` are the dfs for each dimension in the >> `te()` term. It is a bit more complex to do this for `ti()` terms. >> >> Part of the reason to prefer a spline for DoY for the seasonal term is >> that one might not expect the seasonal cycle to be a symmetric cycle >> as a cos/sin terms would imply. >> >> A recent ecological paper describing a similar approach (though using >> different package in R) is that of Claire Ferguson and colleagues in J >> Applied Ecology (2008) http://doi.org/10./j.1365-2664.2007.01428.x >> (freely available). >> >> HTH >> >> G >> >> On 25 March 2014 19:14, Jacob Cram wrote: >> > Hello all, >> > I am thinking about applying season::cosinor() analysis to some >> > irregularely spaced time series data. The data are unevenly spaced, s
Re: [R-sig-eco] Cosinor with data that trend over time
Thanks Gavin, This seems like a promising approach and a first pass suggests it works with this data. I can't quite figure out how I would go about interrogating the fitted spline to deterine when the peak value happens with respect to DoY. Any suggestions? -Jacob On Tue, Mar 25, 2014 at 9:06 PM, Gavin Simpson wrote: > I would probably attack this using a GAM modified to model the > residuals as a stochastic time series process. > > For example > > require("mgcv") > mod <- gamm(y ~ s(DoY, bs = "cc") + s(time), data = foo, > correlation = corCAR1(form = ~ time)) > > where `foo` is your data frame, `DoY` is a variable in the data frame > computed as `as.numeric(strftime(RDate, format = "%j"))` and `time` is > a variable for the passage of time - you could do `as.numeric(RDate)` > but the number of days is probably large as we might encounter more > problems fitting the model. Instead you might do `as.numeric(RDate) / > 1000` say to produce values on a more manageable scale. The `bs = > "cc"` bit specifies a cyclic spline applicable to data measured > throughout a year. You may want to fix the start and end knots to be > days 1 and days 366 respectively, say via `knots = list(DoY = > c(0,366))` as an argument to `gam()` [I think I have this right, > specifying the boundary knots, but let me know if you get an error > about the number of knots]. The residuals are said to follow a > continuois time AR(1), the irregular-spaced counter part to the AR(1), > plus random noise. > > There may be identifiability issues as the `s(time)` and `corCAR1()` > compete to explain the fine-scale variation. If you hit such a case, > you can make an educated guess as to the wiggliness (degrees of > freedom) for the smooth terms based on a plot of the data and fix the > splines at those values via argument `k = x` and `fx = TRUE`, where > `x` in `k = x` is some integer value. Both these go in as arguments to > the `s()` functions. If the trend is not very non-linear you can use a > low value 1-3 here for x and for the DoY term say 3-4 might be > applicable. > > There are other ways to approach this problem of identifiability, but > that would require more time/space here, which I can go into via a > follow-up if needed. > > You can interrogate the fitted splines to see when the peak value of > the `DoY` term is in the year. > > You can also allow the seasonal signal to vary in time with the trend > by allowing the splines to "interact" in a 2d-tensor product spline. > Using `te(DoY, time, bs = c("cc","cr"))` instead of the two `s()` > terms (or using `ti()` terms for the two "marginal" splines and the > 2-d spline). Again you can add in the `k` = c(x,y), fx = TRUE)` to the > `te()` term where `x` and `y` are the dfs for each dimension in the > `te()` term. It is a bit more complex to do this for `ti()` terms. > > Part of the reason to prefer a spline for DoY for the seasonal term is > that one might not expect the seasonal cycle to be a symmetric cycle > as a cos/sin terms would imply. > > A recent ecological paper describing a similar approach (though using > different package in R) is that of Claire Ferguson and colleagues in J > Applied Ecology (2008) http://doi.org/10./j.1365-2664.2007.01428.x > (freely available). > > HTH > > G > > On 25 March 2014 19:14, Jacob Cram wrote: > > Hello all, > > I am thinking about applying season::cosinor() analysis to some > > irregularely spaced time series data. The data are unevenly spaced, so > > usual time series methods, as well as the nscosinor() function are out. > My > > data do however trend over time and I am wondering if I can feed date as > a > > variable into my cosinor analyis. In the example below, then I'd > conclude > > then that the abundances are seasonal, with maximal abundance in mid June > > and furthermore, they are generally decreasing over time. > > Can I use both time variables together like this? If not, is there some > > better approach I should take? > > Thanks in advance, > > -Jacob > > > > For context lAbundance is logg-odds transformed abundance data of a > > microbial species in a given location over time. RDate is the date the > > sample was collected in the r date format. > > > > > >> res <- cosinor(lAbundance ~ RDate, date = "RDate", data = lldata) > > > >>summary(res) > > Cosinor test > > Number of observations = 62 > > Amplitude = 0.58 > > Phase: Month = June , day = 14 > > Low point: Month = December , day = 14 > > Significant seasonality based on adjusted significance level of 0.025 = > > TRUE > > > >>summary(res$glm) > > > > > > Call: > > glm(formula = f, family = family, data = data, offset = offset) > > > > Deviance Residuals: > > Min 1Q Median 3Q Max > > -2.3476 -0.6463 0.1519 0.6574 1.9618 > > > > Coefficients: > > Estimate Std. Error t value Pr(>|t|) > > (Intercept) 0.0115693 1.8963070 0.006 0.99515 > > RDate -0.0003203 0.0001393 -2.299 0.0251
Re: [R-sig-eco] Cosinor with data that trend over time
I would probably attack this using a GAM modified to model the residuals as a stochastic time series process. For example require("mgcv") mod <- gamm(y ~ s(DoY, bs = "cc") + s(time), data = foo, correlation = corCAR1(form = ~ time)) where `foo` is your data frame, `DoY` is a variable in the data frame computed as `as.numeric(strftime(RDate, format = "%j"))` and `time` is a variable for the passage of time - you could do `as.numeric(RDate)` but the number of days is probably large as we might encounter more problems fitting the model. Instead you might do `as.numeric(RDate) / 1000` say to produce values on a more manageable scale. The `bs = "cc"` bit specifies a cyclic spline applicable to data measured throughout a year. You may want to fix the start and end knots to be days 1 and days 366 respectively, say via `knots = list(DoY = c(0,366))` as an argument to `gam()` [I think I have this right, specifying the boundary knots, but let me know if you get an error about the number of knots]. The residuals are said to follow a continuois time AR(1), the irregular-spaced counter part to the AR(1), plus random noise. There may be identifiability issues as the `s(time)` and `corCAR1()` compete to explain the fine-scale variation. If you hit such a case, you can make an educated guess as to the wiggliness (degrees of freedom) for the smooth terms based on a plot of the data and fix the splines at those values via argument `k = x` and `fx = TRUE`, where `x` in `k = x` is some integer value. Both these go in as arguments to the `s()` functions. If the trend is not very non-linear you can use a low value 1-3 here for x and for the DoY term say 3-4 might be applicable. There are other ways to approach this problem of identifiability, but that would require more time/space here, which I can go into via a follow-up if needed. You can interrogate the fitted splines to see when the peak value of the `DoY` term is in the year. You can also allow the seasonal signal to vary in time with the trend by allowing the splines to "interact" in a 2d-tensor product spline. Using `te(DoY, time, bs = c("cc","cr"))` instead of the two `s()` terms (or using `ti()` terms for the two "marginal" splines and the 2-d spline). Again you can add in the `k` = c(x,y), fx = TRUE)` to the `te()` term where `x` and `y` are the dfs for each dimension in the `te()` term. It is a bit more complex to do this for `ti()` terms. Part of the reason to prefer a spline for DoY for the seasonal term is that one might not expect the seasonal cycle to be a symmetric cycle as a cos/sin terms would imply. A recent ecological paper describing a similar approach (though using different package in R) is that of Claire Ferguson and colleagues in J Applied Ecology (2008) http://doi.org/10./j.1365-2664.2007.01428.x (freely available). HTH G On 25 March 2014 19:14, Jacob Cram wrote: > Hello all, > I am thinking about applying season::cosinor() analysis to some > irregularely spaced time series data. The data are unevenly spaced, so > usual time series methods, as well as the nscosinor() function are out. My > data do however trend over time and I am wondering if I can feed date as a > variable into my cosinor analyis. In the example below, then I'd conclude > then that the abundances are seasonal, with maximal abundance in mid June > and furthermore, they are generally decreasing over time. > Can I use both time variables together like this? If not, is there some > better approach I should take? > Thanks in advance, > -Jacob > > For context lAbundance is logg-odds transformed abundance data of a > microbial species in a given location over time. RDate is the date the > sample was collected in the r date format. > > >> res <- cosinor(lAbundance ~ RDate, date = "RDate", data = lldata) > >>summary(res) > Cosinor test > Number of observations = 62 > Amplitude = 0.58 > Phase: Month = June , day = 14 > Low point: Month = December , day = 14 > Significant seasonality based on adjusted significance level of 0.025 = > TRUE > >>summary(res$glm) > > > Call: > glm(formula = f, family = family, data = data, offset = offset) > > Deviance Residuals: > Min 1Q Median 3Q Max > -2.3476 -0.6463 0.1519 0.6574 1.9618 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 0.0115693 1.8963070 0.006 0.99515 > RDate -0.0003203 0.0001393 -2.299 0.02514 * > cosw-0.5516458 0.1837344 -3.002 0.00395 ** > sinw 0.1762904 0.1700670 1.037 0.30423 > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > (Dispersion parameter for gaussian family taken to be 0.9458339) > > Null deviance: 70.759 on 61 degrees of freedom > Residual deviance: 54.858 on 58 degrees of freedom > AIC: 178.36 > > Number of Fisher Scoring iterations: 2 > > > llldata as a csv below > "RDate","lAbundance" > 2003-03-12,-3.3330699059335 > 2003-05-21,-3.04104625745886
[R-sig-eco] Cosinor with data that trend over time
Hello all, I am thinking about applying season::cosinor() analysis to some irregularely spaced time series data. The data are unevenly spaced, so usual time series methods, as well as the nscosinor() function are out. My data do however trend over time and I am wondering if I can feed date as a variable into my cosinor analyis. In the example below, then I'd conclude then that the abundances are seasonal, with maximal abundance in mid June and furthermore, they are generally decreasing over time. Can I use both time variables together like this? If not, is there some better approach I should take? Thanks in advance, -Jacob For context lAbundance is logg-odds transformed abundance data of a microbial species in a given location over time. RDate is the date the sample was collected in the r date format. > res <- cosinor(lAbundance ~ RDate, date = "RDate", data = lldata) >summary(res) Cosinor test Number of observations = 62 Amplitude = 0.58 Phase: Month = June , day = 14 Low point: Month = December , day = 14 Significant seasonality based on adjusted significance level of 0.025 = TRUE >summary(res$glm) Call: glm(formula = f, family = family, data = data, offset = offset) Deviance Residuals: Min 1Q Median 3Q Max -2.3476 -0.6463 0.1519 0.6574 1.9618 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.0115693 1.8963070 0.006 0.99515 RDate -0.0003203 0.0001393 -2.299 0.02514 * cosw-0.5516458 0.1837344 -3.002 0.00395 ** sinw 0.1762904 0.1700670 1.037 0.30423 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for gaussian family taken to be 0.9458339) Null deviance: 70.759 on 61 degrees of freedom Residual deviance: 54.858 on 58 degrees of freedom AIC: 178.36 Number of Fisher Scoring iterations: 2 llldata as a csv below "RDate","lAbundance" 2003-03-12,-3.3330699059335 2003-05-21,-3.04104625745886 2003-06-17,-3.04734680029566 2003-07-02,-4.18791034708572 2003-09-18,-3.04419201802053 2003-10-22,-3.13805060873929 2004-02-19,-3.80688269144794 2004-03-17,-4.50755507726145 2004-04-22,-4.38846502542992 2004-05-19,-3.06618649442674 2004-06-17,-5.20518774876304 2004-07-14,-3.75041853151097 2004-08-25,-3.67882486716196 2004-09-22,-5.22205827512234 2004-10-14,-3.99297508670535 2004-11-17,-4.68793287601157 2004-12-15,-4.31712380781011 2005-02-16,-4.30893550479904 2005-03-16,-4.05781773988454 2005-05-11,-3.94746237402035 2005-07-19,-4.91195185391358 2005-08-17,-4.93590576323119 2005-09-15,-4.85820800095518 2005-10-20,-5.22956391101343 2005-12-13,-5.12244047315448 2006-01-18,-3.04854660925046 2006-02-22,-6.77145858348375 2006-03-29,-4.33151493849021 2006-04-19,-3.36152357710535 2006-06-20,-3.09071584142593 2006-07-25,-3.31430484483825 2006-08-24,-3.09974933041469 2006-09-13,-3.33288992218458 2007-12-17,-4.19942661980677 2008-03-19,-3.86146499633625 2008-04-22,-3.36161599919095 2008-05-14,-4.30878307213324 2008-06-18,-3.74372448768828 2008-07-09,-4.65951429661651 2008-08-20,-5.35984647704619 2008-09-22,-4.78481898261137 2008-10-20,-3.58588161980965 2008-11-20,-3.10625125552057 2009-02-18,-6.90675477864855 2009-03-11,-3.43446932013368 2009-04-23,-3.82688066341466 2009-05-13,-4.44885332005661 2009-06-18,-3.97671552612412 2009-07-09,-3.40185954003936 2009-08-19,-3.44958231694091 2009-09-24,-3.86508161094726 2010-01-28,-4.95281587967569 2010-02-11,-3.78064756876257 2010-03-24,-3.5823501176064 2010-04-27,-4.3363571587 2010-05-17,-3.90545735473055 2010-07-21,-3.3147176517321 2010-08-11,-4.53218360860017 2010-10-21,-6.90675477864855 2010-11-23,-6.90675477864855 2010-12-16,-6.75158176094352 2011-01-11,-6.90675477864855 [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology