Re: [R] Question on CCA and RDA analysis
You didn't look that hard... Start with the Environmetrics Task View: http://cran.r-project.org/web/views/Environmetrics.html That might point you to the **vegan** pkg: http://cran.r-project.org/web/packages/vegan/index.html And that has a vignette on ordination which you might find useful. I have been wondering if a blog post on such basic topics might be worthwhile. Your email suggests perhaps it is. Once I'm done teaching I will try to find some time to write some posts on these basic analyses. HTH G On 10 April 2015 at 20:38, Luis Fernando García luysgar...@gmail.com wrote: Yeah, The most useful example I found was this. https://gist.github.com/perrygeo/7572735. I always had the idea of this kind of forums was to provide sources not so obvious in the web. If you have something better it would be great. 2015-04-10 18:36 GMT-03:00 Ben Bolker bbol...@gmail.com: Luis Fernando García luysgarcia at gmail.com writes: Dear R experts, I wanted to know if you can suggest me any website or tutorial just to learn about how to make a RDA or CDA in R Thanks in advance! I hate to ask, but did you try Googling canonical correspondence analysis R ... ? __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gavin Simpson, PhD [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] package mgcv - predict with bam: Error in X[ind, ] :, subscript out of bounds
The two distributions are different. The random effect is assumed to be a Gaussian random variable, just as it is with the GLMMs in the lme4 package. It is fine to use such a random effect within a GAM with a non-Gaussian error distribution, like the ones you describe using. HTH Gavin On 3 February 2014 15:00, William Shadish wshad...@ucmerced.edu wrote: Dear Simon, your note below says bs=re specifies a Gaussian random effect . I have been using bs = re for data modeled with Poisson and binomial distributions, or variants thereof (e.g., quasi-Poisson). Have I erred in assuming bs =re can be used to obtain random effects for such data? Will Shadish - Actually this is ok. mgcv exploits the duality between quadratically penalized smooths and Gaussian random effects to allow random effects to be specified this way. bs=re specifies a Gaussian random effect with corresponding model matrix given by model.matrix(~site-1). (More generally s(x,y,z,bs=re) specifies a gaussian random effect with model matrix given by model.matrix(~x:y:z-1), with obvious generalization to more or fewer variables). See mgcv help file ?random.effects for more. best, Simon -- William R. Shadish Distinguished Professor Founding Faculty Mailing Address: William R. Shadish University of California School of Social Sciences, Humanities and Arts 5200 North Lake Rd Merced CA 95343 Physical/Delivery Address: University of California Merced ATTN: William Shadish School of Social Sciences, Humanities and Arts Facilities Services Building A 5200 North Lake Rd. Merced, CA 95343 209-228-4372 voice 209-228-4007 fax (communal fax: be sure to include cover sheet) wshad...@ucmerced.edu http://faculty.ucmerced.edu/wshadish/index.htm http://psychology.ucmerced.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gavin Simpson, PhD __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What computer power for GAMM models
1) The model is far too complex. I have trouble imagining a system where the trend potentially uses 49 degrees of freedom and yet the unexplained variance within years follows an AR(7). It looks like you have time points within years over many years; is that why you are trying to use such a complex trend, to capture the repeated seasonal patterns? If so, break that down and fit a spline using `bs = cc` (or one of the other cyclic splines in mgcv) and a separate spline for the trend. If you want to allow the seasonal pattern to vary with time, then you can do that with multivariate splines, eg, this way you can compare a model 1 m1 - gamm(yi ~ ti(DayOfYear, bs = cc) + ti(Time, bs = cr), ) with model 2 m2 - gamm(yi ~ ti(DayOfYear, bs = cc) + ti(Time, bs = cr) + ti(DayOfYear, Time, bs = c(cc,cr)), ) Of course, you'll need to fix the the ARAM structure at know coefficient (currently you are estimating it) or you aren't really just comparing the fixed part of the models. 2) You aren't heeding Simon's warnings about fiddling with lmeControl. You don't want EM iterations, so you need at least to have `lmeControl(niterEM = 0)`, alongside other options A general word of advice with using gamm(), start simply and make sure you have lots of RAM. These models are very complex so if you start with simple models you will be less likely to run into convergence issues and you get results more quickly. Get lots of RAM because models with `correlation` arguments involve inverting potentially big covariance matrices and that can quite quickly chew through RAM. I've had pretty good success fitting models with non-linear trends etc to environmental time series using gamm(), but the only time things went a bit awry was when too complex a model was required (where I had data covering hundreds of thousands of years from an ocean core and the trend really was very wiggly, and then the correlation structure and the trend fought for control as there was an identifiability issue). HTH Gavin On 25 October 2013 07:19, Sylvie Martin smartin_se...@orange.fr wrote: Hello, I use GAMM function (MGCV package) in R software to study relationship between pollen and pollinosis. My models include autoregressive terms. Here is an exemple: CupAR7_2lt;-gamm(nb_rca_2~s(Trend,bs=ps,k=49) +Semaine nbsp;nbsp;nbsp; nbsp;nbsp;nbsp; nbsp;nbsp;nbsp; +s(TX03,bs=ps,k=8) +s(UN03,bs=ps,k=4)nbsp; ..+pollen01 ,correlation=corARMA(p=7,q=0,form=~Trend|Annees), niter=40,control=lmeControl(maxIter=10,msMaxIter=10) nbsp;nbsp;nbsp; nbsp;nbsp;nbsp; ,family=quasipoisson, data=donnees, na=na.omit) Each model need 1/2 hour to 1 hour to get a result and I frequently get the following error messages: Erreur dans lme.formula(fixed = fixed, random = random, data = data, correlation = correlation,nbsp; : nbsp; nlminb problem, convergence error code = 1 nbsp; message = singular convergence (7) Erreur dans na.fail.default(list(Xr.4 = c(0.00374377951791214, 0.00438373736920182,nbsp; : nbsp; missing values in object Does my computer lack power,(if so,what is required?) or is Rnbsp; limited in speed of execution or is my syntax not ok. Here are the features of my computer: processor: 3.10 GHznbsp; i5-2400 CPU RAM: 4 Go (3.87 Go) OS: 64 bits Windows 7 professionnel (Pack 1) Another question: how can we add a smoothing parameter in the model GAMM to help convergence (sp = is not accepted)? Thanks for your help Best regards Sylvie Martin SEPIA-Santé Bureau d'études en épidémiologie, biostatistiques, environnement 31 rue de Pontivy 56 150 BAUD FRANCE tél : 02 97 28 80 38 fax: 02 97 28 81 10 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gavin Simpson, PhD __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assessing temporal correlation in GAM with irregular time steps
On 3 September 2013 16:10, Worthington, Thomas A thomas.worthing...@okstate.edu wrote: Dear Gavin Thank you for the very detailed response. I had started to go down the route of fitting a correlation structure via gamm. I tried applying your code to my data but returned the error Error in corCAR1(~ID | SiteCode1971) : parameter in CAR(1) structure must be between 0 and 1 Sorry, that is my fault, I keep forgetting that you need to specify the formula argument, the first argument of corCAR1() is the value of the correlation parameter if you want to specify it. So try: corCAR1(form = ~ID | SiteCode1971) I do this (get that error) all the time myself. I set the 'bar' in your code to the sample ID (basically a number between 1 and 192) but I wasn't sure if this was what you meant in relation to 'ordering of the samples' That is not that useful as you need to give the software something about when the samples occur in time, otherwise it doesn't have the information needed to properly model the decay in correlation with time. You need to give it the observation time, however you measured it. HTH G Best wishes Tom -Original Message- From: Gavin Simpson [mailto:ucfa...@gmail.com] Sent: Tuesday, September 03, 2013 3:17 PM To: Worthington, Thomas A Cc: r-help@r-project.org Subject: Re: [R] Assessing temporal correlation in GAM with irregular time steps It is possible, but you can't use the discrete time or classical stochastic trend models (or evaluate using the ACF). Also, why do you care to do this with regard to DoY? The assumption of the model relates to the residuals, so you should check those for residual autocorrelation. As you are using `mgcv::gam` you could also use `mgcv::gamm` which can then leverage the correlation structures from the nlme package, which has spatial correlation structures (and you can think of time as a 1-d spatial direction). The package also has a `corCAR1()` correlation structure which is the continuous-time analogue of the AR(1). Fitting via `gamm()` will also allow you to use the `Variogram()` function from the nlme package to assess the model residuals for residual autocorrelation. For example you could compare the two fits m0 - gamm(Length ~ s(DOY, by = SiteCode) + SiteCode, data = foo, method = REML) m1 - gamm(Length ~ s(DOY, by = SiteCode) + SiteCode, data = foo, method = REML, correlation = corCAR1( ~ bar | SiteCode)) where `foo` is the object that contains the variables mentioned in the call, and `bar` is the variable (in `foo)` that indicates the ordering of the samples. Notice that I nest the CAR(1) within the two respective Sites, but do note IIRC that this fits the same residual correlation structure to both sites' residuals (i.e. there is 1 CAR(1) process, not two separate ones). require(nlme) anova(m0$lme, m1$lme) will perform a likelihood ratio test on the two models. If you have residual autocorrelation, do note that the smooth for DoY may be chosen to be more complex than is appropriate (it might be fitting the autocorrleated noise), so you may want to fix the degrees of freedom for the smoother at some a priori chosen value and use this same value when fitting both m0 and m1, or at the very least set an upper limit on the complexity of the DoY smooth, say via s(DoY, by = SiteCode, k = 5). Finally, as a length = 0 insect makes no sense, the assumption of Gaussian (Normal) errors may be in trouble with your data; apart from their strictly positive nature, the mean-variance relationship of the data may not follow that of the assumptions for the errors. You can move to a GLM (GAM) to account for this but things get very tricky with the correlation structures (you can use gamm() still but fitting then goes via glmmPQL() in the MASS package a thence to lme()). If you just want to fit a variogram to something, there are a large number of spatial packages available for R, several of which can fit variograms to data, though you will need to study their respective help files for how to use them. As for the input data, often the time/date of sampling encoded as a numeric will be sufficient input, but you will need to check individual functions for what they require. I would check out the Spatial Task View on CRAN. HTH G On 28 August 2013 14:26, Worthington, Thomas A thomas.worthing...@okstate.edu wrote: I have constructed a GAM using the package mgcv to test whether the lengths of an emerging insect (Length) varies with day of the year (DOY) and between two sites (SiteCode). The data are collected at irregular time steps ranging from 2 days to 20 days between samples. The GAM takes the form M3 - gam(Length ~s(DOY, by = SiteCode) + SiteCode) As the data are a time series I would like to test for temporal autocorrelation. I have read that it is not possible to use the autocorrelation function (ACF) due
Re: [R] Assessing temporal correlation in GAM with irregular time steps
It is possible, but you can't use the discrete time or classical stochastic trend models (or evaluate using the ACF). Also, why do you care to do this with regard to DoY? The assumption of the model relates to the residuals, so you should check those for residual autocorrelation. As you are using `mgcv::gam` you could also use `mgcv::gamm` which can then leverage the correlation structures from the nlme package, which has spatial correlation structures (and you can think of time as a 1-d spatial direction). The package also has a `corCAR1()` correlation structure which is the continuous-time analogue of the AR(1). Fitting via `gamm()` will also allow you to use the `Variogram()` function from the nlme package to assess the model residuals for residual autocorrelation. For example you could compare the two fits m0 - gamm(Length ~ s(DOY, by = SiteCode) + SiteCode, data = foo, method = REML) m1 - gamm(Length ~ s(DOY, by = SiteCode) + SiteCode, data = foo, method = REML, correlation = corCAR1( ~ bar | SiteCode)) where `foo` is the object that contains the variables mentioned in the call, and `bar` is the variable (in `foo)` that indicates the ordering of the samples. Notice that I nest the CAR(1) within the two respective Sites, but do note IIRC that this fits the same residual correlation structure to both sites' residuals (i.e. there is 1 CAR(1) process, not two separate ones). require(nlme) anova(m0$lme, m1$lme) will perform a likelihood ratio test on the two models. If you have residual autocorrelation, do note that the smooth for DoY may be chosen to be more complex than is appropriate (it might be fitting the autocorrleated noise), so you may want to fix the degrees of freedom for the smoother at some a priori chosen value and use this same value when fitting both m0 and m1, or at the very least set an upper limit on the complexity of the DoY smooth, say via s(DoY, by = SiteCode, k = 5). Finally, as a length = 0 insect makes no sense, the assumption of Gaussian (Normal) errors may be in trouble with your data; apart from their strictly positive nature, the mean-variance relationship of the data may not follow that of the assumptions for the errors. You can move to a GLM (GAM) to account for this but things get very tricky with the correlation structures (you can use gamm() still but fitting then goes via glmmPQL() in the MASS package a thence to lme()). If you just want to fit a variogram to something, there are a large number of spatial packages available for R, several of which can fit variograms to data, though you will need to study their respective help files for how to use them. As for the input data, often the time/date of sampling encoded as a numeric will be sufficient input, but you will need to check individual functions for what they require. I would check out the Spatial Task View on CRAN. HTH G On 28 August 2013 14:26, Worthington, Thomas A thomas.worthing...@okstate.edu wrote: I have constructed a GAM using the package mgcv to test whether the lengths of an emerging insect (Length) varies with day of the year (DOY) and between two sites (SiteCode). The data are collected at irregular time steps ranging from 2 days to 20 days between samples. The GAM takes the form M3 - gam(Length ~s(DOY, by = SiteCode) + SiteCode) As the data are a time series I would like to test for temporal autocorrelation. I have read that it is not possible to use the autocorrelation function (ACF) due to the irregular spacing and that producing a variogram in relation to DOY would be an option. Is this a correct method to test for temporal autocorrelation? And could someone suggest the code to produce the variogram as I'm getting an error related to the 'distance' argument Best wishes Tom [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gavin Simpson, PhD __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Setting up 3D tensor product interactions in mgcv
From a reading of `?ti` It is sometimes useful to investigate smooth models with a main-effects + interactions structure, for example f_1(x) + f_2(z) + f_3(x,z) This functional ANOVA decomposition is supported by ‘ti’ terms, which produce tensor product interactions from which the main effects have been excluded, under the assumption that they will be included separately. For example the ‘~ ti(x) + ti(z) + ti(x,z)’ would produce the above main effects + interaction structure. This is much better than attempting the same thing with ‘s’or ‘te’ terms representing the interactions (although mgcv does not forbid it). I think the following will do what you want: mdl - gam(PA ~ ti(x) + ti(y) + ti(z) + ti(x,y) + ti(x,z) + ti(y,z) + ti(x,y,z), ) HTH G On 23 August 2013 02:05, Mark Payne markpayneatw...@gmail.com wrote: Hi, I am trying to fit a smoothing model where there are three dimensions over which I can smooth (x,y,z). I expect interactions between some, or all, of these terms, and so I have set up my model as mdl - gam(PA ~ s(x) + s(y) + s(z) + te(x,y) + te(x,z) + te(y,z) + te(x,y,z),...) I have recently read about the ti(), tensor product interaction smoother, which takes care of these interaction terms elegantly and does the nesting properly. The help file says This is much better than attempting the same thing with ‘s’or ‘te’ terms representing the interactions (although mgcv does not forbid it). There is a 2D example there also. But I don't understand how I should set this up for my 3D example. Do I simply replace the te's above with ti? Or is there more to it than that? Does anyone have experience with this, and can explain how I should do it properly? Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gavin Simpson, PhD __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple interaction terms in GAMM-model
On Thu, 2013-07-25 at 04:56 -0700, Jeroen wrote: Dear all, I am trying to correlate a variable tau1 to a set of other variables (x1, x2, x3, x4), taking into account an interaction with time ('doy') and place ('region'), and taking into account dependency of data in time per object ID. My dataset looks like: snip / Since the data is dependent in time per objectID, I use a GAMM model with an autocorrelation function. Since each variable x1, x2, etc. is dependent on time and place, I should incorporate this as well. Therefore I am wondering if the following gamm-model is correct for my situation: model - gamm( tau1 ~ te( x1, by= doy ) + te( x1, by= factor( region ) ) + ... + te( x4, by= doy ) + te( x4, by= factor( region ) ) + factor( region ), correlation= corAR1(form= ~ doy|objectID ), na.action= na.omit ). Does anyone know if this is ok? The model looks a little odd - the way you've written it you don't need `te()` smooths. For the interaction with doy, I would try te(x1, doy, bs = c(cr, cc)) assuming x1 and doy are continuous. This specifies cubic splines and cyclic cubic splines respectively for x1 and doy. I use a cyclic spline for doy as we might not expect Jan 1st and Dec 31st to be much different. By separating the interaction between x1 and doy and the one for x1 and region you are saying that the way the effect of x1 varies through the year does not change between regions. Is that reasonable? I think you need something for the objectID - a random effect or a fixed effect will account for differences in mean values of taul between objects. A random effect seems most useful otherwise you'll eat into your degrees of freedom, but you have plenty of those. The correlation part assumes that the residuals follow an AR(1) applied within the objects, with each AR(1) having the same coefficient. Autocorrelation decreases exponentially with lag/separation. It is a reasonable start. Fitting will be much quicker without this. Could you try fitting without and then extract the normalised residuals to check for residual correlation? I hope you have a lot of memory and a lot of spare time available; my experience of fitting similarly sized (though not as complex - I had one *very* long series) was that gamm() used large amounts of RAM (I had 16GB and most was used for a single model) and took many days to fit. You may find the random effect for object fights with the AR(1) so you could end up with an odd fit if it fits at all. You are going to want to turn on verbosity when fitting so you can see how the optimisation is proceeding. Something like require(nlme) ## need a control object ctrl - lmeControl(msVerbose = TRUE, maxIter = 400, msMaxIter = 400, niterEM = 0, ## this is VERY important for gamm()! tolerance = 1e-8, msTol = 1e-8, msMaxEval = 400) then add `control = ctrl` to the `gamm()` call. I would approach the expecting the model to fail to fit so be prepared to simplify it etc. Good luck, G Or should I use a model which also includes terms like te( x1 ) + ... + te( x4 ). And is the correlation function correct? Thanks so much!! Jeroen -- View this message in context: http://r.789695.n4.nabble.com/Multiple-interaction-terms-in-GAMM-model-tp4672297.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gavin Simpson, PhD [t] +1 306 337 8863 Adjunct Professor, Department of Biology[f] +1 306 337 2410 Institute of Environmental Change Society [e] gavin.simp...@uregina.ca 523 Research and Innovation Centre [tw] @ucfagls University of Regina Regina, SK S4S 0A2, Canada __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MGCV: Degrees of freedom of smooth terms
On Tue, 2013-07-23 at 11:16 +0200, Christoph Scherber wrote: Dear all, This is just a quick question regarding degrees of freedom in GAM models fit by MGCV (using select=T): Can I roughly interpret them as: 1 df: linear effect of x on y 2 df: approximately quadratic of x on y 3 df: approximately cubic effect of x on y? Yes, approximately 1 df for a spatial term s(x,y): bilinear effect (?) or how would I call this? A bilinear effect would have two df, no? In a linear regression z ~ x + y would define a plane just like s(x, y) can and would use 2 df. 1 df for the entire s(x,y) implies an additional penalty such that less than 1df is spent in the `x` or `y` dimensions of the spline. And what does ref.df in the summary output mean; is this the unpenalized degrees of freedom for each term? IIRC, these are the dfs that are used in the tests reported. I am not familiar enough with the details to comment more. If you turn on Select = TRUE for example which adds an additional penalty then the ref.df can be much larger than the edf values. You might send an email to Simon Wood (or see if he picks up on this) for a (far) more authoritative answer on this part of your question. HTH G Thank you very much for answering! Best wishes, Christoph -- Gavin Simpson, PhD [t] +1 306 337 8863 Adjunct Professor, Department of Biology[f] +1 306 337 2410 Institute of Environmental Change Society [e] gavin.simp...@uregina.ca 523 Research and Innovation Centre [tw] @ucfagls University of Regina Regina, SK S4S 0A2, Canada __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error with metaMDS
On Mon, 2013-06-24 at 19:33 +0800, Suparna Mitra wrote: H ello R-experts, I want to do ordination plots using vegan metaMDS. I have a where many cells have zero values. snip / I am having two different kind of errors for these two data... Error 1 ord1 - metaMDS( X =bray) That is fundamentally wrong - you are setting argument `X` to the character vector `bray`, that is not how you call metaMDS(). Hence I suspect you didn't bother to include the full code...! Let's assume you cal actually call metaMDS() correctly... The first set of warnings (not the error) come `vegdist()`. The first is from : if (method 2 any(rowSums(x, na.rm = TRUE) == 0)) and hence if you do which(rowSums(X, na.rm = TRUE) == 0) you'll see which rows (samples) have no counts at all. The second warning in the first set comes from if (any(is.na(d))) warning(missing values in results) hence by the time vegdist has computed the dissimilarity, those computation ended up generating one or more `NA` values. Things go downhill from there as the error generated from within metaMDS() is because we are comparing the distances with the smallest representable number and any comparison with `NA` yields `NA` and hence the first Error you see. We should probably catch that but I don't have a reproducible example to see why we don't... Square root transformation Wisconsin double standardization Error in if (any(dist -sqrt(.Machine$double.eps))) warning(some dissimilarities are negative -- is this intentional?) : missing value where TRUE/FALSE needed In addition: Warning messages: 1: In distfun(comm, method = distance, ...) : you have empty rows: their dissimilarities may be meaningless in method bray 2: In distfun(comm, method = distance, ...) : missing values in results Error 2 ord.data= metaMDS(data, distance=bray) Error in if (any(autotransform, noshare 0, wascores) any(comm 0)) { : missing value where TRUE/FALSE needed In addition: Warning message: In Ops.factor(left, right) : not meaningful for factors Look at `str(data)` and if any of the columns are factors, change them to be numeric or find out why R thinks they are factors - it wouldn't normally convert numeric data to a factor when reading the data in. What is `data`? You only refer to `Genus_data`. Please try to be specific about exactly what you did - i.e. include the actual code. I suspect the things you try below are pointless as it is not zero distances that are the problem but sites for which no observations were recorded. G I searched all the details of metaMDS where it is suggested to avail the argument 'zerodist' So I tried both X.dist1 - metaMDSdist(X, method=bray,zerodist = ignore) X.dist2 - metaMDSdist(X, method=bray,zerodist = add) Can Please help me with this. Thanks, Mitra [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gavin Simpson, PhD [t] +1 306 337 8863 Adjunct Professor, Department of Biology[f] +1 306 337 2410 Institute of Environmental Change Society [e] gavin.simp...@uregina.ca 523 Research and Innovation Centre [tw] @ucfagls University of Regina Regina, SK S4S 0A2, Canada __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gamm in mgcv random effect significance
On Tue, 2013-06-11 at 10:08 -0700, William Shadish wrote: Gavin et al., Thanks so much for the help. Unfortunately, the command anova(g1$lme, g2$lme) gives Error in eval(expr, envir, enclos) : object 'fixed' not found This is with mgcv:::gamm yes? Strange - did you load nlme first? I think mgcv now imports from the nlme package so to use its functions/methods explicitly you need to load nlme - before loading mgcv also loaded nlme, but it no longer does so. That *should* work. HTH G and for bam (which is the one that can use a known ar1 term), the error is AR1 parameter rho unused with generalized model Apparently it cannot run for binomial distributions, and presumably also Poisson. I did find a Frequently Asked Questions for package mgcv page that said How can I compare gamm models? In the identity link normal errors case, then AIC and hypotheis testing based methods are fine. Otherwise it is best to work out a strategy based on the summary.gam So putting all this together, I take it that my binomial example will not support a direct model comparison to test the significance of the random effects. I'm guessing the best strategy based on the summary.gam is probably just to compare fit indices like Log Likelihoods. If anyone has any other suggestions, though, please do let me know. Thanks so much. Will Shadish On 6/7/2013 3:02 PM, Gavin Simpson wrote: On Fri, 2013-06-07 at 13:12 -0700, William Shadish wrote: Dear R-helpers, I'd like to understand how to test the statistical significance of a random effect in gamm. I am using gamm because I want to test a model with an AR(1) error structure, and it is my understanding neither gam nor gamm4 will do the latter. gamm4() can't yes and out of the box mgcv::gam can't either but see ?magic for an example of correlated errors and how the fits can be manipulated to take the AR(1) (or any structure really as far as I can tell) into account. You might like to look at mgcv::bam() which allows an known AR(1) term but do check that it does what you think; with a random effect spline I'm not at all certain that it will nest the AR(1) in the random effect level. snip / Consider, for example, two models, both with AR(1) but one allowing a random effect on xc: g1 - gamm(y ~ s(xc) +z+ int,family=binomial, weights=trial, correlation=corAR1()) g2 - gamm(y ~ s(xc) +z+ int,family=binomial, weights=trial, random = list(xc=~1),correlation=corAR1()) Shouldn't you specify how the AR(1) is nested in the hierarchy here, i.e. AR(1) within xc? maybe I'm not following your data structure correctly. I include the output for g1 and g2 below, but the question is how to test the significance of the random effect on xc. I considered a test comparing the Log-Likelihoods, but have no idea what the degrees of freedom would be given that s(xc) is smoothed. I also tried: anova(g1$gam, g2$gam) gamm() fits via the lme() function of package nlme. To do what you want, you need the anova() method for objects of class lme, e.g. anova(g1$lme, g2$lme) Then I think you should check if the fits were done via REML and also be aware of the issue of testing wether a variance term is 0. that did not seem to return anything useful for this question. A related question is how to test the significance of adding a second random effect to a model that already has a random effect, such as: g3 - gamm(y ~ xc +z+ s(int),family=binomial, weights=trial, random = list(Case=~1, z=~1),correlation=corAR1()) g4 - gamm(y ~ xc +z+ s(int),family=binomial, weights=trial, random = list(Case=~1, z=~1, int=~1),correlation=corAR1()) Again, I think you need anova() on the $lme components. HTH G Any help would be appreciated. Thanks. Will Shadish g1 $lme Linear mixed-effects model fit by maximum likelihood Data: data Log-likelihood: -437.696 Fixed: fixed X(Intercept) Xz XintXs(xc)Fx1 0.6738466 -2.56883170.0137415 -0.1801294 Random effects: Formula: ~Xr - 1 | g Structure: pdIdnot Xr1 Xr2 Xr3 Xr4 Xr5 Xr6 Xr7 Xr8 Residual StdDev: 0.0004377781 0.0004377781 0.0004377781 0.0004377781 0.0004377781 0.0004377781 0.0004377781 0.0004377781 1.693177 Correlation Structure: AR(1) Formula: ~1 | g Parameter estimate(s): Phi 0.3110725 Variance function: Structure: fixed weights Formula: ~invwt Number of Observations: 264 Number of Groups: 1 $gam Family: binomial Link function: logit Formula: y ~ s(xc) + z + int Estimated degrees of freedom: 1 total = 4 attr(,class) [1] gamm list g2 $lme Linear mixed-effects model fit by maximum likelihood Data: data
Re: [R] gamm in mgcv random effect significance
On Fri, 2013-06-07 at 13:12 -0700, William Shadish wrote: Dear R-helpers, I'd like to understand how to test the statistical significance of a random effect in gamm. I am using gamm because I want to test a model with an AR(1) error structure, and it is my understanding neither gam nor gamm4 will do the latter. gamm4() can't yes and out of the box mgcv::gam can't either but see ?magic for an example of correlated errors and how the fits can be manipulated to take the AR(1) (or any structure really as far as I can tell) into account. You might like to look at mgcv::bam() which allows an known AR(1) term but do check that it does what you think; with a random effect spline I'm not at all certain that it will nest the AR(1) in the random effect level. snip / Consider, for example, two models, both with AR(1) but one allowing a random effect on xc: g1 - gamm(y ~ s(xc) +z+ int,family=binomial, weights=trial, correlation=corAR1()) g2 - gamm(y ~ s(xc) +z+ int,family=binomial, weights=trial, random = list(xc=~1),correlation=corAR1()) Shouldn't you specify how the AR(1) is nested in the hierarchy here, i.e. AR(1) within xc? maybe I'm not following your data structure correctly. I include the output for g1 and g2 below, but the question is how to test the significance of the random effect on xc. I considered a test comparing the Log-Likelihoods, but have no idea what the degrees of freedom would be given that s(xc) is smoothed. I also tried: anova(g1$gam, g2$gam) gamm() fits via the lme() function of package nlme. To do what you want, you need the anova() method for objects of class lme, e.g. anova(g1$lme, g2$lme) Then I think you should check if the fits were done via REML and also be aware of the issue of testing wether a variance term is 0. that did not seem to return anything useful for this question. A related question is how to test the significance of adding a second random effect to a model that already has a random effect, such as: g3 - gamm(y ~ xc +z+ s(int),family=binomial, weights=trial, random = list(Case=~1, z=~1),correlation=corAR1()) g4 - gamm(y ~ xc +z+ s(int),family=binomial, weights=trial, random = list(Case=~1, z=~1, int=~1),correlation=corAR1()) Again, I think you need anova() on the $lme components. HTH G Any help would be appreciated. Thanks. Will Shadish g1 $lme Linear mixed-effects model fit by maximum likelihood Data: data Log-likelihood: -437.696 Fixed: fixed X(Intercept) Xz XintXs(xc)Fx1 0.6738466 -2.56883170.0137415 -0.1801294 Random effects: Formula: ~Xr - 1 | g Structure: pdIdnot Xr1 Xr2 Xr3 Xr4 Xr5 Xr6 Xr7 Xr8 Residual StdDev: 0.0004377781 0.0004377781 0.0004377781 0.0004377781 0.0004377781 0.0004377781 0.0004377781 0.0004377781 1.693177 Correlation Structure: AR(1) Formula: ~1 | g Parameter estimate(s): Phi 0.3110725 Variance function: Structure: fixed weights Formula: ~invwt Number of Observations: 264 Number of Groups: 1 $gam Family: binomial Link function: logit Formula: y ~ s(xc) + z + int Estimated degrees of freedom: 1 total = 4 attr(,class) [1] gamm list g2 $lme Linear mixed-effects model fit by maximum likelihood Data: data Log-likelihood: -443.9495 Fixed: fixed X(Intercept) Xz XintXs(xc)Fx1 0.720018143 -2.562155820 0.003457463 -0.045821030 Random effects: Formula: ~Xr - 1 | g Structure: pdIdnot Xr1 Xr2 Xr3 Xr4 Xr5 Xr6 Xr7 Xr8 StdDev: 7.056078e-06 7.056078e-06 7.056078e-06 7.056078e-06 7.056078e-06 7.056078e-06 7.056078e-06 7.056078e-06 Formula: ~1 | xc %in% g (Intercept) Residual StdDev: 6.277279e-05 1.683007 Correlation Structure: AR(1) Formula: ~1 | g/xc Parameter estimate(s): Phi 0.1809409 Variance function: Structure: fixed weights Formula: ~invwt Number of Observations: 264 Number of Groups: g xc %in% g 134 $gam Family: binomial Link function: logit Formula: y ~ s(xc) + z + int Estimated degrees of freedom: 1 total = 4 attr(,class) [1] gamm list -- Gavin Simpson, PhD [t] +1 306 337 8863 Adjunct Professor, Department of Biology[f] +1 306 337 2410 Institute of Environmental Change Society [e] gavin.simp...@uregina.ca 523 Research and Innovation Centre [tw] @ucfagls University of Regina Regina, SK S4S 0A2, Canada __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NMDS using Vegan
On Wed, 2013-05-15 at 12:06 +0800, Suparna Mitra wrote: Hello R experts, I am new to Vegan and use trying to follow the tutorial to perform NMDS for my data. But after performing the metaMDS, when I plotted my results the default plot shows MDS1 vs MDS2. Thought according to the tutorial Default ordination plot should display NMDS1vs NMDS2. Why is this difference? Accourding to tutorial it says: Function metaMDS is a wrapper to perform NMDS. Can anybody please help me to understand this? Thanks, Mitra They are just labels and metaMDS **has** performed an NMDS. Not sure why Jari labelled these as MDSx. If this bothers you so, add your own labels: require(vegan) data(dune) sol - metaMDS(dune) plot(sol, xlab = NMDS1, ylab = NMDS2) HTH G -- Gavin Simpson, PhD [t] +1 306 337 8863 Adjunct Professor, Department of Biology[f] +1 306 337 2410 Institute of Environmental Change Society [e] gavin.simp...@uregina.ca 523 Research and Innovation Centre [tw] @ucfagls University of Regina Regina, SK S4S 0A2, Canada __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vegan package treatment of zeros
On Wed, 2013-05-15 at 09:04 -0500, Michael Rentz wrote: Hello: I have a good size database of plant cover measurements I was running a Shannon Diversity on. I have every cell as a default of 0 if it is not present. The tutorials I have looked at all had blanks in cases of absence. I just wanted some kind of verification that VEGAN will recognize 0s as absent I just wanted to make sure I was not inadvertently setting myself up for problems. Blanks would be interpreted as `NA` in R when you read the data in. Vegan would respect that and most likely fail because of the missing data as not all methods are appropriate when there are `NA` values. Hence a common first step is to replace `NA` with 0. As you have already have this indicated in your file external to R then you won;t need to do this extra step *in* R. How zeros are then treated will depend largely on the method employed and/or the dissimilarity coefficient used, but that is not an R question. A recent posting to the R-Sig-Ecology list by Brian Cade presents some interesting observations on the double zero issue: https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20130419/114d1e5f/attachment.pl HTH G -- Gavin Simpson, PhD [t] +1 306 337 8863 Adjunct Professor, Department of Biology[f] +1 306 337 2410 Institute of Environmental Change Society [e] gavin.simp...@uregina.ca 523 Research and Innovation Centre [tw] @ucfagls University of Regina Regina, SK S4S 0A2, Canada __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NMDS in Vegan: problems in stressplot, best solution
On Fri, 2013-04-26 at 12:42 -0500, Kumar Mainali wrote: Hello, I can draw a basic stress plot for NMDS with the following code in package Vegan. stressplot(parth.mds, parth.dis) When I try to specify the line and point types, it gives me error message. stressplot(parth.mds, parth.dis, pch=1, p.col=gray, lwd=2, l.col=red) Error in plot.xy(xy, type, ...) : invalid plot type In the above code, if I removed line type, it does give me the plot only of points with my choice of type. stressplot(parth.mds, parth.dis, pch=1, p.col=gray) Why cannot I define both line and point at the same time? You can. What you can't do is use argument `dis` with an metaMDS object. If you use: stressplot(parth.mds, pch=1, p.col=gray, lwd=2, l.col=red) you'll see it works just fine. We'll see about providing a better error message if you do what the documentation asks you not to. If I have 100 iterations for metaMDS, then when I plot the result, does it give me result from best solution? The best solution it encountered in the 100 random starts, yes. How do I know that. It is implied in point 4. of the Details section of ?metaMDS. Can you plot the Stress by Iteration number? Not in a graphical plot. The stresses for each iteration are printed to the console at each iteration. Note these iterations are random starts, each of which has iterations of the algorithm. HTH G parth.mds - metaMDS(WorldPRSenv, distance = bray, k = 2, trymax = 100, engine = c(monoMDS, isoMDS), autotransform =TRUE, wascores = TRUE, expand = TRUE, trace = 2) plot(parth.mds, type = p) Thanks in advance, Kumar -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RDA permutest envfit
On Wed, 2013-04-24 at 10:38 -0700, Rémi Lesmerises wrote: Dear all, I did a RDA and when I looked to the signification of the test with permutest, the output was non-significant. But when I used the envfit function, some of the vectors are significant. All the test's conditions are respected. What it means? Is it an error in the script? The two functions are testing two fundamentally different things. `permutest` (as you invoked it --- `first = TRUE`) is testing the first axis of the RDA. That axis is a linear combination of the constraints. The `envfit` procedure is testing the individual correlations between the 2-d configuration of samples in the RDA space and the direction of maximal variance of the environmental data. Each variable is considered separately. I can easily imagine a case, where the variance explained on the first axis is not significant but variance over 2 axes is significant, as one where the vectors do not point solely along the first RDA axis but also point along the 2nd axis. By looking only at their contributions to the first axis and summing them you don't explain a whole lot, but when you look at the directions in 2D space each variable explains a significant amount of variance. HTH G Commands and output: permutest(rda.ind, perm=999, first=TRUE) Permutation test for rda Call: rda(formula = x ~ ARH_frs + CON_frs + PRP_frs + RUI_frs + VAM_frs, data = z) Permutation test for first constrained eigenvalue Pseudo-F:4.093568 (with 1, 10 Degrees of Freedom) Significance:0.413 Based on 999 permutations under reduced model. fit - envfit(rda.ind, z, perm = 999, display = lc) fit ***VECTORS RDA1 RDA2 r2 Pr(r) VAM_frs 0.145281 -0.989390 0.2388 0.147 ARH_frs -0.876494 -0.481413 0.6127 0.002 ** CON_frs 0.904278 0.426944 0.4846 0.013 * PRP_frs -0.997634 0.068755 0.9433 0.001 *** RUI_frs -0.648512 -0.761204 0.6243 0.004 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 P values based on 999 permutations. Rémi Lesmerises, biol. M.Sc., Candidat Ph.D. en Biologie Université du Québec à Rimouski remilesmeri...@yahoo.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RDA permutest envfit
On Wed, 2013-04-24 at 13:13 -0700, Rémi Lesmerises wrote: Thank you! A last question: Is it still the same explanation if I remove the condition first=TRUE (and then testing for all axis) and permutest gives the same result? No; again `permutest(, first = FALSE)` and `envfit()` are testing different models/fits but how they are different is not the same as with your previous question. `permutest(, first = FALSE)` is testing whether the variance explained by the k environmental variable (k = 4 I believe in your case) is sufficiently large as to be unusual when compared to the variance explained under a null model (achieved by permuting residuals, propagating those residuals plus fitted values to give new data and refitting the model on those new data). `envfit()` is doing what it always did; assessing whether the correlation between the vector and the 2-d configuration is unusually large. In the `permutest(, first = FALSE)` case, you have a model with 4df. In the `envfit()` case you have 4 models each with 1 degree of freedom. The two methods also differ in terms of the role played by the environmental data. In the `permutest` case, the environmental data are the predictors. In the `envfit` case the env data are the responses and the axis scores in the 2 dimensions chosen are the predictors. HTH G Rémi Lesmerises, biol. M.Sc., Candidat Ph.D. en Biologie Université du Québec à Rimouski remilesmeri...@yahoo.ca On Wed, 2013-04-24 at 10:38 -0700, Rémi Lesmerises wrote: Dear all, I did a RDA and when I looked to the signification of the test with permutest, the output was non-significant. But when I used the envfit function, some of the vectors are significant. All the test's conditions are respected. What it means? Is it an error in the script? The two functions are testing two fundamentally different things. `permutest` (as you invoked it --- `first = TRUE`) is testing the first axis of the RDA. That axis is a linear combination of the constraints. The `envfit` procedure is testing the individual correlations between the 2-d configuration of samples in the RDA space and the direction of maximal variance of the environmental data. Each variable is considered separately. I can easily imagine a case, where the variance explained on the first axis is not significant but variance over 2 axes is significant, as one where the vectors do not point solely along the first RDA axis but also point along the 2nd axis. By looking only at their contributions to the first axis and summing them you don't explain a whole lot, but when you look at the directions in 2D space each variable explains a significant amount of variance. HTH G Commands and output: permutest(rda.ind, perm=999, first=TRUE) Permutation test for rda Call: rda(formula = x ~ ARH_frs + CON_frs + PRP_frs + RUI_frs + VAM_frs, data = z) Permutation test for first constrained eigenvalue Pseudo-F:4.093568 (with 1, 10 Degrees of Freedom) Significance:0.413 Based on 999 permutations under reduced model. fit - envfit(rda.ind, z, perm = 999, display = lc) fit ***VECTORS RDA1 RDA2 r2 Pr(r) VAM_frs 0.145281 -0.989390 0.2388 0.147 ARH_frs -0.876494 -0.481413 0.6127 0.002 ** CON_frs 0.904278 0.426944 0.4846 0.013 * PRP_frs -0.997634 0.068755 0.9433 0.001 *** RUI_frs -0.648512 -0.761204 0.6243 0.004 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 P values based on 999 permutations. Rémi Lesmerises, biol. M.Sc., Candidat Ph.D. en Biologie Université du Québec à Rimouski remilesmeri...@yahoo.ca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GAM Penalised Splines - Intercept
On Tue, 2013-04-23 at 17:51 +0200, Lucas Holland wrote: Hey all, I'm using the gam() function inside the mgcv package to fit a penalised spline to some data. However, I don't quite understand what exactly the intercept it includes by default is / how to interpret it. Ideally I'd like to understand what the intercept is in terms of the B-Spline and/or truncated power series basis representation. It is the mean of the response: library(mgcv) set.seed(2) ## simulate some data... dat - gamSim(1, n=400, dist=normal, scale=2) #Gu Wahba 4 term additive model b - gam(y ~ s(x2), data = dat) coef(b)[1] (Intercept) 7.833279 with(dat, mean(y)) [1] 7.833279 b2 - gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat) coef(b2)[1] (Intercept) 7.833279 In models with parametric factor terms what the intercept is depends on the contrasts used - the default in R would represent the mean of the response for the reference level(s) of factor terms in the model. The intercept is separates from the penalised spline terms in the model. HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CCA report species environment correlation
On Fri, 2013-04-12 at 11:00 -0700, SRuhl wrote: Hi everyone, I did a CCA with R in the vegan package and got all the outputs I need/want to report except for one, that is the species environment correlation values. I only know of them because of one of the sources I read up on about CCAs reported it and I have the suspicion that it´s a value reported when you do a CCA in SPSS (which I have never done) and that just doesnßt come up in R. Can anyone tell me how to find it or calculate it myself? Do I need to run posthoc stats or additional tests on my results? Thanks in advance! See `?spenvcor` with vegan loaded. Pass it your CCA model object. G -- View this message in context: http://r.789695.n4.nabble.com/CCA-report-species-environment-correlation-tp4664089.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in plot.envfit(ef, p.max = 0.1) : (subscript) logical subscript too long
Hi Laura, Following a bit of further digging, I noticed that the bug was in not handling the levels of the factor constraints appropriately. In your data set 12 rows (observations) were deleted because of missing data. These observations meant that one or more of the factor variables had more levels() than actually used in the subset of data analysed. I have checked in some fixes for this which removes the extraneous levels: https://r-forge.r-project.org/scm/viewvc.php?view=revroot=veganrevision=2342 We are currently planning a release for some time in the new year, but in the meantime, you can get a copy of the development version of Vegan from R-forge once the new binary has been built (needs to be Rev 2342 or later to include this fix but only Rev 2341 has been built thus far): https://r-forge.r-project.org/R/?group_id=68 I've CC'd this back to R-Help for the record. Thanks for taking the time to post about this and for sending me your data so I could track this down quickly. All the best and Happy New Year. G On Mon, 2012-12-31 at 14:17 +, Laura Martínez Suz wrote: Hi Gavin, Thank you! It works now, that's great, many thanks. I have spent hours trying to see what I was doing wrong, I'm not very good in R.. That was my first email to the R community, should I officialy reply to you? All the best for you as well and happy New Year! Laura Subject: RE: [R] Error in plot.envfit(ef, p.max = 0.1) : (subscript) logical subscript too long From: gavin.simp...@ucl.ac.uk To: lauram...@hotmail.com CC: lauram...@um.es Date: Mon, 31 Dec 2012 12:02:09 +0100 Hi Laura, Thanks for this. There is a bug which has lead me to discover another bug in that same section of code. The issue you uncovered is that you have factor variables but none are significant at the level you requested and we don't handle the no significant factors case explicitly. That led me to find another bug in that I don't think this code to select significant factors ever worked correctly. I'll need to speak to Jari about this as he wrote the code, but he is away at the moment. I'll see about coming up with a fix for this. In the meantime, I presume this will work for you as the factors are not significant, just drop them before doing the envfit() call: facs - sapply(enviromentaldata_file, is.factor) env - enviromentaldata_file[, !facs] ef - envfit(species, env, permutations = 999, na.rm = TRUE) plot(species, display = sites) plot(ef, p.max = 0.1) HTH and thanks for providing the data. I'll delete it now. All the best, Gavin On Sun, 2012-12-30 at 15:25 +, Laura Martínez Suz wrote: Hi Gavin, Many thanks! Here they go, named as in the scripts. species_matrix is the file with the relative abundance of each species (394) related to the total number of species in each plot (22). environmentaldata_file corresponds to data from the 22 plots in the same order than in the species file. I wanted to see which factors are affecting the dissimilarities in community composition among plots and plot the significant ones as vectors at p0.1. Sorry if it becomes a silly thing and thanks again for your offer.. and yes, please, delete them! :) Regards, Laura Subject: Re: [R] Error in plot.envfit(ef, p.max = 0.1) : (subscript) logical subscript too long From: gavin.simp...@ucl.ac.uk To: lauram...@hotmail.com CC: r-help@r-project.org Date: Sun, 30 Dec 2012 13:16:56 +0100 On Sat, 2012-12-29 at 20:40 +, Laura Martínez Suz wrote: Hello there, I'm trying to plot vectors with p0.1 in a NMDS ordination plot using p.max. Below the scripts I'm using. I guess I'm missing something! could you please give me a hand? species-metaMDS(species_matrix)ef-envfit(species,environmentaldata_file,permu=999,na.rm=TRUE)efplot(species, dis=sites)plot(ef,p.max=0.1) Error in plot.envfit(ef, p.max = 0.1) : (subscript) logical subscript too long Many thanks! Laura Hi Laura, Can you send the data to me **off list** so I can run through the code under the debugger and see what is causing the error? I can see where the subscripting occurs related to p.max but can't immediately see under what circumstances the subscript could ever get longer than the object being subscripted. I promise to delete the data once I have tracked the problem down. Cheers, G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~ %~%~ % Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
Re: [R] Error in plot.envfit(ef, p.max = 0.1) : (subscript) logical subscript too long
On Sat, 2012-12-29 at 20:40 +, Laura Martínez Suz wrote: Hello there, I'm trying to plot vectors with p0.1 in a NMDS ordination plot using p.max. Below the scripts I'm using. I guess I'm missing something! could you please give me a hand? species-metaMDS(species_matrix)ef-envfit(species,environmentaldata_file,permu=999,na.rm=TRUE)efplot(species, dis=sites)plot(ef,p.max=0.1) Error in plot.envfit(ef, p.max = 0.1) : (subscript) logical subscript too long Many thanks! Laura Hi Laura, Can you send the data to me **off list** so I can run through the code under the debugger and see what is causing the error? I can see where the subscripting occurs related to p.max but can't immediately see under what circumstances the subscript could ever get longer than the object being subscripted. I promise to delete the data once I have tracked the problem down. Cheers, G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CCA plot
On Tue, 2012-11-27 at 15:35 -0500, Maria Kernecker wrote: Hi, I have a couple questions about fitting environmental (land use factors, plant species presence-absence, and soil variables) constraints to my CCA biplot. 1. After successfully plotting species and site scores in my CCA, I have been trying to insert the biplot arrows of the environmental constraints in my data set using the text() function. When I do that, the plot changes completely. Is there some code or a sample script you could let me know about? I'm not sure what you tried but this works for me: require(vegan) data(varespec) data(varechem) ## Common but bad way: use all variables you happen to have in your ## environmental data matrix vare.cca - cca(varespec ~ ., data = varechem) ## build up plot plot(vare.cca, display = c(sites,species), scaling = 3) ## add the biplot arrows (though this could be done in the plot() call text(vare.cca, scaling = 3, display = bp) 2. I would like to include only the environment constraints that are significant at conf=0.95, but am not sure that I can do that in a CCA biplot. I was hoping that this way I could make my plot less crowded. One way would be: sig - anova(vare.cca, by = term) ## get ones with p values less than or equal to some threshold ind - with(sig, head(`Pr(F)` = 0.05, -1)) ## now get the bp scores scrs - scores(vare.cca, display = bp, scaling = 3) ## and take the ones that are signif scrs - scrs[ind, ] ## draw plot and add them plot(vare.cca, display = c(sites,species), scaling = 3) ## scale then as per vegan mul - vegan:::ordiArrowMul(scrs) arrows(0, 0, scrs[,1] * mul, scrs[,2] * mul, length = 0.05, col = blue) text(scrs * mul * 1.1, labels = rownames(scrs), cex = 0.9, col = blue) Not that I recommend this pruning a CCA... HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CCA with Vegan - Plot problem
On Fri, 2012-11-09 at 00:51 -0800, rumble14 wrote: Hi, I've just started using R and am having some problems with CCA using vegan. I'm looking at abundance p/m2 (hence decimals) vs environmental variables and have been using http://ecology.msu.montana.edu/labdsv/R/labs/lab12/lab12.html to guide me through. My organism data looks like this: Sample Species_1 Species_2 Species_3 etc Sample_1 0 0.12432 0.3456 etc and is in a .csv file My environment data (.csv again) looks like this: TempWater etc 1833.3 etc I can get vegan to run the CCA with no problems but when I enter cca1.plot - plot(cca.1,choices=c(1,2)) I get the following error message: Error in function (formula, data = NULL, subset = NULL, na.action = na.fail, : invalid type (list) for variable 'org' ('org' is the name of my organism file once attached). Somewhat frustratingly you have covered everything *but* how you created `cca.1` and whilst you might presume that because an error was not thrown everything worked out OK with that procedure, the failing `plot()` call suggests otherwise. Can you show the exact code you used please? Also, not sure why you are attaching anything - it is not required for most R usage and certainly not for using cca() in vegan. G I'd appreciate some help. If I haven't given enough detail, please let me know. I've tested the example data given on the website and it works so am obviously missing something I shouldn't be. Cheers, Heather -- View this message in context: http://r.789695.n4.nabble.com/CCA-with-Vegan-Plot-problem-tp4649012.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CCA with Vegan - Plot problem
On Fri, 2012-11-09 at 05:16 -0800, rumble14 wrote: Thanks for replying. Apologies for not putting the code, I have not deviated from that used on the website I linked to. Can you show the exact code you used please? I am using: #below is Species_1 etc org - read.table (Organisms.S, header=TRUE, row.names=1) #below is environmental variables env-read.table (Environment.S, header=TRUE) attach(env) library(vegan) cca.1-(org~Temp+Water+etc) cca.1 Don't do that - it is an abuse of the formula interface and an abomination to attach such objects to the search path. This isn't actually doing anything (well, it is, but it is not a CCA): cca.1-(org~Temp+Water+etc) There is no function call. And your use of `+ etc` suggests this isn't the **actual** code you ran and I did ask for that. You should use: cca.1 - cca(org ~ ., data = env) assuming you want all variables in `env` in the model - not that that is a good thing but that is a statistical issue not an R one! Can you show me - probably *off list* including the actual data files (which I promise to delete after we solve the problem) - the code you use and I can take a look but before you send anything to me in an email make sure you can paste the exact same code into R and have it run G At this point I have noticed (this morning) that I should be getting the CCA come up here, and this is working with test data but not my actual data which is formatted in exactly the same way. If I type: cca(formula = org ~ Temp + Water+etc, data = org) at this point, I get what looks like but (you're right) might not be an output. For the graph I was then trying to use: cca1.plot - plot(cca.1,choices=c(1,2)) at which point I get error messages. Also, not sure why you are attaching anything As I said, I'm quite new to R. It does so on the tutorial linked so had assumed it was to do with being able to access the parameters Temp etc. H -- View this message in context: http://r.789695.n4.nabble.com/CCA-with-Vegan-Plot-problem-tp4649012p4649045.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] labeling loading vectors in vegan
On Fri, 2012-07-27 at 11:52 -0700, Gordon Holtgrieve wrote: Hello, I am using vegan to do an NMDS plot and I would like to suppress the labels for the loading vectors. Is this possible? Alternatively, how can I avoid overlap? Hi Gordon, You seem to be trying to fit the species scores as a weighted sum a la PCA. Usually we add species scores as weighted averages of the site scores in NMDS. metaMDS() handles the latter for you: require(vegan) data(varespec) data(varechem) mod - metaMDS(varespec) plot(mod) and you can get the species scores directly via: scrs - scores(mod, display = species) head(scrs) NMDS1 NMDS2 Cal.vul -0.16696628 -0.07418638 Emp.nig 0.05870236 0.10673232 Led.pal 0.88640043 -0.10135959 Vac.myr 0.71139281 -0.10964309 Vac.vit 0.04389918 0.09996126 Pin.syl -0.02588978 0.2963255 There isn't a nice way to fiddle with the plotting of envfit objects that will stop overlap as most of the potential functions don't work with envfit objects. This would be easier if you worked with the WA species scores as we could do plot(mod) ordipointlabel(mod, display = species, add = TRUE, col = forestgreen) where the latter function will do its best to avoid overlap. Making envfit objects work in the same way will require a good amount of rewriting various functions which is not something I will have time to do for a few months, but I will add it to the TODO list and discuss with Jari about how to proceed. Feel free to contact me off-list if you require further help with this. All the best, Gavin Many thanks for the help. Example code: #perform NMDS using metaMDS() function spe.nmds-metaMDS(data, distance='bray',k=2 , engine = isoMDS, autotransform=F, trymax=1000) #calculate the loading (i.e., variable weights) on each NMDS axis vec.sp-envfit(spe.nmds$points, data, perm=1000, choices=c(1:2)) #plot data in ordination ordiplot(spe.nmds, choices=c(1:2), type='text', display=sites, xlab=Axis 1, ylab=Axis 2) #plot loading vectors plot(vec.sp, p.max=0.05, col='blue') [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reception of (Vegan) envfit analysis by manuscript reviewers
On Wed, 2012-05-09 at 15:51 -0600, Matt Bakker wrote: I'm getting lots of grief from reviewers about figures generated with the envfit function in the Vegan package. Has anyone else struggled to effectively explain this analysis? If so, can you share any helpful tips? The most recent comment I've gotten back: What this shows is which NMDS axis separates the communities, not the relationship between the edaphic factor and the Bray-Curtis distance. Without further context for that quote and your manuscript to see how you are using the method it is difficult to say whether you are doing something silly or the reviewer is bone-headed. I've had similar comments from reviewers about my use of the ordisurf() function. In each case it was the reviewers' failure to understand the methods applied that was the cause of the confusion. As you provide little or no context I'll explain what envfit() does etc. The idea goes back a long way (!) and is in my 1995 edition of Jongman et al Data Analysis in Community and Landscape Ecology (Cambridge University Press) though most likely was in 1987 version too. See Section 5.4 of the Ordination chapter by Ter Braak in that book. The idea is to find the direction (in the k-dimensional ordination space) that has maximal correlation with an external variable. Essentially, we have: E(z_j) = b_0 + b_1x_1 + b_2x_2 where E(z_j) is the expectation (or mean, or fitted values) of the jth external (environmental) variable, x_1 and x_2 are the axis scores in ordination dimensions 1 and 2, and b_y are unknown regression coefficients. This generalises to more than 2 dimensions or axes. The biplot arrow drawn goes from (0,0) to (b_1, b_2). You can see that the aim is to model or predict the values of the jth environmental variable (z_j) as a linear combination of the axis or site scores of the samples in the ordination space. Exactly the same idea underlies the ordisurf() function except that we use a GAM and for the right hand side of the equation multivariate splines are used which allow a non-linear surface instead of a plane. When applied to nMDS, if the nMDS provides a reasonable approximation to the original dissimilarities, then envfit() will estimate and show the strengths of the correlation and direction of maximal correlation between the nMDS configuration and the jth enviromental variable. This technique can be used to indicate if one or more environmental variables are associated with differences between sites/samples as represented in the nMDS ordination. The big caveat is the implication that the correlation or relationship between z_j and the ordination space is linear. ordisurf() allows you to relax this assumption as we fit a potentially non-linear surface to the ordination space instead of the plane that envfit() effectively produces (though we show only the direction of change with the arrow). So without seeing your manuscript or more context (and I'm not promising to read it or comment more if you provide it) I would suggest that, *if* you have applied nMDS and used envfit() correctly the combined analysis *does* reflect the *linear* relationship between the edaphic factor and the Bray-Curtis distance, assuming of course that the nMDS has low stress (i.e fits the original dissimilarities well). In future, you should consider posting similar questions (ecological/environmental) to the R-SIG-Ecology list instead of the main R-Help list. I know Jari (lead developer of vegan and author of envfit() ) has stopped regularly reading the main R-Help list and you will get far more eyes familiar with these techniques on the R-SIG-Ecology list. I have taken the liberty of cc'ing this to the R-Sig-Ecology list so others can comment. HTH G Thanks for any suggestions! Matt __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] conducting GAM-GEE within gamm4?
) = 0.317glmer.ML score = 10504 Scale est. = 1 n = 10792 summary(b$mer) Generalized linear mixed model fit by the Laplace approximation AIC BIC logLik deviance 10514 10551 -525210504 Random effects: Groups Name Variance Std.Dev. Xr s(dist_slag) 1611344 1269.39 Xr.0 s(Depth) 98622 314.04 Number of obs: 10792, groups: Xr, 8; Xr.0, 8 Fixed effects: Estimate Std. Error z value Pr(|z|) X(Intercept) -13.968 5.145 -2.715 0.00663 ** Xs(dist_slag)Fx1 -35.871 33.944 -1.057 0.29063 Xs(Depth)Fx13.971 3.740 1.062 0.28823 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Correlation of Fixed Effects: X(Int) X(_)F1 Xs(dst_s)F1 0.654 Xs(Dpth)Fx1 -0.030 0.000 I just want to make sure it is correctly allowing for correlation within each value for the block variable. How do I formulate the model to say that autocorrelation can exist within each single value for block, but assume independence among blocks? On another note, I am also receiving the following warning message after model completion for larger models (with many more variables than 2): Warning message: In mer_finalize(ans) : false convergence (8) Thank you for your time, effort, and patience dealing with a novice. Sincerely, Nathan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Clustering analysis with ordination plots
Please read the posting guide for future questions. I presume you mean using the vegan package? If so, then see this blog post of mine which shows how to do something similar: http://wp.me/pZRQ9-73 If you post more details and an example I will help further if the blog post is not sufficient for you to get the solution you want. G On Mon, 2012-04-30 at 09:44 -0700, borinot wrote: Hello to all, I'm new to R so I have a lot of problems with it, but I'll only ask the main one. I have clustered an environmental matrix with 2 different methods, and I'd like to plot them in a PCA and a db-RDA. I mean, I want see these clusters in the plots like points of differents colours, together with the rest information of the plot, but I don't know how to do this. I've checked a lot of bibliography and forums, and I haven't found the solution... it can't be so hard! Well, thanks in advance! :) -- View this message in context: http://r.789695.n4.nabble.com/Clustering-analysis-with-ordination-plots-tp4598695.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Long command in Sweave
On Fri, 2012-04-13 at 17:46 +0800, Wincent wrote: Dear useRs, I am writing a vignette for a package, which contains long command like this, reduce(Lipset_cs,SURVIVAL,c(GNPCAP, URBANIZA, LITERACY, INDLAB, GOVSTAB),explain=positive,remainder=exclude,case=CASEID) It is longer than the width a page and part of it will become missing. Currently, I have to manually break the command into multiple lines. Is there a better way to handle such issue? Not that I am aware of. It seems that others have raised similar question which seems to remain unsolved in a satisfactory fashion. Thanks for your kind attention in advance. 1) use some spacing and format the code over multiple lines reduce(Lipset_cs, SURVIVAL, c(GNPCAP, URBANIZA, LITERACY, INDLAB, GOVSTAB), explain=positive, remainder=exclude, case=CASEID) Isn't that more readable?! Any good R-aware editor should be able to handle appropriate formatting of the code. I *never* write long lines in my editor; I always break the code down to fit roughly into a 72 column editor window. 2) if you want to force Sweave to respect your new formatting, use argument `keep.source=TRUE` for the code chunk. Or set it document wide using \SweaveOpts{option1=value1, option2=value2} etc in the preamble (where optionX is one of the arguments and valueX what you want to set that argument too. Thought IIRC, `keep.source=TRUE` is the default now and as such Sweave will respect your formatting by default now - before it broke lines where it could. In short get out of the habit of writing long lines of R code; you'll be better in the long run laying your code out logically. HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Long command in Sweave
I use Emacs and ESS, with the coding standards in one of the R manuals. I have to insert the carriage returns where I want them, but Emacs/ESS indents the code correctly G On Fri, 2012-04-13 at 22:17 +0800, Wincent wrote: Thanks, Gavin and Duncan. In that case, what I need is a suitable editor which can break the command properly. All the best On 13 April 2012 19:33, Gavin Simpson gavin.simp...@ucl.ac.uk wrote: On Fri, 2012-04-13 at 17:46 +0800, Wincent wrote: Dear useRs, I am writing a vignette for a package, which contains long command like this, reduce(Lipset_cs,SURVIVAL,c(GNPCAP, URBANIZA, LITERACY, INDLAB, GOVSTAB),explain=positive,remainder=exclude,case=CASEID) It is longer than the width a page and part of it will become missing. Currently, I have to manually break the command into multiple lines. Is there a better way to handle such issue? Not that I am aware of. It seems that others have raised similar question which seems to remain unsolved in a satisfactory fashion. Thanks for your kind attention in advance. 1) use some spacing and format the code over multiple lines reduce(Lipset_cs, SURVIVAL, c(GNPCAP, URBANIZA, LITERACY, INDLAB, GOVSTAB), explain=positive, remainder=exclude, case=CASEID) Isn't that more readable?! Any good R-aware editor should be able to handle appropriate formatting of the code. I *never* write long lines in my editor; I always break the code down to fit roughly into a 72 column editor window. 2) if you want to force Sweave to respect your new formatting, use argument `keep.source=TRUE` for the code chunk. Or set it document wide using \SweaveOpts{option1=value1, option2=value2} etc in the preamble (where optionX is one of the arguments and valueX what you want to set that argument too. Thought IIRC, `keep.source=TRUE` is the default now and as such Sweave will respect your formatting by default now - before it broke lines where it could. In short get out of the habit of writing long lines of R code; you'll be better in the long run laying your code out logically. HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] customizing plot, Vegan CCA
On Wed, 2012-03-28 at 01:43 -0700, millerjeremya wrote: I’m working on a canonical correspondence analysis using Vegan. I’m new to this. I would like to control the shapes and colors for each of my individual display points and can’t figure out the syntax. This comes up so often it is /the/ FAQ (well one of them anyway). So much so I wrote a post on my blog showing how to do this. It changes the colour of the points but you can use the same ideas to change the plotting characters (via argument `pch`). http://ucfagls.wordpress.com/2012/04/11/customising-vegans-ordination-plots/ See if that helps you. G Here is the script I am using. It comes from Numerical Ecology in R by Bocard et al. windows(title=CCA biplot - scaling 1, 9, 9) plot(spe.cca, scaling=1, display=c(lc, cn), main=Biplot CCA spe ~ env - scaling 1) I think I should be able to add points() and text() lines to change the appearance of particular elements. I also will need to refer to one or more specific points (e.g., make rows 1-4 a red circle, 5-8 a yellow triangle, etc.; make V1 say “tree cover”, V2 say “altitude”, etc.; see attached file CCA.jpeg). Many thanks! http://r.789695.n4.nabble.com/file/n4511704/CCA.jpeg -- View this message in context: http://r.789695.n4.nabble.com/customizing-plot-Vegan-CCA-tp4511704p4511704.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Gamm and post comparison
On Wed, 2012-02-22 at 10:55 -0800, RHam wrote: My data set consist of number of calls (lcin) across Day. I am looking for activity differences between three features (4 sites per feature). I am also looking for peaks of activity across time (Day). I am using a gamm since I believe these are nonlinear trends with nested data. You don't need gamm() for this as you don't appear to be using any random effects. gam() will be fine for the model you show below. I don't know enough about the workings of mgcv to know whether you can leverage functions in the multcomp package; gam objects inherit from glm and lm classes and multcomp can work with these, though do not the point in ?gamObject that gam objects lack the details of the fitting expected in glm or lm objects. IIRC, that model could be fitted as a linear mixed model if the smooth term is set up correctly. multcomp can work with nlme and lme4 so you could continue to use gamm() as below but use the multcomp package on the $lme component. The gamm4 package provides a version of gamm() that uses lme4 instead of nlme for the underlying fitting. Again you might look into using the lmer representation of the model with multcomp. None of the above is tested or even based on personal experience of doing this. G gammdata-gamm(lcin~Temp+s(Day)+fType+wind+fFeature+Forest+Water+Built, list=fSite,data=data, family=gaussian) summary(gammdata$gam) summary(gammdata$lme) anova(gammdata$gam) I can see which variables are significant but I was wondering if there was a way to do a post hoc to see differences between features? Or is there a way to compare different models (feature) for significance (not best fit) against each other? Thanks, Rham -- View this message in context: http://r.789695.n4.nabble.com/Gamm-and-post-comparison-tp4411366p4411366.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] z-transform each column of a data.frame
On Fri, 2012-01-20 at 18:04 +0100, Martin Batholdy wrote: Hi, I am currently trying to z-transform (that is subtracting the mean and divide by the standard deviation) multiple columns of a data.frame at the same time. My first approach was: x - data.frame(c(0:10), c(10:20)) (x - colMeans(x)) / apply(x, 2, sd) This is obviously not working. Is there a convenient way to z-transform each column separately (so in this case, each column represents an independent variable that should be z-transformed) ?scale scale(x) c.0.10. c.10.20. [1,] -1.5075567 -1.5075567 [2,] -1.2060454 -1.2060454 [3,] -0.9045340 -0.9045340 [4,] -0.6030227 -0.6030227 [5,] -0.3015113 -0.3015113 [6,] 0.000 0.000 [7,] 0.3015113 0.3015113 [8,] 0.6030227 0.6030227 [9,] 0.9045340 0.9045340 [10,] 1.2060454 1.2060454 [11,] 1.5075567 1.5075567 attr(,scaled:center) c.0.10. c.10.20. 5 15 attr(,scaled:scale) c.0.10. c.10.20. 3.316625 3.316625 thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] any DCCA function in R?
On Tue, 2011-12-20 at 09:31 +0100, Juan Santos wrote: Dear members, I am performing multivariate analysis on marine benthic populations using R. At first glance I found ca and VEGANO packages to be the That would be the `vegan` package suitable for the task, but neither has incorporated Detrended Canonical Correspondence Analysis (DCCA), which is just the method I want to apply on my data. I've looked for alternative packages containing the method, but my suspicion is that there is not DCCA availability for R users. Does anyone have better news for me? Nope, sorry. I'm not aware of any such function and as I maintain the Environmetric Task View on CRAN I come across most ecologically-related packages one way or another. I can think of two reasons to use a DCCA; i) to estimate a gradient length in environmental space on axis 1, and ii) to remove an arch in a CCA. ii) can more easily be addressed by reducing the number of terms in the CCA model or a different ordination method, and i) could no doubt be done in a number of other ways should you really want it. As far as I am aware DCCA was only ever implemented in CANOCO (possibly also in one of the other DOS applications from the good old days of quantitative ecology). Jari ported the original DECORANA code for DCA to vegan as it was available for us to do so. I am not aware of a free/unencumbered source code for DCCA, and given that and the limited number of applications for the method, it hasn't been something we have been motivated to do. What do you want to do with DCCA? Perhaps people on the list can suggest alternatives? You might also consider posting that message (what you want to do with DCCA) on the R-SIG-Ecology list as there'll be more people there who will be familiar with you specific area of interest. HTH G Javier Murillo PHD Candidate IEO-CO VIGO __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vegan: Diversity Plot, label points
On Wed, 2011-11-23 at 16:02 -0300, Alejo C.S. wrote: Dear List, I can'f figure how to add point labels in the next plot (example from ?taxondive help page): library(vegan) data(dune) data(dune.taxon) taxdis - taxa2dist(dune.taxon, varstep=TRUE) mod - taxondive(dune, taxdis) plot(mod) The points in this plot are diversity values of single sites, and I'd like to add a label to each one. The plot command don't accept a label argument. Any tip? A couple of options: with(mod, text(Species, Dplus, label = rownames(dune), pos = 2, cex = 0.7)) fiddle with pos - see ?text for details. Another option is with(mod, identify(Species, Dplus, label = rownames(dune), cex = 0.9)) where you now click around the points you want to label. See ?identify for further details. HTH G Thanks in advance. Alejo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] I cannot get species scores to plot with site scores in MDS when I use a distance matrix as input. Problems with NA's?
On Thu, 2011-11-24 at 07:57 -0800, B77S wrote: Try the daisy() function from the package cluster, it seems to be able to handle NAs and non-dummy coded character variables metaMDS(daisy(df, metric=gower)) That won't help the OP as the species scores (the species data, i.e. the traits in this case) can not be computed from a site x site dissimilarity matrix. This is has the same problem as the OPs existing problem. G Edwin Lebrija Trejos wrote Hi, First I should note I am relatively new to R so I would appreciate answers that take this into account. I am trying to perform an MDS ordination using the function “metaMDS” of the “vegan” package. I want to ordinate species according to a set of functional traits. “Species” here refers to “sites” in traditional vegetation analyses while “traits” here correspond to “species” in such analyses. My data looks like this: Trait1 Trait2 Trait3 Trait4 Trait5 Trait… Species1 228.44 16.56 1.66 13.22 1 short Species2 150.55 28.07 0.41 0.60 1 mid Species3 NA 25.89 NA 0.55 0 large Species4 147.70 17.65 0.42 1.12 NA large Species… 132.68 NA 1.28 2.75 0 short Because the traits have different variable types, different measurement scales, and also missing values for some species, I have calculated the matrix of species distances using the Gower coefficient of similarity available in Package “FD” (which allows missing values). My problem comes when I create a bi-plot of species and traits. As I have used a distance matrix in function “metaMDS” there are no species scores available. This is given as a warning in R: NMDSgowdis- metaMDS(SpeciesGowdis) plot(NMDSgowdis, type = t) Warning message:In ordiplot(x, choices = choices, type = type, display = display, :Species scores not available” I have read from internet resources that in principle I could obtain the trait (species) scores to plot them in the ordination but my attempts have been unsuccessful. I have tried using the function “wascores” in package vegan and “add.spec.scores” in package BiodiversityR. For this purpuse I have created a new species x traits table where factor traits were coded into dummy variables and all integer variables (including binary) were coerced to numeric variables. Here are the codes used and the error messages I have got: “ NMDSgowdis- metaMDS(SpeciesGowdis) NMDSpoints-postMDS(NMDSgowdis$points,SpeciesGowdis) NMDSwasc-wascores(NMDSpoints,TraitsNMDSdummies) Error in if (any(w 0) || sum(w) == 0) stop(weights must be non-negative and not all zero) : missing value where TRUE/FALSE needed” I imagine the problem is with the NA’s in the data. Alternatively, I have used the “add.spec.scores” function, method=”cor.scores”, found in package BiodiversityR. This seemed to work, as I got no error message, but the species scores were not returned. Here the R code and results: “ A-add.spec.scores(ordi=NMDSgowdis,comm=TraitsNMDSdummies,method=cor.scores,multi=1, Rscale=F,scaling=1) plot(A) Warning message:In ordiplot(x, choices = choices, type = type, display = display, :Species scores not available“ Can anyone guide me to get the trait (“species”) scores to plot together with my species (“site”) scores? Thanks in advance, Edwin __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/I-cannot-get-species-scores-to-plot-with-site-scores-in-MDS-when-I-use-a-distance-matrix-as-input-Pr-tp4103699p4104406.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained
Re: [R] I cannot get species scores to plot with site scores in MDS when I use a distance matrix as input. Problems with NA's?
On Thu, 2011-11-24 at 12:16 +, Edwin Lebrija Trejos wrote: Hi, First I should note I am relatively new to R so I would appreciate answers that take this into account. I am trying to perform an MDS ordination using the function “metaMDS” of the “vegan” package. I want to ordinate species according to a set of functional traits. “Species” here refers to “sites” in traditional vegetation analyses while “traits” here correspond to “species” in such analyses. My data looks like this: Trait1 Trait2 Trait3 Trait4 Trait5 Trait… Species1 228.44 16.56 1.66 13.22 1 short Species2 150.55 28.07 0.41 0.60 1 mid Species3 NA 25.89 NA 0.55 0 large Species4 147.70 17.65 0.42 1.12 NA large Species… 132.68 NA 1.28 2.75 0 short snip / Can anyone guide me to get the trait (“species”) scores to plot together with my species (“site”) scores? Thanks in advance, Edwin I think you should pass metaMDS the species trait matrix and then get it to use FD to compute the dissimilarities. Note from ?metaMDS there is a distfun argument for metaMDSdit. metaMDS passes the community data to metaMDSdist to compute the dissimilarities. The only snag is that gowdis doesn't accept a `method` argument so we need a wrapper: wrapper - function(x, method, ...) { gowdis(x, ...) } then do metaMDS(SpeciesGowdis, distfun = wrapper, X) where represents any other arguments you want to pass to gowdis via our wrapper. Essentially the wrapper ignores the method argument, we just need it as metaMDSdist wants to call the dissimilarity function with a method argument. This is not tested as there wasn't reproducible code, but hopefully you'll be able to work it out from the above. HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cca with repeated measures
On Fri, 2011-11-18 at 10:25 +0100, René Mayer wrote: Dear all, How can I run a constrained correspondence analysis with the following data: 15 animals were measured repeatedly month-wise (over to 2 years) according to ther diet composition (8 food categories). our data.frame looks like this: food 1 2 ... 8 sex season year animal freq 12 8 ... 1 0 summer 2011 1 freq 0 7 ... 0 1 winter 2011 1 ... freq 0 7 ... 0 1 spring 2011 15 We want to find out if season and sex influences diet composition. My experience with CCA is limited, but in repeated measures ANOVA, e.g. with aov() on has to define the between (animal) error term in order to deal with the pseudoreplication. Do I have to restructure or reshape the data in order to deal with pseudoreplication the data? Or do I have to define an error strata? I suspect I cannot simply run: library(vegan) model=cca(food ~ season*sex+year+animal, data) You could do that although the analysis would be i) focussed on those particular animals in those years, and ii) you could only test the terms season, sex and season:sex in a sequential manner (i.e. dependent upon how the terms enter the model), so season, then sex after season is in the model, then their interaction after both main terms are included in the model. ii) is done by adding `by = terms` to the call to the `anova method for cca objects; examples are in `?anova.cca` That corresponds to a fixed effects formulation of the ANOVA (assuming I have my terminology right). The alternative is to adjust the permutation scheme used to reflect the clustering in your data. In that case, using `strata = animal` would be OK. Ideally one would want to control for temporal dependence so you would want cyclic shifts *within* `strata = animal` but vegan can not yet do this sort of permutation. It is coming; the actual code to generate those permutations is available in the permute package (upon which vegan depends), but as yet we have not hooked this into the vegan ordination functions (it is on the TODO list). That said, season should be accounting for much of the temporal dependence, so I think you might get away with just specifying strata (the permutation test itself is permuting residuals from the model so if the model is capturing the seasonal variation then the simple permuting within animal should be OK, but you can check by extracting the residuals using the resid() method and plotting them out by time and animal). HTH G I would be grateful for any help. Thanks in advance, René __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cca with repeated measures
On Fri, 2011-11-18 at 15:17 +0100, René Mayer wrote: Thanks a lot Gavin!, this was what I was looking for. Have I got this right that with no 'cyclic shifts *within* strata' you mean that I cannot define a nesting within animal, e.g., animal/year/season (speaking in regression-terms random-effects for the animal-specific season and year variation). In the permutation framework, you condition the permutation on animal (`strata = animal` in the anova() method), but that alone would say that observations are freely exchangeable within each animal, but not exchangeable between animals. As you have time series data then the observations are not really freely exchangeable within animal either. We can try to maintain the within-animal temporal dependence structure by using cyclic shift permutations: require(permute) shuffleSeries(1:10) [1] 4 5 6 7 8 9 10 1 2 3 t(replicate(10, shuffleSeries(1:10))) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 1012345678 9 [2,]12345678910 [3,]3456789 101 2 [4,] 1012345678 9 [5,]789 1012345 6 [6,]89 10123456 7 [7,]9 101234567 8 [8,]3456789 101 2 [9,]456789 1012 3 [10,]56789 10123 4 cyclic shifts basically take a random starting point for the permuted timeseries and then keep sample in the order they appear in the data, bending the ends of the series round together into a circle. The above example: shuffleSeries(1:10) [1] 4 5 6 7 8 9 10 1 2 3 shows a single cyclic shift of the observations 1:10 in an equally spaced time series. Here the random starting point was observation 4 and note that we join the ends of the timeseries so after observation 10, we have observation 1. There are problems with this if there is a trend because you'll get a discontinuity when you join the two ends. As you only have two years of data, and include season as a fixed effect (main term in the model), you could assume that the residuals within animal might be free of temporal dependence. You can check that as I said by fitting the model and looking at the residuals within animal over time. Unless you are prepared to do some coding, the above discussion is moot as vegan doesn't possess the code to use these restricted permutations yet. If you want to do it yourself by hacking the anova.ccabyterm() function, take a look at the vignette that comes with the permute package for ideas on how to specify the permutation design you want and then generate the permutations using shuffle() or shuffleSet() instead of via permuted.index() that is used in vegan. Basically, with: mod - cca(food ~ season*sex, data) anova(mod, by = term, strata = data$animal) I guess you are allowing for a random intercept in animal only. For animal specific season and year effects you'll need to include animal and year in the model with appropriate interactions - we don't, as far as I know, have ways of dealing with more complex dependencies in the residuals via permutations built into vegan other than permuting within animal, or maintaining temporal structure within animal (if you hack the code yourself using functions from permute). HTH G thanks, René -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Permutations
On Fri, 2011-11-18 at 11:03 -0500, Gyanendra Pokharel wrote: Hi all, why factorial(150) shows the error out of range in 'gammafn'? I have to calculate the number of subset formed by 150 samples taking 10 at a time. How is this possible? best Do you mean: choose(150, 10) [1] 1.169554e+15 ?? G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] permutation within rows of a matrix
On Wed, 2011-11-16 at 14:55 -0800, Peter Ehlers wrote: I must be missing something. What's wrong with t(apply(mat, 1, sample)) ? Only missing that I am either[*] i) stupid, ii) being too clever, iii) down on my coffee intake for the day. G [*] delete as applicable any that don't apply. ;-) Peter Ehlers On 2011-11-16 12:12, Gavin Simpson wrote: On Wed, 2011-11-16 at 14:29 -0500, R. Michael Weylandt wrote: Suppose your matrix is called X. ? sample X[sample(nrow(X)),] That will shuffle the rows at random, not permute within the rows. Here is an alternative, first using one of my packages (permute - shameful promotion ;-) !: mat- matrix(sample(0:1, 100, replace = TRUE), ncol = 10) require(permute) perms- shuffleSet(10, nset = 10) ## permute mat t(sapply(seq_len(nrow(perms)), function(i, perms, mat) mat[i, perms[i,]], mat = mat, perms = perms)) If you don't want to use permute, then you can do this via standard R functions perms- t(replicate(nrow(mat), sample(ncol(mat ## permute mat t(sapply(seq_len(nrow(perms)), function(i, perms, mat) mat[i, perms[i,]], mat = mat, perms = perms)) HTH G Michael On Wed, Nov 16, 2011 at 11:45 AM, Juan Antonio Balbuenabalbu...@uv.es wrote: Hello This is probably a basic question but I am quite new to R. I need to permute elements within rows of a binary matrix, such as [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]000010000 0 [2,]001100011 0 [3,]010000100 0 [4,]000000110 0 [5,]000100001 0 [6,]001100000 1 [7,]000000000 0 [8,]110100010 1 [9,]100101010 0 [10,]000000010 1 That is, elements within each row are permuted freely and independently from the other rows. I see that is is workable by creating a array for each row, performing sample and binding the arrays again, but I wonder whether there is a more efficient way of doing the trick. Any help will be much appreciated. -- View this message in context: http://r.789695.n4.nabble.com/permutation-within-rows-of-a-matrix-tp4076989p4076989.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] permutation within rows of a matrix
On Wed, 2011-11-16 at 14:29 -0500, R. Michael Weylandt wrote: Suppose your matrix is called X. ? sample X[sample(nrow(X)),] That will shuffle the rows at random, not permute within the rows. Here is an alternative, first using one of my packages (permute - shameful promotion ;-) !: mat - matrix(sample(0:1, 100, replace = TRUE), ncol = 10) require(permute) perms - shuffleSet(10, nset = 10) ## permute mat t(sapply(seq_len(nrow(perms)), function(i, perms, mat) mat[i, perms[i,]], mat = mat, perms = perms)) If you don't want to use permute, then you can do this via standard R functions perms - t(replicate(nrow(mat), sample(ncol(mat ## permute mat t(sapply(seq_len(nrow(perms)), function(i, perms, mat) mat[i, perms[i,]], mat = mat, perms = perms)) HTH G Michael On Wed, Nov 16, 2011 at 11:45 AM, Juan Antonio Balbuena balbu...@uv.es wrote: Hello This is probably a basic question but I am quite new to R. I need to permute elements within rows of a binary matrix, such as [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]000010000 0 [2,]001100011 0 [3,]010000100 0 [4,]000000110 0 [5,]000100001 0 [6,]001100000 1 [7,]000000000 0 [8,]110100010 1 [9,]100101010 0 [10,]000000010 1 That is, elements within each row are permuted freely and independently from the other rows. I see that is is workable by creating a array for each row, performing sample and binding the arrays again, but I wonder whether there is a more efficient way of doing the trick. Any help will be much appreciated. -- View this message in context: http://r.789695.n4.nabble.com/permutation-within-rows-of-a-matrix-tp4076989p4076989.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ordination in vegan: what does downweight() do?
On Mon, 2011-11-07 at 10:24 -0800, kelsmith wrote: Can anyone point me in the right direction of figuring out what downweight() is doing? I am using vegan to perform CCA on diatom assemblage data. I have a lot of rare species, so I want to reduce the influence of rare species in my CCA. I have read that some authors reduce rare species by only including species with an abundance of at least 1% in at least one sample (other authors use 5% as a rule, but this removes at least half my species). That is not what downweight() is for. If you want this sort of selection, see chooseTaxa() in my analogue package diat.sel - chooseTaxa(diatoms, max.abun = 1, type = AND) or, if proportions not percent diat.sel - chooseTaxa(diatoms, max.abun = 0.01, type = AND) This sort of indexing is trivial (but I made a typo in the current version so type = OR won't work) so you can study the code of analogue:::chooseTaxa.default once I fix the CRAN version or on R-Forge now: https://r-forge.r-project.org/scm/viewvc.php/pkg/R/chooseTaxa.R?view=markuproot=analogue Jari has addressed the other part of your question. Jari also mentioned the issue about whether you should be removing or downweighting rare species. Many people, especially diatomists, do this for practical purposes in a routine fashion because their data sets are especially speciose and have a large proportion of low abundance taxa. As a general matter of routine practice, I don't think this is a very good way of working, especially as we have no good ecological grounds for doing so and who knows what information these species could be telling us if we just listened to them instead of discarding them. HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pdIdent in smoothing regression model
On Sun, 2011-10-09 at 10:25 -0400, Lei Liu wrote: Hi there, I am reading the 2004 paper Smoothing with mixed model software in Journal of Statistical Software, by Ngo and Wand. I tried to run their first example in Section 2.1 using R but I had some problems. Here is the code: I'm not going to try to fix this nor work out how to do it via lme(), but just wanted to note that you can use mgcv:::gam() for example to fit something very similar. require(SemiPar) require(mgcv) data(fossil) mod - gam(strontium.ratio ~ s(age), data = fossil, method = REML) That is treating the smoothness terms as random effects. Also note that Matt Wand wrote the SemiPar package as support software for the book Semiparametric Regression, where these example data are taken from: library(nlme) fossil - read.table(fossil.dat,header=T) x - fossil$age y - 10*fossil$strontium.ratio knots - seq(94,121,length=25) n - length(x) X - cbind(rep(1,n),x) Z - outer(x,knots,-) Z - Z*(Z0) fit - lme(y~-1+X,random=pdIdent(~-1+Z)) When I ran the code fit - lme(y~-1+X,random=pdIdent(~-1+Z)) I got an error message: Error in getGroups.data.frame(dataMix, groups) : Invalid formula for groups I was really puzzled. I asked Dr. Ngo and he said they did it in S-plus but not R. Does anyone knows how to do it in R? Thanks! Lei Liu Associate Professor Division of Biostatistics Department of Public Health Sciences University of Virginia School of Medicine http://people.virginia.edu/~ll9f/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pdIdent in smoothing regression model
On Sun, 2011-10-09 at 10:25 -0400, Lei Liu wrote: Hi there, I am reading the 2004 paper Smoothing with mixed model software in Journal of Statistical Software, by Ngo and Wand. I tried to run their first example in Section 2.1 using R but I had some problems. Here is the code: [whoops - fingers slipped and sent the message prematurely] I'm not going to try to fix this nor work out how to do it via lme(), but just wanted to note that you can use mgcv:::gam() for example to fit something very similar. require(SemiPar) require(mgcv) data(fossil) mod - gam(strontium.ratio ~ s(age), data = fossil, method = REML) That is treating the smoothness terms as random effects. Also note that Matt Wand wrote the SemiPar package as support software for the book Semiparametric Regression, where these example data are taken from: http://cran.r-project.org/web/packages/SemiPar/index.html You could use the function contained therein. The above presumes you weren't wanting to make the example work for some pedagogic reason. HTH G library(nlme) fossil - read.table(fossil.dat,header=T) x - fossil$age y - 10*fossil$strontium.ratio knots - seq(94,121,length=25) n - length(x) X - cbind(rep(1,n),x) Z - outer(x,knots,-) Z - Z*(Z0) fit - lme(y~-1+X,random=pdIdent(~-1+Z)) When I ran the code fit - lme(y~-1+X,random=pdIdent(~-1+Z)) I got an error message: Error in getGroups.data.frame(dataMix, groups) : Invalid formula for groups I was really puzzled. I asked Dr. Ngo and he said they did it in S-plus but not R. Does anyone knows how to do it in R? Thanks! Lei Liu Associate Professor Division of Biostatistics Department of Public Health Sciences University of Virginia School of Medicine http://people.virginia.edu/~ll9f/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting issues with PCA
On Mon, 2011-10-17 at 21:02 +1000, Andrew Halford wrote: Hi Listers, This has a simple answer but it has been eluding me nonetheless. ordiellipse() doesn't work the way you think it does - it can only take a single colour at a time. Therefore you need to do /n/ calls to ordiellipse() to draw /n/ ellipses. Here is a *reproducible* example, taken from ?ordiellipse: require(vegan) data(dune) data(dune.env) mod - cca(dune ~ Management, dune.env) plot(mod, type=n, display = sites) text(mod, display=sites, labels = as.character(Management)) ## vector of colours cols - c(blue,red,darkgreen,grey70) ## add ellipses with(dune.env, ordiellipse(mod, Management, kind=se, conf=0.95, lwd=2, col=cols[1], show.groups = BF)) with(dune.env, ordiellipse(mod, Management, kind=se, conf=0.95, lwd=2, col=cols[2], show.groups = HF)) with(dune.env, ordiellipse(mod, Management, kind=se, conf=0.95, lwd=2, col=cols[3], show.groups = NM)) with(dune.env, ordiellipse(mod, Management, kind=se, conf=0.95, lwd=2, col=cols[4], show.groups = SF)) We could automate this a bit: ## set up plotting region plot(mod, type=n, display = sites) text(mod, display=sites, labels = as.character(Management)) ## get the levels of the factor for plotting groups lev - with(dune.env, levels(Management)) ## vector of colours cols - c(blue,red,darkgreen,grey70) ## loop to draw each group for (i in seq_along(lev)) { with(dune.env, ordiellipse(mod, Management, kind=se, conf=0.95, lwd=2, col=cols[i], ## ith colour show.groups = lev[i])) ## for ith group } This works with rda() too. HTH G I have been building a PCA plot from scratch with the ability to plot predefined groups in different colors. This has worked fine but when I try to get a polygon drawn around each of the groups it is not recognising my colour file correctly and is only printing the first colour in the filecode is below site.codings - c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,5,4,4,4,4,4,3,3,3,6,6,6,6,6,6,6,6,5,5,5,5) names(site.codings) - c( WM1, WM2, WM3, NM1, NM2, NM3, KH1, KH2, KH3, LM1 ,LM2 ,LM3, DB1 ,DB2 ,DB3, DM1 , DM2 , DM3 , FI1, FI2, BKI1, BKI2, BKO1, BKO2, BKO3, SUR1,MI1,MI2,MI3,BHE1,BHE2,BHE3,BHW1,BHW2,BHW3,HAL1,HAL2,HAL3,HAL4,HAL5,HAL6,HAL7,DOH1,DOH2,DOH3,DOH4,DOH5) fish.pca -rda(fish.sqrt.h) fish.site - scores(fish.pca,display=sites,scaling=3) fish.spp - scores(fish.pca,display=species,scaling=3)[omanfish.mrt.indval$pval=0.05,] graph - plot(fish.pca,display=c(sites,species),type=n,scaling=3) plotcolor - c(red,green,blue,aquamarine,magenta,yellow)[site.codings] points(fish.site,pch=21,bg=plotcolor,cex=1.2) #up to this point all works well but when I try to draw the polygons I cant get the lines to colour code the same way as the points did ordiellipse(graph,site.codings,kind=sd,conf=0.90,draw=polygon) I see there is a command called show.groups but I cant work out how to use it to access the plotcolor file. Any help appreciated. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] binomial GLM quasi separation
On Sat, 2011-10-15 at 09:11 -0700, lincoln wrote: #Uwe: snip / #Gavin: I have read carefully your thread but I am not sure to understand what you are suggesting (my gaps in statistics!). You say that it should be due to the /Hauck Donner/ effect and that it is not a quasi separation or separation issue. Even though, I am still unsure to understand why I found such a high asymptotic standard error. I don't believe this is a separation issue - the sorts of things we'd expect to see if this were due to separation do not show up. Given the large estimate for the coefficient for the term it is not that surprising that the associated uncertainty is also high: set.seed(2) var(runif(100)) [1] 0.08911998 set.seed(2) var(runif(100)*1) [1] 8911998 All I did there was increase the units of the data in the second example and the variance is huge, but only because the data were expressed in units 1 times bigger than the first example. In the same way, the coefficient estimate is large so it's standard error is also large; the question one needs to ask is, is the estimate of the coefficient for hcp bounded away from zero, given the uncertainty in the estimate. If you were to produce a profile confidence interval it too would be large. So you have a large estimate, which is somewhat uncertain. Given that the slope of the log likelihood is low at the estimate and quite different from the slope at \beta == 0, it is not unreasonable to assume that the Hauck Donner effect might be present... Anyway, how should I consider this result? Should I find another way to analyze this process or I could consider this as correct? ...however, in the case of the snippet of data you showed, it doesn't affect the result - on the basis of the Wald test you would still accept that hcp is significant/important. The Hauck Donner effect might be leading to a lower value of the test statistic, but it hasn't affected the outcome of the test. To check, fit the model with and without hcp and then use the anova() function to compare the two models. This will do a likelihood ratio test. If I am understanding this enough, this warning message Possibly, but it could just be that the fitted probabilities really are 0 or 1. and the very high estimates should be due to /Hauck-Donner/. Regarding that reference to Venables and Ripley (2002) on this issue, I have found this ( http://kups.ku.edu/maillist/classes/ps707/2005/msg00023.html Hauck-Donner ) where it is said that The practical advice, then, is to run the model with all of the variables, and then run again with the questionable one removed, and conduct a likelihood ratio test./ and I suppose that the p-values for hcp should be the LRT p-value, isn't it? Yes. Well it is the result of applying a likelihood ratio test. I don't think there is such a thing as *the* p-value for a term in a model, just different ways of computing *a* p-value. In this case, what does it matter? If the Wald test is *under*estimating z but the term *is* still significant, the LRT should only confirm this and give an even lower p-value than the already very low one. Thanks for taking your time to help me in this. Would it hurt you to reply via an email? Regardless of what Nabble thinks, R-help is a mailing list and your *posts* keep on removing all the context - I have to keep on hunting for the thread in the archives just to keep track of what you have told us. G Simone -- View this message in context: http://r.789695.n4.nabble.com/binomial-GLM-quasi-separation-tp3901687p3907716.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What does \Sexpr[results=rd]{} exactly mean in Rd?
On Sun, 2011-10-16 at 19:36 -0500, Yihui Xie wrote: Hi, I have spent a few hours on the R-exts manual and the documentation of parse_Rd() (as well as the PDF document in the references), but I still have not figured out what results=rd means. I thought I could use an R code fragment to create an Rd fragment dynamically. Here is an example, in which I was expected the output to be a describe list DL in HTML, but it turns out not to be true. Perhaps best not to cross post to several internet resources at once. I replied to the same Q on StackOverflow: http://stackoverflow.com/q/7788628/429846 Suffice it to say that your example works for me with 2.13.1 (still need to compile 2.13.2 on my workstation). I left some additional comments and examples, which might help understand this. I had trouble when I first started playing this and didn't pursue further, but I think I am starting to understand how to use this now after taking a look when I tried to answer your Q. G (I was actually building a package with Rd's containing \Sexpr{} instead of really using Rd2HTML(); the content was not rendered after I run R CMD build.) des - \\describe{\\item{def}{ghi}} con - textConnection(c(\\title{abc}\\name{abc}, \\details{\\Sexpr[results=rd,stage=build]{des}})) z - parse_Rd(con) Rd2HTML(z, stages = build) close(con) !DOCTYPE html PUBLIC -//W3C//DTD HTML 4.01 Transitional//EN htmlheadtitleR: abc/title meta http-equiv=Content-Type content=text/html; charset=utf-8 link rel=stylesheet type=text/css href=R.css /headbody table width=100% summary=page for abctrtdabc/tdtd align=rightR Documentation/td/tr/table h2abc/h2 h3Details/h3 pdefghi/p /body/html sessionInfo() R version 2.13.2 (2011-09-30) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] devtools_0.4 loaded via a namespace (and not attached): [1] RCurl_1.6-10 Thanks! Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] binomial GLM quasi separation
(sex~twp+hwp+hcp+hnp,binomial,data=dati) Mensajes de aviso perdidos glm.fit: fitted probabilities numerically 0 or 1 occurred pp-profileModel(glm.sat,quantile=qchisq(0.95,1),objective=ordinaryDeviance,which=4) Preliminary iteration . Done Profiling for parameter hcp ... Done pp1-profileModel(glm.sat,quantile=qchisq(0.95,1),objective=ordinaryDeviance,which=4,stepsize=20) Preliminary iteration . Done Profiling for parameter hcp ... Done pp2-profileModel(glm.sat,quantile=qchisq(0.95,1),objective=ordinaryDeviance,which=4,stepsize=100) Preliminary iteration . Done Profiling for parameter hcp ... Done plot(pp) mtext(Default stepsize,adj=0,cex=2,line=1) dev.new() plot(pp) mtext(Stepsize=20,adj=0,cex=2,line=1) dev.new() plot(pp) mtext(Stepsize=100,adj=0,cex=2,line=1) And these are the plots as they look like: http://r.789695.n4.nabble.com/file/n3904261/plot1.png http://r.789695.n4.nabble.com/file/n3904261/plot2.png http://r.789695.n4.nabble.com/file/n3904261/plot3.png I have tried to understand what is going on but I don't know how to interpret this. It's quite a long time that I am trying to solve this but I have not been able to. Here ( http://r.789695.n4.nabble.com/file/n3904261/simone.txt simone.txt ) I attach a subset of the data I am working with that comprises the variables specified in the above glm model and by the way the funky variable called hcp. Thank you for take your time to help me in this. -- View this message in context: http://r.789695.n4.nabble.com/binomial-GLM-quasi-separation-tp3901687p3904261.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about GAMs
On Thu, 2011-10-13 at 08:20 -0700, pigpigmeow wrote: I'm confused... I type .. predict.gam(ozonea,type=s(ratio,bs=cr)) That is not a valid 'type'; normally you'd use `type = terms` or `type = iterms`, depending on whether you want (any) standard errors to include the uncertainty about the intercept. pressure maxtemp s(avetemp) s(ratio) 1 -0.0459102290 -0.185178463 0.263358446 -0.164558673 2 -0.0286464652 -0.194731320 0.199315027 0.727823293 30.0478073459 -0.013227033 0.002228896 0.342373202 4 -0.0089164494 0.082301539 -0.037331159 -0.067260889 50.0675373617 0.024984396 -0.047067786 -0.357569389 60.0823348735 0.101407254 -0.075884852 -1.485036738 7 -0.0977015204 0.177830112 -0.094755158 0.236575309 8 -0.0903027645 0.225594398 -0.113346667 0.435141242 90.0206785742 0.187382969 -0.066346157 -0.256133513 10 -0.1371615520 0.101407254 -0.131656887 0.145057584 11 -0.0477674066 -0.181001505 0.260279546 0.180513043 12 -0.0921599421 -0.009050075 -0.020511366 0.281470433 13 0.0681464361 -0.219212934 0.335348247 0.270813178 .. I want to show s(ratio,bs=cr) term, and show the warning message Warning message: In predict.gam(ozonea, type = s(ratio, bs = cr)) : Unknown type, reset to terms. I don't understand what does it mean. So if the above comments I made are not sufficient to explain this, read ?predict.gam and see what the allowed options for the 'type' argument are. G By the way, i use log-link function, should I convert log-link function of newozone to fitted value of newozone? 1. log(newozone) - s(ratio,bs=cr) = x then exp(x) 2. exp(newozone) - s(ratio,bs=cr) =X then x which one is correct? so confused -- View this message in context: http://r.789695.n4.nabble.com/Question-about-GAMs-tp3900848p3901842.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vegan: Anova.CCA accessing original data using option by=margin
On Mon, 2011-10-10 at 23:51 -0700, Steve Pawson wrote: Hello, I am attempting to use the ANOVA.CCA function with the by=margin option. The process works fine using the by=terms option and I note in the Vegan manual that Jari suggests that an error may occur if the anova does not have access to the data on the original constraints. This is the error that I get: Error in dimnames(x) - dn : length of 'dimnames' [2] not equal to array extent My question is, does anyone know if this error relates to what Jari is referring to (or is it a different problem), and if it is, how do I link the anova to the original constraints? It is almost impossible to answer that without a lot more information. For starters, what does traceback() say when run immediately *after* you get the error? G Many thanks for any help provided. Regards Steve -- View this message in context: http://r.789695.n4.nabble.com/Vegan-Anova-CCA-accessing-original-data-using-option-by-margin-tp3893005p3893005.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] About stepwise regression problem
On Thu, 2011-10-06 at 20:12 -0700, pigpigmeow wrote: chris, I'm not using lmer, i just use gam mixed with smoothing function and linear function snip / I have the following question, 1.if I reject the variable term which has greater the p-value no matter the variable term is smoothing term or linear term, is it correct to perform stepwise regression. 2. In my model noxd-gam(newNOX~pressure+maxtemp+s(avetemp,bs=cr)+s(mintemp,bs=cr)+s(RH,bs=cr)+s(solar,bs=cr)+s(windspeed,bs=cr)+s(transport,bs=cr),family=gaussian (link=log),groupD,methods=REML) , is it generalized additive mixed model? That is not a GAMM. It is a GAM. 3. what the different if I use other criteria such as AIC or BIC? Don't do what you are doing. Heed Frank's advice that stepwise selection without shrinkage is invalid. For feature selection in GAMs, use the select = TRUE argument to gam() to turn on an addition penalty that can shrink terms out of the model, hence doing feature selection for you. HTH G Anyway, thank all of you! -- View this message in context: http://r.789695.n4.nabble.com/About-stepwise-regression-problem-tp3870217p3880835.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adonis and nmds help and questions for a novice.
On Tue, 2011-10-04 at 08:45 +, Ashley Houlden wrote: Hi, forgive me if someone has already posted about this but I have had a look and cannot find the answer, also I am very new to R and been getting the grips with this. I have been trying to use Adonis to find out if there are significant difference between groups on data that I have analyses with NMDS, and have been struggling with getting this to work and understanding what is going on. I am looking at diversity in different soils with either woodland or grassland habitats. I have run the scripts library(vegan) library(ecodist) library(MASS) mydata - read.table(ash_data.csv, header=TRUE, sep=,, row.names=Site) envdata_fit - read.table(ash_env.csv, header=TRUE, sep=,, row.names=Site) #distance matrix of samples using bray curtis d= bcdist(mydata, rmzero=FALSE) And then using the distance matrix from this to use for adoins? Is this correct. With this I have then run Adonis results = adonis(d ~ wood, envdata_fit, permutations = 1000) and get significant values to see if sig diff in diversity between wood and grass habitat. However I have been reading about combining the variables, but there seems to be different ways for example results = adonis(d ~ wood+soil, envdata_fit, permutations = 1000) so get sig values for Wood and soil or results = adonis(d ~ wood*soil, envdata_fit, permutations = 1000) And I get sig values for wood, soil, and wood soil interaction. This seems to make sense, however for both if I put the variable the other way around (soil+wood or soil*wood) I get very different sig values, even accounting for the fact they vary slightly due to the permutations. So whats is going on and why to the the values change so much? You can isolate the effects due to different permutations being used by setting a seed via set.seed(). As ?adonis says, sequential sums of squares are used. If there is imbalance in your design it isn't surprising that the results are not invariant to the ordering of terms in the formula. I was also wondering in Adonis, can you nest treatments, so see effect of soil removing the effect of woodland as you can with anova? Not 100% sure what you mean by nested, but adonis() uses the full functionality of R's formula interface. See The R manual for details or ?formula. ?adonis also has details of how you might test a nested design in the Details section - this might not be what you want but it does allow you to test for an effect of one variable by conditioning the permutations on another. Another general questions as well, if I have more than two groups in a treatment, say for soil, clay, sand, loam and do the stats, and I get a significant value, what does it actually mean, is it that soil generally has an effect, with each group separate, or there are general differences between soils which may be one group is very different to the other two? The permutation test, test at the level of the factor, not pairwise comparisons of the levels within the factor. So you get information on Soil, not on Clay, Sand, Loam levels. This is the same as you would get if you did anova(mod) where mod was a linear model with a factor predictor. betadisper() the sister function to adonis() which tests for differences of multivariate dispersions, not differences of multivariate means, does allow the sorts of pairwise tests you are thinking of, but we haven't implemented this in adonis yet I'm afraid. HTH G Many many thanks to anyone who can help me as I have asked people who use R near me and no-one is sure and uses Adonis.. Ash __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adonis and nmds help and questions for a novice.
On Tue, 2011-10-04 at 08:45 +, Ashley Houlden wrote: Hi, snip / #distance matrix of samples using bray curtis d= bcdist(mydata, rmzero=FALSE) In addition, you don't necessarily need ecodist for the bray curtis distance. vegdist() in vegan will compute this for you. Not that there is anything wrong with ecodist I hasten to add - just that you can do this all in vegan if you wanted. G And then using the distance matrix from this to use for adoins? Is this correct. With this I have then run Adonis results = adonis(d ~ wood, envdata_fit, permutations = 1000) and get significant values to see if sig diff in diversity between wood and grass habitat. However I have been reading about combining the variables, but there seems to be different ways for example results = adonis(d ~ wood+soil, envdata_fit, permutations = 1000) so get sig values for Wood and soil or results = adonis(d ~ wood*soil, envdata_fit, permutations = 1000) And I get sig values for wood, soil, and wood soil interaction. This seems to make sense, however for both if I put the variable the other way around (soil+wood or soil*wood) I get very different sig values, even accounting for the fact they vary slightly due to the permutations. So whats is going on and why to the the values change so much? I was also wondering in Adonis, can you nest treatments, so see effect of soil removing the effect of woodland as you can with anova? Another general questions as well, if I have more than two groups in a treatment, say for soil, clay, sand, loam and do the stats, and I get a significant value, what does it actually mean, is it that soil generally has an effect, with each group separate, or there are general differences between soils which may be one group is very different to the other two? Many many thanks to anyone who can help me as I have asked people who use R near me and no-one is sure and uses Adonis.. Ash No virus found in this message. Checked by AVG - www.avg.comhttp://www.avg.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interaction plot type=o
On Fri, 2011-09-30 at 12:33 -0400, David Winsemius wrote: On Sep 30, 2011, at 2:16 AM, Petr PIKAL wrote: David, thank you for your reply I tried this attach(mtcars) interaction.plot(cyl, gear, mpg, type=o, pch=5:8, lty=1 ) but I got this error: Error in match.arg(type) : 'arg' should be one of l, p, b and in ?interaction.plot, o it is not listed in type arguments. Is there any other way to force it to take the argument? As David suggested you need to change code for interaction.plot. If you write interaction.plot you will get the code interaction.plot function (x.factor, trace.factor, response, fun = mean, type = c(l, p, b), legend = TRUE, trace.label = deparse(substitute(trace.factor)), fixed = FALSE, xlab = deparse(substitute(x.factor)), ylab = ylabel, ylim = range(cells, na.rm = TRUE), lty = nc:1, col = 1, pch = c(1L:9, 0, letters), xpd = NULL, leg.bg = par(bg), leg.bty = n, xtick = FALSE, xaxt = par(xaxt), axes = TRUE, ...) {... Copy it into some suitable text editor (not Word please) and change it according to your wish. Or use `fix()` or `fixInNamespace()` if this is just a one off. G You can do it at the console on both the OS versions of R I have used. just copy the code and paste. Add the - and the ,o and hit enter. Piece of cake. I would start with adding o to type in function definition and see how it behaves. I tested it. Worked as expected. Then you can copy the whole code to your modified function e.g. my.int.plot - function(x, and call my.int.plot(cyl, gear, mpg, type=o, pch=5:8, lty=1 ) Regards Petr Thanks Claudio On Thu, Sep 29, 2011 at 9:00 PM, David Winsemius dwinsem...@comcast.netwrote: On Sep 29, 2011, at 7:22 PM, Heverkuhn Heverkuhn wrote: Hello, I was wondering if there is any equivalent of interaction.plot that allow you to set type=o I tried to use interaction.plot and I have a gap between the symbols of the points and the line. If it's OK to have the lines going right though the symbols, then go ahead, hack the code. All you need to do is add ,o to the type arguments in the argument list. The code's not hidden or anything that gets in your way. David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GAMs in R : How to put the new data into the model?
On Wed, 2011-09-28 at 02:10 -0700, pigpigmeow wrote: For example: GAMs and after stepwise regression: You probably don't want to be doing stepwise model/feature selection in any regression model. Marra Wood (2011, Computational Statistics and Data Analysis 55; 2372-2387) show results that suggest adding an extra penalty term during model fitting, that allows terms to be penalised out of the model, performs well for feature selection in a series of GAM exercises. You can turn this on in mgcv::gam() via the `select` argument by setting it to `TRUE`. cod-gam(newCO~RH+s(solar,bs=cr)+windspeed+s(transport,bs=cr),family=gaussian (link=log),groupD,methods=REML) I used 10 year meterorology data (2000-2010) to form equation of concentration of carbon monoxide. NOW, I have 2011 meteorology data, I want to use the above GAMs to get the predict value of concentration of carbon monoxide. So put your 2011 data into a data frame with (at the minimum) the column names: `RH`, `solar`, `windspeed`, `transport` If that data frame were called, say, `dat2011` then you provide the `predict()` method for gam objects with both the fitted model **and** the new data (`dat2011`), e.g.: pred - predict(cod, newdata = dat2011) See ?predict.gam for more details. You are probably going to need type = response so you get values back on the scale of the response not the the link function. HTH G How can I put the 2011 data into this gam and get the expected value of concentration of Carbon monoxide?? -- View this message in context: http://r.789695.n4.nabble.com/GAMs-in-R-How-to-put-the-new-data-into-the-model-tp3849851p3850473.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] metaMDS
On Fri, 2011-09-23 at 12:43 -0500, Jean V Adams wrote: Lineth Contreras wrote on 09/23/2011 11:35:10 AM: Hello R-user community, I am applying the function metaMDS. However, I would like to know if there is any option to export the data I got from the axis as a data frame. I have tried as.data.frame.list but is not working. Any suggestion? Thank you in advance for your help, Lineth When you say the data I got from the axis do you mean the coordinates contained in the $points of the resulting object? If so, something like this should work (using the example provide in ?metaMDS): You would be better off with the scores() method for metaMDS objects: data(dune) sol - metaMDS(dune) scrs - scores(sol) `scrs` is a matrix: class(scrs) [1] matrix which can be exported via say `write.csv()`: write.csv(scrs, filenames.csv) If you want a data frame in R, then SCRS - as.data.frame(scrs) will work, but there is little reason to convert to a data frame just to export it out of R. HTH G data(dune) library(MASS) sol - metaMDS(dune) df - as.data.frame(sol$points) Jean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with predict.gam in package gam
On Wed, 2011-09-14 at 18:06 -0500, Drew Tyre wrote: I am trying to reproduce plots in Chapter 3 of Zuur et al Mixed Effects models and extensions in Ecology. For pedagogical reasons, they make a series of plots with gam(...) in package gam. I encounter errors that trace back to the predict.gam method. Below is a repeatable example using one of the example datasets: The code works for me with: sessionInfo() R version 2.13.1 Patched (2011-07-08 r56332) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] splines grid stats graphics grDevices utils datasets [8] methods base other attached packages: [1] gam_1.04 analogue_0.7-4 princurve_1.1-10 MASS_7.3-13 [5] lattice_0.19-23 vegan_1.17-12 loaded via a namespace (and not attached): [1] tools_2.13.1 G library(gam) data(kyphosis) test = gam(Kyphosis ~ s(Age,4) + Number, family = binomial, data=kyphosis) predict(test) # works predict(test,se.fit=TRUE) # fails #Error in dim(data) - dim : attempt to set an attribute on NULL # after debugging, it wants a model frame ... test = gam(Kyphosis ~ s(Age,4) + Number, family = binomial, data=kyphosis, model=TRUE) predict(test,se.fit=TRUE) # fails, but with a different error # Error in terms.default(object, data = data) : # no terms component nor attribute I am using R 2.13.1 on a PC inside RStudio. I realize I can do this in package mgcv, but just curious as to whether this is something known and fixable? Thanks -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mvpart analyses with covariables
On Tue, 2011-09-13 at 13:46 +0800, 'Ben Ford' wrote: Hi all, I am fairly new to R and I am trying to run mvpart and create a MRT using explanatory variables and covariables. I've been following the procedures in Numerical Ecoogy with R. The command (no covariables) which works fine - ABUNDTMRT - mvpart(abundance ~ .,factors,margin=0.08,cp=0,xv=1se,xval=nrow(abundance),xvmult=100,which=4) where abundance is 4th root transformed fish abundance (103 species x 168 samples), and factors is the relief (high, medium, low profile, sand inundated reef, flat), benthos (coral, sessile inverts, kelp, macroalgae, seagrass, sand), depth (continuous in meters), latitude, and longitude of each sample. To try and incorporate spatial autocorrelation (as a covariate) into this I have been trying the command - ABUNDTMRT - mvpart(abundance ~ environ + spatial, data.frame,margin=0.08,cp=0,xv=1se,xval=nrow(abundance),xvmult=100,which=4) I really don't think you can do that, and even if you could it is not a good thing to do. Arrange your variables (explanatory or covariables) into a single data frame, then, if this data frame contains only variable that go into the model (explanatory or covariables), you can use the `.` notation: ABUNDTMRT - mvpart(abundance ~ ., data = my.data.frame, ...) However, if you are using covariable in the sense of Canoco, then I think you are out of luck. There isn't a way to partial out the spatial effects and fit a model using the non-spatial components of the environmental/explanatory variables. I wonder where you got the impression that this could be done? Of course, you can do this (sort of) if you are prepared to do a bit of work; i) ordinate the abundance data with spatial covariables as constraints and take the residuals (all site scores in unconstrained axes), ii) do the the same thing with the environment so that you residualise the explanatory variables, iii) take the residualised species data and the residualised explanatory matrix and use them as inputs to mvpart(). There may be infelicities here if you need to use different ordination techniques to residualise the species and the explanatory variables matrices... HTH G where abundance is as above, environ is the environmental factors (from above) and spatial is the eigenfunctions from a PCNM analysis. data.frame is the environ and spatial factors as a data.frame. This gives the error - Error in `[[-.data.frame`(`*tmp*`, preds, value = c(72L, 72L, 80L, 72L, : replacement has 504 rows, data has 168 As I am new to this, I am not sure if I am entering an incorrect formula when trying to include the covariables, or if this is just something which mvpart cannot do. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] On-line machine learning packages?
guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ r-h...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] envfit vector labels with ordiplot3d
On Tue, 2011-09-13 at 05:18 -0700, Briony wrote: It *really* does help to read the documentation for functions. Like most of the functions in vegan that extract information from ordinations, ordilabel() extracts information for axes 1 and 2 only, by default. It also helps if I read the docs too! The reason this was in 2d and will always be in 2d is that that is how scatterplot3d works - the arrow head in 2d on the device has been placed at the 2d translation of the original 3d coordinates. It might help to push the label away from the arrow head a little bit: ordilabel(pl$arrows * 1.075) will push the label 7.5% away from the arrow head. It seems that that is the best we can currently do with vegan. When you say that this is not what you wanted, what did you want to achieve? G Does ordilabel(pl$arrows, choices = 1:3) work for you? G Thanks for the suggestion. Unfortunately that didn't work - it returned an error (Error in scores.default(x, choices = choices, display = display, ...) : subscript out of bounds). The ordilabel help says ordilabel(x, display, labels, choices = c(1, 2)... which seems to suggest the function might not allow a third axis at the moment? Cheers, Briony On Tue, Sep 13, 2011 at 2:53 AM, Gavin Simpson [via R] ml-node+s789695n3807984...@n4.nabble.com wrote: On Mon, 2011-09-12 at 03:24 -0700, Briony wrote: Thank you very much for the suggestion. And while I'm here, thank you for vegan and the documentation that goes with it. Function ordiplot3d uses scatterplot3d, and it returns also all scatterplot3d items, like functions xyz.converter and points3d that can be used for tuning labels. I tried ordilabel(pl$arrows) but the labels only seem to be in two dimensions. It *really* does help to read the documentation for functions. Like most of the functions in vegan that extract information from ordinations, ordilabel() extracts information for axes 1 and 2 only, by default. Does ordilabel(pl$arrows, choices = 1:3) work for you? G With ordixyplot I can see no other choice than that you edit the function and preferably contribute your edited function to vegan (and will be credited with the function help). I'm not a skilled enough user of R to edit the ordixyplot function - so I'll pass on that invitation to anyone else who reads this thread? Thanks again, Briony Briony brionynorton at gmail.com writes: Hi R experts, I'm looking for some help with plotting vectors from envfit in vegan, onto a 3d plot using ordiplot3d. So far I have data.mds - metaMDS(data, k=3,trace = FALSE) vect_data-envfit(data.mds,vegdata[,3:21],choices=1:3,permu=) ordiplot3d(data.mds,envfit=vect_data) ordixyplot(data.mds,pch=pts,envfit=vect_data) (my data's not really called data, I thought it might be easier to communicate this way) These display the vectors as arrows, but what I would really like is for the arrows to be labelled, like what comes up automatically in ordirgl or with a 2D ordiplot. I've gone through the help and tried everything I can work out, but I must be missing something important, because nothing's worked so far. I would be happy to use ordixyplot and show a series of 2D plots, but I can't get labels on those arrows either. Any pointers in the right direction would be gratefully received. Briony Briony, There really is no way to do this automatically, but if someone fixes the functions, we are happy to incorporate those changes in vegan. You may be able to achieve something like that with ordiplot3d, but I am not sure it looks completely satisfactory. Function ordiplot3d returns invisibly the plotting object which contains, among other items. the coordinates of arrow heads in the flattened graph. So this could work: pl - ordiplot3d(data.mds,envfit=vect_data) ordilabel(pl$arrows) Function ordiplot3d uses scatterplot3d, and it returns also all scatterplot3d items, like functions xyz.converter and points3d that can be used for tuning labels. With ordixyplot I can see no other choice than that you edit the function and preferably contribute your edited function to vegan (and will be credited with the function help). Cheers, Jari Oksanen __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/envfit-vector-labels-with-ordiplot3d-tp3800669p3807015.html Sent from the R help mailing list archive at Nabble.com. __ [hidden email
Re: [R] envfit vector labels with ordiplot3d
On Mon, 2011-09-12 at 03:24 -0700, Briony wrote: Thank you very much for the suggestion. And while I'm here, thank you for vegan and the documentation that goes with it. Function ordiplot3d uses scatterplot3d, and it returns also all scatterplot3d items, like functions xyz.converter and points3d that can be used for tuning labels. I tried ordilabel(pl$arrows) but the labels only seem to be in two dimensions. It *really* does help to read the documentation for functions. Like most of the functions in vegan that extract information from ordinations, ordilabel() extracts information for axes 1 and 2 only, by default. Does ordilabel(pl$arrows, choices = 1:3) work for you? G With ordixyplot I can see no other choice than that you edit the function and preferably contribute your edited function to vegan (and will be credited with the function help). I'm not a skilled enough user of R to edit the ordixyplot function - so I'll pass on that invitation to anyone else who reads this thread? Thanks again, Briony Briony brionynorton at gmail.com writes: Hi R experts, I'm looking for some help with plotting vectors from envfit in vegan, onto a 3d plot using ordiplot3d. So far I have data.mds - metaMDS(data, k=3,trace = FALSE) vect_data-envfit(data.mds,vegdata[,3:21],choices=1:3,permu=) ordiplot3d(data.mds,envfit=vect_data) ordixyplot(data.mds,pch=pts,envfit=vect_data) (my data's not really called data, I thought it might be easier to communicate this way) These display the vectors as arrows, but what I would really like is for the arrows to be labelled, like what comes up automatically in ordirgl or with a 2D ordiplot. I've gone through the help and tried everything I can work out, but I must be missing something important, because nothing's worked so far. I would be happy to use ordixyplot and show a series of 2D plots, but I can't get labels on those arrows either. Any pointers in the right direction would be gratefully received. Briony Briony, There really is no way to do this automatically, but if someone fixes the functions, we are happy to incorporate those changes in vegan. You may be able to achieve something like that with ordiplot3d, but I am not sure it looks completely satisfactory. Function ordiplot3d returns invisibly the plotting object which contains, among other items. the coordinates of arrow heads in the flattened graph. So this could work: pl - ordiplot3d(data.mds,envfit=vect_data) ordilabel(pl$arrows) Function ordiplot3d uses scatterplot3d, and it returns also all scatterplot3d items, like functions xyz.converter and points3d that can be used for tuning labels. With ordixyplot I can see no other choice than that you edit the function and preferably contribute your edited function to vegan (and will be credited with the function help). Cheers, Jari Oksanen __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/envfit-vector-labels-with-ordiplot3d-tp3800669p3807015.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] packfor
On Wed, 2011-08-17 at 09:49 +0200, COCCIA , CRISTINA wrote: Good morning all, I'm trying to find the package packfor to install on my library in R, but I'm not available to find it online, so I would ask to you if you please let e know how I could find it and if I need a special R version to do it. packfor is part of the *sedar* project on R-forge. You can get a binary or sources from: http://r-forge.r-project.org/R/?group_id=195 HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Symbol Font Baseline, Cairo, Card Symbols
On Tue, 2011-08-16 at 17:46 -0700, ivo welch wrote: I think I found a bug in the Cairo library, plus weird behavior in both the Cairo and the normal pdf device. The baseline of the spades symbol seems to be off. This is easier to show than it is to explain. The problem does not appear in the normal pdf device, which is why I am guessing this is a Cairo bug. moreover, I cannot figure out why three of the card symbols seem to be transparent, but the fourth is not. this is the case for both the Cairo and the ordinary pdf devices. I thought I would report it to the R wizards on this group... Why? One is expressly asked to address such problems with the package maintainer(s), especially if you feel there is a bug in the package. And whilst Martin M is potentially indisposed at UseR! 2011, I will mention that it is a *package* not a library when you refer to the R package Cario. If you had identified a bug in the Cairo *library* you should be reporting it upstream to the Cairo people: http://www.cairographics.org/ G library(Cairo) clubs - expression(symbol('\247')) hearts - expression(symbol('\250')) diamonds - expression(symbol('\251')) spades - expression(symbol('\252')) csymbols - c(clubs, diamonds, hearts, spades) CairoPDF(file = cardsymbols.pdf) plot( 0, xlim=c(0,5), ylim=c(0,2), type=n ) clr - c(black, red, red, black) for (i in 1:4) { hline - function( yloc, ... ) for (i in 1:length(yloc)) lines( c(-1,6), c(yloc[i],yloc[i]), col=gray) hline(0.9); hline(1.0); hline(1.1); hline(1.2) text( i, 1, csymbols[i], col=clr[i], cex=5 ) text( i, 0.5, csymbols[i], col=clr[i] ) } dev.off() Ivo Welch (ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Symbol Font Baseline, Cairo, Card Symbols
On Wed, 2011-08-17 at 08:06 -0700, ivo welch wrote: hi gavin---I am not even sure that it is a cairo bug, much less do I know about the details where it sits. for all I know, it could be an Apple problem. the possible bug report was not only for the cairo package (what's the difference between a package and a library? in my user R code, I invoke it as library(cairo)), but also for others who may use it and be surprised when they run into the same issue, wondering if it is their code, or a more general issue.. In the real world, libraries contain books. In the R world, libraries contain packages. In the same way that in the real world it would be silly and wrong to call a book a library, it is wrong to call an R package a library. The thing that resides on CRAN is the R *package* named Cairo. It works by making use of a general purpose *software* *library* called Cairo, details of which can be found here http://www.cairographics.org/ . I appreciate that the name of the `library()` function used to load packages for use is somewhat counter-intuitive, but there you go. There is talk of moving to a `use()` function to replace `library()`; when/if that happens, perhaps it will lessen the confusion. Even if you are not sure, the general advice is to email the maintainers. If there is nothing wrong and this is a quirk of fonts/system/OS etc then your archived email lives on via Google and may confuse others in the future. Giving package authors a chance to look at the problem is also common courtesy. The Posting Guide is clear in this regard: If the question relates to a contributed package , e.g., one downloaded from CRAN, try contacting the package maintainer first. Many people ignore this however. G /iaw Ivo Welch (ivo.we...@gmail.com) On Wed, Aug 17, 2011 at 5:18 AM, Gavin Simpson gavin.simp...@ucl.ac.uk wrote: On Tue, 2011-08-16 at 17:46 -0700, ivo welch wrote: I think I found a bug in the Cairo library, plus weird behavior in both the Cairo and the normal pdf device. The baseline of the spades symbol seems to be off. This is easier to show than it is to explain. The problem does not appear in the normal pdf device, which is why I am guessing this is a Cairo bug. moreover, I cannot figure out why three of the card symbols seem to be transparent, but the fourth is not. this is the case for both the Cairo and the ordinary pdf devices. I thought I would report it to the R wizards on this group... Why? One is expressly asked to address such problems with the package maintainer(s), especially if you feel there is a bug in the package. And whilst Martin M is potentially indisposed at UseR! 2011, I will mention that it is a *package* not a library when you refer to the R package Cario. If you had identified a bug in the Cairo *library* you should be reporting it upstream to the Cairo people: http://www.cairographics.org/ G library(Cairo) clubs - expression(symbol('\247')) hearts - expression(symbol('\250')) diamonds - expression(symbol('\251')) spades - expression(symbol('\252')) csymbols - c(clubs, diamonds, hearts, spades) CairoPDF(file = cardsymbols.pdf) plot( 0, xlim=c(0,5), ylim=c(0,2), type=n ) clr - c(black, red, red, black) for (i in 1:4) { hline - function( yloc, ... ) for (i in 1:length(yloc)) lines( c(-1,6), c(yloc[i],yloc[i]), col=gray) hline(0.9); hline(1.0); hline(1.1); hline(1.2) text( i, 1, csymbols[i], col=clr[i], cex=5 ) text( i, 0.5, csymbols[i], col=clr[i] ) } dev.off() Ivo Welch (ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read
Re: [R] reflecting a PCA biplot
On Tue, 2011-08-09 at 22:57 +1000, Andrew Halford wrote: Hi Listers, I am trying to reflect a PCA biplot in the x-axis (i.e. PC1) but am not having much success. In theory I believe all I need to do is multiply the site and species scores for the PC1 by -1, which would effectively flip the biplot. I am creating a blank plot using the plot command and accessing the results from a call to rda. I then use the calls to scores to obtain separate site and species coordinates and I have worked out how to multiply the appropriate PC1 scores by -1 to create the site and species scores I want. However I am not sure how to change the call to plot which accesses the results of the call to rda to draw the blank plot. The coordinates it is accessing are for the unreflected ordination and this does not match the new site and species scores that I have. fish.pca - rda(fish.hel) fish.site - scores(fish.pca,display=sites,scaling=3) fish.spp - scores(fish.pca,display=species,scaling=3) fish.site[,PC1] - -1*(fish.site[,PC1]) fish.spp[,PC1] - -1*(fish.spp[,PC1]) graph - plot(fish.pca,display=c(sites,species),type=n,scaling=3) # how do I get the plot to draw up the blank display based on the reversed site and species scores? Do you mean something like...? require(vegan) data(dune) mod - rda(dune) si.scrs - scores(mod, display = sites, scaling = 3, choices = 1:2) sp.scrs - scores(mod, display = species, scaling = 3, choices = 1:2) si.scrs[, PC1] - -1 * si.scrs[, PC1] sp.scrs[, PC1] - -1 * sp.scrs[, PC1] xlim - range(0, si.scrs[, 1], sp.scrs[, 1]) ylim - range(0, si.scrs[, 2], sp.scrs[, 2]) plot(si.scrs[,1], si.scrs[,2], ylim = ylim, xlim = xlim, ylab = PC2, xlab = PC1, cex = 0.7, asp = 1) abline(h = 0, lty = dotted) abline(v = 0, lty = dotted) points(sp.scrs[,1], sp.scrs[,2], col = red, pch = 3, cex = 0.7) box() For non-standard plotting of ordination objects, our advice has always been to build the plot up from lower-level plotting functions rather than the specific methods supplied with vegan. A comparison: (not quite the same, I know, but close enough) layout(matrix(1:2, ncol = 2)) plot(mod, display=c(sites,species), type = p, scaling=3, main = Original) plot(si.scrs[,1], si.scrs[,2], ylim = ylim, xlim = xlim, ylab = PC2, xlab = PC1, cex = 0.7, asp = 1, main = Flipped PC1) abline(h = 0, lty = dotted) abline(v = 0, lty = dotted) points(sp.scrs[,1], sp.scrs[,2], col = red, pch = 3, cex = 0.7) box() layout(1) If you want type = t, the default for plot.cca, then use the same coordinates but extract the relevant labels from the two scores objects: plot(si.scrs[,1], si.scrs[,2], ylim = ylim, xlim = xlim, ylab = PC2, xlab = PC1, cex = 0.7, asp = 1, type = n) ## no plotting this time first abline(h = 0, lty = dotted) abline(v = 0, lty = dotted) text(si.scrs[,1], si.scrs[,2], labels = rownames(si.scrs), cex = 0.7) ## add site and species scores as labels text(sp.scrs[,1], sp.scrs[,2], labels = rownames(sp.scrs), col = red, cex = 0.7) box() HTH G Any help appreciated. cheers Andy -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running R in a sandbox
On Wed, 2011-08-03 at 11:04 +0300, Antonio Rodriges wrote: Hello, The idea is to grant access of remote users to R running on Linux. Users must have ability to run their R scripts but avoid corrupting the operating system. How one can restrict/limit access of remote users to certain R functions? For example, dealing with IO (file system), graphical tools, etc. We've been here before, IIRC. But I'm too lazy to check the archives - that's your job ;-) Try a search on http://finzi.psych.upenn.edu/search.html for relevant terms and make sure you turn on the email lists and off the functions/vignettes. Thank you. G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with R
On Thu, 2011-07-28 at 11:58 -0400, Sarah Goslee wrote: Hi Mark, On Thu, Jul 28, 2011 at 10:44 AM, m...@statcourse.com wrote: 1. How can I plot the entire tree produced by rpart? What does plot() not do that you are expecting? Not do any labelling... ;-) text(tree) where `tree` is your fitted tree will add the labels after using `plot()` as per Sarah's reply. 2. How can I submit a vector of values to a tree produced by rpart and have it make an assignment? What does predict() not do that you are expecting? Indeed. G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Elegant way to subtract matrix from array
On Wed, 2011-07-27 at 01:06 -0700, steven mosher wrote: there are really two related problems here I have a 2D matrix A - matrix(1:100,nrow=20,ncol =5) S - matrix(1:10,nrow=2,ncol =5) #I want to subtract S from A. so that S would be subtracted from the first 2 rows of #A, then the next two rows and so on. For this one, I have used the following trick to replication a matrix do.call(rbind, rep(list(mat), N) where we convert the matrix, `mat`, to a list and repeat that list `N` times, and arrange for the resulting list to be rbind-ed. For your example matrices, the following does what you want: A - do.call(rbind, rep(list(S), nrow(A)/nrow(S))) Whether this is useful will depend on the dimension of A and S - from your posts on R-Bloggers, I can well imagine you are dealing with large matrices. #I have a the same problem with a 3D array # where I want to subtract Q for every layer (1-10) in Z # I thought I solved this one with array(mapply(-,Z,Q),dim=dim(Z)) # but got the wrong answers Z - array(1:100,dim=c(2,5,10)) Q - matrix(1:10,nrow=2,ncol =5) For this one, consider the often overlooked function `sweep()`: sweep(Z, c(1,2), Q, -) does what you wanted. c(1,2) is the `MARGIN` argument over the dimensions that Q will be swept from. HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function lm, get back the coefficient
On Tue, 2011-07-26 at 16:43 +0100, Barry Rowlingson wrote: On Tue, Jul 26, 2011 at 4:21 PM, ascoquel ascoq...@yahoo.fr wrote: Hi, I've done a linear fit on my data and I would like to get back the a (time) coefficient ... mod-lm(res_sql2$Lx0x~0+time) result-data.frame() result-coef(mod) print(result) print(result) [1] result time 0.02530191 But I would like just the value 0.02530191 ... I tried result$time but it doesn't work ... It is 'just the value'. It happens to be in a named vector with a length of 1. You can do anything to it that you want to do with any other number. Try result * 2, or sqrt(result). If it really annoys you, try names(result)=NULL to get rid of the name. Barry `unname()` exists for this purpose: foo - 1 names(foo) - bar foo bar 1 unname(foo) [1] 1 G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function lm, get back the coefficient
On Tue, 2011-07-26 at 13:42 -0700, Bert Gunter wrote: Not quite, Gavin. You have to assign the value of unname back: Yes; thought that would be a given. The point was to give a simple example to show that `unname()` removes names/dimnames. G z - structure(2, names=a) unname(z) [1] 2 z a 2 zz - unname(z) zz [1] 2 names(z) - NULL z [1] 2 Cheers, Bert On Tue, Jul 26, 2011 at 1:18 PM, Gavin Simpson gavin.simp...@ucl.ac.uk wrote: On Tue, 2011-07-26 at 16:43 +0100, Barry Rowlingson wrote: On Tue, Jul 26, 2011 at 4:21 PM, ascoquel ascoq...@yahoo.fr wrote: Hi, I've done a linear fit on my data and I would like to get back the a (time) coefficient ... mod-lm(res_sql2$Lx0x~0+time) result-data.frame() result-coef(mod) print(result) print(result) [1] result time 0.02530191 But I would like just the value 0.02530191 ... I tried result$time but it doesn't work ... It is 'just the value'. It happens to be in a named vector with a length of 1. You can do anything to it that you want to do with any other number. Try result * 2, or sqrt(result). If it really annoys you, try names(result)=NULL to get rid of the name. Barry `unname()` exists for this purpose: foo - 1 names(foo) - bar foo bar 1 unname(foo) [1] 1 G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple plots in single frame: 2 upper, 1 lower
On Wed, 2011-07-20 at 23:38 +1200, Rolf Turner wrote: On 20/07/11 11:07, DrCJones wrote: Hi, par(mfrow = c(2,2)) will create a 2x2 window that I can use to plot 4 diferent figures in: [plot1 plot2] [plot3 plot4] But how can do 3 so that the bottom spans the width of the upper two: [plot1 plot1] [p l o t 3] Is this possible in R? In R ***anything*** is possible. :-) Your requirement is no only possible, but easy! See ?layout You may have to expend a bit of effort to understand the syntax, but that will be good for your karma. :-) It may help the OP to think of the layout as a 2*2 matrix: 1 2 3 4 say, with plot 3 using regions 3 and 4. If we fill the same matrix with the plot number we want to draw in it, we have 1 2 3 3 From there it is easy to specify the layout by directly building that matrix in R: (m - matrix(c(1:3,3), ncol = 2, byrow = TRUE)) [,1] [,2] [1,]12 [2,]33 and pass that to `layout()` layout(m) ## invisible() just to stop replicate returning something visible invisible(replicate(3, plot(1:10))) layout(1) HTH G It ***will*** do exactly what you want, if you ask it nicely. cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] PERMANOVA+ and adonis in vegan package
On Sat, 2011-07-09 at 09:35 -0700, VG wrote: Hi, I was wondering if someone can tell me what is the difference between strata argument (function adonis in vegan package) and using random effects in PERMANOVA+ add-on package to PRIMER6 when doing permutational MANOVA-s? Is the way permutations are done the same? Thank you very much in advance, Vesna By the looks of the Primer page on this new add-on, there are substantial differences between adonis() and PERMANOVA+. The current way that adonis() permutes data is either to shuffle the data totally at random, or at random *within* the levels of `strata`. This conditions the permutations on the clusterings of samples, and reflects a null hypothesis where samples are not fully freely exchangeable (independent), but exchangeable only within the groups/clusters of samples. It appears PERMANOVA+ has facilities for more complex permutations that this. I have implemented some of these restricted permutations in my package `permute` - on r.forge within the vegan stable of packages - that will hopefully be used within vegan for all it's permutations, perhaps by as soon as the end of the summer break. As PERMANOVA+ is closed source and I have not seen any works from Marti Anderson describing the newer features of her software, which are included in the new PRIMER add-on, I would suggest you enquire with the PRIMER people as to the similarities/differences between adonis() and PERMANOVA+ and report back your findings? HTH G -- View this message in context: http://r.789695.n4.nabble.com/PERMANOVA-and-adonis-in-vegan-package-tp3656397p3656397.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FW: lasso regression
On Tue, 2011-07-12 at 10:12 -0400, Heiman, Thomas J. wrote: Hi, Hopefully I got the formatting down.. I am trying to do a lasso regression using the lars package with the following data (the data files is in .csv format): V1 V2 V3 V4 V5 V6 V7 V8 V9 1 FastestTime WinPercentage PlacePercentage ShowPercentage BreakAverageFinishAverage Time7AverageTime3AverageFinish 2 116.9 0.14285715 0.14285715 0.2857143 4.4285713.2857144 117.557144 117.76667 5.0 3 116.22857 0.2857143 0.42857143 0.14285715 6.1428572.142857116.84286 116.8 2.0 4 116.41428 0.0 0.14285715 0.2857143 5.7142863.7142856 117.24286 117.14 4.0 5 115.8 0.5714286 0.0 0.2857143 2.1428572.5714285 116.21429 116.5 6.0 #load Data crs- read.csv(file:///C:/temp/Horse//horseracing.csvfile:///C:\temp\Horse\horseracing.csv, na.strings=c(,, NA, , ?), encoding=UTF-8) ## define x and y x= x-crs[,9]#predictor variables y= y-crs[1:8,] #response variable library(lars) cv.lars(x, y, K=10, trace=TRUE, plot.it = TRUE,se = TRUE, type=lasso) and I get: LASSO sequence Error in one %*% x : requires numeric/complex matrix/vector arguments Any idea on what I am doing wrong? Thank you!! Row 1 contains character data, the variable names. Are you missing a `header = TRUE` (this is the default in `read.csv()`), or do you have several header lines? I also think you have the response/predictors back to front there; otherwise, why would you need to shrink the coefficient and select from a model with a single predictor? HTH G Sincerely, tom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] relative euclidean distance
On Wed, 2011-07-06 at 14:50 +0100, Sarah Leclaire wrote: Hi, I would like to calculate the RELATIVE euclidean distance. Is there a function in R which does it ? (I calculated the abundance of 94 chemical compounds in secretion of several individuals, and I would like to have the chemical distance between 2 individuals as expressed by the relative euclidean distance. Some compounds are in very low abundance whereas others are in high abundance, that's why I would like to correct for the abundance) Thanks, Sarah A simple solution to this is to transform the data and then compute the Euclidean distance using dist(). decostand(foo, method = normalize) and disttransform(foo, method = chord) in package BiodiversityR can do this for you without you having to write a function yourself. Pass the returned object to dist() to get the distances you want. HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] permil symbol linux
On Wed, 2011-07-06 at 12:24 +0200, mathew brown wrote: To all in R land, Here is what I would like to do. x = c(1:10) y = c(1:10) plot(x,y) mtext(side=2, line=1.5, expression(*delta*^18*O [permill]), cex=1, adj=0.5) That's it. Except I would like to replace permill with the symbol. Thanks for the help Are you trying to be difficult? Have you read the posting guide so you can provide the at minimum information Prof. Ripley asked you for? Simply repeating the question ad nauseam isn't what was requested. On my linux box, this will produce a d^{18}O permille **on screen** - not sure why you want this in square brackets? plot(1:10, ylab = expression(delta^{18}*O ~ (\u2030))) Now that glyph doesn't appear to be in the pdf device fonts for my encoding so when plotting to pdf() I have to add `encoding=WinAnsi.enc` to my `pdf()` call - IIRC this advice at the time was supplied by Prof. Ripley. The advice may well be different on your **unstated** OS - hence the requests for more info. This is with: sessionInfo() R version 2.13.0 Patched (2011-04-19 r55527) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] cocorresp_0.1-9 vegan_1.17-9 which would be part of the at minimum info Prof. Ripley asked for. And also which device (windows(), x11(), quartz(), pdf(), postscript() etc, etc.) do you *want* to print on? Please do read the posting guide - you might not be aware what information is required to answer your question, so what seems superfluous may well be, and in this case probably is, *essential*. G On Wed, 6 Jul 2011 11:07:55 +0100 (BST) Prof Brian Ripley rip...@stats.ox.ac.uk wrote: You still have not sent the information requested in the posting guide, and I do not know your target device nor locale, nor is that a reproducible example. Please learn some respect for the time of the helpers here (and for all the work that went into making this possible in R). On Wed, 6 Jul 2011, mathew brown wrote: Good point. Here is the code plot(dat$timestamp,dat$delta_18_16, ylab=, xlab=(min), tck=0.05, col=blue) mtext(side=2, line=1.5, expression(*delta*^18*O [\u0089]), cex=1, adj=0.5) I'm still not sure how to get your code to work. Unsurprising, as you have not seen *my* code. thanks On Wed, 6 Jul 2011 09:45:49 +0100 (BST) Prof Brian Ripley rip...@stats.ox.ac.uk wrote: On Wed, 6 Jul 2011, mathew brown wrote: Hi, I'm trying to figure out how to make a plot with ylab showing the permil symbol. Anyone know how to do this? Yes. Now, it you would follow the posting guide and give the 'at a minimum' information you were asked for, and the graphics device you want to use, we might be able to tell you more precisely how. Hint: that symbol is \u0089, and also in latin-1. Thanks -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 -- Mathew Brown Institute of Bioclimatology University of Göttingen Büsgenweg 2 37077 Göttingen, Germany t: +49 551 39 9359 mathew.br...@forst.uni-goettingen.de -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R
Re: [R] Standards for delivery of GPL software in CRAN packages
On Mon, 2011-06-27 at 11:14 -0400, Galkowski, Jan wrote: Fine. Attached. It's waved. All it has is *.Rd files. Apparently the functions are collected in functionINIT.R. But 00Index and DESCRIPTION are not helpful. - j The Rd files are the help or manual pages for the functions defined in the package. In this particular case, the package author has decided to put all the R code for their package into a single R source file - functionINIT.R. The other two files are R-specific files, the latter of which is used to describe the package; which, incidentally, points you to a peer-reviewed paper that the package is support for. I don't recall the GPL mentioning anything requiring that the source code be helpful. The authors have most certainly fulfilled their requirements under GPL, as has CRAN in distributing the package sources. Or am I being obtuse and completely missing your point? G -Original Message- From: b.rowling...@googlemail.com [mailto:b.rowling...@googlemail.com] On Behalf Of Barry Rowlingson Sent: Monday, June 27, 2011 10:18 AM To: Galkowski, Jan Cc: r-help@r-project.org Subject: Re: [R] Standards for delivery of GPL software in CRAN packages On Mon, Jun 27, 2011 at 1:24 PM, Galkowski, Jan jgalk...@akamai.com wrote: I wondered if there were standard practices in CRAN for delivery of R source implementing functions in R packages. I has encountered a couple of packages where the gzipped version of source contains very little, primarily the Help files describing the functions in the package. In some cases I can find the source as the value of the function name. Given that these packages are released as GPL, oughtn't the unoptimized source be freely available, hopefully with comments? Am I missing something? Is there a central place other than mirrors where such source is retained? Sourceforge? The 'package source' link on CRAN should point you to a tar.gz file that contains the source code. For example, for splancs off the heanet mirror it is: http://ftp.heanet.ie/mirrors/cran.r-project.org/src/contrib/splancs_2.01-27.tar.gz .tar.gz files from those links should have full R, C and Fortran source code. I think we need counter-examples... Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Setting up a State Space Model in dlm
On Tue, 2011-06-07 at 17:24 +0100, Michael Ash wrote: This question pertains to setting up a model in the package dlm (dynamic linear models, http://cran.r-project.org/web/packages/dlm/index.html The author of the dlm package has just published a paper on state space models in R including details on setting up dlm: http://www.jstatsoft.org/v41/i04 That might help with your question - I haven't seen a reply on list, but am unable to help answer it either. HTH G I have read both the vignette and An R Package for Dynamic Linear Models (http://www.jstatsoft.org/v36/i12/paper), both of which are very helpful. There is also some discussion at https://stat.ethz.ch/pipermail/r-help/2009-May/198463.html I have what I think is a relatively straightforward state-space model but am unable to translate it into the terms of dlm. It would be very helpful to get a basic dlm setup for the problem and I would guess that I can then modify it with more lags, etc., etc. The main equation is pi[t] = a * pi[t-1] + b*(U[t] - UN[t]) + e[t] (see http://chart.apis.google.com/chart?cht=txchl=%5Cpi_t=a%5Cpi_{t-1}%2bb%28U_t-U^N_{t}%29%2Be_t for a pretty version) with pi and U observed, a and b fixed coefficients, and e a well-behaved error term (gaussian, say, variance unknown). The object of interest is the unobserved and time-varying component UN which evolves according to UN[t] = UN[t-1] + w[t] (see http://chart.apis.google.com/chart?cht=txchl=U%5EN_%7Bt%7D%20=%20U%5EN_%7Bt-1%7D%20%2B%20%5Cepsilon_t for a pretty version) that is, a random walk with well-behaved error term with var(w) known (or assumed). I'm interested in the estimates of a and b and also in estimating the time series of UN. Note that the term b*(U[t] - UN[t]) makes this a nonlinear model. Below is code that does not work as expected. I see the model as having four parameters, a, b, var(e), and UN. (Or do I have a parameter UN[t] for every period?) I do not fully understand the dlm syntax. Is FF specified properly? What should X look like? How does m0 relate to parm()? I would be grateful if someone would be willing to glance at the code. Thanks. Michael library(quantmod) library(dlm) ## Get and organize the data getSymbols(UNRATE,src=FRED) ## Unemployment rate getSymbols(GDPDEF,src=FRED) ## Quarterly GDP Implicit Price Deflator u - aggregate(UNRATE,as.yearqtr,mean) gdpdef - aggregate(GDPDEF,as.yearqtr,mean) pi - diff(log(gdpdef))*400 pilag - lag(pi,-1) tvnairu - cbind(pi,pilag,u) tvnairu.df - subset(data.frame(tvnairu), !is.na(pi) !is.na(u) !is.na(pilag)) ## First attempt buildNAIRU - function(x) { modNAIRU - dlm(FF=t(matrix(c(1,1,1,0))), GG=diag(4), W=matrix(c(0,0,0,0, 0,0,0,0, 0,0,0.04,0, 0,0,0,0),4,4), V=exp(x[4]), m0=rep(0,4), C0=diag(1e07,4), JFF = t(matrix(c(1,1,0,0))), X=cbind( tvnairu.df$pilag, tvnairu.df$u)) return(modNAIRU) } (fitNAIRU - dlmMLE(tvnairu.df$pi, parm=c(0,0,0,0) , build=buildNAIRU, hessian=TRUE, control=list(maxit=500))) (dlmNAIRU - buildNAIRU(fitNAIRU$par)) ## Second attempt buildNAIRU - function(x) { modNAIRU - dlm(FF=t(matrix(c(1,1,0,0))), GG=diag(4), W=matrix(c(0,0,0,0, 0,0,0,0, 0,0,0.04,0, 0,0,0,0 ),4,4), V=exp(x[4]), m0=c(0.7,-0.3,5,1), C0=diag(100,4), JFF = t(matrix(c(1,1,0,0))), X=cbind( tvnairu.df$pilag, tvnairu.df$u - x[3] )) return(modNAIRU) } (fitNAIRU - dlmMLE(tvnairu.df$pi, parm=c(0,0,0,0) , build=buildNAIRU, hessian=TRUE, control=list(maxit=500))) (dlmNAIRU - buildNAIRU(fitNAIRU$par)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] metaMDS and envfit: Help reading output
On Wed, 2011-05-11 at 07:07 -0400, Michael Denslow wrote: Hi Katie, On Tue, May 10, 2011 at 4:51 PM, Songer, Katherine B - DNR katherine.son...@wisconsin.gov wrote: Hello R experts, I've used metaMDS to run NMDS on some fish abundance data, and am also working on correlating environmental data to the NMDS coordinates. I'm fairly new to metaMDS and NMDS in general, so I have what are probably some very basic questions. My fish abundance data consists of 66 sites for which up to 20 species of fish were identified and counted. I ran metaMDS on this data in 3 dimensions (after using a scree plot to check for stress levels in the different dimensions). I then used envfit to correlate a predictor dataset of environmental variables with the NMDS results, using the following code. Fish-as.data.frame(read.csv(Fish.csv,header=TRUE, sep = ,)) Fish.mds-metaMDS(Fish,zerodist = add,k=3,trymax=20) Predictors-as.data.frame(read.csv(Predictors.csv,header=TRUE, sep = ,)) Fish.fit - envfit(Fish.mds$points, Predictors, k=3, 1000, na.rm = TRUE) Have a look at the choices argument in envfit, etc. This is how you specify which axes you want to plot. ord - metaMDS(varespec, k=3) fit - envfit(ord, varechem, perm = 999, choices=c(1:3)) Indeed, this is an essential step because we probably shouldn't interpret the axes of nMDS as separate components a la PCA/CA... fit plot(ord, choices=c(1,3)) plot(fit, choices=c(1,3)) ...as such, a 3-d plot might be better (for some definition of better). See ?ordirgl G Fish.fit The output of Fish.fit was as follows (table truncated): Dim1Dim2r2 Pr(r) DrainArea -0.5923233 -0.8057004 0.7674 0.000999 *** Flow-0.5283236 -0.8490431 0.7847 0.000999 *** StrmWidth -0.6993457 -0.7147836 0.6759 0.000999 *** Gradient0.4541225 0.8909392 0.2085 0.010989 * I'd like to better understand how to read this table. I understand that Dim1 and Dim2 refer to the dimensions of the vectors produced by envfit, and r2 is the r-squared of those vectors. But how do I visualize these vectors in a 3-d plot? To which of the 3 NMDS dimensions are these vectors being correlated? Is there code to produce the x, y, and z coordinates of each of the sites in Fish.mds? Thanks very much. Katie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with Princurve
On Thu, 2011-05-19 at 06:43 -0700, guy33 wrote: Hey all, I can't seem to get the princurve package to produce correct results, even in the simplest cases. For example, if you just generate a 1 period noiseless sine wave, and ask for the principal curve and plot, the returned curve is clearly wrong (doesn't follow the sine wave). Here's my code: library(princurve) x - runif(1000,0,2*pi); x - cbind(x/(2*pi), sin(x)) fit1 - principal.curve(x, plot = TRUE) Anyone have any suggestions? If you run this code, do you get the correct principal curve? How about specifying some useful starting points? fit1 - principal.curve(x, plot = TRUE, trace = TRUE, maxit = 100, start = cbind(sort(x[,1]), rep(1, nrow(x And we need a few more iterations before convergence here. Starting from the first principal component for example might give useful starting points. HTH G Any help would be really appreciated! -guy33 -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-Princurve-tp3535721p3535721.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with Princurve
On Mon, 2011-05-23 at 09:58 -0400, Ravi Varadhan wrote: Why does this not find a better solution? x - seq(0,2*pi, length=1000) x - cbind(x/(2*pi), sin(x)) fit1 - principal.curve(x, plot = TRUE, trace = TRUE, maxit = 100, + start = cbind(sort(x[,1]), rep(1, nrow(x Starting curve---distance^2: 1499.5 Iteration 1---distance^2: 3.114789 Iteration 2---distance^2: 10.04492 Iteration 3---distance^2: 11.89215 Iteration 4---distance^2: 12.43235 Iteration 5---distance^2: 12.68524 Iteration 6---distance^2: 12.84443 Iteration 7---distance^2: 12.93624 Iteration 8---distance^2: 12.99118 Iteration 9---distance^2: 13.01280 Iteration 10---distance^2: 13.02867 Iteration 11---distance^2: 13.03867 You see that the projection distance is minimal at iteration 1, but the algorithm settles for an inferior projection (i..e. a greater projection distance). I hadn't quite noticed that - I was too busy watching the algorithm converge to some solution instead. Hastie and Stuetzle make no claims about the algorithm converging or indeed even that each step leads to a decrease in the criterion (sum of squared distances to the curve). It would appear that a smoother curve than the one identified in iter 1 is preferable when one considers the self-consistency check; smoothing in each dimension using the distance along the current curve as a predictor performs local averaging and the new principal curve identified starts cutting the corners off the sine wave. gradually the new curve derived by local averaging and the current curve are sufficiently similar that the curve is declared self-consistent and the algorithm stops. I can achieve a better fit using my own wrappers around principal.curve() and the smooth.spline() function used in the local averaging, but only if I supply some good starting values. For my wrapper and the starting configuration I suggested, I get a very close solution to the sine wave original but it is a bit smoother - the rapid change around the peak/troughs is smoothed out a little bit. GCV is being used at each iteration in my wrappers to find the optimal smoothness for smooths in both dimensions that describe the curve; the solution fits the data almost as well as the best curve, but uses fewer degrees of freedom in doing so. As Henrik suggests, playing with the parameters of the smooth.spline fitting engine helps - as that is the main thing my wrapper function makes easier. These data seem quite difficult for the algorithm to get good answers - starting from the first or second principal component gives poor results and a highly complex fitted curve that traverses back and forth across the gaps between the peaks/troughs in the sine wave several times. G If I do not provide any starting values, I get this: fit1 - principal.curve(x, plot = TRUE, trace = TRUE, maxit = 100) Starting curve---distance^2: 29692.03 Iteration 1---distance^2: 20.31220 Iteration 2---distance^2: 19.45939 Iteration 3---distance^2: 19.26387 Iteration 4---distance^2: 19.20626 Iteration 5---distance^2: 19.18666 Iteration 6---distance^2: 19.18059 This is even worse. It seems like the algorithm is quite sensitive to starting values. Is this behavior expected or is there some flaw in the algorithm? Ravi. From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Gavin Simpson [gavin.simp...@ucl.ac.uk] Sent: Monday, May 23, 2011 8:27 AM To: guy33 Cc: r-help@r-project.org Subject: Re: [R] Problem with Princurve On Thu, 2011-05-19 at 06:43 -0700, guy33 wrote: Hey all, I can't seem to get the princurve package to produce correct results, even in the simplest cases. For example, if you just generate a 1 period noiseless sine wave, and ask for the principal curve and plot, the returned curve is clearly wrong (doesn't follow the sine wave). Here's my code: library(princurve) x - runif(1000,0,2*pi); x - cbind(x/(2*pi), sin(x)) fit1 - principal.curve(x, plot = TRUE) Anyone have any suggestions? If you run this code, do you get the correct principal curve? How about specifying some useful starting points? fit1 - principal.curve(x, plot = TRUE, trace = TRUE, maxit = 100, start = cbind(sort(x[,1]), rep(1, nrow(x And we need a few more iterations before convergence here. Starting from the first principal component for example might give useful starting points. HTH G Any help would be really appreciated! -guy33 -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-Princurve-tp3535721p3535721.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained
Re: [R] R forum for only Statistics
On Thu, 2011-04-28 at 05:41 -0700, Jeremy Hetzel wrote: Vincy, In addition to the R-help mailing list, other forums for R and statistics that I go to include: Stackoverflow's R tagged questions http://stackoverflow.com/questions/tagged/r SO is *not* meant for statistical Q's - R or otherwise. People posting such questions there will usually get short shrift, and find their Qs closed or migrated to the Crossvalidated sister site that you list below stats.stackexchange.com. SO is for programming. HTH G StackExchange's Stats site: http://stats.stackexchange.com/ A Google search for R statistics forum shows a few other forums that I'm not familiar with. It also shows a previous post to this list on a similar topic: http://r.789695.n4.nabble.com/Statistics-forums-td873818.html , where, at least in 2008, David Winsemius recommended the three sci.stat.* on USENET, which are now also on Google Groups ( https://groups.google.com/forum/#!forumsearch/sci.stat.*). Perhaps others have a better feel for which forums are of higher quality. The sci.stat.* groups look a bit overrun with spam these days. Jeremy On Thursday, April 28, 2011 7:26:06 AM UTC-4, Vincy Pyne wrote: Hi! I wish to know if there is any R forum which is meant only for Statistics? I mean where we can clarify our statistics doubts and seek knowledge. I know there are lot many books and internet sites, but 'R forum' has altogether different standard and very high level and one can learn a lot from them. Regards Vincy [[alternative HTML version deleted]] __ r-h...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using $ accessor in GAM formula
[[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] principal components
On Fri, 2011-04-01 at 11:52 +0530, nuncio m wrote: HI all, I am trying to compute the EOF of a matrix using prcomp but unable to get the expansion co-efficients. is it possible using prcomp or are there any other methods thanks nuncio *sigh* RSiteSearch(EOF) It is at times like this that an R equivalent of: http://lmgtfy.com/ would be handy ;-) Or it slightly cruder cousin... G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Quick recode of -999 to NA in R
On Wed, 2011-03-30 at 08:15 -0500, Christopher Desjardins wrote: Hi, I am trying to write a loop to recode my data from -999 to NA in R. What's the most efficient way to do this? Below is what I'm presently doing, which is inefficient. Thanks, Chris dat0 - read.table(time1.dat) dat0 - read.table(time1.dat, na.strings = c(NA,-999)) would seem the easiest option. Do read ?read.table for details. HTH G colnames(dat0) - c(e1dq, e1arcp, e1dev, s1prcp, s1nrcp, s1ints, a1gpar, a1pias, a1devt) dat0[dat0$e1dq==-999.,e1dq] - NA dat0[dat0$e1arcp==-999.,e1arcp] - NA dat0[dat0$e1dev==-999.,e1dev] - NA dat0[dat0$s1prcp==-999.,s1prcp] - NA dat0[dat0$s1nrcp==-999.,s1nrcp] - NA dat0[dat0$s1ints==-999.,s1ints] - NA dat0[dat0$a1gpar==-999.,a1gpar] - NA dat0[dat0$a1pias==-999.,a1pias] - NA dat0[dat0$a1devt==-999.,a1devt] - NA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dirichlet surface
On Wed, 2011-03-30 at 09:55 -0700, Kehl Dániel wrote: Dear David, I think that is a small bug too, maybe because the function is constant? is there a nice way to put the c(0,2.1) argument optionally, only if all the parameters are 1? Should I post the problem somewhere else (developers maybe?) I don't think this is a bug really; the code is having to compute limits of the z axis and you supplied it one bit of information: a 2. If your data are so degenerate then it is not unreasonable to expect some user intervention. Admittedly, persp doesn't seem to work like other R plotting functions. You could do something like: persp(x1, x2, z, zlim = if(length(na.omit(unique(as.vector(z 2){ c(0,2.1) } else { NULL}) in your call to persp so it only uses user-defined limits if the number of numeric values in z is less than 2. HTH G thanks: Daniel 2011-03-30 04:42 keltezéssel, David Winsemius írta: On Mar 29, 2011, at 4:45 PM, Kehl Dániel wrote: Dear list members, I want to draw surfaces of Dirichlet distributions with different parameter settings. My code is the following: #begin code a1 - a2 - a3 - 2 #a2 - .5 #a3 - .5 x1 - x2 - seq(0.01, .99, by=.01) f - function(x1, x2){ term1 - gamma(a1+a2+a3)/(gamma(a1)*gamma(a2)*gamma(a3)) term2 - x1^(a1-1)*x2^(a2-1)*(1-x1-x2)^(a3-1) term3 - (x1 + x2 1) term1*term2*term3 } z - outer(x1, x2, f) z[z=0] - NA persp(x1, x2, z, main = Dirichlet Distribution, col = lightblue, theta = 50, phi = 20, r = 50, d = 0.1, expand = 0.5, ltheta = 90, lphi = 180, shade = 0.75, ticktype = detailed, nticks = 5) #end code It works fine (I guess), except for a1=a2=a3=1. In that case I get the error: Error in persp.default... invalid 'z' limits. The z matrix has only elements 2 and NA. Might be a buglet in persp. If you set the zlim argument to c(0,2.1), you get the expected constant plane at z=2 for those arguments. Any ideas are appreciated. Thank you: Daniel University of Pécs __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dirichlet surface
On Wed, 2011-03-30 at 11:12 -0700, Kehl Dániel wrote: Actually, it works for the a=1 case, not for the others. It still gives the invalid 'zlim argument' error. I'll try to work it out maybe instead of NULL giving a c which is dependent on the max(z). Sorry, I misread the helpfile - the other lims can be NULL but not zlim. This works: persp(x1, x2, z, zlim = if(length(na.omit(unique(as.vector(z 2) { c(0,2.1) } else { range(z, na.rm = TRUE) }) HTH G Daniel 2011-03-30 10:42 keltezéssel, Kehl Dániel írta: It helped a lot indeed, thank you very much! Now I understand why it was a problem for persp! Daniel 2011-03-30 10:31 keltezéssel, Gavin Simpson írta: On Wed, 2011-03-30 at 09:55 -0700, Kehl Dániel wrote: Dear David, I think that is a small bug too, maybe because the function is constant? is there a nice way to put the c(0,2.1) argument optionally, only if all the parameters are 1? Should I post the problem somewhere else (developers maybe?) I don't think this is a bug really; the code is having to compute limits of the z axis and you supplied it one bit of information: a 2. If your data are so degenerate then it is not unreasonable to expect some user intervention. Admittedly, persp doesn't seem to work like other R plotting functions. You could do something like: persp(x1, x2, z, zlim = if(length(na.omit(unique(as.vector(z 2){ c(0,2.1) } else { NULL}) in your call to persp so it only uses user-defined limits if the number of numeric values in z is less than 2. HTH G thanks: Daniel 2011-03-30 04:42 keltezéssel, David Winsemius írta: On Mar 29, 2011, at 4:45 PM, Kehl Dániel wrote: Dear list members, I want to draw surfaces of Dirichlet distributions with different parameter settings. My code is the following: #begin code a1- a2- a3- 2 #a2- .5 #a3- .5 x1- x2- seq(0.01, .99, by=.01) f- function(x1, x2){ term1- gamma(a1+a2+a3)/(gamma(a1)*gamma(a2)*gamma(a3)) term2- x1^(a1-1)*x2^(a2-1)*(1-x1-x2)^(a3-1) term3- (x1 + x2 1) term1*term2*term3 } z- outer(x1, x2, f) z[z=0]- NA persp(x1, x2, z, main = Dirichlet Distribution, col = lightblue, theta = 50, phi = 20, r = 50, d = 0.1, expand = 0.5, ltheta = 90, lphi = 180, shade = 0.75, ticktype = detailed, nticks = 5) #end code It works fine (I guess), except for a1=a2=a3=1. In that case I get the error: Error in persp.default... invalid 'z' limits. The z matrix has only elements 2 and NA. Might be a buglet in persp. If you set the zlim argument to c(0,2.1), you get the expected constant plane at z=2 for those arguments. Any ideas are appreciated. Thank you: Daniel University of Pécs __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ordination in vegan
On Sun, 2011-03-27 at 21:40 -0700, Akane Nishimura wrote: Hi all, Hi Akane, I answered this on the R-forge forum where you first posted it. I have posted a follow-up there too. More below... I have site data with plant species cover and am looking for trends. I'm kind of new to this, but have done lots of reading and can't find an answer. I tried decorana (I know it's been replaced by ca.) and see a trend, but I'm not sure what it means. Is there a way to get the loadings/eigenvectors of the axes (like in PCA)? Is there a way to do this with rda() too? How about with cca()? I know I can do scores() or summary() but that's not exactly the info I need. I basically want to interpret the axes. Also after doing envfit() and plotting, is there a way to label sites based on certain environmental factors? For instance, I want to highlight plots that have a certain treatment (or have different colored points for the different treatments. I don't know if it matters but I used downweight() before applying decorana(). Here is a copy of the response on the Vegan forum: There are *two* sets of scores, species and sites (samples). The terminology in ordination in ecology is a little different to that of PCA with scores and loadings. Think of the site (sample) scores as the scores and the species scores as the loadings in a DCA. Also, the species scores represent the fitted optima of each species in the ordination space. If the above doesn't make any sense, consider a good text on ordination in ecology, such as Legendre and Legendre (1998) Numerical Ecology 2nd English Edition, or Jongman et al (1995) Data Analysis in Community and Landscape Ecology. As for envfit(), like I said, you need to build the plot up by hand. Here is an example: data(varespec) data(varechem) library(MASS) ord - metaMDS(varespec) fit - envfit(ord, varechem, perm = 999) ## create two groups to represent two treatments ## you'd use your actual Treatment variable, preferably as a factor Treatment - gl(2, NROW(varespec) / 2) ## build up plot plot(ord, type = n) points(ord, display = species) ## Treatment 1 in blue, Treatment 2 in red points(ord, display = sites, col = c(blue,red)[Treatment]) plot(fit, add = TRUE) Does that help? As for downweight(), it isn't relevant really - you need to know what it has done, but that is really just to reduce to weight of rare species in the data set. For the definition of rare see the help page or the code for the function - I can't remember what it is off the top of my head. HTH G down -downweight(spp2009bttl, fraction = 5) down.dca - decorana(down) scores(down.dca) summary(down.dca) Thanks so much! -Akane [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] exploring dist()
On Fri, 2011-03-18 at 06:21 -0700, bra86 wrote: Hello, everybody, I hope somebody could help me with a dist() function. I have a data frame of size 2*4087 (col*row), where col corresponds to the treatment and rows are So you have 4087 species? If yes, normally, you'd have the species in the columns and the samples/treatments in the row. species, values are Hellinger distances, I should reconstruct a distance matrix This doesn't make sense - distances would mean you have a square symmetric matrix but 2 * 4087 isn't square. Do you mean you have Hellinger **transformed** the data such that when you take the Euclidean distances of this transformed data you get the Hellinger distance rather than the Euclidean distance? If yes - and you sort the rows/columns issue - R wants the samples in rows - then it is reasonably simple. Here is a much simplified example with 5 species and 4 samples: dat - data.frame(runif(4, 1, 10), runif(4, 2, 10), runif(4, 4, 20), runif(4, 1, 4), runif(4, 0, 5)) names(dat) - paste(spp, LETTERS[1:5]) rownames(dat) - paste(samp, 1:4) So we have data that looks like this: dat spp Aspp B spp Cspp D spp E samp 1 6.974237 7.933403 5.460453 3.975219 4.6818142 samp 2 1.049801 6.751013 14.143798 1.777532 4.0261914 samp 3 5.742314 2.243850 15.613524 3.476935 0.4144043 samp 4 5.985012 9.576440 8.722579 3.411262 1.8126338 Then I apply a Hellinger transformation: require(vegan) datH - decostand(dat, method = hellinger) So at this point we have something that I think you are telling us you have: datH spp A spp B spp C spp D spp E samp 1 0.4901864 0.5228086 0.4337378 0.3700782 0.4016244 samp 2 0.1945069 0.4932488 0.7139447 0.2530989 0.3809156 samp 3 0.4570334 0.2856942 0.7536245 0.3556336 0.1227769 samp 4 0.4503635 0.5696823 0.5436922 0.3400073 0.2478481 We can use dist() on this data frame via: dij - dist(datH) If we look at the object created, we see the **printed** representation of the dissimilarity matrix, which is a 4*4 matrix in this example: dij samp 1samp 2samp 3 samp 2 0.4253576 samp 3 0.4874570 0.4367179 samp 4 0.2010581 0.3543312 0.3750363 Note that the diagonal and the upper triangle of the matrix are not printed, or stored even, because they are trivial (0 for all diagonals and the upper triangle is the same as the lower triangle). dist() actually creates a vector of numbers that will fill the lower triangle of the dissimilarity matrix. This saves on storage space. If you want the add the diagonal and upper triangle, we can get it one of two ways: 1) dist(datH, diag = TRUE, upper = TRUE) 2) as.matrix(dij) However only the second actually returns a matrix with 16 numbers, the former still only computes the 6 pair-wise distances, but when **printed** it shows the full matrix. If you really have species in rows and smaples in columns, then you can transpose your matrix, e.g. datH.t - t(dat.H) and then compute the dissimilarity matrix as above. Does this help? G with a dist() function. I know that euclidean method should be used. When I type: dist(dframe,euclidean) it gives me a truncated table, where values are missing. I suppose that I have to define something for the values, but I have no idea what exactly, because I am not familiar with r at all. I would be very appreciated for every kind of suggestions or tips. -- View this message in context: http://r.789695.n4.nabble.com/exploring-dist-tp3387187p3387187.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] persuade tabulate function to count NAs in a data frame
and any attached files are confidential and/...{{dropped:19}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with vegan package instalation on linux
On Wed, 2011-03-09 at 11:41 -0300, Ronaldo Reis-Jr. wrote: Hi, I try to install vegan package on debian linux and I give this error: Have you got a fortran compiler installed on your system? gcc-gfortran is the package on my Fedora laptop. G = * installing *source* package 'vegan' ... ** libs make: *** No rule to make target `cepin.o', needed by `vegan.so'. Stop. *** arch - Rgnome Warning in file(con, r) : cannot open file '/usr/lib/R/etc/Rgnome/Makeconf': No such file or directory Error in file(con, r) : cannot open the connection * removing '/usr/local/lib/R/site-library/vegan' The downloaded packages are in '/tmp/RtmpYgJz5c/downloaded_packages' Warning message: In install.packages(vegan) : installation of package 'vegan' had non-zero exit status == I try to find the problem without success. Anybody can help-me? Thanks Ronaldo [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rioja package, creating transfer function, WA, Error in FUN
On Thu, 2011-02-10 at 09:40 -0800, mdc wrote: Hi, I am a new R user and am trying to construct a palaeoenvironmental transfer function (weighted averaging method) using the package rioja. I've managed to insert the two matrices (the species abundance and the environmental data) and have assigned them to the y and x values respectively. When I try and enter the 'WA' function though, I get an 'Error in FUN' message (see below for full values). Alas, I do not know what this means and have struggled to find similar problems to this online. Is there a step I've missed out between assigning the matrices and the WA function? SWED=odbcConnectExcel(file.choose()) (SWED is the environmental data file) sqlTables(SWED) Env=sqlFetch(SWED, Sheet1) odbcClose(SWED) Env SampleId WTD Moisture pH EC 1 N1_1 20 91.72700 3.496674 85.02688 2 N1_22 93.88913 3.550794 85.69465 3 N1_3 26 90.30269 3.948559 113.19206 4 N1_45 94.14427 3.697213 48.56375 5 N1_5 30 90.04269 3.745020 108.57278 90 GAL_15 70 94.07849 3.777932 66.77673 That's your problem, the odbc stuff has read the data in as characters. CSV would be a lot simpler, just save your excel sheets as CSV files and read them in with: Env - read.csv(my_excel_sheet.csv, row.names = 1) etc... where my_excel_sheet.csv is the name of your saved csv file or just use: Env - read.csv(file.choose(), row.names = 1) if finding files via the GUI is helpful to you. It is odd that the species data set has been read in OK though - I say OK, but you still need to get the F1 column out of the species data and set it as the row names of your data. Sorry I'm coming to this late; I've been away and not really following the list for a few weeks. If you can't get things working, contact me off list and send the Excel files and I'll send back a script that will load the files and do the WA for you to look at. HTH G STEST=odbcConnectExcel(file.choose()) sqlTables(STEST) (STEST is the species abundance file) Spe=sqlFetch(STEST, Sheet8) odbcClose(STEST) Spe (The species data contains the abundance of 32 species over 90 sites, set out like this) F1AmpFlavAmpWri ArcCat ArcDis 1N1_1 22.2929936 0.000 0.000 0.000 2N1_2 30.9677419 0.000 0.000 3.2258065 library(rioja) y -as.matrix(Spe) x -as.matrix(Env) WA(y, x, tolDW = FALSE, use.N2=TRUE, check.data=TRUE, lean=FALSE)(the command from the WA section of the rioja booklet) Error in FUN(newX[, i], ...) : invalid 'type' (character) of argument Any help would be most appreciated, Best wishes, Matthew -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] different results in MASS's mca and SAS's corresp
On Sat, 2011-02-05 at 23:39 -0600, Paul Johnson wrote: On Sat, Feb 5, 2011 at 9:19 AM, David Winsemius dwinsem...@comcast.net wrote: snip / cbind(scalermca[,1] * 0.827094, scalermca[,2] * -0.7644828) [,1][,2] 1 1.06070017 -0.8154 2 0.77057891 0.63456780 3 1.07031764 -1.30675217 4 1.07031764 -1.30675217 5 0.23075886 0.90002547 6 0.6943 0.60993995 7 0.10530240 0.78445402 8 -0.27026650 0.44225049 9 0.13426089 1.15670532 10 0.11861965 0.64778456 11 0.23807570 1.21775202 12 1.01156703 -0.01927226 13 0.28051938 -0.59805897 14 -1.17343686 -0.27122981 15 -0.83838041 -0.64003061 16 -0.05453708 -0.22925816 17 -0.91732401 -0.49899374 18 -0.92694148 -0.00774156 19 -1.30251038 -0.34994509 20 -1.30251038 -0.34994509 So, that does reproduce SAS exactly. And I'm a little frustrated I can't remember the matrix command to get that multiplication done without cbinding the 2 columns together that way. You might have been thinking of sweep(): sweep(scalermca[,1:2], 2, c(0.827094,-0.7644828), *) snip/ HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GAM quasipoisson in MuMIn
On Fri, 2011-02-04 at 12:31 +, Karen Moore wrote: Hi, snip / My formula is GAM-gam(Species richness (count) ~ Categorical + Continous + Continous + * s*(Continous ) + Continous : Continous + Continous : Continous, family=quasipoisson, data =) Ok, I'm reasonably certain that that is *not* your gam() call. Why not post the *actual* code you would enter for your GAM model? dd - dredge(GAM) should evaluate all possible models from the global model you specified. The example you quote is for a global model with both smooth and linear terms for the same variables, hence in dredge() they need to select models that include only the smooth for a particular variable or the linear term for that variable, but not both. By the looks of it, you don't have this problem. G Thanks for any advice on script Karen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Please stop all e-mail
On Fri, 2011-02-04 at 13:06 +0800, cahyo kristiono wrote: Dear r-help I don't want to receive againevery e-mail about [R] in my address e-mail (cahyo_kristi...@yahoo.com), because it is cause my inbox so full quickly. So I need your help to stop every e-mail about [R] in my address. Remove yourself or change your subscription options using the web interface: https://stat.ethz.ch/mailman/listinfo/r-help G Thank you so much Regards CK [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] agnes clustering and NAs
On Fri, 2011-01-28 at 10:00 +1100, Dario Strbenac wrote: Hello, Yes, that's right, it is a values matrix. Not a dissimilarity matrix. i.e. str(iMatrix) num [1:23371, 1:56] -0.407 0.198 NA -0.133 NA ... - attr(*, dimnames)=List of 2 ..$ : NULL ..$ : chr [1:56] -8100 -7900 -7700 -7500 ... For the snippet of checking for NAs, I get all TRUEs, so I have at least one NA in each column. Sorry, my bad. Try this: apply(iMatrix, 1, function(x) all(is.na(x))) will check that you have no fully `NA` rows. Also look at str(iMatrix) for potential problems. Finally, try: out - dist(iMatrix) any(is.na(out)) should repeat what agnes is doing to compute the dissimilarity matrix. If that returns TRUE, go and find which samples are giving NA dissimilarity and why. The issue is not NA in the input data, but that your input data is leading to NA in the computed dissimilarities. This might be due to NA's in your input data, where a pair of samples has no common set of data for example. HTH G The part of the agnes documentation I was referring to is : In case of a matrix or data frame, each row corresponds to an observation, and each column corresponds to a variable. All variables must be numeric. Missing values (NAs) are allowed. So, I'm under the impression it handles NAs on its own ? - Dario. Original message Date: Thu, 27 Jan 2011 12:53:27 + From: Gavin Simpson gavin.simp...@ucl.ac.uk Subject: Re: [R] agnes clustering and NAs To: Uwe Ligges lig...@statistik.tu-dortmund.de Cc: d.strbe...@garvan.org.au, r-help@r-project.org On Thu, 2011-01-27 at 10:45 +0100, Uwe Ligges wrote: On 27.01.2011 05:00, Dario Strbenac wrote: Hello, In the documentation for agnes in the package 'cluster', it says that NAs are allowed, and sure enough it works for a small example like : m- matrix(c( 1, 1, 1, 2, 1, NA, 1, 1, 1, 2, 2, 2), nrow = 3, byrow = TRUE) agnes(m) Call:agnes(x = m) Agglomerative coefficient: 0.1614168 Order of objects: [1] 1 2 3 Height (summary): Min. 1st Qu. MedianMean 3rd Qu.Max. 1.155 1.247 1.339 1.339 1.431 1.524 Available components: [1] order height ac merge diss call method data But I have a large matrix (23371 rows, 50 columns) with some NAs in it and it runs for about a minute, then gives an error : agnes(iMatrix) Error in agnes(iMatrix) : No clustering performed, NA-values in the dissimilarity matrix. I've also tried getting rid of rows with all NAs in them, and it still gave me the same error. Is this a bug in agnes() ? It doesn't seem to fulfil the claim made by its documentation. I haven't looked in the file, but you need to get rid of all NA, or in other words, all rows that contain *any* NA values. If one believes the documentation, then that only applies to the case where `x` is a dissimilarity matrix. `NA`s are allowed if x is the raw data matrix or data frame. The only way the OP could have gotten that error with the call shown is if iMatrix were not a dissimilarity matrix inheriting from class dist, so `NA`s should be allowed. My guess would be that the OP didn't get rid of all the `NA`s. Dario: what does: sapply(iMatrix, function(x) any(is.na(x))) or if iMatrix is a matrix: apply(iMatrix, 2, function(x) any(is.na(x))) say? G Uwe Ligges The matrix I'm using can be obtained here : http://129.94.136.7/file_dump/dario/iMatrix.obj -- Dario Strbenac Research Assistant Cancer Epigenetics Garvan Institute of Medical Research Darlinghurst NSW 2010 Australia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% -- Dario Strbenac Research Assistant Cancer Epigenetics Garvan Institute of Medical Research Darlinghurst NSW 2010 Australia
Re: [R] Quasi-poisson glm and calculating a qAIC and qAICc...trying to modilfy Bolker et al. 2009 function to work for a glm model
On Thu, 2011-01-27 at 08:20 -0500, Jason Nelson wrote: Sorry about re-posting this, it never went out to the mailing list when I posted this to r-help forum on Nabble and was pending for a few days, now that I am subscribe to the mailing list I hope that this goes out: I've been a viewer of this forum for a while and it has helped out a lot, but this is my first time posting something. I am running glm models for richness and abundances. For example, my beetle richness is overdispersed: qcc.overdispersion.test(beetle.richness) Overdispersion test Obs.Var/Theor.Var Statistic p-value poisson data 2.628131 23.65318 0.0048847 So, I am running a simple glm with my distribution as quasipoisson glm.richness1-glm(beetle.richness~log.area, family = quasipoisson) Now I want to calculate a qAIC and qAICc. I was trying to modify the equation that I found in Bolker et al 2009 supplemental material: QAICc - function(mod, scale, QAICc=TRUE){ LL - logLik(mod) You are out of luck there then; logLik is not defined for the quasi families. ll - as.numeric(LL) df - attr(LL, df) n - length(mod$y) #used $ to replace @ to make a S3 object if(QAICc) qaic = as.numeric( -2*ll/scale + 2*df + 2*df*(df+1)/(n-df-1)) else qaic =as.numeric( -2*ll/scale + 2*df) qaic } The only problem is that I have no idea how to scale it. In Bolker at al. 2009 it is scaled to phi: phi = lme4:::sigma(model) phi, IIRC, is the dispersion parameter. You can get the estimated value from `summary(model)$dispersion` where model is the result of a call to glm(). But I am not running a mixed model and I cannot run the qAICc function without scaling it. I am comparing models to each other trying to find the best model for both landscape land use land cover data and patch variables. How would I set the scale if I run this function? QAICc(glm.richness1, scale = ?) Should I set the scale to the square root of the deviance? phi = sqrt(glm.richness1$deviance) Your help is much appreciated. Instead of resorting to these ad-hoc approaches to correct for overdispersion, you would be better off fitting models that model the overdispersion. Try a negative binomial (glm.nb() in MASS) or the zeroinfl() and hurdle() functions in the pscl package. Those all have proper log likelihoods and you can compute/extract AIC simply using them. Regards, Jason HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help needed
On Thu, 2011-01-27 at 09:50 -0500, Thomas Stewart wrote: For question 2, TTT - rt(1000,3) mean(TTT[rank(TTT) = 975 rank(TTT) 25]) mean(TTT, trim = 0.05) What you are doing is only removing 5% of observations in total, whilst the question asks for 5% removed off *each* end. G On Thu, Jan 27, 2011 at 8:16 AM, Ben Boyadjian benjy_cy...@hotmail.comwrote: Hello I am trying to solve these problems and I am not allowed to use loops or ifs. 1st Question My first question is that I have generated 100 random numbers from the uniform distribution then A)add only the negative integers. B)add elements until the first appearance of a negative element. I know how to choose the negative elements for A but how to find integers? And I dont know what to do for B. 2nd Question Simulate 1000 observations from the student-t distribution with 3 degrees of freedom and then calculate the truncated mean by excluding bottom 5% and top 5%. Thank yoou [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extrapolating values from a glm fit
On Wed, 2011-01-26 at 19:25 -1000, Ahnate Lim wrote: Even when I try to predict y values from x, let's say I want to predict y at x=0. Looking at the graph from the provided syntax, I would expect y to be about 0.85. Is this right: predict(mylogit,newdata=as.data.frame(0),type=response) Your original problem was the use of `newdata = as.data.frame(0.5)`. There are two problems here: i) if you don't name the input (x = 0.5, say) then you get a data frame with the name(s) 0.5: as.data.frame(0.5) 0.5 1 0.5 and ii) if you do name it, you still get a data frame with name(s) 0.5 as.data.frame(x = 0.5) 0.5 1 0.5 In both cases, predict wants to find a variable with the name `x` in the object supplied to `newdata`. It finds `x` but your `x` in the global workspace, but warns because it knows that `newdata` was a data frame with a single row - so there was a mismatch and you likely made a mistake. In these cases, `data.frame()` is preferred to `as.data.frame()`: predict(mylogit, newdata = data.frame(x = 0), type = response) or we can use a list, to save a few characters: predict(mylogit, newdata = list(x = 0), type = response) which give: predict(mylogit, newdata = list(x = 0), type = response) 1 0.813069 predict(mylogit, newdata = data.frame(x = 0), type = response) 1 0.813069 In summary, use `data.frame()` or `list()` to create the object passed as `newdata` and make sure you give the component containing the new values a *name* that matches the predictor in the model formula. HTH G # I get: Warning message: 'newdata' had 1 rows but variable(s) found have 34 rows # Why do I need 34 rows? So I try: v=vector() v[1:34]=0 predict(mylogit,newdata=as.data.frame(v),type=response) # And I get a matrix of 34 values that appear to fit a logistic function, instead of 0.85.. On Wed, Jan 26, 2011 at 6:59 PM, David Winsemius dwinsem...@comcast.netwrote: On Jan 26, 2011, at 10:52 PM, Ahnate Lim wrote: Dear R-help, I have fitted a glm logistic function to dichotomous forced choices responses varying according to time interval between two stimulus. x values are time separation in miliseconds, and the y values are proportion responses for one of the stimulus. Now I am trying to extrapolate x values for the y value (proportion) at .25, .5, and .75. I have tried several predict parameters, and they don't appear to be working. Is this correct use and understanding of the predict function? It would be nice to know the parameters for the glm best fit, but all I really need are the extrapolated x values for those proportions. Thank you for your help. Here is the code: x - c(-283.9, -267.2, -250.5, -233.8, -217.1, -200.4, -183.7, -167, -150.3, -133.6, -116.9, -100.2, -83.5, -66.8, -50.1, -33.4, -16.7, 16.7, 33.4, 50.1, 66.8, 83.5, 100.2, 116.9, 133.6, 150.3, 167, 183.7, 200.4, 217.1, 233.8, 250.5, 267.2, 283.9) y - c(0, 0.333, 0, 0, 0, 0, 0, 0, 0, 0.333, 0, 0.133, 0.238095238095238, 0.528, 0.567, 0.845238095238095, 0.55, 1, 0.889, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.5) weight - c(1, 3, 2, 5, 4, 4, 3, 5, 5, 4, 5, 11, 22, 11, 15, 16, 11, 7, 14, 10, 16, 19, 11, 5, 4, 5, 6, 9, 4, 2, 5, 5, 2, 2) mylogit - glm(y~x,weights=weight, family = binomial) # now I try plotting the predicted value, and it looks like a good fit, hopefully I can access what the glm is doing ypred - predict(mylogit,newdata=as.data.frame(x),type=response) plot(x, ypred,type=l) points(x,y) # so I try to predict the x value when y (proportion) is at .5, but something is wrong.. Right. Predict goes in the other direction ... x predicts y. Perhaps if you created a function of y w.r.t. x and then inverted it. ?approxfun # and see the posting earlier this week Inverse Prediction with splines where it was demonstrated how to reverse the roles of x and y. predict(mylogit,newdata=as.data.frame(0.5)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e
Re: [R] Extrapolating values from a glm fit
On Thu, 2011-01-27 at 00:10 -1000, Ahnate Lim wrote: Thank you for the clarification, this makes sense now! dose.p from MASS also does the job perfectly, in returning x values for specified proportions. I'm curious, if I use the other parameter in predict.glm type=link (instead of type=response), in the case of a logistic binomial, what does this predict? This is the predicted value(s) on the scale of the linear predictor, or the link function, depending on terminology, hence link. Recall that in the GLM the response and the linear predictor are related through a link function: g(y) = eta so for your model logit(y) = eta where eta is the linear predictor: beta_0 + beta_1 * x (in your example). beta_0 + beta_1 * x gives us the fitted value but on the untransformed scale. This is the value given by predict() when type = link is used. To get the predicted value on the response scale, we apply the inverse of the link function g(): y = g(eta)^-1 The inverse of the logit function is the inverse-logit: logit(eta)^-1 = exp(eta) / (1 + exp(eta)) So in R, we can see the relationship through a few simple commands. First, the prediction for x = 0.5 on the response scale: predict(mylogit, newdata = list(x = 0.5), type = response) 1 0.8149848 Then we compute the same prediction but on the scale of the link function: (p1 - predict(mylogit, newdata = list(x = 0.5), type = link)) 1 1.482732 to which we apply the inverse-logit function giving us the same value we got earlier for type = response: exp(p1) / (1 + exp(p1)) 1 0.8149848 There is a function to do this for us, however, called plogis() plogis(p1) 1 0.8149848 One reason why you might want predictions on the scale of the link function is for computation of confidence intervals using normal theory (e.g. 2-sigma ~95% confidence intervals). On the response scale these should be asymmetric and respect the scale of the response (bounding etc), so you compute them on the link scale and apply the inverse of the link function to get them on to the response scale. HTH G On Wed, Jan 26, 2011 at 11:41 PM, Gavin Simpson gavin.simp...@ucl.ac.uk wrote: On Wed, 2011-01-26 at 19:25 -1000, Ahnate Lim wrote: Even when I try to predict y values from x, let's say I want to predict y at x=0. Looking at the graph from the provided syntax, I would expect y to be about 0.85. Is this right: predict(mylogit,newdata=as.data.frame(0),type=response) Your original problem was the use of `newdata = as.data.frame(0.5)`. There are two problems here: i) if you don't name the input (x = 0.5, say) then you get a data frame with the name(s) 0.5: as.data.frame(0.5) 0.5 1 0.5 and ii) if you do name it, you still get a data frame with name(s) 0.5 as.data.frame(x = 0.5) 0.5 1 0.5 In both cases, predict wants to find a variable with the name `x` in the object supplied to `newdata`. It finds `x` but your `x` in the global workspace, but warns because it knows that `newdata` was a data frame with a single row - so there was a mismatch and you likely made a mistake. In these cases, `data.frame()` is preferred to `as.data.frame()`: predict(mylogit, newdata = data.frame(x = 0), type = response) or we can use a list, to save a few characters: predict(mylogit, newdata = list(x = 0), type = response) which give: predict(mylogit, newdata = list(x = 0), type = response) 1 0.813069 predict(mylogit, newdata = data.frame(x = 0), type = response) 1 0.813069 In summary, use `data.frame()` or `list()` to create the object passed as `newdata` and make sure you give the component containing the new values a *name* that matches the predictor in the model formula. HTH G # I get: Warning message: 'newdata' had 1 rows but variable(s) found have 34 rows # Why do I need 34 rows? So I try: v=vector() v[1:34]=0 predict(mylogit,newdata=as.data.frame(v),type=response) # And I get a matrix of 34 values that appear to fit a logistic function, instead of 0.85.. On Wed, Jan 26, 2011 at 6:59 PM, David Winsemius dwinsem...@comcast.netwrote: On Jan 26, 2011, at 10:52 PM
Re: [R] agnes clustering and NAs
On Thu, 2011-01-27 at 10:45 +0100, Uwe Ligges wrote: On 27.01.2011 05:00, Dario Strbenac wrote: Hello, In the documentation for agnes in the package 'cluster', it says that NAs are allowed, and sure enough it works for a small example like : m- matrix(c( 1, 1, 1, 2, 1, NA, 1, 1, 1, 2, 2, 2), nrow = 3, byrow = TRUE) agnes(m) Call:agnes(x = m) Agglomerative coefficient: 0.1614168 Order of objects: [1] 1 2 3 Height (summary): Min. 1st Qu. MedianMean 3rd Qu.Max. 1.155 1.247 1.339 1.339 1.431 1.524 Available components: [1] order height ac merge diss call method data But I have a large matrix (23371 rows, 50 columns) with some NAs in it and it runs for about a minute, then gives an error : agnes(iMatrix) Error in agnes(iMatrix) : No clustering performed, NA-values in the dissimilarity matrix. I've also tried getting rid of rows with all NAs in them, and it still gave me the same error. Is this a bug in agnes() ? It doesn't seem to fulfil the claim made by its documentation. I haven't looked in the file, but you need to get rid of all NA, or in other words, all rows that contain *any* NA values. If one believes the documentation, then that only applies to the case where `x` is a dissimilarity matrix. `NA`s are allowed if x is the raw data matrix or data frame. The only way the OP could have gotten that error with the call shown is if iMatrix were not a dissimilarity matrix inheriting from class dist, so `NA`s should be allowed. My guess would be that the OP didn't get rid of all the `NA`s. Dario: what does: sapply(iMatrix, function(x) any(is.na(x))) or if iMatrix is a matrix: apply(iMatrix, 2, function(x) any(is.na(x))) say? G Uwe Ligges The matrix I'm using can be obtained here : http://129.94.136.7/file_dump/dario/iMatrix.obj -- Dario Strbenac Research Assistant Cancer Epigenetics Garvan Institute of Medical Research Darlinghurst NSW 2010 Australia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.