Re: [R-sig-eco] AIC in R: back-transforming standardized model parameters (slopes)

2016-01-11 Thread Drew Tyre
Hi Matt,

This isn't going to be a complete answer, but it might help. 

I wasn't 100% sure what standardize() was doing, or how it was doing it, so I 
did

getMethod("standardize","glm") 

to see the source code. That function calls standardize.default() which is a 
bit hard to get but 

getFromNamespace("standardize.default","arm")

pulls it out. From that code you can see that standardize() extracts the data 
from the model object, centers and scales it, and then refits the model to the 
centered and scaled data. So the formula you're looking for is

z.time = (time - mean(time))/2*sd(time)

similar to a standard Z transformation but using 2 times the sd in the 
denominator. 

> Therefore I wished to know (preferably) the calculation being made, and more
> importantly the function/code to back-transform my slope estimates to
> reportable 'real' slopes.

Hmmm, but the main reason to do the transformation in the first place is to 
make it easier to do comparisons between the effect sizes of different 
variables. If you want to report "real" slopes, just use the ones from your 
model1, which should be near identical to the backtransformed versions of the 
ones from model2. 

Another reason for centering and scaling is to improve numerical stability of 
your estimates, but if you were able to fit model1 and it didn't complain, not 
sure that you need to bother with the standardization. 

There are other reasons too, but I don't think any of them apply here. There's 
a great discussion of when to scale and why here:
http://stats.stackexchange.com/questions/29781/when-conducting-multiple-regression-when-should-you-center-your-predictor-varia

-- 
Drew Tyre

School of Natural Resources
University of Nebraska-Lincoln
416 Hardin Hall, East Campus
3310 Holdrege Street
Lincoln, NE 68583-0974

phone: +1 402 472 4054 
fax: +1 402 472 2946
email: aty...@unl.edu
http://snr.unl.edu/tyre
http://aminpractice.blogspot.com
http://www.flickr.com/photos/atiretoo

> -Original Message-
> From: R-sig-ecology [mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of
> Matt Perkins
> Sent: Monday, January 11, 2016 1:40 AM
> To: r-sig-ecology@r-project.org
> Subject: [R-sig-eco] AIC in R: back-transforming standardized model parameters
> (slopes)
> 
> 
> Hi All,
> 
> 
> I have a simple Q that I'm having some difficulty finding an answer for. I'm
> conducting AIC analyses in R, and I would like to be able to report 'real' 
> slope
> values from my model summary output (i.e. take the slopes and report them
> within simple y=mx+c linear equations in my paper). However, following Greuber
> etal 2011, I have standardised the explanatory variables in my model by
> centering them to a mean of zero and and an SD of 2, using the following code
> and the R package "arm". My model has a normal error distribution.
> 
> 
> stdz.model1<-standardize(model1, standardize.y=FALSE)
> 
> 
> 
> I do not yet know the sums behind this code in order to know how and what
> change has been made to my explanatory variables, in order that I could
> manually make back-transformations.
> 
> 
> Therefore I wished to know (preferably) the calculation being made, and more
> importantly the function/code to back-transform my slope estimates to
> reportable 'real' slopes.
> 
> 
> Addtionally, is it correct (or does it even matter) that I should be focusing 
> my
> back-transformation on the slope estimate taken from the model summary, as
> opposed to instead using the model summary standardised slope estimate to
> calculate a y value in my linear equation (y=mx+c), and then back-transforming
> that y value?
> 
> 
> 
> ##
> 
> 
> If it is useful, my model and summary tables are below.
> 
> 
> I would like to test if treatment (kept in air or ice) affects nitrogen (N) 
> within
> shrimps over time. I have repeated measures per shrimp (unique.id) that I use 
> as
> a random factor to account for non-independence within an individual.
> 
> 
> My model is a linear mixed model of the form:
> 
> 
> model1<-lmer( N ~ time* air.ice + (1|unique.id), data=shrimp, REML=FALSE)
> 
> 
> 
> The two model summary tables below show the 1) un-standardised model with
> ready-to-use slope value ("time" = 0.008156)
> 
> and 2) standardised model with much-larger slope value ("z.time" = 0.50782)
> 
> 
> 1)
> 
> Fixed effects:
>   Estimate   Std. Error   
>  t value
> (Intercept)  12.522535   0.197024  63.56
> time   0.0081560.001134   7.19
> air.ice   -0.801936   0.278634  -2.88
> time:air.ice  -0.004442   0.001604 -2.77
> 
> 
> 2)
> 
>Estimate  Std. Error   
>  t value
> (Intercept)12.48242  0.13051  95.65
> z.time  0.50782   0.06861   

Re: [R-sig-eco] Number of Groups in SpeciesMix

2016-01-11 Thread Scott Foster

Dear Alexandre,

I'm glad that you are using species archetypes models (SAMs).  I hope that SAMs 
can answer your questions succinctly.

I think that a lot will be clarified if you look at the example in the help 
file for clusterSelect().  There you will see that:

1) obs is just a dummy -- you can leave it out (~1+x) or insert whatever you like (anything~1+x).  This is an unfortunate nomenclature, but bearable I 
think.
2) dat1$pa contains all the species observations.  Note that all the columns (species) in this data.frame are used in the analysis.  So, make sure 
that you remove any unwanted species prior to passing it as an argument.
3) dat is the data.frame containing all the environmental covariates.  Note that the number of rows in dat should match dat1.pa (should get an error 
if not).  The model fitting function will extract the right terms, and functions there of (just like lm or glm will do).


So, in terms of your specific questions:

1) obs doesn't stand for the species data.  It doesn't even stand for anything. 
 Ignore it or just type anything at all.
2) You should put your species data in the sp.data argument (all species to be 
included in the analysis and no more)
3) You should put your environmental data in the covar.data, and the right bits 
of it will be extracted according to the right hand side of your formula.

I would encourage you to look at the example in ?clusterSelect.  See how the (simulated) data set is arranged into species data and environmental 
data, and how they are passed to clusterSelect().


I'm happy to help as much as I can, either on list or off (but preferably not both).  I'm also happy to take suggestions about how the package/method 
can be improved.


Regards,

Scott (contributor to, but not author of, SpeciesMix)

On 12/01/16 06:29, Alexandre F. Souza wrote:

Dear friends,

I am willing to apply the SAM analytical framework to a dataset of plant
species in coastal Brazil using the SpeciesMix package. The SpeciesMix
package fits Species Archtype Models, a special type of finite mixture of
regression model motivated by the analysis of multi-species data.

In appying function clusterSelect, which helps in defining the best number
of species groups G, I would like to confirm if my understanding is
correct: the formula reported there as "obs ~ 1 + x" in

clusters <- clusterSelect(obs~1+x,dat1$pa,dat,G=2:5,em.refit=2)

is a generic formulation not directly related to the species (dat1$pa) or
environmental (dat) data, isn't it? So in principle I should use this same
formulation as well, understanding that obs stands for the whole species
data matrix, 1 for the presence of a constant, and x for the whole
explanatory dataset?

I tried to apply obs~1+x but it returns an error message, however.

I am kind of blocked here so any thoughts could help...

Sincerely,

Alexandre



--
Scott Foster
CSIRO
E scott.fos...@csiro.au T +61 3 6232 5178
Postal address: CSIRO Marine Laboratories, GPO Box 1538, Hobart TAS 7001
Street Address: CSIRO, Castray Esplanade, Hobart Tas 7001, Australia
www.csiro.au

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] NMDS axes scores

2016-01-11 Thread Jari Oksanen
Contrary to common misbelief, NMDS ordination space is **metric**. In vegan, 
the ordination space (= the ordination result) is even guaranteed to be 
Euclidean (in isoMDS it can be Minkowski, but this is not allowed with vegan). 
What is non-metric is the regression from observed dissimilarities to the 
Euclidean distances in ordination space. The reason why we do not recommend 
using NMDS axes as independent beasts is that NMDS tries to preserve the 
*distances* among points. Any orthogonal rotation (= turning of ordination 
space) will change scores along rotated axes, but retain the distances among 
points. The vegan NMDS result is rotated to principal components, but still you 
should avoid thinking that this makes dimensions independent from each other, 
although the first maximizes the dispersion of points and axes are orthogonal 
(non-correlated).

PCA ordination is Euclidean in the same way as NMDS. The difference to NMDS are 
that (1) only Euclidean distances among sampling units can be used in PCA (in 
NMDS you can use any adequate dissimilarity), and (2) the mapping is linear 
(instead of non-metric) from observed dissimilarities to Euclidean 
dissimilarities. Try function stressplot() in vegan to see what this means — it 
is available both for NMDS and rda (PCA) results.  CA is similar to PCA except 
that it is based on weighted Euclidean distances. I won’t go into mathematical 
details, but you can see ?wcmdscale in vegan to see how to get CA as a weighted 
Euclidean ordination of Chi-square transformed data. 

PCA and CA have some ordering criteria for their axis and therefore some people 
have used axes from those as independent beasts. I think this is dubious, too, 
but people do it all the time. The PCA/CA also define a multivariate space, and 
taking only one axis as an independent object sounds strange, in particular if 
you take something else than the first axes. 

So what to do with NMDS axes? If you take all NMDS axes and their interactions 
in a regression of type ~ axis1 + axis2 + axis1:axis2 then this is equal to 
fitting a linear trend surface, and the interaction term axis1:axis2 takes care 
that the result is invariant under rotation of NMDS space. Function ordisurf() 
in vegan gives further ideas how to fit surfaces to NMDS *space* (instead of 
simple axis). Also, if you think that some direction in NMDS (not necessarily 
parallel to the axes) is good and you have an indicator variable for that, you 
can use MDSrotate() function in vegan to rotate your solution to that direction 
and then take that rotated axis as your explanatory variable. 

HTH, Jari Oksanen

> On 11 Jan 2016, at 10:38 am, Martin Weiser  wrote:
> 
> Hi Conny,
> 
> AFAIK NMDS is *non-metric* and represents distances among objects, not
> gradients along axes (known or unknown): distances along axes are
> stretched as needed locally (NMDS works with rank order), even order of
> the elements along axes does not tell anything. NMDS is great if you
> want to say: Object A resembles object C more than it resembles object
> B, even though C and B are quite similar.
> Try this: run NMDS several times, aim for different number of axes (e.g.
> 1,2,3,5,10) and note the scores of the objects along the first one.  You
> *may* get the same thing.
> 
> If you need scores of the objects in the ordination, use something with
> well defined metrics and axes, e.g. PCA, CA.
> 
> HTH,
> Martin
> 
> On 9.1.2016 05:41, Conny wrote:
>> Hi all,
>> 
>> 
>> 
>> it has been frequently pointed out in this group, that NMDS axes scores
>> shouldn't be used individually for further analysis.  
>> 
>> I therefore would like to include both of my NMDS site scores as a response
>> into a GLM model simultaneously.  Unfortunately, I couldn't find any advice
>> on how to actually do this. I found a  couple of papers using NMDS scores in
>> GLMs, but they all seem to use them individually, fitting separate models to
>> each of the ordination axes.
>> 
>> 
>> 
>> I'm a bit at a loss here and any advice is very much appreciated,
>> 
>> Conny
>> 
>> 
>>  [[alternative HTML version deleted]]
>> 
>> ___
>> R-sig-ecology mailing list
>> R-sig-ecology@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> 
> 
> -- 
> 
> --
> Pokud je tento e-mail součástí obchodního jednání, Přírodovědecká fakulta 
> Univerzity Karlovy v Praze:
> a) si vyhrazuje právo jednání kdykoliv ukončit a to i bez uvedení důvodu,
> b) stanovuje, že smlouva musí mít písemnou formu,
> c) vylučuje přijetí nabídky s dodatkem či odchylkou,
> d) stanovuje, že smlouva je uzavřena teprve výslovným dosažením shody na 
> všech náležitostech smlouvy.
> 
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology 

Re: [R-sig-eco] NMDS axes scores

2016-01-11 Thread Roman Luštrik
Thank you Jari for an, as always, insightful email. It has been a gut
feeling of mine for quite some time that using PCA scores as independent
variables is at least little wrong but never found any reference to
substantiate it. I would like to use this opportunity to ask you or other
readers if there are any (critical) references available regarding this
usage of PCA scores?

Cheers,
Roman

On Mon, Jan 11, 2016 at 11:11 AM, Jari Oksanen  wrote:

> Contrary to common misbelief, NMDS ordination space is **metric**. In
> vegan, the ordination space (= the ordination result) is even guaranteed to
> be Euclidean (in isoMDS it can be Minkowski, but this is not allowed with
> vegan). What is non-metric is the regression from observed dissimilarities
> to the Euclidean distances in ordination space. The reason why we do not
> recommend using NMDS axes as independent beasts is that NMDS tries to
> preserve the *distances* among points. Any orthogonal rotation (= turning
> of ordination space) will change scores along rotated axes, but retain the
> distances among points. The vegan NMDS result is rotated to principal
> components, but still you should avoid thinking that this makes dimensions
> independent from each other, although the first maximizes the dispersion of
> points and axes are orthogonal (non-correlated).
>
> PCA ordination is Euclidean in the same way as NMDS. The difference to
> NMDS are that (1) only Euclidean distances among sampling units can be used
> in PCA (in NMDS you can use any adequate dissimilarity), and (2) the
> mapping is linear (instead of non-metric) from observed dissimilarities to
> Euclidean dissimilarities. Try function stressplot() in vegan to see what
> this means — it is available both for NMDS and rda (PCA) results.  CA is
> similar to PCA except that it is based on weighted Euclidean distances. I
> won’t go into mathematical details, but you can see ?wcmdscale in vegan to
> see how to get CA as a weighted Euclidean ordination of Chi-square
> transformed data.
>
> PCA and CA have some ordering criteria for their axis and therefore some
> people have used axes from those as independent beasts. I think this is
> dubious, too, but people do it all the time. The PCA/CA also define a
> multivariate space, and taking only one axis as an independent object
> sounds strange, in particular if you take something else than the first
> axes.
>
> So what to do with NMDS axes? If you take all NMDS axes and their
> interactions in a regression of type ~ axis1 + axis2 + axis1:axis2 then
> this is equal to fitting a linear trend surface, and the interaction term
> axis1:axis2 takes care that the result is invariant under rotation of NMDS
> space. Function ordisurf() in vegan gives further ideas how to fit surfaces
> to NMDS *space* (instead of simple axis). Also, if you think that some
> direction in NMDS (not necessarily parallel to the axes) is good and you
> have an indicator variable for that, you can use MDSrotate() function in
> vegan to rotate your solution to that direction and then take that rotated
> axis as your explanatory variable.
>
> HTH, Jari Oksanen
>
> > On 11 Jan 2016, at 10:38 am, Martin Weiser 
> wrote:
> >
> > Hi Conny,
> >
> > AFAIK NMDS is *non-metric* and represents distances among objects, not
> > gradients along axes (known or unknown): distances along axes are
> > stretched as needed locally (NMDS works with rank order), even order of
> > the elements along axes does not tell anything. NMDS is great if you
> > want to say: Object A resembles object C more than it resembles object
> > B, even though C and B are quite similar.
> > Try this: run NMDS several times, aim for different number of axes (e.g.
> > 1,2,3,5,10) and note the scores of the objects along the first one.  You
> > *may* get the same thing.
> >
> > If you need scores of the objects in the ordination, use something with
> > well defined metrics and axes, e.g. PCA, CA.
> >
> > HTH,
> > Martin
> >
> > On 9.1.2016 05:41, Conny wrote:
> >> Hi all,
> >>
> >>
> >>
> >> it has been frequently pointed out in this group, that NMDS axes scores
> >> shouldn't be used individually for further analysis.
> >>
> >> I therefore would like to include both of my NMDS site scores as a
> response
> >> into a GLM model simultaneously.  Unfortunately, I couldn't find any
> advice
> >> on how to actually do this. I found a  couple of papers using NMDS
> scores in
> >> GLMs, but they all seem to use them individually, fitting separate
> models to
> >> each of the ordination axes.
> >>
> >>
> >>
> >> I'm a bit at a loss here and any advice is very much appreciated,
> >>
> >> Conny
> >>
> >>
> >>  [[alternative HTML version deleted]]
> >>
> >> ___
> >> R-sig-ecology mailing list
> >> R-sig-ecology@r-project.org
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> >
> >
> > --
> >
> > --
> > Pokud je 

Re: [R-sig-eco] NMDS axes scores

2016-01-11 Thread Martin Weiser
Hi Conny,

AFAIK NMDS is *non-metric* and represents distances among objects, not
gradients along axes (known or unknown): distances along axes are
stretched as needed locally (NMDS works with rank order), even order of
the elements along axes does not tell anything. NMDS is great if you
want to say: Object A resembles object C more than it resembles object
B, even though C and B are quite similar.
Try this: run NMDS several times, aim for different number of axes (e.g.
1,2,3,5,10) and note the scores of the objects along the first one.  You
*may* get the same thing.

If you need scores of the objects in the ordination, use something with
well defined metrics and axes, e.g. PCA, CA.
 
HTH,
Martin

On 9.1.2016 05:41, Conny wrote:
> Hi all,
>
>  
>
> it has been frequently pointed out in this group, that NMDS axes scores
> shouldn't be used individually for further analysis.  
>
> I therefore would like to include both of my NMDS site scores as a response
> into a GLM model simultaneously.  Unfortunately, I couldn't find any advice
> on how to actually do this. I found a  couple of papers using NMDS scores in
> GLMs, but they all seem to use them individually, fitting separate models to
> each of the ordination axes.
>
>  
>
> I'm a bit at a loss here and any advice is very much appreciated,
>
> Conny
>
>
>   [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


-- 

--
Pokud je tento e-mail součástí obchodního jednání, Přírodovědecká fakulta 
Univerzity Karlovy v Praze:
a) si vyhrazuje právo jednání kdykoliv ukončit a to i bez uvedení důvodu,
b) stanovuje, že smlouva musí mít písemnou formu,
c) vylučuje přijetí nabídky s dodatkem či odchylkou,
d) stanovuje, že smlouva je uzavřena teprve výslovným dosažením shody na 
všech náležitostech smlouvy.

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] NMDS axes scores

2016-01-11 Thread Bob O'Hara

On 11/01/16 12:08, Roman Luštrik wrote:

Thank you Jari for an, as always, insightful email. It has been a gut
feeling of mine for quite some time that using PCA scores as independent
variables is at least little wrong but never found any reference to
substantiate it. I would like to use this opportunity to ask you or other
readers if there are any (critical) references available regarding this
usage of PCA scores?

Hadi & Ling is a good reference:

Hadi A, Ling R. Some cautionary notes on the use of principal components 
regression. Am Stat [Internet]. 1998 [cited 2012 Mar 21];52(1):15–9. 
Available from: http://www.jstor.org/stable/10.2307/2685559


Basically, if you have all your PCs in the model, you're OK (because 
that's just a rotation of the covariates anyway: in fact regression 
algorithms do their own rotation anyway). Similarly, with MDS it only 
makes sense if you include all of the axes.


Whilst I'm filling bandwidth, I'm not sure Jari's suggestion that you 
need the interaction term is correct. If a model is linear in axis1 and 
axis2, then any rotation is also linear, i.e. the transformation is c_1 
axis1 + c_2 axis2. So you only need Y ~ axis1 + axis2. Basically, it's 
still plane. The interaction adds a curve: if you want that you also 
need to include quadratic terms, i.e. Y ~ axis1 + axis2 + axis1^2 + 
axis2^2 + axis1:axis2.


Bob



Cheers,
Roman

On Mon, Jan 11, 2016 at 11:11 AM, Jari Oksanen  wrote:


Contrary to common misbelief, NMDS ordination space is **metric**. In
vegan, the ordination space (= the ordination result) is even guaranteed to
be Euclidean (in isoMDS it can be Minkowski, but this is not allowed with
vegan). What is non-metric is the regression from observed dissimilarities
to the Euclidean distances in ordination space. The reason why we do not
recommend using NMDS axes as independent beasts is that NMDS tries to
preserve the *distances* among points. Any orthogonal rotation (= turning
of ordination space) will change scores along rotated axes, but retain the
distances among points. The vegan NMDS result is rotated to principal
components, but still you should avoid thinking that this makes dimensions
independent from each other, although the first maximizes the dispersion of
points and axes are orthogonal (non-correlated).

PCA ordination is Euclidean in the same way as NMDS. The difference to
NMDS are that (1) only Euclidean distances among sampling units can be used
in PCA (in NMDS you can use any adequate dissimilarity), and (2) the
mapping is linear (instead of non-metric) from observed dissimilarities to
Euclidean dissimilarities. Try function stressplot() in vegan to see what
this means — it is available both for NMDS and rda (PCA) results.  CA is
similar to PCA except that it is based on weighted Euclidean distances. I
won’t go into mathematical details, but you can see ?wcmdscale in vegan to
see how to get CA as a weighted Euclidean ordination of Chi-square
transformed data.

PCA and CA have some ordering criteria for their axis and therefore some
people have used axes from those as independent beasts. I think this is
dubious, too, but people do it all the time. The PCA/CA also define a
multivariate space, and taking only one axis as an independent object
sounds strange, in particular if you take something else than the first
axes.

So what to do with NMDS axes? If you take all NMDS axes and their
interactions in a regression of type ~ axis1 + axis2 + axis1:axis2 then
this is equal to fitting a linear trend surface, and the interaction term
axis1:axis2 takes care that the result is invariant under rotation of NMDS
space. Function ordisurf() in vegan gives further ideas how to fit surfaces
to NMDS *space* (instead of simple axis). Also, if you think that some
direction in NMDS (not necessarily parallel to the axes) is good and you
have an indicator variable for that, you can use MDSrotate() function in
vegan to rotate your solution to that direction and then take that rotated
axis as your explanatory variable.

HTH, Jari Oksanen


On 11 Jan 2016, at 10:38 am, Martin Weiser 

wrote:

Hi Conny,

AFAIK NMDS is *non-metric* and represents distances among objects, not
gradients along axes (known or unknown): distances along axes are
stretched as needed locally (NMDS works with rank order), even order of
the elements along axes does not tell anything. NMDS is great if you
want to say: Object A resembles object C more than it resembles object
B, even though C and B are quite similar.
Try this: run NMDS several times, aim for different number of axes (e.g.
1,2,3,5,10) and note the scores of the objects along the first one.  You
*may* get the same thing.

If you need scores of the objects in the ordination, use something with
well defined metrics and axes, e.g. PCA, CA.

HTH,
Martin

On 9.1.2016 05:41, Conny wrote:

Hi all,



it has been frequently pointed out in this group, that NMDS axes scores
shouldn't be used 

Re: [R-sig-eco] NMDS axes scores

2016-01-11 Thread Jari Oksanen

> On 11 Jan 2016, at 14:13 pm, Bob O'Hara  wrote:
> 
> 
> Whilst I'm filling bandwidth, I'm not sure Jari's suggestion that you need 
> the interaction term is correct. If a model is linear in axis1 and axis2, 
> then any rotation is also linear, i.e. the transformation is c_1 axis1 + c_2 
> axis2. So you only need Y ~ axis1 + axis2. Basically, it's still plane. The 
> interaction adds a curve: if you want that you also need to include quadratic 
> terms, i.e. Y ~ axis1 + axis2 + axis1^2 + axis2^2 + axis1:axis2.
> 
Me neither: I think my suggestion to include interaction term was wrong. This 
belief seems to be shared by vegan authors who define linear trend surface 
without interaction in ordisurf().

Cheers, Jari O.

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] NMDS axes scores

2016-01-11 Thread Jari Oksanen
Zoltan,

You’d better ask Bob…

If  you really want to get a synthetic (latent) variable with reduced noise, I 
think you really should be doing Factor Analysis. In particular, you should  
have confirmatory factor analysis, a.k.a. measurement model in latent linear 
structural models. Often taking first axes of PCA can do about the same, except 
that you rarely have sound model construction in PCA. I’m getting more liberal 
minded with age, and I can accept many kind of analyses, though.

Cheers, Jari O.
> On 11 Jan 2016, at 17:29 pm, Zoltan Botta-Dukat 
>  wrote:
> 
> Dear Jari,
> 
> What is your opinion about using first few axes of a metric ordination? I'm 
> aware that it is meaningless using first two axes of NMDS ordination that 
> calculated for three  dimensions. But in my experience, it is often useful to 
> use only first few axes of metric ordination instead of raw data: no 
> ecological relevant information is lost, only the 'noise' is reduced.
> 
> Best wishes
> 
> Zoltan
> 
> 2016.01.11. 11:11 keltezéssel, Jari Oksanen írta:
>> Contrary to common misbelief, NMDS ordination space is **metric**. In vegan, 
>> the ordination space (= the ordination result) is even guaranteed to be 
>> Euclidean (in isoMDS it can be Minkowski, but this is not allowed with 
>> vegan). What is non-metric is the regression from observed dissimilarities 
>> to the Euclidean distances in ordination space. The reason why we do not 
>> recommend using NMDS axes as independent beasts is that NMDS tries to 
>> preserve the *distances* among points. Any orthogonal rotation (= turning of 
>> ordination space) will change scores along rotated axes, but retain the 
>> distances among points. The vegan NMDS result is rotated to principal 
>> components, but still you should avoid thinking that this makes dimensions 
>> independent from each other, although the first maximizes the dispersion of 
>> points and axes are orthogonal (non-correlated).
>> 
>> PCA ordination is Euclidean in the same way as NMDS. The difference to NMDS 
>> are that (1) only Euclidean distances among sampling units can be used in 
>> PCA (in NMDS you can use any adequate dissimilarity), and (2) the mapping is 
>> linear (instead of non-metric) from observed dissimilarities to Euclidean 
>> dissimilarities. Try function stressplot() in vegan to see what this means — 
>> it is available both for NMDS and rda (PCA) results.  CA is similar to PCA 
>> except that it is based on weighted Euclidean distances. I won’t go into 
>> mathematical details, but you can see ?wcmdscale in vegan to see how to get 
>> CA as a weighted Euclidean ordination of Chi-square transformed data.
>> 
>> PCA and CA have some ordering criteria for their axis and therefore some 
>> people have used axes from those as independent beasts. I think this is 
>> dubious, too, but people do it all the time. The PCA/CA also define a 
>> multivariate space, and taking only one axis as an independent object sounds 
>> strange, in particular if you take something else than the first axes.
>> 
>> So what to do with NMDS axes? If you take all NMDS axes and their 
>> interactions in a regression of type ~ axis1 + axis2 + axis1:axis2 then this 
>> is equal to fitting a linear trend surface, and the interaction term 
>> axis1:axis2 takes care that the result is invariant under rotation of NMDS 
>> space. Function ordisurf() in vegan gives further ideas how to fit surfaces 
>> to NMDS *space* (instead of simple axis). Also, if you think that some 
>> direction in NMDS (not necessarily parallel to the axes) is good and you 
>> have an indicator variable for that, you can use MDSrotate() function in 
>> vegan to rotate your solution to that direction and then take that rotated 
>> axis as your explanatory variable.
>> 
>> HTH, Jari Oksanen
>> 
>>> On 11 Jan 2016, at 10:38 am, Martin Weiser  wrote:
>>> 
>>> Hi Conny,
>>> 
>>> AFAIK NMDS is *non-metric* and represents distances among objects, not
>>> gradients along axes (known or unknown): distances along axes are
>>> stretched as needed locally (NMDS works with rank order), even order of
>>> the elements along axes does not tell anything. NMDS is great if you
>>> want to say: Object A resembles object C more than it resembles object
>>> B, even though C and B are quite similar.
>>> Try this: run NMDS several times, aim for different number of axes (e.g.
>>> 1,2,3,5,10) and note the scores of the objects along the first one.  You
>>> *may* get the same thing.
>>> 
>>> If you need scores of the objects in the ordination, use something with
>>> well defined metrics and axes, e.g. PCA, CA.
>>> 
>>> HTH,
>>> Martin
>>> 
>>> On 9.1.2016 05:41, Conny wrote:
 Hi all,
 
 
 
 it has been frequently pointed out in this group, that NMDS axes scores
 shouldn't be used individually for further analysis.
 
 I therefore would like to include both of my NMDS site scores as a 

Re: [R-sig-eco] NMDS axes scores

2016-01-11 Thread Zoltan Botta-Dukat

Dear Jari,

What is your opinion about using first few axes of a metric ordination? 
I'm aware that it is meaningless using first two axes of NMDS ordination 
that calculated for three  dimensions. But in my experience, it is often 
useful to use only first few axes of metric ordination instead of raw 
data: no ecological relevant information is lost, only the 'noise' is 
reduced.


Best wishes

Zoltan

2016.01.11. 11:11 keltezéssel, Jari Oksanen írta:

Contrary to common misbelief, NMDS ordination space is **metric**. In vegan, 
the ordination space (= the ordination result) is even guaranteed to be 
Euclidean (in isoMDS it can be Minkowski, but this is not allowed with vegan). 
What is non-metric is the regression from observed dissimilarities to the 
Euclidean distances in ordination space. The reason why we do not recommend 
using NMDS axes as independent beasts is that NMDS tries to preserve the 
*distances* among points. Any orthogonal rotation (= turning of ordination 
space) will change scores along rotated axes, but retain the distances among 
points. The vegan NMDS result is rotated to principal components, but still you 
should avoid thinking that this makes dimensions independent from each other, 
although the first maximizes the dispersion of points and axes are orthogonal 
(non-correlated).

PCA ordination is Euclidean in the same way as NMDS. The difference to NMDS are 
that (1) only Euclidean distances among sampling units can be used in PCA (in 
NMDS you can use any adequate dissimilarity), and (2) the mapping is linear 
(instead of non-metric) from observed dissimilarities to Euclidean 
dissimilarities. Try function stressplot() in vegan to see what this means — it 
is available both for NMDS and rda (PCA) results.  CA is similar to PCA except 
that it is based on weighted Euclidean distances. I won’t go into mathematical 
details, but you can see ?wcmdscale in vegan to see how to get CA as a weighted 
Euclidean ordination of Chi-square transformed data.

PCA and CA have some ordering criteria for their axis and therefore some people 
have used axes from those as independent beasts. I think this is dubious, too, 
but people do it all the time. The PCA/CA also define a multivariate space, and 
taking only one axis as an independent object sounds strange, in particular if 
you take something else than the first axes.

So what to do with NMDS axes? If you take all NMDS axes and their interactions 
in a regression of type ~ axis1 + axis2 + axis1:axis2 then this is equal to 
fitting a linear trend surface, and the interaction term axis1:axis2 takes care 
that the result is invariant under rotation of NMDS space. Function ordisurf() 
in vegan gives further ideas how to fit surfaces to NMDS *space* (instead of 
simple axis). Also, if you think that some direction in NMDS (not necessarily 
parallel to the axes) is good and you have an indicator variable for that, you 
can use MDSrotate() function in vegan to rotate your solution to that direction 
and then take that rotated axis as your explanatory variable.

HTH, Jari Oksanen


On 11 Jan 2016, at 10:38 am, Martin Weiser  wrote:

Hi Conny,

AFAIK NMDS is *non-metric* and represents distances among objects, not
gradients along axes (known or unknown): distances along axes are
stretched as needed locally (NMDS works with rank order), even order of
the elements along axes does not tell anything. NMDS is great if you
want to say: Object A resembles object C more than it resembles object
B, even though C and B are quite similar.
Try this: run NMDS several times, aim for different number of axes (e.g.
1,2,3,5,10) and note the scores of the objects along the first one.  You
*may* get the same thing.

If you need scores of the objects in the ordination, use something with
well defined metrics and axes, e.g. PCA, CA.

HTH,
Martin

On 9.1.2016 05:41, Conny wrote:

Hi all,



it has been frequently pointed out in this group, that NMDS axes scores
shouldn't be used individually for further analysis.

I therefore would like to include both of my NMDS site scores as a response
into a GLM model simultaneously.  Unfortunately, I couldn't find any advice
on how to actually do this. I found a  couple of papers using NMDS scores in
GLMs, but they all seem to use them individually, fitting separate models to
each of the ordination axes.



I'm a bit at a loss here and any advice is very much appreciated,

Conny


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


--

--
Pokud je tento e-mail součástí obchodního jednání, Přírodovědecká fakulta
Univerzity Karlovy v Praze:
a) si vyhrazuje právo jednání kdykoliv ukončit a to i bez uvedení důvodu,
b) stanovuje, že smlouva musí mít písemnou formu,
c) vylučuje přijetí nabídky s dodatkem či odchylkou,
d) stanovuje, že 

[R-sig-eco] Number of Groups in SpeciesMix

2016-01-11 Thread Alexandre F. Souza
Dear friends,

I am willing to apply the SAM analytical framework to a dataset of plant
species in coastal Brazil using the SpeciesMix package. The SpeciesMix
package fits Species Archtype Models, a special type of finite mixture of
regression model motivated by the analysis of multi-species data.

In appying function clusterSelect, which helps in defining the best number
of species groups G, I would like to confirm if my understanding is
correct: the formula reported there as "obs ~ 1 + x" in

clusters <- clusterSelect(obs~1+x,dat1$pa,dat,G=2:5,em.refit=2)

is a generic formulation not directly related to the species (dat1$pa) or
environmental (dat) data, isn't it? So in principle I should use this same
formulation as well, understanding that obs stands for the whole species
data matrix, 1 for the presence of a constant, and x for the whole
explanatory dataset?

I tried to apply obs~1+x but it returns an error message, however.

I am kind of blocked here so any thoughts could help...

Sincerely,

Alexandre

-- 
Dr. Alexandre F. Souza
Professor Adjunto III
Universidade Federal do Rio Grande do Norte
CB, Departamento de Ecologia
Campus Universitário - Lagoa Nova
59072-970 - Natal, RN - Brasil
lattes: lattes.cnpq.br/7844758818522706
http://www.docente.ufrn.br/alexsouza

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology