Re: [R-sig-eco] Vegan metaMDS: unusual first run stress values with large data set

2012-12-12 Thread Jari Oksanen
Hello R-Community,

First my thanks to Ewan Isherwood who turned our attention to this issue and 
sent his data file to us for analysing the situation. 

It seems that the default convergence criteria are too slack in monoMDS() that 
was the ordination engine of metaMDS() in this case. Good news are that you can 
change those criteria by adding argument 'sfgrmin' to the metaMDS() call (this 
is documented in ?monoMDS). The following command seems to work:

 PSU.NMDS - metaMDS(PSU.sp, k=2, sfgrmin = 1e-7, distance = jaccard)


The default was 'sfgrmin = 1e-5' which was so slack the iteration stopped early 
and did not really converge close to the solution. With this option you can 
find that the correct stress is of magnitude 0.029 which is much lower than 
reported below. Moreover, the stresses of one-dimensional and two-dimensional 
solutions are very close to each other. (There was one outlier (P1763E) which 
only had one species (CHICRA) that occurred only in four other sites and 
distorted the results.)

I advice *against* using 'zerodist = add': it is not needed with monoMDS. 
Identical (distance = 0) sites will have identical scores if you do not use 
this argument. Using 'zerodist = add' is only necessary with MASS::isoMDS() 
that is unable to handle zero distances.

We have changed the default of 'sfgrmin' in http://www.r-forge.r-project.org/ 
so that you should not see this problem in the next vegan releases.

Cheers, Jari Oksanen

On 05/12/2012, at 21:15 PM, Ewan Isherwood wrote:

 Hello, R-Community! This is the first time writing to this group and
 indeed the first time using a mailing list, so please bear with me if
 I’ve done something wrong.
 
 I have a large species x site matrix (89 x 4831) that I want to
 ordinate using metaMDS in the Vegan (2.0-5) package in R (2.15.2). If
 I run this data frame using the Jaccard index in two or more
 dimensions (k1), the first run (run=0) has a relatively low stress
 value and the other 20 runs are much higher and have very low
 deviation. However, k=1 seems to work fine. Furthermore, a
 stress/scree plot reveals a pyramid-like shape, where the k=1 lowest
 stress value is low, increases rapidly for k=2 then decreases slowly
 as k increases.
 
 DimensionsStress
 1 0.1382185
 2 0.1939509
 3 0.1695375
 4 0.155221
 5 0.1406408
 6 0.1294149
 
 I’ve tried this with a small iteration of this data and this issue
 arises at k2 rather than at k1 as it is here. Anyway, this is the
 input and output:
 
 library(vegan)
 library(MASS)
 PSU - read.table(PSU.txt, header = TRUE, sep = )
 PSU.sp - PSU[, 22:110]
 PSU.NMDS - metaMDS(PSU.sp, k=4, zerodist = add, distance = jaccard)
 
 Square root transformation
 Wisconsin double standardization
 Zero dissimilarities changed into  0.0006657301
 Run 0 stress 0.155221
 Run 1 stress 0.2548103
 Run 2 stress 0.255434
 Run 3 stress 0.2551382
 … (Up to run 20 where run 1 through run 20 have all very similar stress 
 values.)
 
 Call:
 metaMDS(comm = PSU.sp, distance = jaccard, k = 4, zerodist = add)
 
 global Multidimensional Scaling using monoMDS
 
 Data: wisconsin(sqrt(PSU.sp))
 Distance: jaccard
 
 Dimensions: 4
 Stress: 0.155221
 Stress type 1, weak ties
 No convergent solutions - best solution after 20 tries
 Scaling: centring, PC rotation, halfchange scaling
 Species: expanded scores based on ‘wisconsin(sqrt(PSU.sp))’
 
 Now, again, with k=1 this does not happen – the solution looks like
 any other regular NMDS run. There are no blank values in the data as
 they are all numbers between 0 and 100 corresponding to % cover, and
 every row and column sum is greater than 0. There are many sites with
 the same species configurations, hence the zerodist, but omitting this
 makes no difference to the problem at hand. The NMDS works fine if I
 use a subset of the data, but I have not subsetted and tested all of
 it. Other metric (Euclidean) and nonmetric (Bray) dissimilarity
 indices result in the same effect. I’ve chosen k=4 here because of the
 (marginal) elbow in the stress plot, but the data itself actually
 looks pretty good at any k value. Even though the output is
 reasonable, I am concerned that hitting the best solution by a large
 amount on the first run means something is messing up, and this
 concern is amplified by the strange pyramid shaped stress plot.
 Because metaMDS uses random starts, I don't see how this output is
 possible. I've scoured the help files and archives of this list and I
 am really now at a loss to explain this.
 
 Thank you in advance for your time and consideration!
 
 Ewan
 
 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

-- 
Jari Oksanen, Dept Biology, Univ Oulu, 90014 Finland
jari.oksa...@oulu.fi, Ph. +358 400 408593, http://cc.oulu.fi/~jarioksa

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org

[R-sig-eco] Vegan metaMDS: unusual first run stress values with large data set

2012-12-05 Thread Ewan Isherwood
Hello, R-Community! This is the first time writing to this group and
indeed the first time using a mailing list, so please bear with me if
I’ve done something wrong.

I have a large species x site matrix (89 x 4831) that I want to
ordinate using metaMDS in the Vegan (2.0-5) package in R (2.15.2). If
I run this data frame using the Jaccard index in two or more
dimensions (k1), the first run (run=0) has a relatively low stress
value and the other 20 runs are much higher and have very low
deviation. However, k=1 seems to work fine. Furthermore, a
stress/scree plot reveals a pyramid-like shape, where the k=1 lowest
stress value is low, increases rapidly for k=2 then decreases slowly
as k increases.

Dimensions  Stress
1   0.1382185
2   0.1939509
3   0.1695375
4   0.155221
5   0.1406408
6   0.1294149

I’ve tried this with a small iteration of this data and this issue
arises at k2 rather than at k1 as it is here. Anyway, this is the
input and output:

library(vegan)
library(MASS)
PSU - read.table(PSU.txt, header = TRUE, sep = )
PSU.sp - PSU[, 22:110]
PSU.NMDS - metaMDS(PSU.sp, k=4, zerodist = add, distance = jaccard)

Square root transformation
Wisconsin double standardization
Zero dissimilarities changed into  0.0006657301
Run 0 stress 0.155221
Run 1 stress 0.2548103
Run 2 stress 0.255434
Run 3 stress 0.2551382
… (Up to run 20 where run 1 through run 20 have all very similar stress values.)

Call:
metaMDS(comm = PSU.sp, distance = jaccard, k = 4, zerodist = add)

global Multidimensional Scaling using monoMDS

Data: wisconsin(sqrt(PSU.sp))
Distance: jaccard

Dimensions: 4
Stress: 0.155221
Stress type 1, weak ties
No convergent solutions - best solution after 20 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on ‘wisconsin(sqrt(PSU.sp))’

Now, again, with k=1 this does not happen – the solution looks like
any other regular NMDS run. There are no blank values in the data as
they are all numbers between 0 and 100 corresponding to % cover, and
every row and column sum is greater than 0. There are many sites with
the same species configurations, hence the zerodist, but omitting this
makes no difference to the problem at hand. The NMDS works fine if I
use a subset of the data, but I have not subsetted and tested all of
it. Other metric (Euclidean) and nonmetric (Bray) dissimilarity
indices result in the same effect. I’ve chosen k=4 here because of the
(marginal) elbow in the stress plot, but the data itself actually
looks pretty good at any k value. Even though the output is
reasonable, I am concerned that hitting the best solution by a large
amount on the first run means something is messing up, and this
concern is amplified by the strange pyramid shaped stress plot.
Because metaMDS uses random starts, I don't see how this output is
possible. I've scoured the help files and archives of this list and I
am really now at a loss to explain this.

Thank you in advance for your time and consideration!

Ewan

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology