Re: [R] question about capscale (vegan)

2006-11-27 Thread Alicia Amadoz
Hi Gavin,

I have been analyzing real data (sorry but I am not allowed to post
these data here) and what I got was this,

mydistmat_f.cap - capscale(distmat_f ~ F + L + F:L, mfactors_frame)

Warning messages:
1: some of the first 30 eigenvalues are  0 in: cmdscale(X, k = k, eig =
TRUE, add = add)
2: Se han producido NaNs in: sqrt(ev)

 mydistmat_f.cap

Call:
capscale(formula = distmat_f ~ F + L + F:L, data = mfactors_frame)

  Inertia Rank
Total  0.3758
Constrained0.21104
Unconstrained  0.16484
Inertia is squared  distance
Some constraints were aliased because they were collinear (redundant)

Eigenvalues for constrained axes:
 CAP1  CAP2  CAP3  CAP4
1.679e-01 2.954e-02 1.349e-02 1.233e-05

Eigenvalues for unconstrained axes:
 MDS1  MDS2  MDS3  MDS4
1.388e-01 2.601e-02 4.076e-05 2.064e-07

So, by these results I can tell that there are 4 axes that explain
0.1648 of the total variance and another 4 axes that explain 0.2110 of
the total variance. But I don't understand the difference between
constrained and unconstrained.

 anova(mydistmat_f.cap)

Permutation test for capscale under direct model

Model: capscale(formula = distmat_f ~ F + L + F:L, data = mfactors_frame)
 DfVar  F N.Perm Pr(F)
Model 4   0.21 1.2798 400.00 0.0875 .
Residual  4   0.16
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 summary(anova(mydistmat_f.cap))
   Df Var   F N.PermPr(F)
 Min.   :4   Min.   :0.1648   Min.   :1.280   Min.   :200   Min.   :0.12
 1st Qu.:4   1st Qu.:0.1764   1st Qu.:1.280   1st Qu.:200   1st Qu.:0.12
 Median :4   Median :0.1879   Median :1.280   Median :200   Median :0.12
 Mean   :4   Mean   :0.1879   Mean   :1.280   Mean   :200   Mean   :0.12
 3rd Qu.:4   3rd Qu.:0.1994   3rd Qu.:1.280   3rd Qu.:200   3rd Qu.:0.12
 Max.   :4   Max.   :0.2110   Max.   :1.280   Max.   :200   Max.   :0.12
  NA's   :1.000   NA's   :  1   NA's   :1.00

Then, I want to know the sum of squares of anova to check with other
analysis that we performed but I can't see them by the output of anova.
Besides, I am wondering if there is any manner to identify the main
effects, factor effects and interaction in this anova analysis. I would
be very grateful if you could help me to understand these results.

Thank you very much,
Alicia

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question about capscale (vegan)

2006-11-27 Thread Gavin Simpson
On Mon, 2006-11-27 at 15:37 +0100, Alicia Amadoz wrote:
 Hi Gavin,
 
 I have been analyzing real data (sorry but I am not allowed to post
 these data here) and what I got was this,
 
 mydistmat_f.cap - capscale(distmat_f ~ F + L + F:L, mfactors_frame)

I believe you can write that formula as: distmat_f ~ F * L

 
 Warning messages:
 1: some of the first 30 eigenvalues are  0 in: cmdscale(X, k = k, eig =
 TRUE, add = add)
 2: Se han producido NaNs in: sqrt(ev)

Sorry, I don't know enough about this method to know whether this a
problem you should worry about or not. You should read up on the method
some more to decide if the first warning is something you should be
worried about. IIRC, negative eigenvalues are to be expected with this
method as they are handled explicitly by capscale, and as this is a
warning coming from cmdscale(), I suspect it is a helpful feature of
that function, which you don't need to worry about when used in
capscale().

 
  mydistmat_f.cap
 
 Call:
 capscale(formula = distmat_f ~ F + L + F:L, data = mfactors_frame)
 
   Inertia Rank
 Total  0.3758
 Constrained0.21104
 Unconstrained  0.16484
 Inertia is squared  distance
 Some constraints were aliased because they were collinear (redundant)
 
 Eigenvalues for constrained axes:
  CAP1  CAP2  CAP3  CAP4
 1.679e-01 2.954e-02 1.349e-02 1.233e-05
 
 Eigenvalues for unconstrained axes:
  MDS1  MDS2  MDS3  MDS4
 1.388e-01 2.601e-02 4.076e-05 2.064e-07
 
 So, by these results I can tell that there are 4 axes that explain
 0.1648 of the total variance and another 4 axes that explain 0.2110 of
 the total variance. But I don't understand the difference between
 constrained and unconstrained.

The constrained axes are axes that are linear combinations of your
explanatory variables (F, L and F:L), so this is the bit of your genomic
data that is explained by those explanatory factors. The unconstrained
bit is the remaining variance not explained, and are MDS (PCoord) axes.

So you can explain c. 56% of the variance in your genomic data with F,
L, and F:L.

Note the warning about aliased constraints - this means that at least
the variance of one variable in the model (inc interactions) is
completely correlated with another variable (or combination of
variables?) and is redundant.

Type alias(mydistmat_f.cap) to see which coefficients are aliased
and ?alias to see what this means.

 
  anova(mydistmat_f.cap)
 
 Permutation test for capscale under direct model
 
 Model: capscale(formula = distmat_f ~ F + L + F:L, data = mfactors_frame)
  DfVar  F N.Perm Pr(F)
 Model 4   0.21 1.2798 400.00 0.0875 .
 Residual  4   0.16
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 
  summary(anova(mydistmat_f.cap))
Df Var   F N.PermPr(F)
  Min.   :4   Min.   :0.1648   Min.   :1.280   Min.   :200   Min.   :0.12
  1st Qu.:4   1st Qu.:0.1764   1st Qu.:1.280   1st Qu.:200   1st Qu.:0.12
  Median :4   Median :0.1879   Median :1.280   Median :200   Median :0.12
  Mean   :4   Mean   :0.1879   Mean   :1.280   Mean   :200   Mean   :0.12
  3rd Qu.:4   3rd Qu.:0.1994   3rd Qu.:1.280   3rd Qu.:200   3rd Qu.:0.12
  Max.   :4   Max.   :0.2110   Max.   :1.280   Max.   :200   Max.   :0.12
   NA's   :1.000   NA's   :  1   NA's   :1.00
 
 Then, I want to know the sum of squares of anova to check with other
 analysis that we performed but I can't see them by the output of anova.
 Besides, I am wondering if there is any manner to identify the main
 effects, factor effects and interaction in this anova analysis. I would
 be very grateful if you could help me to understand these results.

There isn't a summary method for anova.cca, and anyway, this anova isn't
working on sums of squares, but on other measures of variance. It is a
permutation test, and simply works out with brute force how likely you
are to have a model explaining 56% of the total variance given your
sample size and model complexity, under a null/random model.

It sounds like you haven't grasped fully the fundamentals of the methods
you are employing, and I would strongly advise you to do some more
reading up on these methods. I can, at best, only guide you as I am not
that familiar with the technique myself.

A good start would be the refs in ?capscale and then search for papers
that cite Anderson  Willis and that use the methodology.

 
 Thank you very much,
 Alicia

HTH

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC  ENSIS, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%


Re: [R] question about capscale (vegan)

2006-11-17 Thread Alicia Amadoz
Hello,

Thank you for your help. I have tried to perform the analysis I wanted
with data of example, I mean not real data because I can't provide it
here. So, what I have tried is this,

 matrix
 [,1] [,2] [,3]
[1,] 0.00 0.13 0.59
[2,] 0.13 0.00 0.55
[3,] 0.59 0.55 0.00

 dist_mat
 12
2 0.13
3 0.59 0.55

# here, distance matrix is calculated from percentaje of different
nucleic acids between two sequences and R is not used to perform it. The
original data would be like this:

n1  n2  n3  n4  n5  n6  n7  n8  n9  
n10 n11 n12
m1  A   C   G   T   A   G   C   T   A   
C   T   A
m2  G   C   T   A   T   G   C   T   A   
C   T   A
m3  G   A   G   T   A   G   C   T   A   
C   T   A

 factors_frame
  time regioncity
1 2006 europe  london
2 2005 africa nairobi
3 2005 europe   paris

 my.cap - capscale(dist_mat ~ time + region + time:region +
region:city + time:region:city, factors_frame)

 my.cap

Call:
capscale(formula = dist_mat ~ time + region + time:region + region:city
+  time:region:city, data = factors_frame)

Inertia Rank
Total 0.445
Constrained   0.4452
Inertia is squared  distance
Some constraints were aliased because they were collinear (redundant)

Eigenvalues for constrained axes:
   CAP1CAP2
0.42978 0.01522

 anova(my.cap)
Erro en `names-.default`(`*tmp*`, value = Residual) :
se intenta especificar un atributo en un NULL

Then, I am still concerned about 'comm' argument since I don't
understand how important could it be for my type of data and I don't
understand to what it referes in my data. Another thing, is that what I
am really interested in is to perform a factorial anova with another
factor nested (the model I have provided above), and as you can see R
gives an error that I don't understand either.

Thank you for your help in advance. 
Regards,
Alicia


 On Thu, 2006-11-16 at 17:25 +0100, Alicia Amadoz wrote:
  Hello,
  
  I am interested in using the capscale function of vegan package of R. I
  already have a dissimilarity matrix and I am intended to use it as
  'distance' argument. But then, I don't know what kind of data must be in
  'comm' argument. I don't understand what type of data must be referred
  as 'species scores' and 'community data frame' since my data refer to
  nucleic distances between different sequences.
 
 No, that is all wrong. Read ?capscale more closely! It says that you
 need to use the formula to describe the model. distance is used to
 tell capscale which distance coefficient to use if the LHS of the model
 formula is a community matrix.
 
 Argument comm is used to tell capscale where to find the species
 matrix that will be used to determine species scores in the analysis,
 *if* the LHS of the formula is a distance matrix. comm isn't used if
 the LHS is a data frame, and distance is ignored if the LHS is a
 distance matrix.
 
 As you don't provide a reproducible example of your problem, I will use
 the inbuilt example from ?capscale
 
 ## load some data
 data(varespec)
 data(varechem)
 
 Now if you want to fit a capscale model using the raw species data, then
 you would describe the model as so:
 
 vare.cap - capscale(varespec ~ N + P + K + Condition(Al), 
  data = varechem,
  distance = bray)
 vare.cap
 
 In the above, LHS of formula is a data frame so capscale looks to
 argument distance for the name of the coefficient to turn it into a
 distance matrix. The terms on the RHS of the formula are variables
 looked up in the object assigned to the data argument.
 
 Now lets alter this to start with a dissimilarity/distance matrix
 instead. The exact complement of the above would be:
 
 dist.mat - vegdist(varespec, method = bray)
 vare.cap2 - capscale(dist.mat ~ N + P + K + Condition(Al), 
  data = varechem,
  comm = varespec)
 vare.cap2
 
 To explain the above example; first create the Bray Curtis distance
 matrix (dist.mat). Then use this on the LHS of the formula. When
 capscale now wants to calculate the species scores of the analysis it
 will look to argument comm to use in the calculation; which in this
 case we specify is the original species matrix varespec.
 
 As for what are species scores, well this is a throw back to the origins
 of the package and the methods included - all of this is related to
 ecology and mainly vegetation analysis (hence vegan).
 
 For species scores, read variable scores. The distance matrix (however
 calculated) describes how similar your individual sites (read samples)
 are to one another. You can also display information about the variables
 used to determine those distances/similarities, and this is what is
 meant by species scores. Whatever you used to generate the distance
 matrix, the columns represent 

Re: [R] question about capscale (vegan)

2006-11-17 Thread Gavin Simpson
On Fri, 2006-11-17 at 12:18 +0100, Alicia Amadoz wrote:
 Hello,
 
 Thank you for your help. I have tried to perform the analysis I wanted
 with data of example, I mean not real data because I can't provide it
 here. So, what I have tried is this,

Hi Alicia,

It would have been more helpful if you'd included the actual commands to
generate each object, but thanks for including an example.

dat - matrix(c(0.00,0.13,0.59,0.13,0.00,0.55,0.59,0.55,0.00), ncol = 3)
dist.mat - as.dist(dat)
dist.mat
   12
2 0.13
3 0.59 0.55
time - as.factor(c(2006, 2005, 2005))
region - as.factor(c(europe, africa, europe))
city - as.factor(c(london, nairobi, paris))
factors.frame - data.frame(time, region, city)

my.cap - capscale(dist.mat ~ time + region + time:region +
region:city + time:region:city, factors.frame)

my.cap

So, stop here. Look at the output. You can extract 2 constrained axes
that explain 100% of the variance in your data. This causes my.cap$CA to
be NULL, which is why when you do:

anova(my.cap)

You get this error message:

Error in `names-.default`(`*tmp*`, value = Residual) :
attempt to set an attribute on NULL

The error has nothing to do with providing comm or not (I think) as I
don't see how this would alter my.cap$CA, and anyway, comm is used to
generate species scores and if you look at summary(my.cap) you will
see that you have species scores (though their meaning may be hard to
understand if no comm provided - see ?capscale)

I hesitate to call this a bug in capscale() or permutest.cca() (this is
where the error comes from by the way:

 traceback()
5: `names-.default`(`*tmp*`, value = Residual)
4: `names-`(`*tmp*`, value = Residual)
3: permutest.cca(object, step, ...)
2: anova.cca(my.cap)
1: anova(my.cap)

), but anova.cca doesn't seem to handle situations where there isn't an
unconstrained component. I've CC'd Jari Oksanen, the author of vegan to
insure he sees this.

This error is related to the specific dummy problem you sent - do you
get this error when you run the analysis on your full data set? If so,
you might want to consider removing some constraints as your model isn't
really constrained anymore. As number constraints approaches number
sites the constraint on the ordination drops away and you are back to a
Principal Coordinates Analysis (IIRC) of your dissimilarity matrix.

  anova(my.cap)
 Erro en `names-.default`(`*tmp*`, value = Residual) :
 se intenta especificar un atributo en un NULL
 
 Then, I am still concerned about 'comm' argument since I don't
 understand how important could it be for my type of data and I don't
 understand to what it referes in my data. Another thing, is that what I
 am really interested in is to perform a factorial anova with another
 factor nested (the model I have provided above), and as you can see R
 gives an error that I don't understand either.

As for your original data - by the looks of it, you wouldn't be able to
use that as the argument to comm. It would need to be numeric and
recoded etc. before you could use it, and how to do that in the best way
I'm not sure.

But in this instance, if you are interested in the samples and how they
relate to one another, constrained by your factors_frame, then you don't
need comm and you can proceed without it, and not bother displaying
species scores.

If you are interested in how the samples relate to one another and how
the nucleic acids relate to one another and the samples, constrained by
your factors_frame, then you will need to recode that example matrix
into something numeric, and even then it may not be possible with the
way capscale is written.

Hope this helps,

G

 
 Thank you for your help in advance. 
 Regards,
 Alicia
 
 
  On Thu, 2006-11-16 at 17:25 +0100, Alicia Amadoz wrote:
   Hello,
   
   I am interested in using the capscale function of vegan package of R. I
   already have a dissimilarity matrix and I am intended to use it as
   'distance' argument. But then, I don't know what kind of data must be in
   'comm' argument. I don't understand what type of data must be referred
   as 'species scores' and 'community data frame' since my data refer to
   nucleic distances between different sequences.
  
  No, that is all wrong. Read ?capscale more closely! It says that you
  need to use the formula to describe the model. distance is used to
  tell capscale which distance coefficient to use if the LHS of the model
  formula is a community matrix.
  
  Argument comm is used to tell capscale where to find the species
  matrix that will be used to determine species scores in the analysis,
  *if* the LHS of the formula is a distance matrix. comm isn't used if
  the LHS is a data frame, and distance is ignored if the LHS is a
  distance matrix.
  
  As you don't provide a reproducible example of your problem, I will use
  the inbuilt example from ?capscale
  
  ## load some data
  data(varespec)
  data(varechem)
  
  Now if you want to fit a capscale model using the raw species data, 

Re: [R] question about capscale (vegan)

2006-11-17 Thread Alicia Amadoz
Hello Gavin,

Thank you very much for your help. I'm sorry I forgot to include all
commands that I used but next time I will try to write all of them. I
will try with my real data and see how it goes. I think I finally have
understood how capscale works with this kind of data. Thank you.

Regards,
Alicia

 On Fri, 2006-11-17 at 12:18 +0100, Alicia Amadoz wrote:
  Hello,
  
  Thank you for your help. I have tried to perform the analysis I wanted
  with data of example, I mean not real data because I can't provide it
  here. So, what I have tried is this,
 
 Hi Alicia,
 
 It would have been more helpful if you'd included the actual commands to
 generate each object, but thanks for including an example.
 
 dat - matrix(c(0.00,0.13,0.59,0.13,0.00,0.55,0.59,0.55,0.00), ncol = 3)
 dist.mat - as.dist(dat)
 dist.mat
12
 2 0.13
 3 0.59 0.55
 time - as.factor(c(2006, 2005, 2005))
 region - as.factor(c(europe, africa, europe))
 city - as.factor(c(london, nairobi, paris))
 factors.frame - data.frame(time, region, city)
 
 my.cap - capscale(dist.mat ~ time + region + time:region +
 region:city + time:region:city, factors.frame)
 
 my.cap
 
 So, stop here. Look at the output. You can extract 2 constrained axes
 that explain 100% of the variance in your data. This causes my.cap$CA to
 be NULL, which is why when you do:
 
 anova(my.cap)
 
 You get this error message:
 
 Error in `names-.default`(`*tmp*`, value = Residual) :
 attempt to set an attribute on NULL
 
 The error has nothing to do with providing comm or not (I think) as I
 don't see how this would alter my.cap$CA, and anyway, comm is used to
 generate species scores and if you look at summary(my.cap) you will
 see that you have species scores (though their meaning may be hard to
 understand if no comm provided - see ?capscale)
 
 I hesitate to call this a bug in capscale() or permutest.cca() (this is
 where the error comes from by the way:
 
  traceback()
 5: `names-.default`(`*tmp*`, value = Residual)
 4: `names-`(`*tmp*`, value = Residual)
 3: permutest.cca(object, step, ...)
 2: anova.cca(my.cap)
 1: anova(my.cap)
 
 ), but anova.cca doesn't seem to handle situations where there isn't an
 unconstrained component. I've CC'd Jari Oksanen, the author of vegan to
 insure he sees this.
 
 This error is related to the specific dummy problem you sent - do you
 get this error when you run the analysis on your full data set? If so,
 you might want to consider removing some constraints as your model isn't
 really constrained anymore. As number constraints approaches number
 sites the constraint on the ordination drops away and you are back to a
 Principal Coordinates Analysis (IIRC) of your dissimilarity matrix.
 
   anova(my.cap)
  Erro en `names-.default`(`*tmp*`, value = Residual) :
  se intenta especificar un atributo en un NULL
  
  Then, I am still concerned about 'comm' argument since I don't
  understand how important could it be for my type of data and I don't
  understand to what it referes in my data. Another thing, is that what I
  am really interested in is to perform a factorial anova with another
  factor nested (the model I have provided above), and as you can see R
  gives an error that I don't understand either.
 
 As for your original data - by the looks of it, you wouldn't be able to
 use that as the argument to comm. It would need to be numeric and
 recoded etc. before you could use it, and how to do that in the best way
 I'm not sure.
 
 But in this instance, if you are interested in the samples and how they
 relate to one another, constrained by your factors_frame, then you don't
 need comm and you can proceed without it, and not bother displaying
 species scores.
 
 If you are interested in how the samples relate to one another and how
 the nucleic acids relate to one another and the samples, constrained by
 your factors_frame, then you will need to recode that example matrix
 into something numeric, and even then it may not be possible with the
 way capscale is written.
 
 Hope this helps,
 
 G
 
  
  Thank you for your help in advance. 
  Regards,
  Alicia
  
  
   On Thu, 2006-11-16 at 17:25 +0100, Alicia Amadoz wrote:
Hello,

I am interested in using the capscale function of vegan package
of R. I
already have a dissimilarity matrix and I am intended to use it as
'distance' argument. But then, I don't know what kind of data
must be in
'comm' argument. I don't understand what type of data must be
referred
as 'species scores' and 'community data frame' since my data
refer to
nucleic distances between different sequences.
   
   No, that is all wrong. Read ?capscale more closely! It says that you
   need to use the formula to describe the model. distance is used to
   tell capscale which distance coefficient to use if the LHS of the
model
   formula is a community matrix.
   
   Argument comm is used to tell capscale where to find the species
   matrix that will be used to determine 

Re: [R] question about capscale (vegan)

2006-11-17 Thread Jari Oksanen
On Fri, 2006-11-17 at 12:26 +, Gavin Simpson wrote:
 On Fri, 2006-11-17 at 12:18 +0100, Alicia Amadoz wrote:
  Hello,
  
  Thank you for your help. I have tried to perform the analysis I wanted
  with data of example, I mean not real data because I can't provide it
  here. So, what I have tried is this,
 
 Hi Alicia,
 
 It would have been more helpful if you'd included the actual commands to
 generate each object, but thanks for including an example.
 
 dat - matrix(c(0.00,0.13,0.59,0.13,0.00,0.55,0.59,0.55,0.00), ncol = 3)
 dist.mat - as.dist(dat)
 dist.mat
12
 2 0.13
 3 0.59 0.55
 time - as.factor(c(2006, 2005, 2005))
 region - as.factor(c(europe, africa, europe))
 city - as.factor(c(london, nairobi, paris))
 factors.frame - data.frame(time, region, city)
 
 my.cap - capscale(dist.mat ~ time + region + time:region +
 region:city + time:region:city, factors.frame)
 
 my.cap
 
 So, stop here. Look at the output. You can extract 2 constrained axes
 that explain 100% of the variance in your data. This causes my.cap$CA to
 be NULL, which is why when you do:
 
 anova(my.cap)
 
 You get this error message:
 
 Error in `names-.default`(`*tmp*`, value = Residual) :
 attempt to set an attribute on NULL
 
 The error has nothing to do with providing comm or not (I think) as I
 don't see how this would alter my.cap$CA, and anyway, comm is used to
 generate species scores and if you look at summary(my.cap) you will
 see that you have species scores (though their meaning may be hard to
 understand if no comm provided - see ?capscale)
 
 I hesitate to call this a bug in capscale() or permutest.cca() (this is
 where the error comes from by the way:
 
  traceback()
 5: `names-.default`(`*tmp*`, value = Residual)
 4: `names-`(`*tmp*`, value = Residual)
 3: permutest.cca(object, step, ...)
 2: anova.cca(my.cap)
 1: anova(my.cap)
 
 ), but anova.cca doesn't seem to handle situations where there isn't an
 unconstrained component. I've CC'd Jari Oksanen, the author of vegan to
 insure he sees this.
 
Dear y'all,

I agree with this analysis: you have no residual (unconstrained)
variation and this means that you cannot have a significance test. I
have always known this, but I haven't cared about this issue: you ask
for an impossible analysis and get an error message. The only thing that
could be called as a bug is the text of the error message, and I may
change that. After this you still cannot perform anova when there is no
residual variation, but the error message would change. 

You have two roads to go if you still want to have an analysis like
this:
1. Like Gavin suggested, just reduce the number of constraints so that
your model has an unconstrained component, and you will be able to run
the tests.
2. Perform an unconstrained analysis (cmdscale, prcomp, princomp, or rda
in this case), fit the environmental variables to this solution and
analyses the significances of fitted vectors. This all is is doable
using envfit() function in vegan.

Cheers, jari oksanen

 This error is related to the specific dummy problem you sent - do you
 get this error when you run the analysis on your full data set? If so,
 you might want to consider removing some constraints as your model isn't
 really constrained anymore. As number constraints approaches number
 sites the constraint on the ordination drops away and you are back to a
 Principal Coordinates Analysis (IIRC) of your dissimilarity matrix.
 
   anova(my.cap)
  Erro en `names-.default`(`*tmp*`, value = Residual) :
  se intenta especificar un atributo en un NULL
  
  Then, I am still concerned about 'comm' argument since I don't
  understand how important could it be for my type of data and I don't
  understand to what it referes in my data. Another thing, is that what I
  am really interested in is to perform a factorial anova with another
  factor nested (the model I have provided above), and as you can see R
  gives an error that I don't understand either.
 
 As for your original data - by the looks of it, you wouldn't be able to
 use that as the argument to comm. It would need to be numeric and
 recoded etc. before you could use it, and how to do that in the best way
 I'm not sure.
 
 But in this instance, if you are interested in the samples and how they
 relate to one another, constrained by your factors_frame, then you don't
 need comm and you can proceed without it, and not bother displaying
 species scores.
 
 If you are interested in how the samples relate to one another and how
 the nucleic acids relate to one another and the samples, constrained by
 your factors_frame, then you will need to recode that example matrix
 into something numeric, and even then it may not be possible with the
 way capscale is written.
 
 Hope this helps,
 
 G
 
  
  Thank you for your help in advance. 
  Regards,
  Alicia
  
  
   On Thu, 2006-11-16 at 17:25 +0100, Alicia Amadoz wrote:
Hello,

I am interested in using the capscale function of vegan package 

[R] question about capscale (vegan)

2006-11-16 Thread Alicia Amadoz
Hello,

I am interested in using the capscale function of vegan package of R. I
already have a dissimilarity matrix and I am intended to use it as
'distance' argument. But then, I don't know what kind of data must be in
'comm' argument. I don't understand what type of data must be referred
as 'species scores' and 'community data frame' since my data refer to
nucleic distances between different sequences.

I would be very grateful if you could help me with this fact in any
manner. Thank you in advance for your help.

Regards,
Alicia

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question about capscale (vegan)

2006-11-16 Thread Sarah Goslee
Hi Alicia,


On 11/16/06, Alicia Amadoz [EMAIL PROTECTED] wrote:

 'comm' argument. I don't understand what type of data must be referred
 as 'species scores' and 'community data frame' since my data refer to
 nucleic distances between different sequences.

comm would be the original data from which you calculated the
dissimilarity matrix, so that scores can be calculated for the
individual variables. These analyses were designed for use
with vegetation data in the form of a matrix with sites as rows
and species as columns, and containing some measure of
abundance for each species at each site.

If you don't have an original data frame, that is, your data come only
in the form of distances, you will need a different implementation
of constrained ordination. Alternately, you could possibly modify
the function to skip the species scores step.

Sarah


-- 
Sarah Goslee
http://www.stringpage.com
http://www.astronomicum.com
http://www.functionaldiversity.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question about capscale (vegan)

2006-11-16 Thread Sarah Goslee
Sorry, one additional note:

You don't need to specify comm to use capscale. Ignore what I said about
modifying the function.

Sarah
-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question about capscale (vegan)

2006-11-16 Thread Gavin Simpson
On Thu, 2006-11-16 at 17:25 +0100, Alicia Amadoz wrote:
 Hello,
 
 I am interested in using the capscale function of vegan package of R. I
 already have a dissimilarity matrix and I am intended to use it as
 'distance' argument. But then, I don't know what kind of data must be in
 'comm' argument. I don't understand what type of data must be referred
 as 'species scores' and 'community data frame' since my data refer to
 nucleic distances between different sequences.

No, that is all wrong. Read ?capscale more closely! It says that you
need to use the formula to describe the model. distance is used to
tell capscale which distance coefficient to use if the LHS of the model
formula is a community matrix.

Argument comm is used to tell capscale where to find the species
matrix that will be used to determine species scores in the analysis,
*if* the LHS of the formula is a distance matrix. comm isn't used if
the LHS is a data frame, and distance is ignored if the LHS is a
distance matrix.

As you don't provide a reproducible example of your problem, I will use
the inbuilt example from ?capscale

## load some data
data(varespec)
data(varechem)

Now if you want to fit a capscale model using the raw species data, then
you would describe the model as so:

vare.cap - capscale(varespec ~ N + P + K + Condition(Al), 
 data = varechem,
 distance = bray)
vare.cap

In the above, LHS of formula is a data frame so capscale looks to
argument distance for the name of the coefficient to turn it into a
distance matrix. The terms on the RHS of the formula are variables
looked up in the object assigned to the data argument.

Now lets alter this to start with a dissimilarity/distance matrix
instead. The exact complement of the above would be:

dist.mat - vegdist(varespec, method = bray)
vare.cap2 - capscale(dist.mat ~ N + P + K + Condition(Al), 
 data = varechem,
 comm = varespec)
vare.cap2

To explain the above example; first create the Bray Curtis distance
matrix (dist.mat). Then use this on the LHS of the formula. When
capscale now wants to calculate the species scores of the analysis it
will look to argument comm to use in the calculation; which in this
case we specify is the original species matrix varespec.

As for what are species scores, well this is a throw back to the origins
of the package and the methods included - all of this is related to
ecology and mainly vegetation analysis (hence vegan).

For species scores, read variable scores. The distance matrix (however
calculated) describes how similar your individual sites (read samples)
are to one another. You can also display information about the variables
used to determine those distances/similarities, and this is what is
meant by species scores. Whatever you used to generate the distance
matrix, the columns represent the info used to generate the species
scores.

If some of this still isn't clear, email the list with the commands used
to generate your distance matrix in R and I'll have a go at explaining
this with reference to your data/example.

 
 I would be very grateful if you could help me with this fact in any
 manner. Thank you in advance for your help.
 
 Regards,
 Alicia

HTH

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC  ENSIS, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question about capscale (vegan)

2006-11-16 Thread Sarah Goslee
Nice catch, Gavin - I missed that part of the original post. The
nucleic distances need to be included as the left-hand-side of
the formula, not as the distance argument.

comm is still optional, though, but it's not a good idea to omit
it if there's any way you can provide the original data. From the
help:
  If this
  is not supplied, the ``species scores'' are the axes of
  initial metric scaling ('cmdscale') and may be confusing.

I don't know if it's true in this case, but there are applications
where there is no data matrix - the distances themselves are the
original data. I don't know offhand of any other constrained ordination
functions in R that will easily accomodate a precalculated distance
matrix, but I expect there are some somewhere.

My usual approach is to use metric or nonmetric multidimensional
scaling plus vector fitting, but the assumptions behind that are
different than those underlying a constrained ordination.

Sarah

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.