Re: [R] Abundance data ordination in R

2007-04-02 Thread Gavin Simpson
On Sun, 2007-04-01 at 09:20 -0700, Milton Cezar Ribeiro wrote:
 Dear R-gurus
 
 I have a data.frame with abundance data for species and sites which looks 
 like:
 mydf-data.frame(
  sp1=sample(0:10,5,replace=T),
  sp2=sample(0:20,5,replace=T),
  sp3=sample(0:4,5,replace=T),
  sp4=sample(0:2,5,replace=T))
 rownames(mydf)-paste(sites,1:5,sep=)
 
 I would like make an ordination analysis of these data and my worries
 is about the zeros (absence of species) into the matrix. Up to I
 read (Gotelli - A primir of ecological statistics, 2004), when I have
 abundance data I cant compute Euclidian Distances because the zeros
 have the meaning of absence of the species and not as zero counting.
 Gotelli suggests one make principal coordinates analysis. I would
 like to here from you what you think about and what is the best
 packages and functions to I compute my distance matrices and do my
 ordination analysis. Can I considere zero as NA on my data.frame? Is
 there a good PDF book available about Multivariate Analysis for
 abundance data available on the web?

In addition to the other suggestions, there is a Task View on CRAN for
the topic of Environmetrics. This has a section describing the various
ordination techniques available in R as well as functions to calculate
distance/dissimilarity matrices:

http://cran.r-project.org/src/contrib/Views/Environmetrics.html

G

 
 Kind regards
 
 Miltinho
 Brazil
 
 __
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson [t] +44 (0)20 7679 0522
ECRC  [f] +44 (0)20 7679 0565
UCL Department of Geography
Pearson Building  [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street
London, UK[w] http://www.ucl.ac.uk/~ucfagls/
WC1E 6BT  [w] http://www.freshwaters.org.uk/
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Abundance data ordination in R

2007-04-02 Thread Philippe Grosjean

Gavin Simpson wrote:
 On Sun, 2007-04-01 at 09:20 -0700, Milton Cezar Ribeiro wrote:
 Dear R-gurus

 I have a data.frame with abundance data for species and sites which looks 
 like:
 mydf-data.frame(
  sp1=sample(0:10,5,replace=T),
  sp2=sample(0:20,5,replace=T),
  sp3=sample(0:4,5,replace=T),
  sp4=sample(0:2,5,replace=T))
 rownames(mydf)-paste(sites,1:5,sep=)

 I would like make an ordination analysis of these data and my worries
 is about the zeros (absence of species) into the matrix. Up to I
 read (Gotelli - A primir of ecological statistics, 2004), when I have
 abundance data I cant compute Euclidian Distances because the zeros
 have the meaning of absence of the species and not as zero counting.
 Gotelli suggests one make principal coordinates analysis. I would
 like to here from you what you think about and what is the best
 packages and functions to I compute my distance matrices and do my
 ordination analysis. Can I considere zero as NA on my data.frame? Is
 there a good PDF book available about Multivariate Analysis for
 abundance data available on the web?
 
 In addition to the other suggestions, there is a Task View on CRAN for
 the topic of Environmetrics. This has a section describing the various
 ordination techniques available in R as well as functions to calculate
 distance/dissimilarity matrices:
 
 http://cran.r-project.org/src/contrib/Views/Environmetrics.html
 
 G

... And here are a couple of other suggestions:

1) Use a distance that does not take couples of zero as information. 
Typically, the Bray-Curtis distance is one of the most commonly used in 
such a case.

2) Possibly transform your data first, depending on the relative 
importance you want to give to rare species (typically, a log, or double 
square root transformations increase importance of rare species relative 
to abundant ones).

3) One approach is to use MultiDimensional Scaling (see MASS package) on 
the distance matrix to make the ordination in two or three dimensions. 
See the Venables  Ripley's MASS book for details.

4) Another alternative is to use correspondence analysis, which uses the 
Chi2 distance and is adapted to abundances (it is designed to analyze 
contingency tables, but table of abundances, station versus species, 
could be considered as such a double entry contingency table in a way).

Best,

Philippe Grosjean

 Kind regards

 Miltinho
 Brazil

 __


  [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Abundance data ordination in R

2007-04-02 Thread Jari Oksanen
Milton Cezar Ribeiro milton_ruser at yahoo.com.br writes:

 
 Dear R-gurus
 
 I have a data.frame with abundance data for species and sites which looks 
 like:
 mydf-data.frame(
  sp1=sample(0:10,5,replace=T),
  sp2=sample(0:20,5,replace=T),
  sp3=sample(0:4,5,replace=T),
  sp4=sample(0:2,5,replace=T))
 rownames(mydf)-paste(sites,1:5,sep=)
 
 I would like make an ordination analysis of these data and my worries is about
the zeros (absence of
 species) into the matrix. Up to I read (Gotelli - A primir of ecological
statistics, 2004), when I have
 abundance data I can´t compute Euclidian Distances because the zeros have the
meaning of absence of the
 species and not as zero counting. Gotelli suggests one make principal
coordinates analysis. I would
 like to here from you what you think about and what is the best packages and
functions to I compute my
 distance matrices and do my ordination analysis. Can I considere zero as NA on
my data.frame? Is there a
 good PDF book available about Multivariate Analysis for abundance data
available on the web?
 
 
Other people already suggested what to do with these data and where to find pdf
texts. I only comment on some points raised in this original question. Firstly,
Euclidean distance is quite OK with zeros, or at least as good as any other
normal dissimilarity index is with zeros. Euclidean distance on non-transformed
data is poor for other reasons (it takes squared differences emphasizing
abundance, and even when two sites have nothing in common, Euclidean distance
varies with total abundances). Using Principal Co-ordinates analysis does not
change this, since it also can be run with Euclidean distances. However, there
are a many packages providing better dissimilarity indices or transformations
that make Euclidean distances more useful (such as the Hellinger 
transformation).

Another question is more abstract: indeed, you may regard most zeros as missing
data. Species probably could occur in your sample site, more or less, but it was
too scarce to be observed. How to do this in practice is the tricky issue. You
cannot simply change zeros to NA, since then the dissimilarities (if they don't
fail) will really give a special significance to these cells. Regarding them as
zeros certaily makes more sense than removing *pairs* of data where species is
NA in one site and present in another. There are ways to have something like
handling zeros as missing values of various degrees(!), but my decency prohibits
me to write about these methods.

cheers, jari oksanen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Abundance data ordination in R

2007-04-01 Thread Milton Cezar Ribeiro
Dear R-gurus

I have a data.frame with abundance data for species and sites which looks like:
mydf-data.frame(
 sp1=sample(0:10,5,replace=T),
 sp2=sample(0:20,5,replace=T),
 sp3=sample(0:4,5,replace=T),
 sp4=sample(0:2,5,replace=T))
rownames(mydf)-paste(sites,1:5,sep=)

I would like make an ordination analysis of these data and my worries is about 
the zeros (absence of species) into the matrix. Up to I read (Gotelli - A 
primir of ecological statistics, 2004), when I have abundance data I can´t 
compute Euclidian Distances because the zeros have the meaning of absence of 
the species and not as zero counting. Gotelli suggests one make principal 
coordinates analysis. I would like to here from you what you think about and 
what is the best packages and functions to I compute my distance matrices and 
do my ordination analysis. Can I considere zero as NA on my data.frame? Is 
there a good PDF book available about Multivariate Analysis for abundance data 
available on the web?

Kind regards

Miltinho
Brazil

__


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Abundance data ordination in R

2007-04-01 Thread Sarah Goslee
Hi,

There's a very good ordination web page by Mike Palmer aimed at
ecologists (and since you have a species x site matrix, I'm assuming
that describes you) at http://ordination.okstate.edu/

My recommendation is generally nonmetric multidimensional scaling
(principal coordinates analysis is a metric scaling ordination), with a
dissimilarity metric that doesn't consider joint absences, for example
Bray-Curtis/Sorensen. Treating absent species as missing data is
not a good idea, because while it may not be possible to say
that they are truly missing from that site (depending on taxa and
sampling methods), you at least know they aren't common at that
site. Ecological data are messy enough without discarding information!

There are several R packages that may be helpful, including
ecodist and vegan.

Sarah

On 4/1/07, Milton Cezar Ribeiro [EMAIL PROTECTED] wrote:
 Dear R-gurus

 I have a data.frame with abundance data for species and sites which looks 
 like:
 mydf-data.frame(
  sp1=sample(0:10,5,replace=T),
  sp2=sample(0:20,5,replace=T),
  sp3=sample(0:4,5,replace=T),
  sp4=sample(0:2,5,replace=T))
 rownames(mydf)-paste(sites,1:5,sep=)

 I would like make an ordination analysis of these data and my worries is 
 about the zeros (absence of species) into the matrix. Up to I read (Gotelli 
 - A primir of ecological statistics, 2004), when I have abundance data I 
 can´t compute Euclidian Distances because the zeros have the meaning of 
 absence of the species and not as zero counting. Gotelli suggests one make 
 principal coordinates analysis. I would like to here from you what you 
 think about and what is the best packages and functions to I compute my 
 distance matrices and do my ordination analysis. Can I considere zero as NA 
 on my data.frame? Is there a good PDF book available about Multivariate 
 Analysis for abundance data available on the web?

 Kind regards

 Miltinho
 Brazil


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Abundance data ordination in R

2007-04-01 Thread José Rafael Ferrer Paris
There are many ways to do this, really. For example if you use
constrained (~ canonical) correspondence analysis the distance measure
between sites is Chi-square and  absences are not informative to the
analysis. Or you can use an ecological distance measure (similarity
indices like Soerensen, Bray-Curtis, Jaccard, and others) and perform
principal coordinates (=multidimensional scaling), etc. Read the
documentation and tutorials for the packages vegan, ade4 and labdsv. 

You might start your search at the page of Jari Oksanen:
http://cc.oulu.fi/~jarioksa/softhelp/vegan.html
or the one from Dave Roberts
http://ecology.msu.montana.edu/labdsv/R/
. The vegan tutorial was useful for me to learn to use vegan:
http://cc.oulu.fi/~jarioksa/opetus/metodi/vegantutor.pdf
If you need more indeep mathemathical details, you should take a look at
Daniel Chessels site:
http://pbil.univ-lyon1.fr/R/perso/pagechessel.html
 There are plenty of pdfs available for download (however, some are
suited for beginners, others require more background knowledge) . Be
warned: there is a large variety of techniques for multivariate analysis
with different properties and weaknesses, sometimes the most popular
analysis are not the most appropriate. Be sure of what you want and what
you are doing before you perform the analysis, the interpretation will
depend on the techniques applied.

I personally find ade4 implements many different techniques but is
poorly documented and some functionalities are somehow hidden, while
vegan provides more information about the functions and is perfect for
getting started. I haven't used labdsv yet. 
 
hope this help

JR

El dom, 01-04-2007 a las 09:20 -0700, Milton Cezar Ribeiro escribió:
 Dear R-gurus
 
 I have a data.frame with abundance data for species and sites which looks 
 like:
 mydf-data.frame(
  sp1=sample(0:10,5,replace=T),
  sp2=sample(0:20,5,replace=T),
  sp3=sample(0:4,5,replace=T),
  sp4=sample(0:2,5,replace=T))
 rownames(mydf)-paste(sites,1:5,sep=)
 
 I would like make an ordination analysis of these data and my worries is 
 about the zeros (absence of species) into the matrix. Up to I read (Gotelli 
 - A primir of ecological statistics, 2004), when I have abundance data I cant 
 compute Euclidian Distances because the zeros have the meaning of absence of 
 the species and not as zero counting. Gotelli suggests one make principal 
 coordinates analysis. I would like to here from you what you think about and 
 what is the best packages and functions to I compute my distance matrices and 
 do my ordination analysis. Can I considere zero as NA on my data.frame? Is 
 there a good PDF book available about Multivariate Analysis for abundance 
 data available on the web?
 
 Kind regards
 
 Miltinho
 Brazil
 
 __
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.