[R] indicator value in labdsv

2005-09-19 Thread astrzelczak


Hi,

I'm trying to find out what threshold of indicator value in labadsv should be
used to accept a specie as an indicator one? So far I assumed that indval=0.5
is high enough to avoid any mistakes but it was based only in my intuition.

I'd be greatful for any advise

best regards

Agnieszka

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] indicator value in labdsv

2005-09-19 Thread Dave Roberts
Agnieszka,

 As Jari indicated, it depends on which function you meant in you 
inquiry.  The duleg() function implements the Dufrene-Legendre 
algorithm, where indicator species are indicative of a priori 
communities.  It this requires a classification, and is biased to find 
species which occur in the dataset approximately as often as the mean 
cluster size.

 The indpsc() function calculates the mean similarity of all samples 
a species occurs in.  This is slightly biased because  we know that the 
samples being used to calculate the mean share at least the species that 
defines them, but it is still possible to compare those values to the 
mean similarity of the whole matrix, or to an expectation of maximum 
similarity.  Obviously, as species occur more frequently, the harder it 
is to have a really high similarity (indicator value), with the extreme 
case that a species that occurs in every sample must have the same value 
as the mean of the whole matrix.

 To tell the truth, I forgot that indspc() was included in the 
current version of labdsv.  In the new version (due to be released any 
day), I have included a permutation test that estimates quantiles of 
expected values for different numbers of occurrences.  It works, but is 
pretty slow.  Jari has created a version that uses parametric statistics 
to estimate the same envelope, but I haven't had a chance to try it yet.

 What research are you doing, and what are you really trying to 
determine?  Perhaps something altogether different will work better.

Thanks, Dave Roberts

 On Mon, 2005-09-19 at 09:41 +0200, [EMAIL PROTECTED] wrote:
 
Hi,

I'm trying to find out what threshold of indicator value in labadsv should be
used to accept a specie as an indicator one? So far I assumed that indval=0.5
is high enough to avoid any mistakes but it was based only in my intuition.

I'd be greatful for any advise

best regards

 
 
 Agnieszka,
 
 R mailing list software appends the following to your message:
 
 
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 
 
 Then about indicator value analysis. You should be more specific: there
 seem to be three alternatives functions for indicator species in
 labdsv. Which did  you mean? At least two of these return an item called
 indval, and these two alternative indvals are very different. For
 the Dufrêne-Legendre indvals, you should check the original paper (see
 references in the help page), and there you even have an associated P
 value. In indspc, the variance of the indval clearly is dependent on
 species frequency. Moreover, in indspc the expected indval (and its
 variance) are dependent on the whole set of sites you have: these
 reflect the general homogeneity of your data set. Therefore you cannot
 say there that any certain value would mean that a species is a good
 indicator. However, it would be easy to work out standard errors for
 indspc indvals.
 
 I think it would be more useful to post to some other mailing group
 where people are more concerned about indicator species, or to contact
 the package author directly (I CC this message to him).
 
 cheers, jari oksanen


-- 

David W. Roberts office 406-994-4548
Professor and Head  FAX 406-994-3190
Department of Ecology email [EMAIL PROTECTED]
Montana State University
Bozeman, MT 59717-3460

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] indicator value in labdsv

2005-09-19 Thread astrzelczak

Hello,

I was uclear before, I'm sory about it. I forgot to add that I'm using duleg...

I used mvpart for multivariate regression trees. My input variables are
environmental parameters, output variables are macrophyte species
(presence=1,absence=0 in conecutive cases=lakes). For obtained classes I used
duleg to find indicator species for every class. I checked the article Dufrene,
M. and Legendre, P. 1997. Species assemblages and indicator species: the need
for a flexible asymmetrical approach. Ecol. Monogr. 67(3):345-366. The authors
used the threshold of indval=0.25(25%) and that's the only hint I've found in
the literature. This threshod seems to reasonable, but still I have impression
that's too low...

best regards
Agnieszka


 Agnieszka,

  As Jari indicated, it depends on which function you meant in you
 inquiry.  The duleg() function implements the Dufrene-Legendre
 algorithm, where indicator species are indicative of a priori
 communities.  It this requires a classification, and is biased to find
 species which occur in the dataset approximately as often as the mean
 cluster size.

  The indpsc() function calculates the mean similarity of all samples
 a species occurs in.  This is slightly biased because  we know that the
 samples being used to calculate the mean share at least the species that
 defines them, but it is still possible to compare those values to the
 mean similarity of the whole matrix, or to an expectation of maximum
 similarity.  Obviously, as species occur more frequently, the harder it
 is to have a really high similarity (indicator value), with the extreme
 case that a species that occurs in every sample must have the same value
 as the mean of the whole matrix.

  To tell the truth, I forgot that indspc() was included in the
 current version of labdsv.  In the new version (due to be released any
 day), I have included a permutation test that estimates quantiles of
 expected values for different numbers of occurrences.  It works, but is
 pretty slow.  Jari has created a version that uses parametric statistics
 to estimate the same envelope, but I haven't had a chance to try it yet.

  What research are you doing, and what are you really trying to
 determine?  Perhaps something altogether different will work better.

 Thanks, Dave Roberts

 On Mon, 2005-09-19 at 09:41 +0200, [EMAIL PROTECTED] wrote:

Hi,

I'm trying to find out what threshold of indicator value in labadsv should be
used to accept a specie as an indicator one? So far I assumed that indval=0.5
is high enough to avoid any mistakes but it was based only in my intuition.

I'd be greatful for any advise

best regards



 Agnieszka,

 R mailing list software appends the following to your message:


PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html


 Then about indicator value analysis. You should be more specific: there
 seem to be three alternatives functions for indicator species in
 labdsv. Which did  you mean? At least two of these return an item called
 indval, and these two alternative indvals are very different. For
 the Dufręne-Legendre indvals, you should check the original paper (see
 references in the help page), and there you even have an associated P
 value. In indspc, the variance of the indval clearly is dependent on
 species frequency. Moreover, in indspc the expected indval (and its
 variance) are dependent on the whole set of sites you have: these
 reflect the general homogeneity of your data set. Therefore you cannot
 say there that any certain value would mean that a species is a good
 indicator. However, it would be easy to work out standard errors for
 indspc indvals.

 I think it would be more useful to post to some other mailing group
 where people are more concerned about indicator species, or to contact
 the package author directly (I CC this message to him).

 cheers, jari oksanen





--
Best regards,
  mailto:[EMAIL PROTECTED]

Agnieszka Strzelczak, Research Assistant
mailto:[EMAIL PROTECTED]

Institute of Chemistry and Environmental Protection
Faculty of Chemical Engineering
Szczecin University of Technology
Aleja Piastow 42
71-065 Szczecin
Poland

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] indicator value in labdsv

2005-09-19 Thread Dave Roberts
Wow!  That was fast!

 Unfortunately, Agnieszka, I don't think you will find an objective 
criterion for this.  Clearly, species which do not have a statistically 
significant value are probably less useful, but of the many that are 
significant, many may be marginal.

 Without knowing fully what you are hoping to achieve, I think I 
would rank the species by indicator value, and establish the highest 
threshold for indicator value that gives you a suitable number of 
species for each type.  That way, if you are looking to write a field 
key, for example, you would have sufficient values to identify every 
type I suspect.

Good luck, Dave

[EMAIL PROTECTED] wrote:
 Hello,
 
 I was uclear before, I'm sory about it. I forgot to add that I'm using 
 duleg...
 
 I used mvpart for multivariate regression trees. My input variables are
 environmental parameters, output variables are macrophyte species
 (presence=1,absence=0 in conecutive cases=lakes). For obtained classes I used
 duleg to find indicator species for every class. I checked the article 
 Dufrene,
 M. and Legendre, P. 1997. Species assemblages and indicator species: the need
 for a flexible asymmetrical approach. Ecol. Monogr. 67(3):345-366. The authors
 used the threshold of indval=0.25(25%) and that's the only hint I've found in
 the literature. This threshod seems to reasonable, but still I have impression
 that's too low...
 
 best regards
 Agnieszka
 
 
 
Agnieszka,
 
 
 As Jari indicated, it depends on which function you meant in you
inquiry.  The duleg() function implements the Dufrene-Legendre
algorithm, where indicator species are indicative of a priori
communities.  It this requires a classification, and is biased to find
species which occur in the dataset approximately as often as the mean
cluster size.
 
 
 The indpsc() function calculates the mean similarity of all samples
a species occurs in.  This is slightly biased because  we know that the
samples being used to calculate the mean share at least the species that
defines them, but it is still possible to compare those values to the
mean similarity of the whole matrix, or to an expectation of maximum
similarity.  Obviously, as species occur more frequently, the harder it
is to have a really high similarity (indicator value), with the extreme
case that a species that occurs in every sample must have the same value
as the mean of the whole matrix.
 
 
 To tell the truth, I forgot that indspc() was included in the
current version of labdsv.  In the new version (due to be released any
day), I have included a permutation test that estimates quantiles of
expected values for different numbers of occurrences.  It works, but is
pretty slow.  Jari has created a version that uses parametric statistics
to estimate the same envelope, but I haven't had a chance to try it yet.
 
 
 What research are you doing, and what are you really trying to
determine?  Perhaps something altogether different will work better.
 
 
Thanks, Dave Roberts
 
 
On Mon, 2005-09-19 at 09:41 +0200, [EMAIL PROTECTED] wrote:


Hi,

I'm trying to find out what threshold of indicator value in labadsv should 
be
used to accept a specie as an indicator one? So far I assumed that 
indval=0.5
is high enough to avoid any mistakes but it was based only in my intuition.

I'd be greatful for any advise

best regards



Agnieszka,

R mailing list software appends the following to your message:



PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html


Then about indicator value analysis. You should be more specific: there
seem to be three alternatives functions for indicator species in
labdsv. Which did  you mean? At least two of these return an item called
indval, and these two alternative indvals are very different. For
the Dufręne-Legendre indvals, you should check the original paper (see
references in the help page), and there you even have an associated P
value. In indspc, the variance of the indval clearly is dependent on
species frequency. Moreover, in indspc the expected indval (and its
variance) are dependent on the whole set of sites you have: these
reflect the general homogeneity of your data set. Therefore you cannot
say there that any certain value would mean that a species is a good
indicator. However, it would be easy to work out standard errors for
indspc indvals.

I think it would be more useful to post to some other mailing group
where people are more concerned about indicator species, or to contact
the package author directly (I CC this message to him).

cheers, jari oksanen
 
 
 
 
 
 
 --
 Best regards,
   mailto:[EMAIL PROTECTED]
 
 Agnieszka Strzelczak, Research Assistant
 mailto:[EMAIL PROTECTED]
 
 Institute of Chemistry and Environmental Protection
 Faculty of Chemical Engineering
 Szczecin University of Technology
 Aleja Piastow 42
 71-065 Szczecin
 Poland
 
 


-- 

Re: [R] indicator value in labdsv

2005-09-19 Thread astrzelczak

That's what I was afraid of.. I'm not a biologist, so I have to involve my
co-workes who are well-versed in biological issues. Thank you very, very much
for help!!!

best regards
Agnieszka


Monday, September 19, 2005, 6:38:35 PM, you wrote:

 Wow!  That was fast!

  Unfortunately, Agnieszka, I don't think you will find an objective
 criterion for this.  Clearly, species which do not have a statistically
 significant value are probably less useful, but of the many that are
 significant, many may be marginal.

  Without knowing fully what you are hoping to achieve, I think I
 would rank the species by indicator value, and establish the highest
 threshold for indicator value that gives you a suitable number of
 species for each type.  That way, if you are looking to write a field
 key, for example, you would have sufficient values to identify every
 type I suspect.

 Good luck, Dave

 [EMAIL PROTECTED] wrote:
 Hello,

 I was uclear before, I'm sory about it. I forgot to add that I'm using
duleg...

 I used mvpart for multivariate regression trees. My input variables are
 environmental parameters, output variables are macrophyte species
 (presence=1,absence=0 in conecutive cases=lakes). For obtained classes I used
 duleg to find indicator species for every class. I checked the article
Dufrene,
 M. and Legendre, P. 1997. Species assemblages and indicator species: the need
 for a flexible asymmetrical approach. Ecol. Monogr. 67(3):345-366. The
authors
 used the threshold of indval=0.25(25%) and that's the only hint I've found in
 the literature. This threshod seems to reasonable, but still I have
impression
 that's too low...

 best regards
 Agnieszka



Agnieszka,


 As Jari indicated, it depends on which function you meant in you
inquiry.  The duleg() function implements the Dufrene-Legendre
algorithm, where indicator species are indicative of a priori
communities.  It this requires a classification, and is biased to find
species which occur in the dataset approximately as often as the mean
cluster size.


 The indpsc() function calculates the mean similarity of all samples
a species occurs in.  This is slightly biased because  we know that the
samples being used to calculate the mean share at least the species that
defines them, but it is still possible to compare those values to the
mean similarity of the whole matrix, or to an expectation of maximum
similarity.  Obviously, as species occur more frequently, the harder it
is to have a really high similarity (indicator value), with the extreme
case that a species that occurs in every sample must have the same value
as the mean of the whole matrix.


 To tell the truth, I forgot that indspc() was included in the
current version of labdsv.  In the new version (due to be released any
day), I have included a permutation test that estimates quantiles of
expected values for different numbers of occurrences.  It works, but is
pretty slow.  Jari has created a version that uses parametric statistics
to estimate the same envelope, but I haven't had a chance to try it yet.


 What research are you doing, and what are you really trying to
determine?  Perhaps something altogether different will work better.


Thanks, Dave Roberts


On Mon, 2005-09-19 at 09:41 +0200, [EMAIL PROTECTED] wrote:


Hi,

I'm trying to find out what threshold of indicator value in labadsv should
be
used to accept a specie as an indicator one? So far I assumed that
indval=0.5
is high enough to avoid any mistakes but it was based only in my intuition.

I'd be greatful for any advise

best regards



Agnieszka,

R mailing list software appends the following to your message:



PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html


Then about indicator value analysis. You should be more specific: there
seem to be three alternatives functions for indicator species in
labdsv. Which did  you mean? At least two of these return an item called
indval, and these two alternative indvals are very different. For
the Dufręne-Legendre indvals, you should check the original paper (see
references in the help page), and there you even have an associated P
value. In indspc, the variance of the indval clearly is dependent on
species frequency. Moreover, in indspc the expected indval (and its
variance) are dependent on the whole set of sites you have: these
reflect the general homogeneity of your data set. Therefore you cannot
say there that any certain value would mean that a species is a good
indicator. However, it would be easy to work out standard errors for
indspc indvals.

I think it would be more useful to post to some other mailing group
where people are more concerned about indicator species, or to contact
the package author directly (I CC this message to him).

cheers, jari oksanen



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!