[R] indicator value in labdsv
Hi, I'm trying to find out what threshold of indicator value in labadsv should be used to accept a specie as an indicator one? So far I assumed that indval=0.5 is high enough to avoid any mistakes but it was based only in my intuition. I'd be greatful for any advise best regards Agnieszka __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] indicator value in labdsv
Agnieszka, As Jari indicated, it depends on which function you meant in you inquiry. The duleg() function implements the Dufrene-Legendre algorithm, where indicator species are indicative of a priori communities. It this requires a classification, and is biased to find species which occur in the dataset approximately as often as the mean cluster size. The indpsc() function calculates the mean similarity of all samples a species occurs in. This is slightly biased because we know that the samples being used to calculate the mean share at least the species that defines them, but it is still possible to compare those values to the mean similarity of the whole matrix, or to an expectation of maximum similarity. Obviously, as species occur more frequently, the harder it is to have a really high similarity (indicator value), with the extreme case that a species that occurs in every sample must have the same value as the mean of the whole matrix. To tell the truth, I forgot that indspc() was included in the current version of labdsv. In the new version (due to be released any day), I have included a permutation test that estimates quantiles of expected values for different numbers of occurrences. It works, but is pretty slow. Jari has created a version that uses parametric statistics to estimate the same envelope, but I haven't had a chance to try it yet. What research are you doing, and what are you really trying to determine? Perhaps something altogether different will work better. Thanks, Dave Roberts On Mon, 2005-09-19 at 09:41 +0200, [EMAIL PROTECTED] wrote: Hi, I'm trying to find out what threshold of indicator value in labadsv should be used to accept a specie as an indicator one? So far I assumed that indval=0.5 is high enough to avoid any mistakes but it was based only in my intuition. I'd be greatful for any advise best regards Agnieszka, R mailing list software appends the following to your message: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Then about indicator value analysis. You should be more specific: there seem to be three alternatives functions for indicator species in labdsv. Which did you mean? At least two of these return an item called indval, and these two alternative indvals are very different. For the Dufrêne-Legendre indvals, you should check the original paper (see references in the help page), and there you even have an associated P value. In indspc, the variance of the indval clearly is dependent on species frequency. Moreover, in indspc the expected indval (and its variance) are dependent on the whole set of sites you have: these reflect the general homogeneity of your data set. Therefore you cannot say there that any certain value would mean that a species is a good indicator. However, it would be easy to work out standard errors for indspc indvals. I think it would be more useful to post to some other mailing group where people are more concerned about indicator species, or to contact the package author directly (I CC this message to him). cheers, jari oksanen -- David W. Roberts office 406-994-4548 Professor and Head FAX 406-994-3190 Department of Ecology email [EMAIL PROTECTED] Montana State University Bozeman, MT 59717-3460 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] indicator value in labdsv
Hello, I was uclear before, I'm sory about it. I forgot to add that I'm using duleg... I used mvpart for multivariate regression trees. My input variables are environmental parameters, output variables are macrophyte species (presence=1,absence=0 in conecutive cases=lakes). For obtained classes I used duleg to find indicator species for every class. I checked the article Dufrene, M. and Legendre, P. 1997. Species assemblages and indicator species: the need for a flexible asymmetrical approach. Ecol. Monogr. 67(3):345-366. The authors used the threshold of indval=0.25(25%) and that's the only hint I've found in the literature. This threshod seems to reasonable, but still I have impression that's too low... best regards Agnieszka Agnieszka, As Jari indicated, it depends on which function you meant in you inquiry. The duleg() function implements the Dufrene-Legendre algorithm, where indicator species are indicative of a priori communities. It this requires a classification, and is biased to find species which occur in the dataset approximately as often as the mean cluster size. The indpsc() function calculates the mean similarity of all samples a species occurs in. This is slightly biased because we know that the samples being used to calculate the mean share at least the species that defines them, but it is still possible to compare those values to the mean similarity of the whole matrix, or to an expectation of maximum similarity. Obviously, as species occur more frequently, the harder it is to have a really high similarity (indicator value), with the extreme case that a species that occurs in every sample must have the same value as the mean of the whole matrix. To tell the truth, I forgot that indspc() was included in the current version of labdsv. In the new version (due to be released any day), I have included a permutation test that estimates quantiles of expected values for different numbers of occurrences. It works, but is pretty slow. Jari has created a version that uses parametric statistics to estimate the same envelope, but I haven't had a chance to try it yet. What research are you doing, and what are you really trying to determine? Perhaps something altogether different will work better. Thanks, Dave Roberts On Mon, 2005-09-19 at 09:41 +0200, [EMAIL PROTECTED] wrote: Hi, I'm trying to find out what threshold of indicator value in labadsv should be used to accept a specie as an indicator one? So far I assumed that indval=0.5 is high enough to avoid any mistakes but it was based only in my intuition. I'd be greatful for any advise best regards Agnieszka, R mailing list software appends the following to your message: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Then about indicator value analysis. You should be more specific: there seem to be three alternatives functions for indicator species in labdsv. Which did you mean? At least two of these return an item called indval, and these two alternative indvals are very different. For the Dufręne-Legendre indvals, you should check the original paper (see references in the help page), and there you even have an associated P value. In indspc, the variance of the indval clearly is dependent on species frequency. Moreover, in indspc the expected indval (and its variance) are dependent on the whole set of sites you have: these reflect the general homogeneity of your data set. Therefore you cannot say there that any certain value would mean that a species is a good indicator. However, it would be easy to work out standard errors for indspc indvals. I think it would be more useful to post to some other mailing group where people are more concerned about indicator species, or to contact the package author directly (I CC this message to him). cheers, jari oksanen -- Best regards, mailto:[EMAIL PROTECTED] Agnieszka Strzelczak, Research Assistant mailto:[EMAIL PROTECTED] Institute of Chemistry and Environmental Protection Faculty of Chemical Engineering Szczecin University of Technology Aleja Piastow 42 71-065 Szczecin Poland __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] indicator value in labdsv
Wow! That was fast! Unfortunately, Agnieszka, I don't think you will find an objective criterion for this. Clearly, species which do not have a statistically significant value are probably less useful, but of the many that are significant, many may be marginal. Without knowing fully what you are hoping to achieve, I think I would rank the species by indicator value, and establish the highest threshold for indicator value that gives you a suitable number of species for each type. That way, if you are looking to write a field key, for example, you would have sufficient values to identify every type I suspect. Good luck, Dave [EMAIL PROTECTED] wrote: Hello, I was uclear before, I'm sory about it. I forgot to add that I'm using duleg... I used mvpart for multivariate regression trees. My input variables are environmental parameters, output variables are macrophyte species (presence=1,absence=0 in conecutive cases=lakes). For obtained classes I used duleg to find indicator species for every class. I checked the article Dufrene, M. and Legendre, P. 1997. Species assemblages and indicator species: the need for a flexible asymmetrical approach. Ecol. Monogr. 67(3):345-366. The authors used the threshold of indval=0.25(25%) and that's the only hint I've found in the literature. This threshod seems to reasonable, but still I have impression that's too low... best regards Agnieszka Agnieszka, As Jari indicated, it depends on which function you meant in you inquiry. The duleg() function implements the Dufrene-Legendre algorithm, where indicator species are indicative of a priori communities. It this requires a classification, and is biased to find species which occur in the dataset approximately as often as the mean cluster size. The indpsc() function calculates the mean similarity of all samples a species occurs in. This is slightly biased because we know that the samples being used to calculate the mean share at least the species that defines them, but it is still possible to compare those values to the mean similarity of the whole matrix, or to an expectation of maximum similarity. Obviously, as species occur more frequently, the harder it is to have a really high similarity (indicator value), with the extreme case that a species that occurs in every sample must have the same value as the mean of the whole matrix. To tell the truth, I forgot that indspc() was included in the current version of labdsv. In the new version (due to be released any day), I have included a permutation test that estimates quantiles of expected values for different numbers of occurrences. It works, but is pretty slow. Jari has created a version that uses parametric statistics to estimate the same envelope, but I haven't had a chance to try it yet. What research are you doing, and what are you really trying to determine? Perhaps something altogether different will work better. Thanks, Dave Roberts On Mon, 2005-09-19 at 09:41 +0200, [EMAIL PROTECTED] wrote: Hi, I'm trying to find out what threshold of indicator value in labadsv should be used to accept a specie as an indicator one? So far I assumed that indval=0.5 is high enough to avoid any mistakes but it was based only in my intuition. I'd be greatful for any advise best regards Agnieszka, R mailing list software appends the following to your message: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Then about indicator value analysis. You should be more specific: there seem to be three alternatives functions for indicator species in labdsv. Which did you mean? At least two of these return an item called indval, and these two alternative indvals are very different. For the Dufręne-Legendre indvals, you should check the original paper (see references in the help page), and there you even have an associated P value. In indspc, the variance of the indval clearly is dependent on species frequency. Moreover, in indspc the expected indval (and its variance) are dependent on the whole set of sites you have: these reflect the general homogeneity of your data set. Therefore you cannot say there that any certain value would mean that a species is a good indicator. However, it would be easy to work out standard errors for indspc indvals. I think it would be more useful to post to some other mailing group where people are more concerned about indicator species, or to contact the package author directly (I CC this message to him). cheers, jari oksanen -- Best regards, mailto:[EMAIL PROTECTED] Agnieszka Strzelczak, Research Assistant mailto:[EMAIL PROTECTED] Institute of Chemistry and Environmental Protection Faculty of Chemical Engineering Szczecin University of Technology Aleja Piastow 42 71-065 Szczecin Poland --
Re: [R] indicator value in labdsv
That's what I was afraid of.. I'm not a biologist, so I have to involve my co-workes who are well-versed in biological issues. Thank you very, very much for help!!! best regards Agnieszka Monday, September 19, 2005, 6:38:35 PM, you wrote: Wow! That was fast! Unfortunately, Agnieszka, I don't think you will find an objective criterion for this. Clearly, species which do not have a statistically significant value are probably less useful, but of the many that are significant, many may be marginal. Without knowing fully what you are hoping to achieve, I think I would rank the species by indicator value, and establish the highest threshold for indicator value that gives you a suitable number of species for each type. That way, if you are looking to write a field key, for example, you would have sufficient values to identify every type I suspect. Good luck, Dave [EMAIL PROTECTED] wrote: Hello, I was uclear before, I'm sory about it. I forgot to add that I'm using duleg... I used mvpart for multivariate regression trees. My input variables are environmental parameters, output variables are macrophyte species (presence=1,absence=0 in conecutive cases=lakes). For obtained classes I used duleg to find indicator species for every class. I checked the article Dufrene, M. and Legendre, P. 1997. Species assemblages and indicator species: the need for a flexible asymmetrical approach. Ecol. Monogr. 67(3):345-366. The authors used the threshold of indval=0.25(25%) and that's the only hint I've found in the literature. This threshod seems to reasonable, but still I have impression that's too low... best regards Agnieszka Agnieszka, As Jari indicated, it depends on which function you meant in you inquiry. The duleg() function implements the Dufrene-Legendre algorithm, where indicator species are indicative of a priori communities. It this requires a classification, and is biased to find species which occur in the dataset approximately as often as the mean cluster size. The indpsc() function calculates the mean similarity of all samples a species occurs in. This is slightly biased because we know that the samples being used to calculate the mean share at least the species that defines them, but it is still possible to compare those values to the mean similarity of the whole matrix, or to an expectation of maximum similarity. Obviously, as species occur more frequently, the harder it is to have a really high similarity (indicator value), with the extreme case that a species that occurs in every sample must have the same value as the mean of the whole matrix. To tell the truth, I forgot that indspc() was included in the current version of labdsv. In the new version (due to be released any day), I have included a permutation test that estimates quantiles of expected values for different numbers of occurrences. It works, but is pretty slow. Jari has created a version that uses parametric statistics to estimate the same envelope, but I haven't had a chance to try it yet. What research are you doing, and what are you really trying to determine? Perhaps something altogether different will work better. Thanks, Dave Roberts On Mon, 2005-09-19 at 09:41 +0200, [EMAIL PROTECTED] wrote: Hi, I'm trying to find out what threshold of indicator value in labadsv should be used to accept a specie as an indicator one? So far I assumed that indval=0.5 is high enough to avoid any mistakes but it was based only in my intuition. I'd be greatful for any advise best regards Agnieszka, R mailing list software appends the following to your message: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Then about indicator value analysis. You should be more specific: there seem to be three alternatives functions for indicator species in labdsv. Which did you mean? At least two of these return an item called indval, and these two alternative indvals are very different. For the Dufręne-Legendre indvals, you should check the original paper (see references in the help page), and there you even have an associated P value. In indspc, the variance of the indval clearly is dependent on species frequency. Moreover, in indspc the expected indval (and its variance) are dependent on the whole set of sites you have: these reflect the general homogeneity of your data set. Therefore you cannot say there that any certain value would mean that a species is a good indicator. However, it would be easy to work out standard errors for indspc indvals. I think it would be more useful to post to some other mailing group where people are more concerned about indicator species, or to contact the package author directly (I CC this message to him). cheers, jari oksanen __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!