Um, panic? :-)

Start with summary statistics: simply calculate means, and plot histograms for 
every species. I'd guess you'll find quite a few are only in 1 or 2 plots, so 
they're not much use. That could reduce the number of species considerably. 
You'll also be able to spot the species with outliers, which are also going to 
screw up the pairwise plots.

Then I'd also just plot (say) the 10 most abundant species, and just have a 
look. If they're OK, perhaps plot the next most abundant 10. I suspect that'll 
help you see most strange patterns.

BTW, odd to decide to use Spearman's before you've even seen the data. You're 
dissing your data by not even asking it whether it wants a Spearman's - it 
might be quite happy with Pearson.

Bob

Bob O'Hara

Tel: +49 69 798 40216 (in Germany)
Mobile: +49 1515 888 5440
WWW: http://www.bik-f.de/root/index.php?page_id=219
Blog: http://blogs.nature.com/boboh/
Journal of Negative Results - EEB: www.jnr-eeb.org
>>> Jane Shevtsov  06/14/10 7:08 AM >>>
Dear list members,

Stats books (and common sense) typically insist that you need to
examine scatter plots of your data before computing a correlation
coefficient. However, I have a species-plot matrix with 150 species,
for which I plan to generate a correlation matrix as a start for
further analysis. (I'm using the Spearman rank-order correlation to be
on the safe side.) That works out to 11,175 pairwise scatter plots!
What do you recommend I do in order to get a feel for the data and
diagnose potential problems without looking at all of them?

Thanks in advance, and I'll post a summary of responses.

Jane Shevtsov

-- 
-------------
Jane Shevtsov
Ecology Ph.D. candidate, University of Georgia
co-founder, 
Check out my blog, Perceiving Wholes

"The whole person must have both the humility to nurture the
Earth and the pride to go to Mars." --Wyn Wachhorst, The Dream
of Spaceflight

Reply via email to