James Rohlf Thank you for getting those comments from Richard Reyment. I am unsure what alterations are referred to in the sentence 'any alteration introduced into a matrix of compositions changes the sum in a manner that is beyond control'. I obviously need to look at the Geol Soc London book to understand the implications of simplex space. Please pass my thanks to Richard Reyment. I have long admired the clarity and practicality of his book Multivariate Morphometrics.
Richard Wright >Subject: Re: using ratios in MV correlational analysis > From: "F. James Rohlf" <[EMAIL PROTECTED]> > Date: Mon, 29 Sep 2008 21:08:43 -0400 > To: [email protected] > >The following are some comments by Richard Reyment who has worked on >problems in this area: > >"Is this useful. Generally speaking, biologists know abolsutely nothing >about the geometry of the simplex, and this is also true of a great many >statisticians. For the geomathematical fraternity, however, the subject is >of great importance because it is often connected to analyses involving >large-scale economic aspects where an inappropriate analysis can waste great >sums of money. > > G G Simpson was among the first biologists to point out that ratios >cannot be used in correlation exercises such as indictaed in the Course 1 >agenda. >Originally, it was Karl Pearson who in 1898 proved that ratios induce >spurious correlations. This was in relation to so-called standardised >data-vectors. > > Of recent years geomathematicians have taken the subject much further, >following the results of the statistician John Aitchison, who proved that >correlation coefficients are not defined in simplex space, that is the space >in which percentages, frequencies etc lie. This is the outcome of the fact >that such data have a constant sum and any alteration introduced into a >matrix of compositions changes the sum in a manner that is beyond control. >This is not a problem for open-space data of course. > > Ref. John Aitchison: The Statistical analysis of Compositional Data; >Chapman and Hall (1986), slightly revised version reprinted in 2003. > > Hence, multivariate analyses involving compositional data must be made >using the appropriate algebra for distributions on the simplex. >Applying the "open-space" standard version can only lead to incorrect >results. > > Since the original work was published by Aitchison, the Applied >Mathematicians Professors Vera Pawlowsky-Glahn and Juan José Egozcue have >raised the bar several levels in that they introduced the concept of a >finite dimensional Hilbert Space into the analysis of simplicial geometry. >This leads to very elegant solutions. > > An indispensible reference is the recently published volume edited by A. >Buccianti, G. Mateu-Figueras and V. Pawlowsky-Glahn > >COMPOSITIONAL DATA-ANALYSIS IN THE GEOSCIENCES: FROM THEORY TO PRACTICE > >Published by the Geological Society of London, Special Publication No 264, >2006 (212 pp.) > > http://www.geolsoc.org.uk/bookshop > > > Best wishes > >Richard A. Reyment" > >------------------------ >F. James Rohlf, Distinguished Professor >Ecology & Evolution, Stony Brook University >www: http://life.bio.sunysb.edu/ee/rohlf > > >> -----Original Message----- >> From: Classification, clustering, and phylogeny estimation >> [mailto:[EMAIL PROTECTED] On Behalf Of Richard Wright >> Sent: Saturday, September 27, 2008 2:05 AM >> To: [email protected] >> Subject: using ratios in MV correlational analysis >> >> There is a scattered literature on the dangers, or otherwise, of using >> ratios in correlational analyses. >> >> I have read what looks like a non-obfuscatory paper on this topic by >> Firebaugh and Gibbs "User's Guide to Ratio Variables" from American >> Sociological Review, Vol.50, No.5 (1985) pp.713-722. >> >> On page 721 the authors state: "Avoid mixed methods (part ratio, part >> component). If Z is controlled by division rather than by >> residualization, all of the other variables should be divided by Z. >> Should only some of the variables by divided by Z, the effect of Z is >> 'controlled' for some variables and not for others, and a defensible >> interpretation of the results is difficult." >> >> The reason for my interest is that I am trying to evaluate a >> morphometric paper that does linear discriminant analysis on a mixture >> of measurements and ratios derived from those same measurements. For >> example the analysis includes (A) Length as well as Height/Length and >> (B) Height and Breadth as well as Height/Breadth and Height/Length. >> >> This paper seems to be an example of the 'mixed method' that Firebaugh >> and Gibbs warn against, where data are part ratio, part measurement, >> and spurious correlations are introduced into the data. >> >> So my first question is whether I am correct in this interpretation. >> >> My second question also concerns ratios. >> >> In his Multivariate Statistical Methods, 2nd ed. 1994, B.F.J. Manly >> suggests controlling for the effects of absolute size difference in a >> PCA of pots (goblets) by expressing the measurements as "a proportion >> of the sum of all measurements on that goblet." >> >> Given that each variable is divided by the same sum, this example of >> the use of ratios seems to be a case that Firebaugh and Gibbs would >> not frown on. >> >> I shall welcome any comments on these questions and any pointers to >> relevant literature. >> >> Richard >> >> ---------------------------------------------- >> CLASS-L list. >> Instructions: http://www.classification- >> society.org/csna/lists.html#class-l > >---------------------------------------------- >CLASS-L list. >Instructions: http://www.classification-society.org/csna/lists.html#class-l ---------------------------------------------- CLASS-L list. Instructions: http://www.classification-society.org/csna/lists.html#class-l
