On Thu, 24 Aug 2006 11:04:42 -0300, James J. Roper <[EMAIL PROTECTED]> wrote:
>Steve, > Dear Jim, I'm not sure if I fully agree with this. >Even PCA should require multivariate normality, because the method is >based on using either the covariance or the correlation matrix. Both of >these require multivariate normality to some extent to be meaningful. Why is that? The normality assumption is only required for the hypothesis testing procedure. Suppose you just want to use the correlation or covariance as a measure of association....? >The variance is only a good estimate of the "true" variance if the >distribution is normal or transformable to normality, and so, normality >is required. Correlations, to be meaningful, also require normality, as >the statistic program is not using a covariance matrix based on ranks >(Spearman). In some situations perhaps yes..but I can also imagine situations in which this does not hold. Suppose you are interested in the correlation between a species abundance and temperature. Assumming bivariate normality means that each of the variables should be normally distributed. So..most of your temperature values should be clumbed around a certain value.... If all the fun happens in this specific temperature regime, then that is fine. But if you have long gradients, it is perhaps better to take equal number of samples along the temperature gradient (this is also one of the assumptions in methods like canonical correspondence analysis and redundancy analysis...see Ter Braak 1986). Finally, if the new components (reduced number of variables) >are being used in a new analysis (hypothesis testing) - for them to make >sense, their distributions should also fit the assumptions of the test >(if ANOVA, normality of the residuals and equality of the variances, for >example). If you mean that you want to use the PCA axes as new explanatory variables in a linear regression.....then...keep in mind that we do not need normality of the explanatory variables in linear regression. What we actually assume is that at each X value the Y data are normally distributed. So..in a linear regression with one X, you can draw a line (fitted values) with a tunnel on top of it showing the probability of possible realisations. The more I think about it, the more I start to believe that a niceley uniform distributed X is better (approximately the same number of observations along your X gradient). In a GAM, this is even more important or else you get these very wide confidence bands at the end of the gradients. If the original variables were far from normal, then the >reduced number of variables based on the correlation or covariance >matrix are problematic as well. And of course, we all know that the >normality assumption is fairly robust, more so than the equality of >variance assumption. That is true. There are even good textbooks that advocate that normality is not needed at all, and cite the central limit theory. Alain www.highstat.com > >Cheers, > >Jim > >Steve Brewer wrote: >> Please allow me to clarify one comment I made regarding multivariate >> normality. When I was talking about the multivariate normality >> requirement, it was in relation to doing discriminant analysis and >> MANOVA, not PCA. I believe that multivariate normality is required for >> testing significance using these techniques. If I am wrong, then >> several multivariate textbooks are wrong also. >> >> Indeed, multivariate normality is not required for PCA. PCA does not >> involve hypothesis testing. Having said that, several have shown using >> simulations that, when certain aspects of multivariate normality do >> not hold (e.g., when there are lots of zero values), other exploratory >> techniques (e.g., non-metric multidimensional scaling) perform better. >> I have seen some use Principal Coordinates Analysis (using distance >> measures other than correlation) to examine morphometric differences >> among taxa. Presumably, this performs better than PCA under certain >> circumstances. >> >> One problem I have seen is that some investigators become attached to >> a particular technique. When I ask them why, many respond that it is >> the most commonly used analysis in their particular field of study. >> Hopefully, we can all agree that *that* is not an adequate >> justification for using a particular technique. Personally, I prefer >> to analyze multivariate data using several different techniques >> (including PCA). When they provide different results, I become >> suspicious and am encouraged to find out why. >> >> Steve Brewer >> >> >> At 6:05 AM -0400 8/24/06, Highland Statistics Ltd. wrote: >>> > >>>> >>>> Hope this helps some. Let me know if you want information about >>>> SuperAnova or PC-Ord. >>>> >>>> Steve >>>> >>>> >>>> >>>> >>>> >>>> At 7:31 PM -0500 8/21/06, Chris Taylor wrote: >>>>> Hey Steve. What do you run those nested discriminant analyses with? >>>>> Hope all is well! >>>>> >>>>> Chris >>>>> >>>>> At 11:18 AM 8/21/2006, you wrote: >>>>>> Matthew, >>>>>> >>>>>> You may also want to do a nested discriminant analysis to determine >>>>>> whether the mean morphology differs among populations, while >>>>>> controlling for species. The nesting of populations within species >>>>>> should "correct for phylogeny", unless there is something I'm missing >>>>>> here (e.g., phylogenetic relationships among populations within >>>>>> species). Don't really see the need for PICs. Make sure the >>>>>> assumptions of multivariate normality are met. >>>>>> >>>>>> Steve >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>> >>> Matthew, >>> >>>>>> At 10:30 AM -0400 8/18/06, Matthew Gifford wrote: >>>>>>> I am looking for advice regarding principal components analysis. My >>>>>>> situation is as follows: I have a >>>>>>> data set of morphological measurements for 6 "taxa" (4 populations >>>>>>> of one species and 2 >>>>>>> populations of another). I read somewhere that in order to do a PCA >>>>>>> appropriately, one needs to >>>>>>> have more "taxa" (i.e., rows) than measurement variables (i.e., >>>>>>> columns). >>> >>> This is to avoid negative eigenvalues. But if you only focus on the >>> first >>> few eigenvalues, this should be no problem. >>> >>> If I use mean values for >>>>>>> each "taxon" then I viiolate this assumption. To circumvent this, >>>>>>> is it valid to do a PCA on all data >>>>>>> and use mean PC scores? >>> >>> No need to do this. And if you do, it doesn't solve the engative >>> eigenvalue problem. >>> >>> >>> No need for multivariate normality neither. >>> >>> >>> I will be using this information in >>>>>>> phylogenetically independent contrasts >>>>>>> analysis looking at ecomorphological relationships. >>> >>> >>> The real problem with morphometric data is that the first axes become >>> size >>> and shape axes. See: >>> >>> Jolliffe IT (2002) Principal Component Analysis. Springer: New York >>> >>> and: >>> >>> Claude, J., Jolliffe, I.T., Zuur, A.F., Ieno, E.N. and Smith, G.M. >>> Multivariate analyses of morphometric turtle data size and shape. >>> Chapter 30 in Zuur, AF., Ieno, EN, Smith. GM. (Expected publication >>> date: >>> March 2007). Springer >>> >>> >>> Kind regards, >>> >>> Alain Zuur >>> www.highstat.com >>> >>> >>> >>> >>> >>> >>> Any >>>>>>> thoughts/opinions are most appreciated. >>>>>>> >>> >>>>Best, >>>>>>> >>>>>>> Matthew E. Gifford >>>>>>> Ph.D. Candidate >>>>>>> Washington University, St. Louis, MO >>>>>>> http://www.biology.wustl.edu/larsonlab/people/Gifford/Matt's_webpage.ht >>>>>>> >>> ml >>>>>> >>>>>> >>>>>> -- >>>>>> Department of Biology >>>>>> PO Box 1848 >>>>>> University of Mississippi >>>>>> University, Mississippi 38677-1848 >>>>>> >>>>>> Brewer web page - http://home.olemiss.edu/~jbrewer/ >>>>>> >>>>>> FAX - 662-915-5144 >>>>>> Phone - 662-915-1077 >>>>> >>>>> *************************************************************** >>>>> Christopher M. Taylor >>>>> Associate Professor of Biological Sciences >>>>> Dept. of Biological Sciences >>>>> Mississippi State University >>>>> Mississippi State, MS 39762 >>>>> Phone: 662-325-8591 >>>>> Fax: 662-325-7939 >>>>> Email: [EMAIL PROTECTED] >>>>> http://www2.msstate.edu/~ctaylor/ctaylor.htm >>>> >>>> >>>> -- >>>> Department of Biology >>>> PO Box 1848 >>>> University of Mississippi >>>> University, Mississippi 38677-1848 >>>> >>>> Brewer web page - http://home.olemiss.edu/~jbrewer/ >>>> >>>> FAX - 662-915-5144 >>>> Phone - 662-915-1077 >>>> ========================================================================= >>>> >> >> > >-- >------------------------------------- >James J. Roper, Ph.D. >Universidade Federal do Paraná >Depto. de Zoologia >Caixa Postal 19020 >81531-990 Curitiba, Paraná, Brasil >===================================== >E-mail: [EMAIL PROTECTED] >Phone/Fone/Teléfono: 55 41 33611764 >celular: 55 41 99870543 >===================================== >Zoologia na UFPR >http://zoo.bio.ufpr.br/zoologia/ >Ecologia e Conservação na UFPR >http://www.bio.ufpr.br/ecologia/ >------------------------------------- >http://jjroper.sites.uol.com.br >Currículo Lattes >http://lattes.cnpq.br/2553295738925812 >=========================================================================
