Re: size correction discriminant functions analyses
In my understanding to PCA, its main goal is to reduce the dimensionality of a problem without the loss of too much information. In other words, according to Prof. Rohlf, the purpose of PCA is to give you a low dimensional space that accounts for as much variation as possible. However, I agree with Oyvind that many scientists use PCA as a visualization device, projecting a multivariate data set onto a sheet of paper. On the other hand, testing the multivariate normality before applying any multivariate data analysis technique is one of the most serious problems because in most cases none do that and if any tried to do he may choose the wrong way. Actually, we (biologists and paleontologists) need a definite guide to follow when we face such problem. Best regards --- Dr. Ashraf M. T. Elewa Associate Professor Geology Department Faculty of Science Minia University Egypt [EMAIL PROTECTED] http://myprofile.cos.com/aelewa - Original Message - From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Wednesday, May 19, 2004 04:29 ? Subject: Re: size correction discriminant functions analyses Just a comment on this one, from a pragmatic point of view. It is of course true that PCA is only *guaranteed* to produce components maximizing variance if you have multivariate normality. The theory of PCA is based on this assumption. But in many cases, PCA is used purely as a visualization device, projecting a multivariate data set onto a sheet of paper so we can see it. For visualization of non-normal data, one could play around with different techniques, such as PCA, PCO, NMDS, projection pursuit etc., and then find that PCA does (or does not) perform well for the given data set. There is no law against making any linear combination you want of your variates, if it reveals information. For example, PCA may be perfectly adequate for resolving two well-separated groups, if the within-group variance is relatively small. Of course, when using PCA for non-normal data one must be a little careful and not over-interpret the results (especially not the component loadings), but I think it's too harsh to dismiss its use totally. I'm sure the hard-liners will flame me to pieces for this email, but I hope they will at least give me credit for my courage :-) Dr. Oyvind Hammer Geological Museum University of Oslo PCA Analysis assumes multivariate normality. Kathleen M. Robinette, Ph.D. Principal Research Anthropologist Air Force Research Laboratory == Replies will be sent to list. For more information see http://life.bio.sunysb.edu/morph/morphmet.html. == Replies will be sent to list. For more information see http://life.bio.sunysb.edu/morph/morphmet.html.
Re: size correction discriminant functions analyses
Don't know what happened to cause the earlier message largely void of content, but I think the original communication was to correct the Red Book reference. The date is 1985, not 1982. -ds On Tue, 2004-05-18 at 14:12, [EMAIL PROTECTED] wrote: -- Dennis E. Slice, Ph.D. Department of Biomedical Engineering Division of Radiologic Sciences Wake Forest University School of Medicine Winston-Salem, North Carolina, USA 27157-1022 Phone: 336-716-5384 Fax: 336-716-2870 Sender: [EMAIL PROTECTED] Precedence: bulk Reply-To: [EMAIL PROTECTED] == Replies will be sent to list. For more information see http://life.bio.sunysb.edu/morph/morphmet.html. -- Dennis E. Slice, Ph.D. Department of Biomedical Engineering Division of Radiologic Sciences Wake Forest University School of Medicine Winston-Salem, North Carolina, USA 27157-1022 Phone: 336-716-5384 Fax: 336-716-2870 == Replies will be sent to list. For more information see http://life.bio.sunysb.edu/morph/morphmet.html.
Re: size correction discriminant functions analyses
Dear collegues, Sender: [EMAIL PROTECTED] Precedence: bulk Reply-To: [EMAIL PROTECTED] About the above discussion on the linear measurements data for multivariate analysis, I should state that most times my problem (and I expect the problem of many people that wrks with it) is not of rows/columns number (that most times is ok, at leats in the cases I saw) nether of multivariate normality (I use R-project program, which as a test of multivariate normality, so it is easy to test) or lack of homogeneity of variances (this is a bit more dodgy, but the ref. I saw state that if you test unniveriate variances homogeneity (e.g. Bartlett test) it shoud give a good indication of the data variances). The problem that (I supose) most biologists encounter are the collinearity between variables... which strongly influences the representation givn by the PCA. I think this also happens in the NMDS, discriminant and canonical analysis. I probably did not made myself clear in the email. I am sorry... For me, it is very interesting that this things are debate in the list, and different people shows different solutions and bibliography, it is realy nice. In relation to the article from Biometrika, does anyone have the pdf? We dont have the journal in this college. In relation to the robustmess of the techniques to lack of normality, I agree with our colegue (so... I share your feelings of daring to state it... jijijij ;-)) thank you for all, Cheers, Marta - This mail sent through IMP: http://horde.org/imp/ == Replies will be sent to list. For more information see http://life.bio.sunysb.edu/morph/morphmet.html.
Re: size correction discriminant functions analyses
I applaud your courage, Dr. Hammer. I hope everyone appreciates how intimidating this list of experts can be. I also agree with your point that PCA can be used when the data are not multivariate normal if you are just using it to visualize information, or if you just know what it is doing for that matter. I am a fan of using any and all analyses that help in figuring out what is happening. However, in order to understand the results and what you are visualizing you have to understand both the data input and what the statistical analysis is doing. Sometimes the information that seems to be revealed is an artifact of violation of the assumptions and if the observer doesn't realize this it is very easy to come to the wrong conclusion. I thought, what was the analysis doing and how to interpret it were the original questions we were discussing, although I admit to reading the e-mails quickly.The original e-mail indicated that perhaps size and shape confounding was causing their odd looking results. If the shapes are the same, but the sizes are different then the source of the non-normality would be multiple modes only. This may not be a serious enough violation to cause interpretability problems. However, it sounded to me from the description of the problem and the results that in addition to multiple modes there are multiple variance/covariance matrices. That was making it difficult to interpret the results, and since PCA is based upon the variance/covariance will result in difficult to interpret or even invalid components. Separating the analysis into subgroups will allow them to visualize and test the differences in the modes and in the variance/covariance matrices and in that way understand! the source of the differences in the groups. Maybe the common PCA analysis someone else mentioned might do this as well. I am not familiar with that method. Thanx all again for your attention and patience, Kath Kathleen M. Robinette, Ph.D. Principal Research Anthropologist Air Force Research Laboratory AFRL/HEPA 2800 Q Street Wright-Patterson AFB, OH 45433-7947 (937) 255-8810 DSN 785-8810 FAX (937) 255-8752 e-mail:[EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Wednesday, May 19, 2004 9:29 AM To: [EMAIL PROTECTED] Subject: Re: size correction discriminant functions analyses Just a comment on this one, from a pragmatic point of view. It is of course true that PCA is only *guaranteed* to produce components maximizing variance if you have multivariate normality. The theory of PCA is based on this assumption. But in many cases, PCA is used purely as a visualization device, projecting a multivariate data set onto a sheet of paper so we can see it. For visualization of non-normal data, one could play around with different techniques, such as PCA, PCO, NMDS, projection pursuit etc., and then find that PCA does (or does not) perform well for the given data set. There is no law against making any linear combination you want of your variates, if it reveals information. For example, PCA may be perfectly adequate for resolving two well-separated groups, if the within-group variance is relatively small. Of course, when using PCA for non-normal data one must be a little careful and not over-interpret the results (especially not the component loadings), but I think it's too harsh to dismiss its use totally. I'm sure the hard-liners will flame me to pieces for this email, but I hope they will at least give me credit for my courage :-) Dr. Oyvind Hammer Geological Museum University of Oslo PCA Analysis assumes multivariate normality. Kathleen M. Robinette, Ph.D. Principal Research Anthropologist Air Force Research Laboratory == Replies will be sent to list. For more information see http://life.bio.sunysb.edu/morph/morphmet.html. == Replies will be sent to list. For more information see http://life.bio.sunysb.edu/morph/morphmet.html.
Re: size correction discriminant functions analyses
Dr. Hammer, Please consider your courage credited. -ds A couple of points about PCA in general: 1) PCA makes no assumptions about the distribution (multivariate normal or otherwise) of your data. It is a procedure that simply produces the linear combinations of variables with maximum variance subject to orthogonality to other such axes. Distribution assumptions only come into play for (some) significance testing procedures. 2) PC1 will only identify size variation if size variation is the source of the greatest variation in your sample. Sex, species, habitat, etc. could all be determinants (not in the matrix sense 8-) ) of PC1 or some combination of these. In general, if you have data with some extreme outlier (e.g, transcription error), then the PC1 will (probably) just point to (or pi radians away from) the direction of that outlier relative to the main sample, which will still be the linear combination of maximum variance. What people often want PCA to do is either a) identify iso/allometry due to size variation in a sample or b) separate out sexes, species, or other groups. PCA is optimal for neither of these and could be quite misleading in both cases. If you are interested in size relationships, regress variables on some meaningful measure of size. If you are interested in group differences, look into CVA. If you have many more variables than specimens, you might do either of the above in a reduced PCA space if you check carefully to see if your limited data suggest you are capturing salient aspects of a space of reduced dimension resulting from the tight correlations amongst your variables. Otherwise, you must wave your hands vigorously before proceeding. See Marcus 1990 Blue Book chapter for a nice discussion of PCA and related methods. Books by Jackson and Joliffe and other authors specifically on Principal Components are available. -ds On Wed, 2004-05-19 at 09:29, [EMAIL PROTECTED] wrote: Just a comment on this one, from a pragmatic point of view. It is of course true that PCA is only *guaranteed* to produce components maximizing variance if you have multivariate normality. The theory of PCA is based on this assumption. But in many cases, PCA is used purely as a visualization device, projecting a multivariate data set onto a sheet of paper so we can see it. For visualization of non-normal data, one could play around with different techniques, such as PCA, PCO, NMDS, projection pursuit etc., and then find that PCA does (or does not) perform well for the given data set. There is no law against making any linear combination you want of your variates, if it reveals information. For example, PCA may be perfectly adequate for resolving two well-separated groups, if the within-group variance is relatively small. Of course, when using PCA for non-normal data one must be a little careful and not over-interpret the results (especially not the component loadings), but I think it's too harsh to dismiss its use totally. I'm sure the hard-liners will flame me to pieces for this email, but I hope they will at least give me credit for my courage :-) Dr. Oyvind Hammer Geological Museum University of Oslo PCA Analysis assumes multivariate normality. Kathleen M. Robinette, Ph.D. Principal Research Anthropologist Air Force Research Laboratory == Replies will be sent to list. For more information see http://life.bio.sunysb.edu/morph/morphmet.html. -- Dennis E. Slice, Ph.D. Department of Biomedical Engineering Division of Radiologic Sciences Wake Forest University School of Medicine Winston-Salem, North Carolina, USA 27157-1022 Phone: 336-716-5384 Fax: 336-716-2870 == Replies will be sent to list. For more information see http://life.bio.sunysb.edu/morph/morphmet.html.
RE: morphologika now available for free download
Dear Colleagues, Nicholas Jones and I are pleased to announce that morphologika, which is a Windows based program for 3d geometric morphometrics is available at: http://www.york.ac.uk/res/fme/index.htm It can be downloaded from the resources page. We ask that you complete details of your name, insitution and e-mail address for our records before you download. In order to download and install you will need installed on your Windows PC 1. web browser 2. email client 3. software to unzip .zip files Currently I am afraid that I can offer little support, having no funding in respect of this. Extensive help pages are also available for download. Best wishes Paul O'Higgins == Replies will be sent to list. For more information see http://life.bio.sunysb.edu/morph/morphmet.html.