Re: size correction discriminant functions analyses

2004-05-20 Thread morphmet
In my understanding to PCA, its main goal is to reduce the dimensionality of
a problem without the loss of too much information.  In other words,
according to Prof. Rohlf, the purpose of PCA is to give you a low
dimensional space that accounts for as much variation as possible. However,
I agree with Oyvind that many scientists use PCA as a visualization device,
projecting a multivariate data set onto a sheet of paper.

On the other hand, testing the multivariate normality before applying any
multivariate data analysis technique is one of the most serious problems
because in most cases none do that and if any tried to do he may choose the
wrong way. Actually, we (biologists and paleontologists)  need a definite
guide to follow when we face such problem.

Best regards

---
Dr. Ashraf M. T. Elewa
Associate Professor
Geology Department
Faculty of Science
Minia University
Egypt
[EMAIL PROTECTED]
http://myprofile.cos.com/aelewa
- Original Message -
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, May 19, 2004 04:29 ?
Subject: Re: size correction  discriminant functions analyses


 Just a comment on this one, from a pragmatic point of view.

 It is of course true that PCA is only *guaranteed* to
 produce components maximizing variance if you have
 multivariate normality. The theory of PCA is based on this
 assumption. But in many cases, PCA is used purely as a
 visualization device, projecting a multivariate data set
 onto a sheet of paper so we can see it. For visualization
 of non-normal data, one could play around with different
 techniques, such as PCA, PCO, NMDS, projection pursuit etc.,
 and then find that PCA does (or does not) perform well
 for the given data set. There is no law against making
 any linear combination you want of your variates, if it
 reveals information. For example, PCA may be perfectly
 adequate for resolving two well-separated groups, if
 the within-group variance is relatively small.

 Of course, when using PCA for non-normal data one must
 be a little careful and not over-interpret the results
 (especially not the component loadings), but I think
 it's too harsh to dismiss its use totally.

 I'm sure the hard-liners will flame me to pieces for
 this email, but I hope they will at least give me
 credit for my courage  :-)


 Dr. Oyvind Hammer
 Geological Museum
 University of Oslo



  PCA Analysis assumes multivariate normality.
 
  Kathleen M. Robinette, Ph.D.
  Principal Research Anthropologist
  Air Force Research Laboratory



 ==
 Replies will be sent to list.
 For more information see http://life.bio.sunysb.edu/morph/morphmet.html.




==
Replies will be sent to list.
For more information see http://life.bio.sunysb.edu/morph/morphmet.html.


Re: size correction discriminant functions analyses

2004-05-20 Thread morphmet
Don't know what happened to cause the earlier message largely void of content, but I 
think the original communication was to correct the Red Book reference.
The date is 1985, not 1982. -ds

On Tue, 2004-05-18 at 14:12, [EMAIL PROTECTED] wrote:
 --
  Dennis E. Slice, Ph.D.
  Department of Biomedical Engineering
  Division of Radiologic Sciences
  Wake Forest University School of Medicine
  Winston-Salem, North Carolina, USA
  27157-1022
  Phone: 336-716-5384
  Fax: 336-716-2870
 Sender: [EMAIL PROTECTED]
 Precedence: bulk
 Reply-To: [EMAIL PROTECTED]
 
 
 ==
 Replies will be sent to list.
 For more information see http://life.bio.sunysb.edu/morph/morphmet.html.
-- 
Dennis E. Slice, Ph.D.
Department of Biomedical Engineering
Division of Radiologic Sciences
Wake Forest University School of Medicine
Winston-Salem, North Carolina, USA 
27157-1022
Phone: 336-716-5384
Fax: 336-716-2870



==
Replies will be sent to list.
For more information see http://life.bio.sunysb.edu/morph/morphmet.html.


Re: size correction discriminant functions analyses

2004-05-20 Thread morphmet
Dear collegues,
Sender: [EMAIL PROTECTED]
Precedence: bulk
Reply-To: [EMAIL PROTECTED]


About the above discussion on the linear measurements data for multivariate 
analysis, I should state that most times my problem (and I expect the problem 
of many people that wrks with it) is not of rows/columns number (that most 
times is ok, at leats in the cases I saw) nether of multivariate normality (I 
use R-project program, which as a test of multivariate normality, so it is easy 
to test) or lack of homogeneity of variances (this is a bit more dodgy, but the 
ref. I saw state that if you test unniveriate variances homogeneity (e.g. 
Bartlett test) it shoud give a good indication of the data variances). 
The problem that (I supose) most biologists encounter are the collinearity 
between variables... which strongly influences the representation givn by the 
PCA. I think this also happens in the NMDS, discriminant and canonical analysis.

I probably did not made myself clear in the email. I am sorry...
For me, it is very interesting that this things are debate in the list, and 
different people shows different solutions and bibliography, it is realy nice.

In relation to the article from Biometrika, does anyone have the pdf? We dont 
have the journal in this college.
In relation to the robustmess of the techniques to lack of normality, I agree 
with our colegue (so... I share your feelings of daring to state it... 
jijijij ;-))

thank you for all,
Cheers,
Marta


-
This mail sent through IMP: http://horde.org/imp/



==
Replies will be sent to list.
For more information see http://life.bio.sunysb.edu/morph/morphmet.html.


Re: size correction discriminant functions analyses

2004-05-20 Thread morphmet
I applaud your courage, Dr. Hammer.  I hope everyone appreciates how intimidating this 
list of experts can be. 

I also agree with your point that PCA can be used when the data are not multivariate 
normal if you are just using it to visualize information, or if you just know what it 
is doing for that matter.  I am a fan of using any and all analyses that help in 
figuring out what is happening.  However, in order to understand the results and what 
you are visualizing you have to understand both the data input and what the 
statistical analysis is doing.  Sometimes the information that seems to be revealed is 
an artifact of violation of the assumptions and if the observer doesn't realize this 
it is very easy to come to the wrong conclusion.   

I thought, what was the analysis doing and how to interpret it were the original 
questions we were discussing, although I admit to reading the e-mails quickly.The 
original e-mail indicated that perhaps size and shape confounding was causing their 
odd looking results.  If the shapes are the same, but the sizes are different then the 
source of the non-normality would be multiple modes only.  This may not be a serious 
enough violation to cause interpretability problems.  However, it sounded to me from 
the description of the problem and the results that in addition to multiple modes 
there are multiple variance/covariance matrices. That was making it difficult to 
interpret the results, and since PCA is based upon the variance/covariance will result 
in difficult to interpret or even invalid components.  Separating the analysis into 
subgroups will allow them to visualize and test the differences in the modes and in 
the variance/covariance matrices and in that way understand!
  the source of the differences in the groups.  

Maybe the common PCA analysis someone else mentioned might do this as well.  I am 
not familiar with that method.

Thanx all again for your attention and patience,
Kath



Kathleen M. Robinette, Ph.D.
Principal Research Anthropologist
Air Force Research Laboratory
AFRL/HEPA
2800 Q Street
Wright-Patterson AFB, OH 45433-7947
(937) 255-8810
DSN 785-8810
FAX (937) 255-8752
e-mail:[EMAIL PROTECTED] 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
Sent: Wednesday, May 19, 2004 9:29 AM
To: [EMAIL PROTECTED]
Subject: Re: size correction  discriminant functions analyses


Just a comment on this one, from a pragmatic point of view.

It is of course true that PCA is only *guaranteed* to
produce components maximizing variance if you have
multivariate normality. The theory of PCA is based on this assumption. But in many 
cases, PCA is used purely as a visualization device, projecting a multivariate data 
set onto a sheet of paper so we can see it. For visualization of non-normal data, one 
could play around with different techniques, such as PCA, PCO, NMDS, projection 
pursuit etc., and then find that PCA does (or does not) perform well for the given 
data set. There is no law against making any linear combination you want of your 
variates, if it reveals information. For example, PCA may be perfectly adequate for 
resolving two well-separated groups, if the within-group variance is relatively small.

Of course, when using PCA for non-normal data one must
be a little careful and not over-interpret the results (especially not the component 
loadings), but I think it's too harsh to dismiss its use totally.

I'm sure the hard-liners will flame me to pieces for
this email, but I hope they will at least give me
credit for my courage  :-)


Dr. Oyvind Hammer
Geological Museum
University of Oslo



 PCA Analysis assumes multivariate normality.

 Kathleen M. Robinette, Ph.D.
 Principal Research Anthropologist
 Air Force Research Laboratory



==
Replies will be sent to list.
For more information see http://life.bio.sunysb.edu/morph/morphmet.html.



==
Replies will be sent to list.
For more information see http://life.bio.sunysb.edu/morph/morphmet.html.


Re: size correction discriminant functions analyses

2004-05-20 Thread morphmet
Dr. Hammer, Please consider your courage credited. -ds

A couple of points about PCA in general:

1) PCA makes no assumptions about the distribution (multivariate normal
or otherwise) of your data. It is a procedure that simply produces the
linear combinations of variables with maximum variance subject to
orthogonality to other such axes. Distribution assumptions only come
into play for (some) significance testing procedures.

2) PC1 will only identify size variation if size variation is the source
of the greatest variation in your sample. Sex, species, habitat, etc.
could all be determinants (not in the matrix sense 8-) ) of PC1 or some
combination of these.

In general, if you have data with some extreme outlier (e.g,
transcription error), then the PC1 will (probably) just point to (or pi
radians away from) the direction of that outlier relative to the main
sample, which will still be the linear combination of maximum variance.

What people often want PCA to do is either a) identify iso/allometry
due to size variation in a sample or b) separate out sexes, species, or
other groups. PCA is optimal for neither of these and could be quite
misleading in both cases.

If you are interested in size relationships, regress variables on some
meaningful measure of size. If you are interested in group differences,
look into CVA. 

If you have many more variables than specimens, you might do either of
the above in a reduced PCA space if you check carefully to see if your
limited data suggest you are capturing salient aspects of a space of
reduced dimension resulting from the tight correlations amongst your
variables. Otherwise, you must wave your hands vigorously before
proceeding.

See Marcus 1990 Blue Book chapter for a nice discussion of PCA and
related methods. 

Books by Jackson and Joliffe and other authors specifically on Principal Components 
are available.

-ds


On Wed, 2004-05-19 at 09:29, [EMAIL PROTECTED] wrote:
 Just a comment on this one, from a pragmatic point of view.
 
 It is of course true that PCA is only *guaranteed* to
 produce components maximizing variance if you have
 multivariate normality. The theory of PCA is based on this
 assumption. But in many cases, PCA is used purely as a
 visualization device, projecting a multivariate data set
 onto a sheet of paper so we can see it. For visualization
 of non-normal data, one could play around with different
 techniques, such as PCA, PCO, NMDS, projection pursuit etc.,
 and then find that PCA does (or does not) perform well
 for the given data set. There is no law against making
 any linear combination you want of your variates, if it
 reveals information. For example, PCA may be perfectly
 adequate for resolving two well-separated groups, if
 the within-group variance is relatively small.
 
 Of course, when using PCA for non-normal data one must
 be a little careful and not over-interpret the results
 (especially not the component loadings), but I think
 it's too harsh to dismiss its use totally.
 
 I'm sure the hard-liners will flame me to pieces for
 this email, but I hope they will at least give me
 credit for my courage  :-)
 
 
 Dr. Oyvind Hammer
 Geological Museum
 University of Oslo
 
 
 
  PCA Analysis assumes multivariate normality.
 
  Kathleen M. Robinette, Ph.D.
  Principal Research Anthropologist
  Air Force Research Laboratory
 
 
 
 ==
 Replies will be sent to list.
 For more information see http://life.bio.sunysb.edu/morph/morphmet.html.
-- 
Dennis E. Slice, Ph.D.
Department of Biomedical Engineering
Division of Radiologic Sciences
Wake Forest University School of Medicine
Winston-Salem, North Carolina, USA 
27157-1022
Phone: 336-716-5384
Fax: 336-716-2870



==
Replies will be sent to list.
For more information see http://life.bio.sunysb.edu/morph/morphmet.html.


RE: morphologika now available for free download

2004-05-20 Thread morphmet
Dear Colleagues,

Nicholas Jones and I are pleased to announce that  morphologika, which
is a Windows based program for 3d geometric morphometrics is available
at:

http://www.york.ac.uk/res/fme/index.htm

It can be downloaded from the resources page.

We ask that you complete details of your name, insitution and e-mail
address for our records before you download.

In order to download and install you will need installed on your Windows
PC

1. web browser
2. email client
3. software to unzip .zip files

Currently I am afraid that I can offer little support, having no funding
in respect of this. Extensive help pages are also available for
download.


Best wishes

Paul O'Higgins
==
Replies will be sent to list.
For more information see http://life.bio.sunysb.edu/morph/morphmet.html.