Re: [MORPHMET] Mahalanobis distance in cluster analysis of shape variables

Joseph Kunkel Sat, 30 Jan 2016 05:43:03 -0800

I can not speak directly to why it is frequently used in GM cluster analysis 
but I would like to mention how I look at Mahalanobis distance based on its 
calculation.

Mahalanobis distance is not a pure distance metric like Euclidian or Manhattan 
distance, as you have stated it is ‘standardized’.  What doe that really mean?  
It sounds supeficially good.

One way of computing it is to rotate the k-landmark data set to simplest form 
treating the landmarks as factors.  This way would consider all landmarks to 
have a common covariance structure in XY or XYZ in three dimensions.  That is a 
already a streetch, since not all landmarks can be assumed to have the same 
covariance structure.  In addition the landmarks have all been already centered 
about their centroid and rotated to coincide, which has eliminated a dgeree of 
freedom of variability that can have consequences.  

Furthermore not all species landmarks can be expected to have the same 
covariance structure, which is an assumption made in the ordinary Mahalanobis 
distance application to strut analysis between populations or species.  The 
assumption of similar data structure of course applies to the null hypothesis 
where there is no difference.  The typical statistical test explodes when the 
null hypothesis is falsified so just when you want the Mahalanobis distance 
metric to be accurate it starts misbehaving.

After rotation to simplest axes one does an 1 df F-test between each of the 
landmarks.  These tests are all independent so they can be summed together to 
produce a k df F-test which is Mahalonobis D squared.    So Mahalonobis D is 
the square root of the sum of independent F-tests, but those F-tests are based 
on all sorts of assumptions about the variance of the landmarks.  I immagine on 
could modify calculation of D by limiting the sum over the top 95 or 99% 
variance components of the principal components.

Many times applications of analytical techniques are judged by whether they 
‘work’ or not.   If a clustering method works for you, use it(?).  I am of the 
opinion that I use statistics to convince myself rather than the audience.   A 
confluence on many arguments is used to make a case.

Joe

-·.  .· ·.  .><((((º>·.  .· ·.  .><((((º>·.  .· ·.  .><((((º> .··.· >=-       
=º}}}}}><
Joseph G. Kunkel, Research Professor
UNE Biddeford ME 04005
http://www.bio.umass.edu/biology/kunkel/

> On Jan 30, 2016, at 7:11 AM, Elahep <ellie.parv...@gmail.com> wrote:
> 
> 
> Hello all,
> 
> 
> 
> I have seen in many GM articles people use Mahalanobis distance for cluster 
> analysis. What is the advantage of using Mahalanobis distance over Euclidian 
> distance as similarity measure in cluster analysis of shape variables?
> 
> As far as I know Mahalanobis distance is the standardized form of Euclidean 
> distance which standardized data with adjustments made for correlation 
> between variables and weights all variables equally.
> 
> Why this distance measure is frequently used in GM cluster analysis??
> 
> 
> 
> Thanks in advance
> 
> Elahe
> 
> 
> -- 
> MORPHMET may be accessed via its webpage at http://www.morphometrics.org
> --- 
> You received this message because you are subscribed to the Google Groups 
> "MORPHMET" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to morphmet+unsubscr...@morphometrics.org.

-- 
MORPHMET may be accessed via its webpage at http://www.morphometrics.org
--- 
You received this message because you are subscribed to the Google Groups 
"MORPHMET" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to morphmet+unsubscr...@morphometrics.org.

Re: [MORPHMET] Mahalanobis distance in cluster analysis of shape variables

Reply via email to