-------- Original Message --------
Date:   Fri, 29 Feb 2008 11:05:27 -0800 (PST)
From:   Elsa et Stéphane BOUEE <[EMAIL PROTECTED]>
To:     <morphmet@morphometrics.org>



Speaking about Mahalanobis distance (D) I have a question/remark.

Due to random fluctuation in a finite number of observations, D is not
null and will increase with the number of variables.

Markus has proposed a formula that takes into account this fact (I did
not find the mathematical demonstration of this formula):



Corrected(D)=[(n1+n2-p-3)*D/(n1+n2-2)]-[(n1+n2)*p/n1*n2]

With: D=mahalanobis distance

      n1 and n2: number of observations in the 2 groups

      p: number of variables



I applied this formula on a dataset and found negative results (even
with a small number of variables (5)), which is embarrassing for a distance…



Therefore, I used another method to encompass this bias. I randomly
permuted the variables with the observations (I neither cannot use my
hands, but hope everyone can understand) and calculated 10000 random D
by using this method. Then, I subtracted the mean of those random D to
the true D calculated on my dataset.



Am I correct doing so ?

Has anyone an idea of a better (exact mathematic) way to correct the D
without having negative values?



Thank you for your answers



Stéphane BOUEE




--
Replies will be sent to the list.
For more information visit http://www.morphometrics.org

Reply via email to