-------- Original Message -------- Date: Fri, 29 Feb 2008 11:05:27 -0800 (PST) From: Elsa et Stéphane BOUEE <[EMAIL PROTECTED]> To: <morphmet@morphometrics.org>
Speaking about Mahalanobis distance (D) I have a question/remark. Due to random fluctuation in a finite number of observations, D is not null and will increase with the number of variables. Markus has proposed a formula that takes into account this fact (I did not find the mathematical demonstration of this formula): Corrected(D)=[(n1+n2-p-3)*D/(n1+n2-2)]-[(n1+n2)*p/n1*n2] With: D=mahalanobis distance n1 and n2: number of observations in the 2 groups p: number of variables I applied this formula on a dataset and found negative results (even with a small number of variables (5)), which is embarrassing for a distance… Therefore, I used another method to encompass this bias. I randomly permuted the variables with the observations (I neither cannot use my hands, but hope everyone can understand) and calculated 10000 random D by using this method. Then, I subtracted the mean of those random D to the true D calculated on my dataset. Am I correct doing so ? Has anyone an idea of a better (exact mathematic) way to correct the D without having negative values? Thank you for your answers Stéphane BOUEE -- Replies will be sent to the list. For more information visit http://www.morphometrics.org