Re: [R] mahalanobis

2007-06-02 Thread Michael Friendly
Yianni

You probably would have gotten more helpful replies if you indicated
the substantiative problem you were trying to solve.

 From your description, it seems like you want to calculate
leverage of predictors, (X1, X2) in the lm( y ~ X1+X2).
My crystal ball says you may be an SPSS user, for whom
mahalanobis D^2 of the predictors is what you have to beg
for to get leverages.  In R, you will get the most happiness
from
?leverage.plot
in the car package.

mahalanobois D^2 are proportional to leverage.

-Michael


[EMAIL PROTECTED] wrote:
 Hi, I am not sure I am using correctly the mahalanobis distnace method...
 Suppose I have a response variable Y and predictor variables X1 and X2
 
 all - cbind(Y, X1, X2)
 mahalanobis(all, colMeans(all), cov(all));
 
 However, my results from this are different from the ones I am getting
 using another statistical software.
 
 I was reading that the comparison is with the means of the predictor
 variables which led me to think that the above should be transformed
 into:
 
 predictors - cbind(X1, X2)
 mahalanobis(all, colMeans(predictors), cov(all))
 
 But still the results are different
 
 Am I doing something wrong or have I misunderstood something in the
 use of the function mahalanobis? Thanks.
 

-- 
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html
Toronto, ONT  M3J 1P3 CANADA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mahalanobis

2007-06-01 Thread gatemaze
On 31/05/07, Anup Nandialath [EMAIL PROTECTED] wrote:
 oops forgot the example example

 try this line

 sqrt(mahalanobis(all, colMeans(predictors), cov(all), FALSE)
Hi and thanks for the reply Anup. Unfortunately, I had a look on the
example before posting but not much of a help... I did some further
tests and in order to have the same results I must run mahalanobis
with the predictors only dataset, ie.
mahalanobis(predictors, colMeans(predictors), cov(predictors)).

Now, on a first glance it seems to me a bit strange that the influence
of these points on a regression are measured without taking into
account the response variable (provided that the other stat software
calculates the mahalanobis distances correctly) but I guess this
is something that I have to resolve by doing some studying on my own
on the mahalanobis distance...

thanks again.


 now cross check with other software

 best

 Anup


  
 No need to miss a message. Get email on-the-go
 with Yahoo! Mail for Mobile. Get started.




-- 
yianni

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mahalanobis

2007-05-31 Thread gatemaze
Hi, I am not sure I am using correctly the mahalanobis distnace method...
Suppose I have a response variable Y and predictor variables X1 and X2

all - cbind(Y, X1, X2)
mahalanobis(all, colMeans(all), cov(all));

However, my results from this are different from the ones I am getting
using another statistical software.

I was reading that the comparison is with the means of the predictor
variables which led me to think that the above should be transformed
into:

predictors - cbind(X1, X2)
mahalanobis(all, colMeans(predictors), cov(all))

But still the results are different

Am I doing something wrong or have I misunderstood something in the
use of the function mahalanobis? Thanks.

-- 
yianni

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mahalanobis distance and probability of group membership using Hotelling's T2 distribution

2007-02-20 Thread Mike White
I want to calculate the probability that a group will include a particular
point using the squared Mahalanobis distance to the centroid. I understand
that the squared Mahalanobis distance is distributed as chi-squared but that
for a small number of random samples from a multivariate normal population
the Hotellings T2 (T squared) distribution should be used.
I cannot find a function for Hotelling's T2 distribution in R (although from
a previous post I have been provided with functions for the Hotelling Test).
My understanding is that the Hotelling's T2 distribution is related to the F
distribution using the equation:
 T2(u,v) = F(u, v-u+1)*vu/(v-u+1)
where u is the number of variables and v the number of group members.

I have written the R code below to compare the results from the chi-squared
distribution with the Hotelling's T2 distribution for probability of a
member being included within a group.
Please can anyone confirm whether or not this is the correct way to use
Hotelling's T2 distribution for probability of group membership. Also, when
testing a particular group member, is it preferable to leave that member out
when calculating the centre and covariance of the group for the Mahalanobis
distances?

Thanks
Mike White



## Hotelling T^2 distribution function
ph-function(q, u, v, ...){
# q vector of quantiles as in function pf
# u number of independent variables
# v number of observations
if (!v  u+1) stop(n must be greater than p+1)
df1 - u
df2 - v-u+1
pf(q*df2/(v*u), df1, df2, ...)
}

# compare Chi-squared and Hotelling T^2 distributions for a group member
u-3
v-10
set.seed(1)
mat-matrix(rnorm(v*u), nrow=v, ncol=u)
MD2-mahalanobis(mat, center=colMeans(mat), cov=cov(mat))
d-MD2[order(MD2)]
# select a point midway between nearest and furthest from centroid
dm-d[length(d)/2]
1-ph(dm,u,v)# probability using Hotelling T^2 distribution
# [1] 0.6577069
1-pchisq(dm, u) # probability using Chi-squared distribution
# [1] 0.5538466

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mahalanobis distance and probability of group membership using Hotelling's T2 distribution

2007-02-20 Thread Mike White
I want to calculate the probability that a group will include a particular
point using the squared Mahalanobis distance to the centroid. I understand
that the squared Mahalanobis distance is distributed as chi-squared but that
for a small number of random samples from a multivariate normal population
the Hotellings T2 (T squared) distribution should be used.
I cannot find a function for Hotelling's T2 distribution in R (although from
a previous post I have been provided with functions for the Hotelling Test).
My understanding is that the Hotelling's T2 distribution is related to the F
distribution using the equation:
 T2(u,v) = F(u, v-u+1)*vu/(v-u+1)
where u is the number of variables and v the number of group members.

I have written the R code below to compare the results from the chi-squared
distribution with the Hotelling's T2 distribution for probability of a
member being included within a group.
Please can anyone confirm whether or not this is the correct way to use
Hotelling's T2 distribution for probability of group membership. Also, when
testing a particular group member, is it preferable to leave that member out
when calculating the centre and covariance of the group for the Mahalanobis
distances?

Thanks
Mike White



## Hotelling T^2 distribution function
ph-function(q, u, v, ...){
# q vector of quantiles as in function pf
# u number of independent variables
# v number of observations
if (!v  u+1) stop(n must be greater than p+1)
df1 - u
df2 - v-u+1
pf(q*df2/(v*u), df1, df2, ...)
}

# compare Chi-squared and Hotelling T^2 distributions for a group member
u-3
v-10
set.seed(1)
mat-matrix(rnorm(v*u), nrow=v, ncol=u)
MD2-mahalanobis(mat, center=colMeans(mat), cov=cov(mat))
d-MD2[order(MD2)]
# select a point midway between nearest and furthest from centroid
dm-d[length(d)/2]
1-ph(dm,u,v)# probability using Hotelling T^2 distribution
# [1] 0.6577069
1-pchisq(dm, u) # probability using Chi-squared distribution
# [1] 0.5538466

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mahalanobis distances

2005-06-24 Thread Karen Kotschy
Dear R community

Have just recently got back into R after a long break and have been amazed at 
how much it has grown, and how active the list is! Thank you so much to all 
those who contribute to this amazing project.

My question:
I am trying to calculate Mahalanobis distances for a matrix called fgmatrix

dim(fgmatrix)
[1] 76 15

fg.cov - cov.wt(fgmatrix)
mahalanobis(fgmatrix, center = fg.cov$center, cov = fg.cov$cov)

Then I get an error message Covariance matrix is apparently singular

What does this mean? I can't see anything strange about the covariance matrix, 
and am not getting anywhere with the help files.

dim(fg.cov$cov)
[1] 15 15
length(fg.cov$center)
[1] 15


Thanks
-- 
Karen Kotschy
Centre for Water in the Environment
University of the Witwatersrand
Johannesburg
South Africa

P/Bag X3, Wits, 2050
Tel: +2711 717-6425

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Mahalanobis distances

2005-06-24 Thread Spencer Graves
  The first thing I'd try is scale, as that should not affect the 
Mahalinobis distances:

  Fgmat - scale(fgmatrix)
  fg.cov - cov.wt(Fgmat)
  mahalanobis(Fgmat, center = Fg.cov$center, cov = Fg.cov$cov)

  Does this give you the same result.  If no, the problem was that 
fgmatrix was not sufficiently well conditioned to support this 
computation.

  If this does NOT solve the problem, I'd manually contruct a ginverse 
of Fg.cov$cov, proceeding roughly as outlined in the following example:

set.seed(1)
X10 - array(rnorm(760), dim=c(76, 10))
X15.10 - cbind(X10, X10[,1:5])

fg.cov - cov.wt(X15.10)
mahalanobis(X15.10, center = fg.cov$center, cov = fg.cov$cov)

(S15.10 - eigen(fg.cov$cov, symmetric=TRUE))
# Only 10 non-zero eigenvalues
fg.Info - crossprod(S15.10$vectors[,1:10] / 
rep(sqrt(S15.10$values[1:10]), 15))
mahalanobis(X15.10, center = fg.cov$center,
cov = fg.cov$cov, inverted=TRUE)

  The key is computing your own generalized inverse and using that with 
inverted=TRUE.

  spencer graves

Karen Kotschy wrote:
 Dear R community
 
 Have just recently got back into R after a long break and have been amazed at 
 how much it has grown, and how active the list is! Thank you so much to all 
 those who contribute to this amazing project.
 
 My question:
 I am trying to calculate Mahalanobis distances for a matrix called fgmatrix
 
 
dim(fgmatrix)
 
 [1] 76 15
 
 
fg.cov - cov.wt(fgmatrix)
mahalanobis(fgmatrix, center = fg.cov$center, cov = fg.cov$cov)
 
 
 Then I get an error message Covariance matrix is apparently singular
 
 What does this mean? I can't see anything strange about the covariance 
 matrix, 
 and am not getting anywhere with the help files.
 
 
dim(fg.cov$cov)
 
 [1] 15 15
 
length(fg.cov$center)
 
 [1] 15
 
 
 Thanks

-- 
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

[EMAIL PROTECTED]
www.pdf.com http://www.pdf.com
Tel:  408-938-4420
Fax: 408-280-7915

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Mahalanobis distances

2005-06-24 Thread Christian Hennig
On Fri, 24 Jun 2005, Spencer Graves wrote:

(...)
 The key is computing your own generalized inverse and using that with
 inverted=TRUE.
(...)

One method to do this is function solvecov in package fpc.

Christian


 spencer graves

 Karen Kotschy wrote:
  Dear R community
 
  Have just recently got back into R after a long break and have been amazed 
  at
  how much it has grown, and how active the list is! Thank you so much to all
  those who contribute to this amazing project.
 
  My question:
  I am trying to calculate Mahalanobis distances for a matrix called 
  fgmatrix
 
 
 dim(fgmatrix)
 
  [1] 76 15
 
 
 fg.cov - cov.wt(fgmatrix)
 mahalanobis(fgmatrix, center = fg.cov$center, cov = fg.cov$cov)
 
 
  Then I get an error message Covariance matrix is apparently singular
 
  What does this mean? I can't see anything strange about the covariance 
  matrix,
  and am not getting anywhere with the help files.
 
 
 dim(fg.cov$cov)
 
  [1] 15 15
 
 length(fg.cov$center)
 
  [1] 15
 
 
  Thanks

 --
 Spencer Graves, PhD
 Senior Development Engineer
 PDF Solutions, Inc.
 333 West San Carlos Street Suite 700
 San Jose, CA 95110, USA

 [EMAIL PROTECTED]
 www.pdf.com http://www.pdf.com
 Tel:  408-938-4420
 Fax: 408-280-7915

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


*** NEW ADDRESS! ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
[EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] mahalanobis distance

2004-09-12 Thread Murli Nair
Is there a function that calculate the mahalanobis distance in R .
The dist function calculates euclidean', 'maximum', 'manhattan', 
'canberra',
'binary' or 'minkowski'.
Thanks ../Murli

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] mahalanobis distance

2004-09-12 Thread Liaw, Andy
See (surprising enough) ?mahalanobis...

Andy

 From: Murli Nair
 
 Is there a function that calculate the mahalanobis distance in R .
 The dist function calculates euclidean', 'maximum', 
 'manhattan', 
 'canberra',
 'binary' or 'minkowski'.
 Thanks ../Murli
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] mahalanobis distance

2004-09-12 Thread John Fox
Dear Murli,

Try ?mahalanobis, which, by the way, is turned up by
help.search(mahalanobis).

I hope this helps,
 John

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Murli Nair
 Sent: Sunday, September 12, 2004 3:17 PM
 To: [EMAIL PROTECTED]
 Subject: [R] mahalanobis distance
 
 Is there a function that calculate the mahalanobis distance in R .
 The dist function calculates euclidean', 'maximum', 
 'manhattan', 'canberra', 'binary' or 'minkowski'.
 Thanks ../Murli

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Mahalanobis

2004-03-26 Thread Alberto Murta
Dear all

Why isn'it possible to calculate Mahalanobis distances with R for a matrix 
with 1 row (observations) more than the number of columns (variables)?

 mydata - matrix(runif(12,-5,5), 4, 3)
 mahalanobis(x=mydata, center=apply(mydata,2,mean), cov=var(mydata))
[1] 2.25 2.25 2.25 2.25

 mydata - matrix(runif(420,-5,5), 21, 20)
 mahalanobis(x=mydata, center=apply(mydata,2,mean), cov=var(mydata))
 [1] 19.04762 19.04762 19.04762 19.04762 19.04762 19.04762 19.04762 19.04762 
19.04762 19.04762 19.04762 19.04762
[13] 19.04762 19.04762 19.04762 19.04762 19.04762 19.04762 19.04762 19.04762 
19.04762

 mydata - matrix(runif(132,-5,5), 12, 11)
 mahalanobis(x=mydata, center=apply(mydata,2,mean), cov=var(mydata))
 [1] 10.08333 10.08333 10.08333 10.08333 10.08333 10.08333 10.08333 10.08333 
10.08333 10.08333 10.08333 10.08333

Thanks in advance

Alberto Murta

 version
 _
platform i686-pc-linux-gnu
arch i686 
os   linux-gnu
system   i686, linux-gnu  
status
major1
minor8.1  
year 2003 
month11   
day  21   
language R

-- 
 Alberto G. Murta
Institute for Agriculture and Fisheries Research (INIAP-IPIMAR) 
Av. Brasilia, 1449-006 Lisboa, Portugal | Phone: +351 213027062
Fax:+351 213015948 | http://ipimar-iniap.ipimar.pt/pelagicos/

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Mahalanobis

2004-03-26 Thread Liaw, Andy
If I'm not mistaken, the data you generated form a simplex in the
p-dimensional space.  Mahalanobis distance for such data, using sample mean
and covariance, just give the distance to the centroid after normalization.
The normalization step make all the points equidistance from the centroid.

To see this, try generating 3 points in 2D, and plot the principal component
scores:  You'll see the points on the vertices of a regular triangle.

Andy

 From: Alberto Murta
 
 Dear all
 
 Why isn'it possible to calculate Mahalanobis distances with R 
 for a matrix 
 with 1 row (observations) more than the number of columns (variables)?
 
  mydata - matrix(runif(12,-5,5), 4, 3)
  mahalanobis(x=mydata, center=apply(mydata,2,mean), cov=var(mydata))
 [1] 2.25 2.25 2.25 2.25
 
  mydata - matrix(runif(420,-5,5), 21, 20)
  mahalanobis(x=mydata, center=apply(mydata,2,mean), cov=var(mydata))
  [1] 19.04762 19.04762 19.04762 19.04762 19.04762 19.04762 
 19.04762 19.04762 
 19.04762 19.04762 19.04762 19.04762
 [13] 19.04762 19.04762 19.04762 19.04762 19.04762 19.04762 
 19.04762 19.04762 
 19.04762
 
  mydata - matrix(runif(132,-5,5), 12, 11)
  mahalanobis(x=mydata, center=apply(mydata,2,mean), cov=var(mydata))
  [1] 10.08333 10.08333 10.08333 10.08333 10.08333 10.08333 
 10.08333 10.08333 
 10.08333 10.08333 10.08333 10.08333
 
 Thanks in advance
 
 Alberto Murta
 
  version
  _
 platform i686-pc-linux-gnu
 arch i686 
 os   linux-gnu
 system   i686, linux-gnu  
 status
 major1
 minor8.1  
 year 2003 
 month11   
 day  21   
 language R
 
 -- 
  Alberto G. Murta
 Institute for Agriculture and Fisheries Research (INIAP-IPIMAR) 
 Av. Brasilia, 1449-006 Lisboa, Portugal | Phone: +351 213027062
 Fax:+351 213015948 | http://ipimar-iniap.ipimar.pt/pelagicos/
 
 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html