Re: [R] mahalanobis

2007-06-02 Thread Michael Friendly
Yianni

You probably would have gotten more helpful replies if you indicated
the substantiative problem you were trying to solve.

 From your description, it seems like you want to calculate
leverage of predictors, (X1, X2) in the lm( y ~ X1+X2).
My crystal ball says you may be an SPSS user, for whom
mahalanobis D^2 of the predictors is what you have to beg
for to get leverages.  In R, you will get the most happiness
from
?leverage.plot
in the car package.

mahalanobois D^2 are proportional to leverage.

-Michael


[EMAIL PROTECTED] wrote:
> Hi, I am not sure I am using correctly the mahalanobis distnace method...
> Suppose I have a response variable Y and predictor variables X1 and X2
> 
> all <- cbind(Y, X1, X2)
> mahalanobis(all, colMeans(all), cov(all));
> 
> However, my results from this are different from the ones I am getting
> using another statistical software.
> 
> I was reading that the comparison is with the means of the predictor
> variables which led me to think that the above should be transformed
> into:
> 
> predictors <- cbind(X1, X2)
> mahalanobis(all, colMeans(predictors), cov(all))
> 
> But still the results are different
> 
> Am I doing something wrong or have I misunderstood something in the
> use of the function mahalanobis? Thanks.
> 

-- 
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html
Toronto, ONT  M3J 1P3 CANADA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mahalanobis

2007-06-01 Thread gatemaze
On 31/05/07, Anup Nandialath <[EMAIL PROTECTED]> wrote:
> oops forgot the example example
>
> try this line
>
> sqrt(mahalanobis(all, colMeans(predictors), cov(all), FALSE)
Hi and thanks for the reply Anup. Unfortunately, I had a look on the
example before posting but not much of a help... I did some further
tests and in order to have the same results I must run mahalanobis
with the predictors only dataset, ie.
mahalanobis(predictors, colMeans(predictors), cov(predictors)).

Now, on a first glance it seems to me a bit strange that the influence
of these points on a regression are measured without taking into
account the response variable (provided that the other stat software
calculates the mahalanobis distances correctly) but I guess this
is something that I have to resolve by doing some studying on my own
on the mahalanobis distance...

thanks again.

>
> now cross check with other software
>
> best
>
> Anup
>
>
>  
> No need to miss a message. Get email on-the-go
> with Yahoo! Mail for Mobile. Get started.
>
>


-- 
yianni

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mahalanobis

2007-05-31 Thread gatemaze
Hi, I am not sure I am using correctly the mahalanobis distnace method...
Suppose I have a response variable Y and predictor variables X1 and X2

all <- cbind(Y, X1, X2)
mahalanobis(all, colMeans(all), cov(all));

However, my results from this are different from the ones I am getting
using another statistical software.

I was reading that the comparison is with the means of the predictor
variables which led me to think that the above should be transformed
into:

predictors <- cbind(X1, X2)
mahalanobis(all, colMeans(predictors), cov(all))

But still the results are different

Am I doing something wrong or have I misunderstood something in the
use of the function mahalanobis? Thanks.

-- 
yianni

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mahalanobis distance and probability of group membership using Hotelling's T2 distribution

2007-02-20 Thread Mike White
I want to calculate the probability that a group will include a particular
point using the squared Mahalanobis distance to the centroid. I understand
that the squared Mahalanobis distance is distributed as chi-squared but that
for a small number of random samples from a multivariate normal population
the Hotellings T2 (T squared) distribution should be used.
I cannot find a function for Hotelling's T2 distribution in R (although from
a previous post I have been provided with functions for the Hotelling Test).
My understanding is that the Hotelling's T2 distribution is related to the F
distribution using the equation:
 T2(u,v) = F(u, v-u+1)*vu/(v-u+1)
where u is the number of variables and v the number of group members.

I have written the R code below to compare the results from the chi-squared
distribution with the Hotelling's T2 distribution for probability of a
member being included within a group.
Please can anyone confirm whether or not this is the correct way to use
Hotelling's T2 distribution for probability of group membership. Also, when
testing a particular group member, is it preferable to leave that member out
when calculating the centre and covariance of the group for the Mahalanobis
distances?

Thanks
Mike White



## Hotelling T^2 distribution function
ph<-function(q, u, v, ...){
# q vector of quantiles as in function pf
# u number of independent variables
# v number of observations
if (!v > u+1) stop("n must be greater than p+1")
df1 <- u
df2 <- v-u+1
pf(q*df2/(v*u), df1, df2, ...)
}

# compare Chi-squared and Hotelling T^2 distributions for a group member
u<-3
v<-10
set.seed(1)
mat<-matrix(rnorm(v*u), nrow=v, ncol=u)
MD2<-mahalanobis(mat, center=colMeans(mat), cov=cov(mat))
d<-MD2[order(MD2)]
# select a point midway between nearest and furthest from centroid
dm<-d[length(d)/2]
1-ph(dm,u,v)# probability using Hotelling T^2 distribution
# [1] 0.6577069
1-pchisq(dm, u) # probability using Chi-squared distribution
# [1] 0.5538466

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mahalanobis distance and probability of group membership using Hotelling's T2 distribution

2007-02-20 Thread Mike White
I want to calculate the probability that a group will include a particular
point using the squared Mahalanobis distance to the centroid. I understand
that the squared Mahalanobis distance is distributed as chi-squared but that
for a small number of random samples from a multivariate normal population
the Hotellings T2 (T squared) distribution should be used.
I cannot find a function for Hotelling's T2 distribution in R (although from
a previous post I have been provided with functions for the Hotelling Test).
My understanding is that the Hotelling's T2 distribution is related to the F
distribution using the equation:
 T2(u,v) = F(u, v-u+1)*vu/(v-u+1)
where u is the number of variables and v the number of group members.

I have written the R code below to compare the results from the chi-squared
distribution with the Hotelling's T2 distribution for probability of a
member being included within a group.
Please can anyone confirm whether or not this is the correct way to use
Hotelling's T2 distribution for probability of group membership. Also, when
testing a particular group member, is it preferable to leave that member out
when calculating the centre and covariance of the group for the Mahalanobis
distances?

Thanks
Mike White



## Hotelling T^2 distribution function
ph<-function(q, u, v, ...){
# q vector of quantiles as in function pf
# u number of independent variables
# v number of observations
if (!v > u+1) stop("n must be greater than p+1")
df1 <- u
df2 <- v-u+1
pf(q*df2/(v*u), df1, df2, ...)
}

# compare Chi-squared and Hotelling T^2 distributions for a group member
u<-3
v<-10
set.seed(1)
mat<-matrix(rnorm(v*u), nrow=v, ncol=u)
MD2<-mahalanobis(mat, center=colMeans(mat), cov=cov(mat))
d<-MD2[order(MD2)]
# select a point midway between nearest and furthest from centroid
dm<-d[length(d)/2]
1-ph(dm,u,v)# probability using Hotelling T^2 distribution
# [1] 0.6577069
1-pchisq(dm, u) # probability using Chi-squared distribution
# [1] 0.5538466

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mahalanobis distances

2005-06-24 Thread Christian Hennig
On Fri, 24 Jun 2005, Spencer Graves wrote:

(...)
> The key is computing your own generalized inverse and using that with
> "inverted=TRUE".
(...)

One method to do this is function solvecov in package fpc.

Christian

>
> spencer graves
>
> Karen Kotschy wrote:
> > Dear R community
> >
> > Have just recently got back into R after a long break and have been amazed 
> > at
> > how much it has grown, and how active the list is! Thank you so much to all
> > those who contribute to this amazing project.
> >
> > My question:
> > I am trying to calculate Mahalanobis distances for a matrix called 
> > "fgmatrix"
> >
> >
> >>dim(fgmatrix)
> >
> > [1] 76 15
> >
> >
> >>fg.cov <- cov.wt(fgmatrix)
> >>mahalanobis(fgmatrix, center = fg.cov$center, cov = fg.cov$cov)
> >
> >
> > Then I get an error message "Covariance matrix is apparently singular"
> >
> > What does this mean? I can't see anything strange about the covariance 
> > matrix,
> > and am not getting anywhere with the help files.
> >
> >
> >>dim(fg.cov$cov)
> >
> > [1] 15 15
> >
> >>length(fg.cov$center)
> >
> > [1] 15
> >
> >
> > Thanks
>
> --
> Spencer Graves, PhD
> Senior Development Engineer
> PDF Solutions, Inc.
> 333 West San Carlos Street Suite 700
> San Jose, CA 95110, USA
>
> [EMAIL PROTECTED]
> www.pdf.com 
> Tel:  408-938-4420
> Fax: 408-280-7915
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

*** NEW ADDRESS! ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
[EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Mahalanobis distances

2005-06-24 Thread Spencer Graves
  The first thing I'd try is "scale", as that should not affect the 
Mahalinobis distances:

  Fgmat <- scale(fgmatrix)
  fg.cov <- cov.wt(Fgmat)
  mahalanobis(Fgmat, center = Fg.cov$center, cov = Fg.cov$cov)

  Does this give you the same result.  If no, the problem was that 
fgmatrix was not sufficiently well conditioned to support this 
computation.

  If this does NOT solve the problem, I'd manually contruct a ginverse 
of Fg.cov$cov, proceeding roughly as outlined in the following example:

set.seed(1)
X10 <- array(rnorm(760), dim=c(76, 10))
X15.10 <- cbind(X10, X10[,1:5])

fg.cov <- cov.wt(X15.10)
mahalanobis(X15.10, center = fg.cov$center, cov = fg.cov$cov)

(S15.10 <- eigen(fg.cov$cov, symmetric=TRUE))
# Only 10 non-zero eigenvalues
fg.Info <- crossprod(S15.10$vectors[,1:10] / 
rep(sqrt(S15.10$values[1:10]), 15))
mahalanobis(X15.10, center = fg.cov$center,
cov = fg.cov$cov, inverted=TRUE)

  The key is computing your own generalized inverse and using that with 
"inverted=TRUE".

  spencer graves

Karen Kotschy wrote:
> Dear R community
> 
> Have just recently got back into R after a long break and have been amazed at 
> how much it has grown, and how active the list is! Thank you so much to all 
> those who contribute to this amazing project.
> 
> My question:
> I am trying to calculate Mahalanobis distances for a matrix called "fgmatrix"
> 
> 
>>dim(fgmatrix)
> 
> [1] 76 15
> 
> 
>>fg.cov <- cov.wt(fgmatrix)
>>mahalanobis(fgmatrix, center = fg.cov$center, cov = fg.cov$cov)
> 
> 
> Then I get an error message "Covariance matrix is apparently singular"
> 
> What does this mean? I can't see anything strange about the covariance 
> matrix, 
> and am not getting anywhere with the help files.
> 
> 
>>dim(fg.cov$cov)
> 
> [1] 15 15
> 
>>length(fg.cov$center)
> 
> [1] 15
> 
> 
> Thanks

-- 
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

[EMAIL PROTECTED]
www.pdf.com 
Tel:  408-938-4420
Fax: 408-280-7915

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Mahalanobis distances

2005-06-24 Thread Karen Kotschy
Dear R community

Have just recently got back into R after a long break and have been amazed at 
how much it has grown, and how active the list is! Thank you so much to all 
those who contribute to this amazing project.

My question:
I am trying to calculate Mahalanobis distances for a matrix called "fgmatrix"

>dim(fgmatrix)
[1] 76 15

>fg.cov <- cov.wt(fgmatrix)
>mahalanobis(fgmatrix, center = fg.cov$center, cov = fg.cov$cov)

Then I get an error message "Covariance matrix is apparently singular"

What does this mean? I can't see anything strange about the covariance matrix, 
and am not getting anywhere with the help files.

>dim(fg.cov$cov)
[1] 15 15
>length(fg.cov$center)
[1] 15


Thanks
-- 
Karen Kotschy
Centre for Water in the Environment
University of the Witwatersrand
Johannesburg
South Africa

P/Bag X3, Wits, 2050
Tel: +2711 717-6425

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] mahalanobis distance

2004-09-12 Thread John Fox
Dear Murli,

Try ?mahalanobis, which, by the way, is turned up by
help.search("mahalanobis").

I hope this helps,
 John

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Murli Nair
> Sent: Sunday, September 12, 2004 3:17 PM
> To: [EMAIL PROTECTED]
> Subject: [R] mahalanobis distance
> 
> Is there a function that calculate the mahalanobis distance in R .
> The dist function calculates "euclidean"', '"maximum"', 
> '"manhattan"', '"canberra"', '"binary"' or '"minkowski"'.
> Thanks ../Murli

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] mahalanobis distance

2004-09-12 Thread Liaw, Andy
See (surprising enough) ?mahalanobis...

Andy

> From: Murli Nair
> 
> Is there a function that calculate the mahalanobis distance in R .
> The dist function calculates "euclidean"', '"maximum"', 
> '"manhattan"', 
> '"canberra"',
> '"binary"' or '"minkowski"'.
> Thanks ../Murli
> 
> __
> [EMAIL PROTECTED] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] mahalanobis distance

2004-09-12 Thread Murli Nair
Is there a function that calculate the mahalanobis distance in R .
The dist function calculates "euclidean"', '"maximum"', '"manhattan"', 
'"canberra"',
'"binary"' or '"minkowski"'.
Thanks ../Murli

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Mahalanobis

2004-03-26 Thread Liaw, Andy
If I'm not mistaken, the data you generated form a simplex in the
p-dimensional space.  Mahalanobis distance for such data, using sample mean
and covariance, just give the distance to the centroid after normalization.
The normalization step make all the points equidistance from the centroid.

To see this, try generating 3 points in 2D, and plot the principal component
scores:  You'll see the points on the vertices of a regular triangle.

Andy

> From: Alberto Murta
> 
> Dear all
> 
> Why isn'it possible to calculate Mahalanobis distances with R 
> for a matrix 
> with 1 row (observations) more than the number of columns (variables)?
> 
> > mydata <- matrix(runif(12,-5,5), 4, 3)
> > mahalanobis(x=mydata, center=apply(mydata,2,mean), cov=var(mydata))
> [1] 2.25 2.25 2.25 2.25
> 
> > mydata <- matrix(runif(420,-5,5), 21, 20)
> > mahalanobis(x=mydata, center=apply(mydata,2,mean), cov=var(mydata))
>  [1] 19.04762 19.04762 19.04762 19.04762 19.04762 19.04762 
> 19.04762 19.04762 
> 19.04762 19.04762 19.04762 19.04762
> [13] 19.04762 19.04762 19.04762 19.04762 19.04762 19.04762 
> 19.04762 19.04762 
> 19.04762
> 
> > mydata <- matrix(runif(132,-5,5), 12, 11)
> > mahalanobis(x=mydata, center=apply(mydata,2,mean), cov=var(mydata))
>  [1] 10.08333 10.08333 10.08333 10.08333 10.08333 10.08333 
> 10.08333 10.08333 
> 10.08333 10.08333 10.08333 10.08333
> 
> Thanks in advance
> 
> Alberto Murta
> 
> > version
>  _
> platform i686-pc-linux-gnu
> arch i686 
> os   linux-gnu
> system   i686, linux-gnu  
> status
> major1
> minor8.1  
> year 2003 
> month11   
> day  21   
> language R
> 
> -- 
>  Alberto G. Murta
> Institute for Agriculture and Fisheries Research (INIAP-IPIMAR) 
> Av. Brasilia, 1449-006 Lisboa, Portugal | Phone: +351 213027062
> Fax:+351 213015948 | http://ipimar-iniap.ipimar.pt/pelagicos/
> 
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Mahalanobis

2004-03-26 Thread Alberto Murta
Dear all

Why isn'it possible to calculate Mahalanobis distances with R for a matrix 
with 1 row (observations) more than the number of columns (variables)?

> mydata <- matrix(runif(12,-5,5), 4, 3)
> mahalanobis(x=mydata, center=apply(mydata,2,mean), cov=var(mydata))
[1] 2.25 2.25 2.25 2.25

> mydata <- matrix(runif(420,-5,5), 21, 20)
> mahalanobis(x=mydata, center=apply(mydata,2,mean), cov=var(mydata))
 [1] 19.04762 19.04762 19.04762 19.04762 19.04762 19.04762 19.04762 19.04762 
19.04762 19.04762 19.04762 19.04762
[13] 19.04762 19.04762 19.04762 19.04762 19.04762 19.04762 19.04762 19.04762 
19.04762

> mydata <- matrix(runif(132,-5,5), 12, 11)
> mahalanobis(x=mydata, center=apply(mydata,2,mean), cov=var(mydata))
 [1] 10.08333 10.08333 10.08333 10.08333 10.08333 10.08333 10.08333 10.08333 
10.08333 10.08333 10.08333 10.08333

Thanks in advance

Alberto Murta

> version
 _
platform i686-pc-linux-gnu
arch i686 
os   linux-gnu
system   i686, linux-gnu  
status
major1
minor8.1  
year 2003 
month11   
day  21   
language R

-- 
 Alberto G. Murta
Institute for Agriculture and Fisheries Research (INIAP-IPIMAR) 
Av. Brasilia, 1449-006 Lisboa, Portugal | Phone: +351 213027062
Fax:+351 213015948 | http://ipimar-iniap.ipimar.pt/pelagicos/

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html