On Oct 25, 2012, at 4:41 PM, Bert Gunter wrote:
> 1. I don't know what StatMatch is. Try using stats::mahalanobis.
>
> 2. It's the covariance matrix that is **numerically** singular and
> can't be inverted. Why do you claim that there's "no way" this could
> be true when there are hundreds of variables (= dimensions).
>
> 3. Try calculating the svd of your matrix and see what you get if you
> haven't already done so.
This was crossposted to StackOverflow where Josh O'Brien has responded that his
code using svd() shows the matrix to be highly collinear. This is the upper
left corner of the correlation matrix:
V1 V2 V3 V4 V5
V1 1.00000000 0.97250825 0.93390424 0.918813118 0.89705917
V2 0.97250825 1.00000000 0.97118079 0.954020724 0.93992361
V3 0.93390424 0.97118079 1.00000000 0.991508026 0.97602188
V4 0.91881312 0.95402072 0.99150803 1.000000000 0.98837387
V5 0.89705917 0.93992361 0.97602188 0.988373865 1.00000000
> length( which(cor(mat)==1) )
[1] 374
Just looking at it should give a good idea why. I can see bands of columns that
are identically zero.
--
david.
> Cheers,
> Bert
>
> On Thu, Oct 25, 2012 at 4:14 PM, langvince <[email protected]> wrote:
>> Hi folks,
>>
>> I know, this is a fairly common question and I am really disappointed that I
>> could not find a solution.
>> I am trying to calculate Mahanalobis distances in a data frame, where I have
>> several hundreds groups and several hundreds of variables.
>>
>> Whatever I do, however I subset it I get the "system is computationally
>> singular: reciprocal condition number" error.
>> I know what it means and I know what should be the problem, but there is no
>> way this is a singular matrix.
>>
>> I have uploaded the input file to my ftp:
>> http://mkk.szie.hu/dep/talt/lv/CentInpDuplNoHeader.txt
>> It is a tab delimited txt file with no headers.
>>
>> I tried the StatMatch Mahanalobis function and also this function:
>>
>> mahal_dist <-function (data, nclass, nvariable) {
>> dist <- matrix(0, nclass, nclass)
>> n=0
>> w <- cov(data)
>> print(w)
>> for(i in 1:nclass) {
>>
>> for(c in 1:nclass){
>> diffl <- vector(length = nvariable)
>> for(l in 1:nvariable){
>> diffl[l]=abs(data[i,l]-data[c,l])
>>
>> }
>> ### matrixes
>> print(diffl)
>> dist[i,c]= (t(diffl))%*%(solve(w))%*%(diffl)
>> }
>>
>> n=n+1
>> print(n)
>> }
>> return(dist)
>> sqrt_dist <- sqrt(dist)
>> print(sqrt_dist) }
>>
>>
>> I have a deadline for this project (not a homework:)), and I could always
>> use this codes, so I thought I will be able to quit the calculations short,
>> but now I am just lost.
>>
>> I would really appreciate any help.
>>
>> Thanks for any help
>>
>>
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/system-is-computationally-singular-reciprocal-condition-number-tp4647472.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> [email protected] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Alameda, CA, USA
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.