cov.mcd in MASS and covMcd in robustbase both calculate the Minimum Covariance 
Determinant (MCD) multivariate location and scale estimator.

For a 2-variable problem, the two return identical correlation matrices but the 
covariance matrices differ. For the 2-variable case the difference seems to be 
a scaling factor, equal to the product of scaling factors returned as 'cnp2' in 
the 'mcd' object returned by robustbase. In the modified ?covMcd example below, 
the robustbase covariances are appreciably larger as a result. For multivariate 
problems (n>2) things aren't so simple.

Is there a rationale for the difference? And of the two implementations, which 
might be the more defensible estimate of covariance for modestly 
outlier-contaminated data?


S Ellison

#Example (reduced to 2 variables)
library(robustbase)
library(MASS)
 data(hbk)
 hbk.x <- data.matrix(hbk[, 1:2])
 set.seed(17)
 (cH <- covMcd(hbk.x, cor=TRUE))$cov

(cH.M <- cov.mcd(hbk.x, cor=TRUE))$cov
 
cH$cov/cH.M$cov

prod(cH$cnp2)


*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}

_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-robust

Reply via email to