cov.mcd in MASS and covMcd in robustbase both calculate the Minimum Covariance
Determinant (MCD) multivariate location and scale estimator.
For a 2-variable problem, the two return identical correlation matrices but the
covariance matrices differ. For the 2-variable case the difference seems to be
a scaling factor, equal to the product of scaling factors returned as 'cnp2' in
the 'mcd' object returned by robustbase. In the modified ?covMcd example below,
the robustbase covariances are appreciably larger as a result. For multivariate
problems (n>2) things aren't so simple.
Is there a rationale for the difference? And of the two implementations, which
might be the more defensible estimate of covariance for modestly
outlier-contaminated data?
S Ellison
#Example (reduced to 2 variables)
library(robustbase)
library(MASS)
data(hbk)
hbk.x <- data.matrix(hbk[, 1:2])
set.seed(17)
(cH <- covMcd(hbk.x, cor=TRUE))$cov
(cH.M <- cov.mcd(hbk.x, cor=TRUE))$cov
cH$cov/cH.M$cov
prod(cH$cnp2)
*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}
_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-robust