On Thu, Feb 20, 2003 at 06:54:21PM +0100, Christian Hennig wrote: ... > However, a simple straight forward method for outlier identification is > median +/- 5.2*mad as suggested by Hampel, Technometrics 27 (1985) 95-107. ... > x <- data vector > medx <- median(x) > madx <- mad(x) > outliers <- (x<medx-5.2*madx) | (x>medx+5.2*madx) > selected <- x[!outliers]
I haven't read the paper cited above, but I suspect the authors were talking about the true mad. By default, R re-scales the mad to adjust for the normal case (ie multiplies by about 1.48). If that's correct (and I'm quite happy to be wrong), this changes 5.2 to 3.5 in the example above. Cheers Jason -- Indigo Industrial Controls Ltd. 64-21-343-545 [EMAIL PROTECTED] ______________________________________________ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
