On Thu, Feb 20, 2003 at 06:54:21PM +0100, Christian Hennig wrote:
... 
> However, a simple straight forward method for outlier identification is  
> median +/- 5.2*mad as suggested by Hampel, Technometrics 27 (1985) 95-107.
...
> x <- data vector
> medx <- median(x)
> madx <- mad(x)
> outliers <- (x<medx-5.2*madx) | (x>medx+5.2*madx)
> selected <- x[!outliers]

I haven't read the paper cited above, but I suspect the authors were
talking about the true mad.  By default, R re-scales the mad to adjust
for the normal case (ie multiplies by about 1.48).  If that's correct
(and I'm quite happy to be wrong), this changes 5.2 to 3.5 in the
example above.

Cheers

Jason
-- 
Indigo Industrial Controls Ltd.
64-21-343-545
[EMAIL PROTECTED]

______________________________________________
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help

Reply via email to