Hi, sorry, I was wrong and that's true. The Hampel suggestion is outliers <- (x<medx-3.5*madx) | (x>medx+3.5*madx) or to use the multiplier 5.2 with madx <- mad(x, constant=1).
Christian On Fri, 21 Feb 2003, Jason Turner wrote: > On Thu, Feb 20, 2003 at 06:54:21PM +0100, Christian Hennig wrote: > ... > > However, a simple straight forward method for outlier identification is > > median +/- 5.2*mad as suggested by Hampel, Technometrics 27 (1985) 95-107. > ... > > x <- data vector > > medx <- median(x) > > madx <- mad(x) > > outliers <- (x<medx-5.2*madx) | (x>medx+5.2*madx) > > selected <- x[!outliers] > > I haven't read the paper cited above, but I suspect the authors were > talking about the true mad. By default, R re-scales the mad to adjust > for the normal case (ie multiplies by about 1.48). If that's correct > (and I'm quite happy to be wrong), this changes 5.2 to 3.5 in the > example above. > > Cheers > > Jason > -- *********************************************************************** Christian Hennig Seminar fuer Statistik, ETH-Zentrum (LEO), CH-8092 Zuerich (currently) and Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg [EMAIL PROTECTED], http://stat.ethz.ch/~hennig/ [EMAIL PROTECTED], http://www.math.uni-hamburg.de/home/hennig/ ####################################################################### ich empfehle www.boag.de ______________________________________________ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
