Thanks. What's a reasonable multiple to use?
On Mon, Jan 9, 2012 at 7:41 PM, Brian Schott <[email protected]> wrote: > John Tukey has studied outliers extensively in his interactive data > analysis. He computes a box plot by measuring the IQR, that's > interquartile range, of the data set. He adds and subtracts a multiple > of the IQR to the upper and lower quartiles of the box in the boxplot. > Data values outside the "hinges" (in Tukey speak) are outliers. > > The code below is from Donald R. McNeil's IDA, A Practical Primer. > > http://www.pixentral.com/show.php?picture=1Fnz2FOWX9nuYzndC9GbDbi2z1yz50 > > > --- > (B=) > > On Jan 9, 2012, at 7:49 PM, Roger Hui <[email protected]> wrote: > > > I wonder if there are well-known techniques in statistics for dealing > with > > the following problem. > > > > t > > 11 10 10 10 10 11 10 10 10 10 9 11 10 11 10 10 11 10 11 10 11 10 10 > > 11 10 11 10 10 10 11 10 74 11 11 14 11 11 10 12 11 15 14 12 11 > > 11 11 11 11 10 12 11 11 11 10 11 11 11 10 11 11 10 11 161241 49 > > 32 12 11 11 12 10 11 10 12 11 12 11 11 12 11 11 12 11 11 11 12 > > 11 11 12 11 11 11 11 11 11 11 10 11 11 12 12 > > > > t is a set of samples from a noisy source which is supposed to give the > > same integer answer. Obviously, 161241 is an "outlier", and it is likely > > that 74, 49, or even 32 are outliers too. Are there standard techniques > > for discarding outliers to clean up the data, before the application of > > statistical tests such as the means test or large sample test? > > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
