There are lots of possibletechniques for discarding outliers. The important thing is to know about the reason for their ocurence. Is it caused by some type of error in the generation or collection of the data or is it actually important information. You might rather concentrate on the outliers exactly because they deviate from the norm and have the potential of to effect quite different from normal behavior.
Donna [email protected] On 2012-01-09, at 7:49 PM, Roger Hui <[email protected]> wrote: > I wonder if there are well-known techniques in statistics for dealing with > the following problem. > > t > 11 10 10 10 10 11 10 10 10 10 9 11 10 11 10 10 11 10 11 10 11 10 10 > 11 10 11 10 10 10 11 10 74 11 11 14 11 11 10 12 11 15 14 12 11 > 11 11 11 11 10 12 11 11 11 10 11 11 11 10 11 11 10 11 161241 49 > 32 12 11 11 12 10 11 10 12 11 12 11 11 12 11 11 12 11 11 11 12 > 11 11 12 11 11 11 11 11 11 11 10 11 11 12 12 > > t is a set of samples from a noisy source which is supposed to give the > same integer answer. Obviously, 161241 is an "outlier", and it is likely > that 74, 49, or even 32 are outliers too. Are there standard techniques > for discarding outliers to clean up the data, before the application of > statistical tests such as the means test or large sample test? > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
