On 19 Oct 2003 23:00:42 -0700, [EMAIL PROTECTED] (Michael) wrote: > Hi, > > I am performing Kolmogorov-Smirnov (K-S) testing and have a query > about how best to deal with duplicates. The dataset I am working with > has a lot of duplicates - after there removal is size is almost half. > Which leads to my question, > > When deleting duplicates does it matter which ones i delete ?
The K-S that I am familiar with does not "remove duplicates" for any reason. Reference? What do you mean by "removing duplicates" and what is this K-S test about? > > I have done the K-S test deleting all after the first duplicate and > all before the last duplicate bothing approaches giving different > D-values (as I expected). However in the three cases I have tested > this has had no significant impact upon the resultant p-values. > "D-values" sounds like the usual K-S, dealing with cumulative distribution. That uses the whole distribution. The p-value will be less accurate (for small N, especially) if there are ties. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html "Taxes are the price we pay for civilization." . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
