On 19 Oct 2003 23:00:42 -0700, [EMAIL PROTECTED] (Michael)
wrote:

> Hi,
> 
> I am performing Kolmogorov-Smirnov (K-S) testing and have a query
> about how best to deal with duplicates. The dataset I am working with
> has a lot of duplicates - after there removal is size is almost half.
> Which leads to my question,
> 
> When deleting duplicates does it matter which ones i delete ?

The K-S  that I am familiar with 
does not "remove duplicates"  for any reason.

Reference?  What do you mean by "removing
duplicates"  and what is this K-S  test about?

> 
> I have done the K-S test deleting all after the first duplicate and
> all before the last duplicate bothing approaches giving different
> D-values (as I expected). However in the three cases I have tested
> this has had no significant impact upon the resultant p-values.
> 

"D-values"  sounds like the usual K-S, dealing
with cumulative distribution.  That uses the whole
distribution.  The p-value will be less accurate (for
small N, especially)  if there are ties.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
"Taxes are the price we pay for civilization." 
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to