On 20 Oct 2003 20:16:52 -0700, [EMAIL PROTECTED] (Michael)
wrote:

> > Reference?  What do you mean by "removing
> > duplicates"  and what is this K-S  test about?
> 
> I am testing goodness-of-fit for continuous distributions.
> 
> I have been taught to delete any duplicate data points for example if
> we have 10,33,44,44,44,44,55,56 I would calculate my sample and
> theorectical probabilites then delete all but ONE of the 44's (the
> duplicate data points) and the calculate the differences and then find
> then take the absolute values of the differences and my D-value is the
> largest of those.
> 
> The question is which ONE do leave behind? Its a choice that has an
> impact on the resulting D-value.

No, no, no.  
You have these items ranked, and then you compare the
rank to a CDF, and you want to find the maximum difference.

It is not *necessary*  to compute a D  for any rank in 
the middle of ties, because it can't possibly give  the
maximum  D.   In that sense -- because it can't be
useful -- you can 'delete'  the act of computing the D.

But you certainly do not delete the data.

[ ... ]
> 
> What method do you use to test goodness-of-fit for coninuous
> distributions?

K-S  is designed for continuous distributions.
Shapiro-Wilks is popular, and its principle of correlation.

Personally, I most often look at a scatterplot of a couple
of interesting variables.  Outliers matter, but  'homogeneity
of variance'  and  'linearity of regression'  matter, too.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
"Taxes are the price we pay for civilization." 
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to