Hi,

More of a statistical question, I'm trying to understand the formulation of the one-sample two-sided Kolmogorov-Smirnov statistic in stats::ks.test(), testing against a uniform distribution.

Basically, it boils down to:

x <- rnorm(100)

n <- length(x)
z <- punif(sort(x)) - (0:(n - 1)) / n
max(z, 1 / n - z)

which is equivalent to the textbook definition

n <- length(x)
z <- punif(sort(x))
Dplus <- max(sapply(1:n, function(i) i / n - z[i]))
Dminus <- max(sapply(1:n, function(i) z[i] - (i - 1) / n))
max(Dplus, Dminus)

(See, e.g., http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm, and Durbin (1971) ``Distribution theory for tests based on the sample distribution function'', p. 6)

Why does the definition of Dminus have an i-1 in the numerator instead of i? I have a hunch it's got to do with right-continuity of the ecdf, but perhaps someone can shed some light on it.

Thanks,
Gad


--
Gad Abraham
MEng Student, Dept. CSSE and NICTA
The University of Melbourne
Parkville 3010, Victoria, Australia
email: gabra...@csse.unimelb.edu.au
web: http://www.csse.unimelb.edu.au/~gabraham

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to