Hi,
More of a statistical question, I'm trying to understand the formulation
of the one-sample two-sided Kolmogorov-Smirnov statistic in
stats::ks.test(), testing against a uniform distribution.
Basically, it boils down to:
x <- rnorm(100)
n <- length(x)
z <- punif(sort(x)) - (0:(n - 1)) / n
max(z, 1 / n - z)
which is equivalent to the textbook definition
n <- length(x)
z <- punif(sort(x))
Dplus <- max(sapply(1:n, function(i) i / n - z[i]))
Dminus <- max(sapply(1:n, function(i) z[i] - (i - 1) / n))
max(Dplus, Dminus)
(See, e.g.,
http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm, and
Durbin (1971) ``Distribution theory for tests based on the sample
distribution function'', p. 6)
Why does the definition of Dminus have an i-1 in the numerator instead
of i? I have a hunch it's got to do with right-continuity of the ecdf,
but perhaps someone can shed some light on it.
Thanks,
Gad
--
Gad Abraham
MEng Student, Dept. CSSE and NICTA
The University of Melbourne
Parkville 3010, Victoria, Australia
email: gabra...@csse.unimelb.edu.au
web: http://www.csse.unimelb.edu.au/~gabraham
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.