On 11/7/10 9:17 AM, Mikkel Meyer Andersen wrote:
2010/11/7 Phil Steitz<[email protected]>:
On 11/6/10 12:44 PM, Mikkel Meyer Andersen wrote:
2010/11/6 Phil Steitz (JIRA)<[email protected]>:
[
https://issues.apache.org/jira/browse/MATH-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929054#action_12929054
]
Phil Steitz commented on MATH-431:
----------------------------------
+1 for including both of these tests. Then on to MATH-228
Anything I should do in regard to that?
What we need there is a good algorithm for approximating the KS
distribution. I have been corresponding with the author of a very good one
with a Java implementation but have thus far failed in getting consent to
release under ASL. So at this point, I am looking for an alternative good
algorithm to implement. All suggestions / unencumbered patches welcome!
See comments on the MATH-431 for other questions.
Just to be sure of what you mean:
Do you want to have a two-sample Kolmogorov-Smirnov test for equality
of distributions in addition to the Mann-Whitney? Or do you need the
Kolmogorov-Smirnov distribution (as stated for example at
http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test#Kolmogorov_distribution
) in regards to the MATH-428? Sorry, but I'm at bit confused :-).
The goal is to implement the KS test for equality of distributions
(or homogeneity against a reference distribution). To do that we
need at least critical values of the Kolmogorov distribution. The
natural way for us to do that would be to implement the full
distribution which would be nice to have in the distributions package.
Phil
Interesting approach for the exact algorithm for Wilcoxon. If we stay
with this, we should ack the original author of the algorithm in the
javadoc. Looks OK to use.
Agree - both on the approach and legal part! Does the author need to
sign anything but write a mail?
Regarding the difference from R, what I usually do in this case is look
at the R sources to try to explain the difference. Most likely in this
case, what is going on is they are using a different estimation algorithm
for small n or treating ties differently. The ranking options that we use
were largely adapted from R, so if that is the problem, it should be easy to
test. We need to convince ourselves that ours is better or at least a
legitimate alternative. I will take a close look this evening, but it looks
like the algorithm you are using should be exact. If we can't reconcile the
difference with R, it would be good to find a way to validate correct
functioning of the algorithm by manufacturing reference data with known p.
I'll try to investigate the difference, hopefully tomorrow, so that
formal tests can be written and included.
New tests: Wilcoxon signed-rank test and Mann-Whitney U
-------------------------------------------------------
Key: MATH-431
URL: https://issues.apache.org/jira/browse/MATH-431
Project: Commons Math
Issue Type: New Feature
Reporter: Mikkel Meyer Andersen
Assignee: Mikkel Meyer Andersen
Priority: Minor
Attachments: MannWhitneyUTest.java, MannWhitneyUTestImpl.java,
WilcoxonSignedRankTest.java, WilcoxonSignedRankTestImpl.java
Original Estimate: 4h
Remaining Estimate: 4h
Wilcoxon signed-rank test and Mann-Whitney U are commonly used
non-parametric statistical hypothesis tests (e.g. instead of various t-tests
when normality is not present).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.