[ https://issues.apache.org/jira/browse/MATH-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293633#comment-13293633 ]
Mikkel Meyer Andersen commented on MATH-790: -------------------------------------------- Thanks for the details. Why not use a double immediately as below? Is it to avoid precision loss? {noformat} final double n1n2prod = (double) n1 * n2; // http://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U#Normal_approximation final double EU = n1n2prod / 2.0; final double VarU = n1n2prod * (n1 + n2 + 1) / 12.0; final double z = (Umin - EU) / FastMath.sqrt(VarU); {noformat} > Mann-Whitney U Test Suffers From Integer Overflow With Large Data Sets > ---------------------------------------------------------------------- > > Key: MATH-790 > URL: https://issues.apache.org/jira/browse/MATH-790 > Project: Commons Math > Issue Type: Bug > Affects Versions: 3.0, Nightly Builds > Environment: Ubuntu Linux x64, Sun Java 6 > Reporter: James Pickering > Assignee: Mikkel Meyer Andersen > Priority: Minor > Labels: newbie, patch > Fix For: 3.1 > > Attachments: MannWhitnetUOVerflowPatch.diff > > Original Estimate: 1h > Remaining Estimate: 1h > > When performing a Mann-Whitney U Test on large data sets (the attached test > uses two 1500 element sets), intermediate integer values used in > calculateAsymptoticPValue can overflow, leading to invalid results, such as > p-values of NaN, or incorrect calculations. > Attached is a patch, including a test, and a fix, which modifies the affected > code to use doubles -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira