[
https://issues.apache.org/jira/browse/MATH-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Herbert resolved MATH-1627.
--------------------------------
Fix Version/s: 4.0
Resolution: Fixed
Throw an exception if a column or row contains only zeros.
Updated in commit:
21f80081082ce3b31a1bcd8ecae0e3ae9ac70c05
> ChiSquareTest computes NaN with zero observations
> -------------------------------------------------
>
> Key: MATH-1627
> URL: https://issues.apache.org/jira/browse/MATH-1627
> Project: Commons Math
> Issue Type: Bug
> Affects Versions: 4.0
> Reporter: Alex Herbert
> Priority: Trivial
> Fix For: 4.0
>
>
> Zero observations input to the ChiSquareTest will compute NaN:
> {code:java}
> ChiSquareTest chi2Test = new ChiSquareTest();
> final long[][] counts = new long[2][2];
> // NaN
> double chi2 = chi2Test.chiSquare(counts);
> {code}
> This is due to a divide by zero error. This bug was identified by sonarcloud
> analysis.
> The unit tests use R as a reference. In R this case will raise an error that
> at least one entry must be positive. Setting a value to 1 allows R to compute
> a Chi-square test value but the value is not valid:
> {code:r}
> > m <- array(c(1,0,0,0), dim = c(2,2))
> > chisq.test(m)
> Pearson's Chi-squared test
> data: m
> X-squared = NaN, df = 1, p-value = NA
> Warning message:
> In chisq.test(m) : Chi-squared approximation may be incorrect
> {code}
> Other methods in the ChiSquareTest will raise a ZeroException if the
> observations are zero for an entire array of observations or if a pair of
> observations in a bin are both zero.
> The Chi square test has assumptions that do not hold when the number of
> observations are small. The limit for the number of observations per category
> is variable. The document referenced in the code javadoc recommends an
> expected level of 5 per bin. To avoid setting limits on the sample size a
> suggested fix is to raise a zero exception if the sum of all counts is zero.
> This will avoid a NaN computation. Use of a suitable number of observations
> is left to the caller.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)