Source: scipy, dask Control: found -1 scipy/1.7.1-1 Control: found -1 dask/2021.01.0+dfsg-1 Severity: serious Tags: sid bookworm X-Debbugs-CC: [email protected] User: [email protected] Usertags: breaks needs-update
Dear maintainer(s),
With a recent upload of scipy the autopkgtest of dask fails in testing
when that autopkgtest is run with the binary packages of scipy from
unstable. It passes when run with only packages from testing. In tabular
form:
pass fail
scipy from testing 1.7.1-1
dask from testing 2021.01.0+dfsg-1
all others from testing from testing
I copied some of the output at the bottom of this report.
Currently this regression is blocking the migration of scipy to testing
[1]. Due to the nature of this issue, I filed this bug report against
both packages. Can you please investigate the situation and reassign the
bug to the right package?
More information about this bug and the reason for filing it can be found on
https://wiki.debian.org/ContinuousIntegration/RegressionEmailInformation
Paul
[1] https://qa.debian.org/excuses.php?package=scipy
https://ci.debian.net/data/autopkgtest/testing/amd64/d/dask/14750989/log.gz
=================================== FAILURES
===================================
_________________________ test_two[chisquare-kwargs4]
__________________________
kind = 'chisquare', kwargs = {}
@pytest.mark.parametrize(
"kind, kwargs",
[
("ttest_ind", {}),
("ttest_ind", {"equal_var": False}),
("ttest_1samp", {}),
("ttest_rel", {}),
("chisquare", {}),
("power_divergence", {}),
("power_divergence", {"lambda_": 0}),
("power_divergence", {"lambda_": -1}),
("power_divergence", {"lambda_": "neyman"}),
],
)
def test_two(kind, kwargs):
a = np.random.random(size=30)
b = np.random.random(size=30)
a_ = da.from_array(a, 3)
b_ = da.from_array(b, 3)
dask_test = getattr(dask.array.stats, kind)
scipy_test = getattr(scipy.stats, kind)
with pytest.warns(None): # maybe overflow warning
(powrer_divergence)
result = dask_test(a_, b_, **kwargs)
> expected = scipy_test(a, b, **kwargs)
/usr/lib/python3/dist-packages/dask/array/tests/test_stats.py:88:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _
/usr/lib/python3/dist-packages/scipy/stats/stats.py:6852: in chisquare
return power_divergence(f_obs, f_exp=f_exp, ddof=ddof, axis=axis,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _
f_obs = array([0.50616202, 0.24047559, 0.47385697, 0.2617551 , 0.39114576,
0.88770032, 0.34302868, 0.14302513, 0.725594...42, 0.59473997,
0.6900566 , 0.53797374, 0.57797282,
0.03873117, 0.25516211, 0.61014154, 0.07151085, 0.45349848])
f_exp = array([0.74205809, 0.02842818, 0.8787515 , 0.73989809, 0.62863789,
0.29149379, 0.35539664, 0.14194145, 0.256345...17, 0.30981448,
0.79533776, 0.38182638, 0.07065248,
0.82259669, 0.67148202, 0.52457137, 0.24092418, 0.9464332 ])
ddof = 0, axis = 0, lambda_ = 1
def power_divergence(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None):
"""Cressie-Read power divergence statistic and goodness of fit test.
This function tests the null hypothesis that the categorical data
has the given frequencies, using the Cressie-Read power divergence
statistic.
Parameters
----------
f_obs : array_like
Observed frequencies in each category.
f_exp : array_like, optional
Expected frequencies in each category. By default the
categories are
assumed to be equally likely.
ddof : int, optional
"Delta degrees of freedom": adjustment to the degrees of freedom
for the p-value. The p-value is computed using a chi-squared
distribution with ``k - 1 - ddof`` degrees of freedom, where `k`
is the number of observed frequencies. The default value of
`ddof`
is 0.
axis : int or None, optional
The axis of the broadcast result of `f_obs` and `f_exp`
along which to
apply the test. If axis is None, all values in `f_obs` are
treated
as a single data set. Default is 0.
lambda_ : float or str, optional
The power in the Cressie-Read power divergence statistic.
The default
is 1. For convenience, `lambda_` may be assigned one of the
following
strings, in which case the corresponding numerical value is
used::
String Value Description
"pearson" 1 Pearson's chi-squared statistic.
In this case, the function is
equivalent to `stats.chisquare`.
"log-likelihood" 0 Log-likelihood ratio. Also
known as
the G-test [3]_.
"freeman-tukey" -1/2 Freeman-Tukey statistic.
"mod-log-likelihood" -1 Modified log-likelihood ratio.
"neyman" -2 Neyman's statistic.
"cressie-read" 2/3 The power recommended in [5]_.
Returns
-------
statistic : float or ndarray
The Cressie-Read power divergence test statistic. The value is
a float if `axis` is None or if` `f_obs` and `f_exp` are 1-D.
pvalue : float or ndarray
The p-value of the test. The value is a float if `ddof` and the
return value `stat` are scalars.
See Also
--------
chisquare
Notes
-----
This test is invalid when the observed or expected frequencies
in each
category are too small. A typical rule is that all of the observed
and expected frequencies should be at least 5.
Also, the sum of the observed and expected frequencies must be
the same
for the test to be valid; `power_divergence` raises an error if
the sums
do not agree within a relative tolerance of ``1e-8``.
When `lambda_` is less than zero, the formula for the statistic
involves
dividing by `f_obs`, so a warning or error may be generated if
any value
in `f_obs` is 0.
Similarly, a warning or error may be generated if any value in
`f_exp` is
zero when `lambda_` >= 0.
The default degrees of freedom, k-1, are for the case when no
parameters
of the distribution are estimated. If p parameters are estimated by
efficient maximum likelihood then the correct degrees of freedom are
k-1-p. If the parameters are estimated in a different way, then the
dof can be between k-1-p and k-1. However, it is also possible that
the asymptotic distribution is not a chisquare, in which case this
test is not appropriate.
This function handles masked arrays. If an element of `f_obs`
or `f_exp`
is masked, then data at that position is ignored, and does not count
towards the size of the data set.
.. versionadded:: 0.13.0
References
----------
.. [1] Lowry, Richard. "Concepts and Applications of Inferential
Statistics". Chapter 8.
https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html
.. [2] "Chi-squared test",
https://en.wikipedia.org/wiki/Chi-squared_test
.. [3] "G-test", https://en.wikipedia.org/wiki/G-test
.. [4] Sokal, R. R. and Rohlf, F. J. "Biometry: the principles and
practice of statistics in biological research", New York:
Freeman
(1981)
.. [5] Cressie, N. and Read, T. R. C., "Multinomial Goodness-of-Fit
Tests", J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984),
pp. 440-464.
Examples
--------
(See `chisquare` for more examples.)
When just `f_obs` is given, it is assumed that the expected
frequencies
are uniform and given by the mean of the observed frequencies.
Here we
perform a G-test (i.e. use the log-likelihood ratio statistic):
>>> from scipy.stats import power_divergence
>>> power_divergence([16, 18, 16, 14, 12, 12],
lambda_='log-likelihood')
(2.006573162632538, 0.84823476779463769)
The expected frequencies can be given with the `f_exp` argument:
>>> power_divergence([16, 18, 16, 14, 12, 12],
... f_exp=[16, 16, 16, 16, 16, 8],
... lambda_='log-likelihood')
(3.3281031458963746, 0.6495419288047497)
When `f_obs` is 2-D, by default the test is applied to each column.
>>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28,
20, 24]]).T
>>> obs.shape
(6, 2)
>>> power_divergence(obs, lambda_="log-likelihood")
(array([ 2.00657316, 6.77634498]), array([ 0.84823477,
0.23781225]))
By setting ``axis=None``, the test is applied to all data in the
array,
which is equivalent to applying the test to the flattened array.
>>> power_divergence(obs, axis=None)
(23.31034482758621, 0.015975692534127565)
>>> power_divergence(obs.ravel())
(23.31034482758621, 0.015975692534127565)
`ddof` is the change to make to the default degrees of freedom.
>>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1)
(2.0, 0.73575888234288467)
The calculation of the p-values is done by broadcasting the
test statistic with `ddof`.
>>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2])
(2.0, array([ 0.84914504, 0.73575888, 0.5724067 ]))
`f_obs` and `f_exp` are also broadcast. In the following,
`f_obs` has
shape (6,) and `f_exp` has shape (2, 6), so the result of
broadcasting
`f_obs` and `f_exp` has shape (2, 6). To compute the desired
chi-squared
statistics, we must use ``axis=1``:
>>> power_divergence([16, 18, 16, 14, 12, 12],
... f_exp=[[16, 16, 16, 16, 16, 8],
... [8, 20, 20, 16, 12, 12]],
... axis=1)
(array([ 3.5 , 9.25]), array([ 0.62338763, 0.09949846]))
"""
# Convert the input argument `lambda_` to a numerical value.
if isinstance(lambda_, str):
if lambda_ not in _power_div_lambda_names:
names = repr(list(_power_div_lambda_names.keys()))[1:-1]
raise ValueError("invalid string for lambda_: {0!r}. "
"Valid strings are {1}".format(lambda_,
names))
lambda_ = _power_div_lambda_names[lambda_]
elif lambda_ is None:
lambda_ = 1
f_obs = np.asanyarray(f_obs)
f_obs_float = f_obs.astype(np.float64)
if f_exp is not None:
f_exp = np.asanyarray(f_exp)
bshape = _broadcast_shapes(f_obs_float.shape, f_exp.shape)
f_obs_float = _m_broadcast_to(f_obs_float, bshape)
f_exp = _m_broadcast_to(f_exp, bshape)
rtol = 1e-8 # to pass existing tests
with np.errstate(invalid='ignore'):
f_obs_sum = f_obs_float.sum(axis=axis)
f_exp_sum = f_exp.sum(axis=axis)
relative_diff = (np.abs(f_obs_sum - f_exp_sum) /
np.minimum(f_obs_sum, f_exp_sum))
diff_gt_tol = (relative_diff > rtol).any()
if diff_gt_tol:
msg = (f"For each axis slice, the sum of the observed "
f"frequencies must agree with the sum of the "
f"expected frequencies to a relative tolerance "
f"of {rtol}, but the percent differences are:\n"
f"{relative_diff}")
> raise ValueError(msg)
E ValueError: For each axis slice, the sum of the observed
frequencies must agree with the sum of the expected frequencies to a
relative tolerance of 1e-08, but the percent differences are:
E 0.025894044619658885
/usr/lib/python3/dist-packages/scipy/stats/stats.py:6694: ValueError
______________________ test_two[power_divergence-kwargs5]
______________________
kind = 'power_divergence', kwargs = {}
@pytest.mark.parametrize(
"kind, kwargs",
[
("ttest_ind", {}),
("ttest_ind", {"equal_var": False}),
("ttest_1samp", {}),
("ttest_rel", {}),
("chisquare", {}),
("power_divergence", {}),
("power_divergence", {"lambda_": 0}),
("power_divergence", {"lambda_": -1}),
("power_divergence", {"lambda_": "neyman"}),
],
)
def test_two(kind, kwargs):
a = np.random.random(size=30)
b = np.random.random(size=30)
a_ = da.from_array(a, 3)
b_ = da.from_array(b, 3)
dask_test = getattr(dask.array.stats, kind)
scipy_test = getattr(scipy.stats, kind)
with pytest.warns(None): # maybe overflow warning
(powrer_divergence)
result = dask_test(a_, b_, **kwargs)
> expected = scipy_test(a, b, **kwargs)
/usr/lib/python3/dist-packages/dask/array/tests/test_stats.py:88:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _
f_obs = array([0.48154045, 0.91770608, 0.79433273, 0.677548 , 0.5172343 ,
0.97963439, 0.05970396, 0.87044663, 0.047389...74, 0.51919702,
0.17786529, 0.30872526, 0.16799736,
0.3470901 , 0.60993241, 0.97491444, 0.6667405 , 0.40817497])
f_exp = array([0.34841225, 0.36222138, 0.90096367, 0.34009505, 0.06369394,
0.50103548, 0.08646673, 0.08211717, 0.962662...77, 0.17116813,
0.62746241, 0.11182426, 0.35465983,
0.40134001, 0.23779315, 0.98321459, 0.06142563, 0.88909906])
ddof = 0, axis = 0, lambda_ = 1
def power_divergence(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None):
"""Cressie-Read power divergence statistic and goodness of fit test.
This function tests the null hypothesis that the categorical data
has the given frequencies, using the Cressie-Read power divergence
statistic.
Parameters
----------
f_obs : array_like
Observed frequencies in each category.
f_exp : array_like, optional
Expected frequencies in each category. By default the
categories are
assumed to be equally likely.
ddof : int, optional
"Delta degrees of freedom": adjustment to the degrees of freedom
for the p-value. The p-value is computed using a chi-squared
distribution with ``k - 1 - ddof`` degrees of freedom, where `k`
is the number of observed frequencies. The default value of
`ddof`
is 0.
axis : int or None, optional
The axis of the broadcast result of `f_obs` and `f_exp`
along which to
apply the test. If axis is None, all values in `f_obs` are
treated
as a single data set. Default is 0.
lambda_ : float or str, optional
The power in the Cressie-Read power divergence statistic.
The default
is 1. For convenience, `lambda_` may be assigned one of the
following
strings, in which case the corresponding numerical value is
used::
String Value Description
"pearson" 1 Pearson's chi-squared statistic.
In this case, the function is
equivalent to `stats.chisquare`.
"log-likelihood" 0 Log-likelihood ratio. Also
known as
the G-test [3]_.
"freeman-tukey" -1/2 Freeman-Tukey statistic.
"mod-log-likelihood" -1 Modified log-likelihood ratio.
"neyman" -2 Neyman's statistic.
"cressie-read" 2/3 The power recommended in [5]_.
Returns
-------
statistic : float or ndarray
The Cressie-Read power divergence test statistic. The value is
a float if `axis` is None or if` `f_obs` and `f_exp` are 1-D.
pvalue : float or ndarray
The p-value of the test. The value is a float if `ddof` and the
return value `stat` are scalars.
See Also
--------
chisquare
Notes
-----
This test is invalid when the observed or expected frequencies
in each
category are too small. A typical rule is that all of the observed
and expected frequencies should be at least 5.
Also, the sum of the observed and expected frequencies must be
the same
for the test to be valid; `power_divergence` raises an error if
the sums
do not agree within a relative tolerance of ``1e-8``.
When `lambda_` is less than zero, the formula for the statistic
involves
dividing by `f_obs`, so a warning or error may be generated if
any value
in `f_obs` is 0.
Similarly, a warning or error may be generated if any value in
`f_exp` is
zero when `lambda_` >= 0.
The default degrees of freedom, k-1, are for the case when no
parameters
of the distribution are estimated. If p parameters are estimated by
efficient maximum likelihood then the correct degrees of freedom are
k-1-p. If the parameters are estimated in a different way, then the
dof can be between k-1-p and k-1. However, it is also possible that
the asymptotic distribution is not a chisquare, in which case this
test is not appropriate.
This function handles masked arrays. If an element of `f_obs`
or `f_exp`
is masked, then data at that position is ignored, and does not count
towards the size of the data set.
.. versionadded:: 0.13.0
References
----------
.. [1] Lowry, Richard. "Concepts and Applications of Inferential
Statistics". Chapter 8.
https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html
.. [2] "Chi-squared test",
https://en.wikipedia.org/wiki/Chi-squared_test
.. [3] "G-test", https://en.wikipedia.org/wiki/G-test
.. [4] Sokal, R. R. and Rohlf, F. J. "Biometry: the principles and
practice of statistics in biological research", New York:
Freeman
(1981)
.. [5] Cressie, N. and Read, T. R. C., "Multinomial Goodness-of-Fit
Tests", J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984),
pp. 440-464.
Examples
--------
(See `chisquare` for more examples.)
When just `f_obs` is given, it is assumed that the expected
frequencies
are uniform and given by the mean of the observed frequencies.
Here we
perform a G-test (i.e. use the log-likelihood ratio statistic):
>>> from scipy.stats import power_divergence
>>> power_divergence([16, 18, 16, 14, 12, 12],
lambda_='log-likelihood')
(2.006573162632538, 0.84823476779463769)
The expected frequencies can be given with the `f_exp` argument:
>>> power_divergence([16, 18, 16, 14, 12, 12],
... f_exp=[16, 16, 16, 16, 16, 8],
... lambda_='log-likelihood')
(3.3281031458963746, 0.6495419288047497)
When `f_obs` is 2-D, by default the test is applied to each column.
>>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28,
20, 24]]).T
>>> obs.shape
(6, 2)
>>> power_divergence(obs, lambda_="log-likelihood")
(array([ 2.00657316, 6.77634498]), array([ 0.84823477,
0.23781225]))
By setting ``axis=None``, the test is applied to all data in the
array,
which is equivalent to applying the test to the flattened array.
>>> power_divergence(obs, axis=None)
(23.31034482758621, 0.015975692534127565)
>>> power_divergence(obs.ravel())
(23.31034482758621, 0.015975692534127565)
`ddof` is the change to make to the default degrees of freedom.
>>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1)
(2.0, 0.73575888234288467)
The calculation of the p-values is done by broadcasting the
test statistic with `ddof`.
>>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2])
(2.0, array([ 0.84914504, 0.73575888, 0.5724067 ]))
`f_obs` and `f_exp` are also broadcast. In the following,
`f_obs` has
shape (6,) and `f_exp` has shape (2, 6), so the result of
broadcasting
`f_obs` and `f_exp` has shape (2, 6). To compute the desired
chi-squared
statistics, we must use ``axis=1``:
>>> power_divergence([16, 18, 16, 14, 12, 12],
... f_exp=[[16, 16, 16, 16, 16, 8],
... [8, 20, 20, 16, 12, 12]],
... axis=1)
(array([ 3.5 , 9.25]), array([ 0.62338763, 0.09949846]))
"""
# Convert the input argument `lambda_` to a numerical value.
if isinstance(lambda_, str):
if lambda_ not in _power_div_lambda_names:
names = repr(list(_power_div_lambda_names.keys()))[1:-1]
raise ValueError("invalid string for lambda_: {0!r}. "
"Valid strings are {1}".format(lambda_,
names))
lambda_ = _power_div_lambda_names[lambda_]
elif lambda_ is None:
lambda_ = 1
f_obs = np.asanyarray(f_obs)
f_obs_float = f_obs.astype(np.float64)
if f_exp is not None:
f_exp = np.asanyarray(f_exp)
bshape = _broadcast_shapes(f_obs_float.shape, f_exp.shape)
f_obs_float = _m_broadcast_to(f_obs_float, bshape)
f_exp = _m_broadcast_to(f_exp, bshape)
rtol = 1e-8 # to pass existing tests
with np.errstate(invalid='ignore'):
f_obs_sum = f_obs_float.sum(axis=axis)
f_exp_sum = f_exp.sum(axis=axis)
relative_diff = (np.abs(f_obs_sum - f_exp_sum) /
np.minimum(f_obs_sum, f_exp_sum))
diff_gt_tol = (relative_diff > rtol).any()
if diff_gt_tol:
msg = (f"For each axis slice, the sum of the observed "
f"frequencies must agree with the sum of the "
f"expected frequencies to a relative tolerance "
f"of {rtol}, but the percent differences are:\n"
f"{relative_diff}")
> raise ValueError(msg)
E ValueError: For each axis slice, the sum of the observed
frequencies must agree with the sum of the expected frequencies to a
relative tolerance of 1e-08, but the percent differences are:
E 0.0255236948989553
/usr/lib/python3/dist-packages/scipy/stats/stats.py:6694: ValueError
______________________ test_two[power_divergence-kwargs6]
______________________
kind = 'power_divergence', kwargs = {'lambda_': 0}
@pytest.mark.parametrize(
"kind, kwargs",
[
("ttest_ind", {}),
("ttest_ind", {"equal_var": False}),
("ttest_1samp", {}),
("ttest_rel", {}),
("chisquare", {}),
("power_divergence", {}),
("power_divergence", {"lambda_": 0}),
("power_divergence", {"lambda_": -1}),
("power_divergence", {"lambda_": "neyman"}),
],
)
def test_two(kind, kwargs):
a = np.random.random(size=30)
b = np.random.random(size=30)
a_ = da.from_array(a, 3)
b_ = da.from_array(b, 3)
dask_test = getattr(dask.array.stats, kind)
scipy_test = getattr(scipy.stats, kind)
with pytest.warns(None): # maybe overflow warning
(powrer_divergence)
result = dask_test(a_, b_, **kwargs)
> expected = scipy_test(a, b, **kwargs)
/usr/lib/python3/dist-packages/dask/array/tests/test_stats.py:88:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _
f_obs = array([0.63467182, 0.96048488, 0.58221072, 0.71074857, 0.85604392,
0.80233744, 0.19603734, 0.51923681, 0.334861...59, 0.96676095,
0.6747112 , 0.15166506, 0.64653236,
0.11540545, 0.16450541, 0.11795982, 0.41130776, 0.97301235])
f_exp = array([0.13534693, 0.12792919, 0.16689546, 0.63389176, 0.68782358,
0.6177741 , 0.026473 , 0.58817575, 0.044479...02, 0.1481515 ,
0.12261963, 0.84576878, 0.93908736,
0.44969247, 0.14905221, 0.54827366, 0.15204124, 0.30575429])
ddof = 0, axis = 0, lambda_ = 0
def power_divergence(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None):
"""Cressie-Read power divergence statistic and goodness of fit test.
This function tests the null hypothesis that the categorical data
has the given frequencies, using the Cressie-Read power divergence
statistic.
Parameters
----------
f_obs : array_like
Observed frequencies in each category.
f_exp : array_like, optional
Expected frequencies in each category. By default the
categories are
assumed to be equally likely.
ddof : int, optional
"Delta degrees of freedom": adjustment to the degrees of freedom
for the p-value. The p-value is computed using a chi-squared
distribution with ``k - 1 - ddof`` degrees of freedom, where `k`
is the number of observed frequencies. The default value of
`ddof`
is 0.
axis : int or None, optional
The axis of the broadcast result of `f_obs` and `f_exp`
along which to
apply the test. If axis is None, all values in `f_obs` are
treated
as a single data set. Default is 0.
lambda_ : float or str, optional
The power in the Cressie-Read power divergence statistic.
The default
is 1. For convenience, `lambda_` may be assigned one of the
following
strings, in which case the corresponding numerical value is
used::
String Value Description
"pearson" 1 Pearson's chi-squared statistic.
In this case, the function is
equivalent to `stats.chisquare`.
"log-likelihood" 0 Log-likelihood ratio. Also
known as
the G-test [3]_.
"freeman-tukey" -1/2 Freeman-Tukey statistic.
"mod-log-likelihood" -1 Modified log-likelihood ratio.
"neyman" -2 Neyman's statistic.
"cressie-read" 2/3 The power recommended in [5]_.
Returns
-------
statistic : float or ndarray
The Cressie-Read power divergence test statistic. The value is
a float if `axis` is None or if` `f_obs` and `f_exp` are 1-D.
pvalue : float or ndarray
The p-value of the test. The value is a float if `ddof` and the
return value `stat` are scalars.
See Also
--------
chisquare
Notes
-----
This test is invalid when the observed or expected frequencies
in each
category are too small. A typical rule is that all of the observed
and expected frequencies should be at least 5.
Also, the sum of the observed and expected frequencies must be
the same
for the test to be valid; `power_divergence` raises an error if
the sums
do not agree within a relative tolerance of ``1e-8``.
When `lambda_` is less than zero, the formula for the statistic
involves
dividing by `f_obs`, so a warning or error may be generated if
any value
in `f_obs` is 0.
Similarly, a warning or error may be generated if any value in
`f_exp` is
zero when `lambda_` >= 0.
The default degrees of freedom, k-1, are for the case when no
parameters
of the distribution are estimated. If p parameters are estimated by
efficient maximum likelihood then the correct degrees of freedom are
k-1-p. If the parameters are estimated in a different way, then the
dof can be between k-1-p and k-1. However, it is also possible that
the asymptotic distribution is not a chisquare, in which case this
test is not appropriate.
This function handles masked arrays. If an element of `f_obs`
or `f_exp`
is masked, then data at that position is ignored, and does not count
towards the size of the data set.
.. versionadded:: 0.13.0
References
----------
.. [1] Lowry, Richard. "Concepts and Applications of Inferential
Statistics". Chapter 8.
https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html
.. [2] "Chi-squared test",
https://en.wikipedia.org/wiki/Chi-squared_test
.. [3] "G-test", https://en.wikipedia.org/wiki/G-test
.. [4] Sokal, R. R. and Rohlf, F. J. "Biometry: the principles and
practice of statistics in biological research", New York:
Freeman
(1981)
.. [5] Cressie, N. and Read, T. R. C., "Multinomial Goodness-of-Fit
Tests", J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984),
pp. 440-464.
Examples
--------
(See `chisquare` for more examples.)
When just `f_obs` is given, it is assumed that the expected
frequencies
are uniform and given by the mean of the observed frequencies.
Here we
perform a G-test (i.e. use the log-likelihood ratio statistic):
>>> from scipy.stats import power_divergence
>>> power_divergence([16, 18, 16, 14, 12, 12],
lambda_='log-likelihood')
(2.006573162632538, 0.84823476779463769)
The expected frequencies can be given with the `f_exp` argument:
>>> power_divergence([16, 18, 16, 14, 12, 12],
... f_exp=[16, 16, 16, 16, 16, 8],
... lambda_='log-likelihood')
(3.3281031458963746, 0.6495419288047497)
When `f_obs` is 2-D, by default the test is applied to each column.
>>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28,
20, 24]]).T
>>> obs.shape
(6, 2)
>>> power_divergence(obs, lambda_="log-likelihood")
(array([ 2.00657316, 6.77634498]), array([ 0.84823477,
0.23781225]))
By setting ``axis=None``, the test is applied to all data in the
array,
which is equivalent to applying the test to the flattened array.
>>> power_divergence(obs, axis=None)
(23.31034482758621, 0.015975692534127565)
>>> power_divergence(obs.ravel())
(23.31034482758621, 0.015975692534127565)
`ddof` is the change to make to the default degrees of freedom.
>>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1)
(2.0, 0.73575888234288467)
The calculation of the p-values is done by broadcasting the
test statistic with `ddof`.
>>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2])
(2.0, array([ 0.84914504, 0.73575888, 0.5724067 ]))
`f_obs` and `f_exp` are also broadcast. In the following,
`f_obs` has
shape (6,) and `f_exp` has shape (2, 6), so the result of
broadcasting
`f_obs` and `f_exp` has shape (2, 6). To compute the desired
chi-squared
statistics, we must use ``axis=1``:
>>> power_divergence([16, 18, 16, 14, 12, 12],
... f_exp=[[16, 16, 16, 16, 16, 8],
... [8, 20, 20, 16, 12, 12]],
... axis=1)
(array([ 3.5 , 9.25]), array([ 0.62338763, 0.09949846]))
"""
# Convert the input argument `lambda_` to a numerical value.
if isinstance(lambda_, str):
if lambda_ not in _power_div_lambda_names:
names = repr(list(_power_div_lambda_names.keys()))[1:-1]
raise ValueError("invalid string for lambda_: {0!r}. "
"Valid strings are {1}".format(lambda_,
names))
lambda_ = _power_div_lambda_names[lambda_]
elif lambda_ is None:
lambda_ = 1
f_obs = np.asanyarray(f_obs)
f_obs_float = f_obs.astype(np.float64)
if f_exp is not None:
f_exp = np.asanyarray(f_exp)
bshape = _broadcast_shapes(f_obs_float.shape, f_exp.shape)
f_obs_float = _m_broadcast_to(f_obs_float, bshape)
f_exp = _m_broadcast_to(f_exp, bshape)
rtol = 1e-8 # to pass existing tests
with np.errstate(invalid='ignore'):
f_obs_sum = f_obs_float.sum(axis=axis)
f_exp_sum = f_exp.sum(axis=axis)
relative_diff = (np.abs(f_obs_sum - f_exp_sum) /
np.minimum(f_obs_sum, f_exp_sum))
diff_gt_tol = (relative_diff > rtol).any()
if diff_gt_tol:
msg = (f"For each axis slice, the sum of the observed "
f"frequencies must agree with the sum of the "
f"expected frequencies to a relative tolerance "
f"of {rtol}, but the percent differences are:\n"
f"{relative_diff}")
> raise ValueError(msg)
E ValueError: For each axis slice, the sum of the observed
frequencies must agree with the sum of the expected frequencies to a
relative tolerance of 1e-08, but the percent differences are:
E 0.11828134694186175
/usr/lib/python3/dist-packages/scipy/stats/stats.py:6694: ValueError
______________________ test_two[power_divergence-kwargs7]
______________________
kind = 'power_divergence', kwargs = {'lambda_': -1}
@pytest.mark.parametrize(
"kind, kwargs",
[
("ttest_ind", {}),
("ttest_ind", {"equal_var": False}),
("ttest_1samp", {}),
("ttest_rel", {}),
("chisquare", {}),
("power_divergence", {}),
("power_divergence", {"lambda_": 0}),
("power_divergence", {"lambda_": -1}),
("power_divergence", {"lambda_": "neyman"}),
],
)
def test_two(kind, kwargs):
a = np.random.random(size=30)
b = np.random.random(size=30)
a_ = da.from_array(a, 3)
b_ = da.from_array(b, 3)
dask_test = getattr(dask.array.stats, kind)
scipy_test = getattr(scipy.stats, kind)
with pytest.warns(None): # maybe overflow warning
(powrer_divergence)
result = dask_test(a_, b_, **kwargs)
> expected = scipy_test(a, b, **kwargs)
/usr/lib/python3/dist-packages/dask/array/tests/test_stats.py:88:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _
f_obs = array([0.93323015, 0.06847719, 0.04883692, 0.11047738, 0.46259841,
0.20786193, 0.9073499 , 0.35431746, 0.389861...12, 0.75042707,
0.04320992, 0.58871239, 0.60498304,
0.21119014, 0.07340626, 0.64619613, 0.78444948, 0.57526428])
f_exp = array([0.24077443, 0.18040326, 0.14252158, 0.50056972, 0.11418123,
0.06264705, 0.12162502, 0.08841781, 0.660959...98, 0.81831997,
0.46537992, 0.0474171 , 0.87974671,
0.88955421, 0.94621617, 0.81090708, 0.89923667, 0.95013257])
ddof = 0, axis = 0, lambda_ = -1
def power_divergence(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None):
"""Cressie-Read power divergence statistic and goodness of fit test.
This function tests the null hypothesis that the categorical data
has the given frequencies, using the Cressie-Read power divergence
statistic.
Parameters
----------
f_obs : array_like
Observed frequencies in each category.
f_exp : array_like, optional
Expected frequencies in each category. By default the
categories are
assumed to be equally likely.
ddof : int, optional
"Delta degrees of freedom": adjustment to the degrees of freedom
for the p-value. The p-value is computed using a chi-squared
distribution with ``k - 1 - ddof`` degrees of freedom, where `k`
is the number of observed frequencies. The default value of
`ddof`
is 0.
axis : int or None, optional
The axis of the broadcast result of `f_obs` and `f_exp`
along which to
apply the test. If axis is None, all values in `f_obs` are
treated
as a single data set. Default is 0.
lambda_ : float or str, optional
The power in the Cressie-Read power divergence statistic.
The default
is 1. For convenience, `lambda_` may be assigned one of the
following
strings, in which case the corresponding numerical value is
used::
String Value Description
"pearson" 1 Pearson's chi-squared statistic.
In this case, the function is
equivalent to `stats.chisquare`.
"log-likelihood" 0 Log-likelihood ratio. Also
known as
the G-test [3]_.
"freeman-tukey" -1/2 Freeman-Tukey statistic.
"mod-log-likelihood" -1 Modified log-likelihood ratio.
"neyman" -2 Neyman's statistic.
"cressie-read" 2/3 The power recommended in [5]_.
Returns
-------
statistic : float or ndarray
The Cressie-Read power divergence test statistic. The value is
a float if `axis` is None or if` `f_obs` and `f_exp` are 1-D.
pvalue : float or ndarray
The p-value of the test. The value is a float if `ddof` and the
return value `stat` are scalars.
See Also
--------
chisquare
Notes
-----
This test is invalid when the observed or expected frequencies
in each
category are too small. A typical rule is that all of the observed
and expected frequencies should be at least 5.
Also, the sum of the observed and expected frequencies must be
the same
for the test to be valid; `power_divergence` raises an error if
the sums
do not agree within a relative tolerance of ``1e-8``.
When `lambda_` is less than zero, the formula for the statistic
involves
dividing by `f_obs`, so a warning or error may be generated if
any value
in `f_obs` is 0.
Similarly, a warning or error may be generated if any value in
`f_exp` is
zero when `lambda_` >= 0.
The default degrees of freedom, k-1, are for the case when no
parameters
of the distribution are estimated. If p parameters are estimated by
efficient maximum likelihood then the correct degrees of freedom are
k-1-p. If the parameters are estimated in a different way, then the
dof can be between k-1-p and k-1. However, it is also possible that
the asymptotic distribution is not a chisquare, in which case this
test is not appropriate.
This function handles masked arrays. If an element of `f_obs`
or `f_exp`
is masked, then data at that position is ignored, and does not count
towards the size of the data set.
.. versionadded:: 0.13.0
References
----------
.. [1] Lowry, Richard. "Concepts and Applications of Inferential
Statistics". Chapter 8.
https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html
.. [2] "Chi-squared test",
https://en.wikipedia.org/wiki/Chi-squared_test
.. [3] "G-test", https://en.wikipedia.org/wiki/G-test
.. [4] Sokal, R. R. and Rohlf, F. J. "Biometry: the principles and
practice of statistics in biological research", New York:
Freeman
(1981)
.. [5] Cressie, N. and Read, T. R. C., "Multinomial Goodness-of-Fit
Tests", J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984),
pp. 440-464.
Examples
--------
(See `chisquare` for more examples.)
When just `f_obs` is given, it is assumed that the expected
frequencies
are uniform and given by the mean of the observed frequencies.
Here we
perform a G-test (i.e. use the log-likelihood ratio statistic):
>>> from scipy.stats import power_divergence
>>> power_divergence([16, 18, 16, 14, 12, 12],
lambda_='log-likelihood')
(2.006573162632538, 0.84823476779463769)
The expected frequencies can be given with the `f_exp` argument:
>>> power_divergence([16, 18, 16, 14, 12, 12],
... f_exp=[16, 16, 16, 16, 16, 8],
... lambda_='log-likelihood')
(3.3281031458963746, 0.6495419288047497)
When `f_obs` is 2-D, by default the test is applied to each column.
>>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28,
20, 24]]).T
>>> obs.shape
(6, 2)
>>> power_divergence(obs, lambda_="log-likelihood")
(array([ 2.00657316, 6.77634498]), array([ 0.84823477,
0.23781225]))
By setting ``axis=None``, the test is applied to all data in the
array,
which is equivalent to applying the test to the flattened array.
>>> power_divergence(obs, axis=None)
(23.31034482758621, 0.015975692534127565)
>>> power_divergence(obs.ravel())
(23.31034482758621, 0.015975692534127565)
`ddof` is the change to make to the default degrees of freedom.
>>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1)
(2.0, 0.73575888234288467)
The calculation of the p-values is done by broadcasting the
test statistic with `ddof`.
>>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2])
(2.0, array([ 0.84914504, 0.73575888, 0.5724067 ]))
`f_obs` and `f_exp` are also broadcast. In the following,
`f_obs` has
shape (6,) and `f_exp` has shape (2, 6), so the result of
broadcasting
`f_obs` and `f_exp` has shape (2, 6). To compute the desired
chi-squared
statistics, we must use ``axis=1``:
>>> power_divergence([16, 18, 16, 14, 12, 12],
... f_exp=[[16, 16, 16, 16, 16, 8],
... [8, 20, 20, 16, 12, 12]],
... axis=1)
(array([ 3.5 , 9.25]), array([ 0.62338763, 0.09949846]))
"""
# Convert the input argument `lambda_` to a numerical value.
if isinstance(lambda_, str):
if lambda_ not in _power_div_lambda_names:
names = repr(list(_power_div_lambda_names.keys()))[1:-1]
raise ValueError("invalid string for lambda_: {0!r}. "
"Valid strings are {1}".format(lambda_,
names))
lambda_ = _power_div_lambda_names[lambda_]
elif lambda_ is None:
lambda_ = 1
f_obs = np.asanyarray(f_obs)
f_obs_float = f_obs.astype(np.float64)
if f_exp is not None:
f_exp = np.asanyarray(f_exp)
bshape = _broadcast_shapes(f_obs_float.shape, f_exp.shape)
f_obs_float = _m_broadcast_to(f_obs_float, bshape)
f_exp = _m_broadcast_to(f_exp, bshape)
rtol = 1e-8 # to pass existing tests
with np.errstate(invalid='ignore'):
f_obs_sum = f_obs_float.sum(axis=axis)
f_exp_sum = f_exp.sum(axis=axis)
relative_diff = (np.abs(f_obs_sum - f_exp_sum) /
np.minimum(f_obs_sum, f_exp_sum))
diff_gt_tol = (relative_diff > rtol).any()
if diff_gt_tol:
msg = (f"For each axis slice, the sum of the observed "
f"frequencies must agree with the sum of the "
f"expected frequencies to a relative tolerance "
f"of {rtol}, but the percent differences are:\n"
f"{relative_diff}")
> raise ValueError(msg)
E ValueError: For each axis slice, the sum of the observed
frequencies must agree with the sum of the expected frequencies to a
relative tolerance of 1e-08, but the percent differences are:
E 0.11063329565866321
/usr/lib/python3/dist-packages/scipy/stats/stats.py:6694: ValueError
______________________ test_two[power_divergence-kwargs8]
______________________
kind = 'power_divergence', kwargs = {'lambda_': 'neyman'}
@pytest.mark.parametrize(
"kind, kwargs",
[
("ttest_ind", {}),
("ttest_ind", {"equal_var": False}),
("ttest_1samp", {}),
("ttest_rel", {}),
("chisquare", {}),
("power_divergence", {}),
("power_divergence", {"lambda_": 0}),
("power_divergence", {"lambda_": -1}),
("power_divergence", {"lambda_": "neyman"}),
],
)
def test_two(kind, kwargs):
a = np.random.random(size=30)
b = np.random.random(size=30)
a_ = da.from_array(a, 3)
b_ = da.from_array(b, 3)
dask_test = getattr(dask.array.stats, kind)
scipy_test = getattr(scipy.stats, kind)
with pytest.warns(None): # maybe overflow warning
(powrer_divergence)
result = dask_test(a_, b_, **kwargs)
> expected = scipy_test(a, b, **kwargs)
/usr/lib/python3/dist-packages/dask/array/tests/test_stats.py:88:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _
f_obs = array([8.77388479e-01, 9.60149501e-01, 4.47046846e-01,
8.40131926e-01,
2.53213006e-01, 7.75601859e-01, 9.242306...546e-01,
9.89797308e-01, 3.58514314e-01, 1.74643187e-01, 8.59181200e-01,
8.11582172e-01, 8.30851765e-04])
f_exp = array([0.38556923, 0.78153729, 0.45783246, 0.6889513 , 0.85721604,
0.6226383 , 0.76431552, 0.6329189 , 0.037484...69, 0.16427409,
0.00692481, 0.91507232, 0.41494886,
0.90695644, 0.57371155, 0.36052197, 0.61507666, 0.87718468])
ddof = 0, axis = 0, lambda_ = -2
def power_divergence(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None):
"""Cressie-Read power divergence statistic and goodness of fit test.
This function tests the null hypothesis that the categorical data
has the given frequencies, using the Cressie-Read power divergence
statistic.
Parameters
----------
f_obs : array_like
Observed frequencies in each category.
f_exp : array_like, optional
Expected frequencies in each category. By default the
categories are
assumed to be equally likely.
ddof : int, optional
"Delta degrees of freedom": adjustment to the degrees of freedom
for the p-value. The p-value is computed using a chi-squared
distribution with ``k - 1 - ddof`` degrees of freedom, where `k`
is the number of observed frequencies. The default value of
`ddof`
is 0.
axis : int or None, optional
The axis of the broadcast result of `f_obs` and `f_exp`
along which to
apply the test. If axis is None, all values in `f_obs` are
treated
as a single data set. Default is 0.
lambda_ : float or str, optional
The power in the Cressie-Read power divergence statistic.
The default
is 1. For convenience, `lambda_` may be assigned one of the
following
strings, in which case the corresponding numerical value is
used::
String Value Description
"pearson" 1 Pearson's chi-squared statistic.
In this case, the function is
equivalent to `stats.chisquare`.
"log-likelihood" 0 Log-likelihood ratio. Also
known as
the G-test [3]_.
"freeman-tukey" -1/2 Freeman-Tukey statistic.
"mod-log-likelihood" -1 Modified log-likelihood ratio.
"neyman" -2 Neyman's statistic.
"cressie-read" 2/3 The power recommended in [5]_.
Returns
-------
statistic : float or ndarray
The Cressie-Read power divergence test statistic. The value is
a float if `axis` is None or if` `f_obs` and `f_exp` are 1-D.
pvalue : float or ndarray
The p-value of the test. The value is a float if `ddof` and the
return value `stat` are scalars.
See Also
--------
chisquare
Notes
-----
This test is invalid when the observed or expected frequencies
in each
category are too small. A typical rule is that all of the observed
and expected frequencies should be at least 5.
Also, the sum of the observed and expected frequencies must be
the same
for the test to be valid; `power_divergence` raises an error if
the sums
do not agree within a relative tolerance of ``1e-8``.
When `lambda_` is less than zero, the formula for the statistic
involves
dividing by `f_obs`, so a warning or error may be generated if
any value
in `f_obs` is 0.
Similarly, a warning or error may be generated if any value in
`f_exp` is
zero when `lambda_` >= 0.
The default degrees of freedom, k-1, are for the case when no
parameters
of the distribution are estimated. If p parameters are estimated by
efficient maximum likelihood then the correct degrees of freedom are
k-1-p. If the parameters are estimated in a different way, then the
dof can be between k-1-p and k-1. However, it is also possible that
the asymptotic distribution is not a chisquare, in which case this
test is not appropriate.
This function handles masked arrays. If an element of `f_obs`
or `f_exp`
is masked, then data at that position is ignored, and does not count
towards the size of the data set.
.. versionadded:: 0.13.0
References
----------
.. [1] Lowry, Richard. "Concepts and Applications of Inferential
Statistics". Chapter 8.
https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html
.. [2] "Chi-squared test",
https://en.wikipedia.org/wiki/Chi-squared_test
.. [3] "G-test", https://en.wikipedia.org/wiki/G-test
.. [4] Sokal, R. R. and Rohlf, F. J. "Biometry: the principles and
practice of statistics in biological research", New York:
Freeman
(1981)
.. [5] Cressie, N. and Read, T. R. C., "Multinomial Goodness-of-Fit
Tests", J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984),
pp. 440-464.
Examples
--------
(See `chisquare` for more examples.)
When just `f_obs` is given, it is assumed that the expected
frequencies
are uniform and given by the mean of the observed frequencies.
Here we
perform a G-test (i.e. use the log-likelihood ratio statistic):
>>> from scipy.stats import power_divergence
>>> power_divergence([16, 18, 16, 14, 12, 12],
lambda_='log-likelihood')
(2.006573162632538, 0.84823476779463769)
The expected frequencies can be given with the `f_exp` argument:
>>> power_divergence([16, 18, 16, 14, 12, 12],
... f_exp=[16, 16, 16, 16, 16, 8],
... lambda_='log-likelihood')
(3.3281031458963746, 0.6495419288047497)
When `f_obs` is 2-D, by default the test is applied to each column.
>>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28,
20, 24]]).T
>>> obs.shape
(6, 2)
>>> power_divergence(obs, lambda_="log-likelihood")
(array([ 2.00657316, 6.77634498]), array([ 0.84823477,
0.23781225]))
By setting ``axis=None``, the test is applied to all data in the
array,
which is equivalent to applying the test to the flattened array.
>>> power_divergence(obs, axis=None)
(23.31034482758621, 0.015975692534127565)
>>> power_divergence(obs.ravel())
(23.31034482758621, 0.015975692534127565)
`ddof` is the change to make to the default degrees of freedom.
>>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1)
(2.0, 0.73575888234288467)
The calculation of the p-values is done by broadcasting the
test statistic with `ddof`.
>>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2])
(2.0, array([ 0.84914504, 0.73575888, 0.5724067 ]))
`f_obs` and `f_exp` are also broadcast. In the following,
`f_obs` has
shape (6,) and `f_exp` has shape (2, 6), so the result of
broadcasting
`f_obs` and `f_exp` has shape (2, 6). To compute the desired
chi-squared
statistics, we must use ``axis=1``:
>>> power_divergence([16, 18, 16, 14, 12, 12],
... f_exp=[[16, 16, 16, 16, 16, 8],
... [8, 20, 20, 16, 12, 12]],
... axis=1)
(array([ 3.5 , 9.25]), array([ 0.62338763, 0.09949846]))
"""
# Convert the input argument `lambda_` to a numerical value.
if isinstance(lambda_, str):
if lambda_ not in _power_div_lambda_names:
names = repr(list(_power_div_lambda_names.keys()))[1:-1]
raise ValueError("invalid string for lambda_: {0!r}. "
"Valid strings are {1}".format(lambda_,
names))
lambda_ = _power_div_lambda_names[lambda_]
elif lambda_ is None:
lambda_ = 1
f_obs = np.asanyarray(f_obs)
f_obs_float = f_obs.astype(np.float64)
if f_exp is not None:
f_exp = np.asanyarray(f_exp)
bshape = _broadcast_shapes(f_obs_float.shape, f_exp.shape)
f_obs_float = _m_broadcast_to(f_obs_float, bshape)
f_exp = _m_broadcast_to(f_exp, bshape)
rtol = 1e-8 # to pass existing tests
with np.errstate(invalid='ignore'):
f_obs_sum = f_obs_float.sum(axis=axis)
f_exp_sum = f_exp.sum(axis=axis)
relative_diff = (np.abs(f_obs_sum - f_exp_sum) /
np.minimum(f_obs_sum, f_exp_sum))
diff_gt_tol = (relative_diff > rtol).any()
if diff_gt_tol:
msg = (f"For each axis slice, the sum of the observed "
f"frequencies must agree with the sum of the "
f"expected frequencies to a relative tolerance "
f"of {rtol}, but the percent differences are:\n"
f"{relative_diff}")
> raise ValueError(msg)
E ValueError: For each axis slice, the sum of the observed
frequencies must agree with the sum of the expected frequencies to a
relative tolerance of 1e-08, but the percent differences are:
E 0.1911078274656955
OpenPGP_signature
Description: OpenPGP digital signature

