test_stats.py::test_two

Paul Gevers Sun, 22 Aug 2021 01:03:17 -0700

Source: scipy, dask
Control: found -1 scipy/1.7.1-1
Control: found -1 dask/2021.01.0+dfsg-1
Severity: serious
Tags: sid bookworm
X-Debbugs-CC: [email protected]
User: [email protected]
Usertags: breaks needs-update


Dear maintainer(s),

With a recent upload of scipy the autopkgtest of dask fails in testing
when that autopkgtest is run with the binary packages of scipy from
unstable. It passes when run with only packages from testing. In tabular
form:

                       pass            fail
scipy                  from testing    1.7.1-1
dask                   from testing    2021.01.0+dfsg-1
all others             from testing    from testing

I copied some of the output at the bottom of this report.

Currently this regression is blocking the migration of scipy to testing
[1]. Due to the nature of this issue, I filed this bug report against
both packages. Can you please investigate the situation and reassign the
bug to the right package?

More information about this bug and the reason for filing it can be found on
https://wiki.debian.org/ContinuousIntegration/RegressionEmailInformation

Paul

[1] https://qa.debian.org/excuses.php?package=scipy

https://ci.debian.net/data/autopkgtest/testing/amd64/d/dask/14750989/log.gz

=================================== FAILURES
===================================
_________________________ test_two[chisquare-kwargs4]
__________________________

kind = 'chisquare', kwargs = {}

    @pytest.mark.parametrize(
        "kind, kwargs",
        [
            ("ttest_ind", {}),
            ("ttest_ind", {"equal_var": False}),
            ("ttest_1samp", {}),
            ("ttest_rel", {}),
            ("chisquare", {}),
            ("power_divergence", {}),
            ("power_divergence", {"lambda_": 0}),
            ("power_divergence", {"lambda_": -1}),
            ("power_divergence", {"lambda_": "neyman"}),
        ],
    )
    def test_two(kind, kwargs):
        a = np.random.random(size=30)
        b = np.random.random(size=30)
        a_ = da.from_array(a, 3)
        b_ = da.from_array(b, 3)

        dask_test = getattr(dask.array.stats, kind)
        scipy_test = getattr(scipy.stats, kind)

        with pytest.warns(None):  # maybe overflow warning
(powrer_divergence)
            result = dask_test(a_, b_, **kwargs)
>           expected = scipy_test(a, b, **kwargs)

/usr/lib/python3/dist-packages/dask/array/tests/test_stats.py:88:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _
/usr/lib/python3/dist-packages/scipy/stats/stats.py:6852: in chisquare
    return power_divergence(f_obs, f_exp=f_exp, ddof=ddof, axis=axis,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _

f_obs = array([0.50616202, 0.24047559, 0.47385697, 0.2617551 , 0.39114576,
       0.88770032, 0.34302868, 0.14302513, 0.725594...42, 0.59473997,
0.6900566 , 0.53797374, 0.57797282,
       0.03873117, 0.25516211, 0.61014154, 0.07151085, 0.45349848])
f_exp = array([0.74205809, 0.02842818, 0.8787515 , 0.73989809, 0.62863789,
       0.29149379, 0.35539664, 0.14194145, 0.256345...17, 0.30981448,
0.79533776, 0.38182638, 0.07065248,
       0.82259669, 0.67148202, 0.52457137, 0.24092418, 0.9464332 ])
ddof = 0, axis = 0, lambda_ = 1

    def power_divergence(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None):
        """Cressie-Read power divergence statistic and goodness of fit test.

        This function tests the null hypothesis that the categorical data
        has the given frequencies, using the Cressie-Read power divergence
        statistic.

        Parameters
        ----------
        f_obs : array_like
            Observed frequencies in each category.
        f_exp : array_like, optional
            Expected frequencies in each category.  By default the
categories are
            assumed to be equally likely.
        ddof : int, optional
            "Delta degrees of freedom": adjustment to the degrees of freedom
            for the p-value.  The p-value is computed using a chi-squared
            distribution with ``k - 1 - ddof`` degrees of freedom, where `k`
            is the number of observed frequencies.  The default value of
`ddof`
            is 0.
        axis : int or None, optional
            The axis of the broadcast result of `f_obs` and `f_exp`
along which to
            apply the test.  If axis is None, all values in `f_obs` are
treated
            as a single data set.  Default is 0.
        lambda_ : float or str, optional
            The power in the Cressie-Read power divergence statistic.
The default
            is 1.  For convenience, `lambda_` may be assigned one of the
following
            strings, in which case the corresponding numerical value is
used::

                String              Value   Description
                "pearson"             1     Pearson's chi-squared statistic.
                                            In this case, the function is
                                            equivalent to `stats.chisquare`.
                "log-likelihood"      0     Log-likelihood ratio. Also
known as
                                            the G-test [3]_.
                "freeman-tukey"      -1/2   Freeman-Tukey statistic.
                "mod-log-likelihood" -1     Modified log-likelihood ratio.
                "neyman"             -2     Neyman's statistic.
                "cressie-read"        2/3   The power recommended in [5]_.

        Returns
        -------
        statistic : float or ndarray
            The Cressie-Read power divergence test statistic.  The value is
            a float if `axis` is None or if` `f_obs` and `f_exp` are 1-D.
        pvalue : float or ndarray
            The p-value of the test.  The value is a float if `ddof` and the
            return value `stat` are scalars.

        See Also
        --------
        chisquare

        Notes
        -----
        This test is invalid when the observed or expected frequencies
in each
        category are too small.  A typical rule is that all of the observed
        and expected frequencies should be at least 5.

        Also, the sum of the observed and expected frequencies must be
the same
        for the test to be valid; `power_divergence` raises an error if
the sums
        do not agree within a relative tolerance of ``1e-8``.

        When `lambda_` is less than zero, the formula for the statistic
involves
        dividing by `f_obs`, so a warning or error may be generated if
any value
        in `f_obs` is 0.

        Similarly, a warning or error may be generated if any value in
`f_exp` is
        zero when `lambda_` >= 0.

        The default degrees of freedom, k-1, are for the case when no
parameters
        of the distribution are estimated. If p parameters are estimated by
        efficient maximum likelihood then the correct degrees of freedom are
        k-1-p. If the parameters are estimated in a different way, then the
        dof can be between k-1-p and k-1. However, it is also possible that
        the asymptotic distribution is not a chisquare, in which case this
        test is not appropriate.

        This function handles masked arrays.  If an element of `f_obs`
or `f_exp`
        is masked, then data at that position is ignored, and does not count
        towards the size of the data set.

        .. versionadded:: 0.13.0

        References
        ----------
        .. [1] Lowry, Richard.  "Concepts and Applications of Inferential
               Statistics". Chapter 8.

https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html
        .. [2] "Chi-squared test",
https://en.wikipedia.org/wiki/Chi-squared_test
        .. [3] "G-test", https://en.wikipedia.org/wiki/G-test
        .. [4] Sokal, R. R. and Rohlf, F. J. "Biometry: the principles and
               practice of statistics in biological research", New York:
Freeman
               (1981)
        .. [5] Cressie, N. and Read, T. R. C., "Multinomial Goodness-of-Fit
               Tests", J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984),
               pp. 440-464.

        Examples
        --------
        (See `chisquare` for more examples.)

        When just `f_obs` is given, it is assumed that the expected
frequencies
        are uniform and given by the mean of the observed frequencies.
Here we
        perform a G-test (i.e. use the log-likelihood ratio statistic):

        >>> from scipy.stats import power_divergence
        >>> power_divergence([16, 18, 16, 14, 12, 12],
lambda_='log-likelihood')
        (2.006573162632538, 0.84823476779463769)

        The expected frequencies can be given with the `f_exp` argument:

        >>> power_divergence([16, 18, 16, 14, 12, 12],
        ...                  f_exp=[16, 16, 16, 16, 16, 8],
        ...                  lambda_='log-likelihood')
        (3.3281031458963746, 0.6495419288047497)

        When `f_obs` is 2-D, by default the test is applied to each column.

        >>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28,
20, 24]]).T
        >>> obs.shape
        (6, 2)
        >>> power_divergence(obs, lambda_="log-likelihood")
        (array([ 2.00657316,  6.77634498]), array([ 0.84823477,
0.23781225]))

        By setting ``axis=None``, the test is applied to all data in the
array,
        which is equivalent to applying the test to the flattened array.

        >>> power_divergence(obs, axis=None)
        (23.31034482758621, 0.015975692534127565)
        >>> power_divergence(obs.ravel())
        (23.31034482758621, 0.015975692534127565)

        `ddof` is the change to make to the default degrees of freedom.

        >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1)
        (2.0, 0.73575888234288467)

        The calculation of the p-values is done by broadcasting the
        test statistic with `ddof`.

        >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2])
        (2.0, array([ 0.84914504,  0.73575888,  0.5724067 ]))

        `f_obs` and `f_exp` are also broadcast.  In the following,
`f_obs` has
        shape (6,) and `f_exp` has shape (2, 6), so the result of
broadcasting
        `f_obs` and `f_exp` has shape (2, 6).  To compute the desired
chi-squared
        statistics, we must use ``axis=1``:

        >>> power_divergence([16, 18, 16, 14, 12, 12],
        ...                  f_exp=[[16, 16, 16, 16, 16, 8],
        ...                         [8, 20, 20, 16, 12, 12]],
        ...                  axis=1)
        (array([ 3.5 ,  9.25]), array([ 0.62338763,  0.09949846]))

        """
        # Convert the input argument `lambda_` to a numerical value.
        if isinstance(lambda_, str):
            if lambda_ not in _power_div_lambda_names:
                names = repr(list(_power_div_lambda_names.keys()))[1:-1]
                raise ValueError("invalid string for lambda_: {0!r}. "
                                 "Valid strings are {1}".format(lambda_,
names))
            lambda_ = _power_div_lambda_names[lambda_]
        elif lambda_ is None:
            lambda_ = 1

        f_obs = np.asanyarray(f_obs)
        f_obs_float = f_obs.astype(np.float64)

        if f_exp is not None:
            f_exp = np.asanyarray(f_exp)
            bshape = _broadcast_shapes(f_obs_float.shape, f_exp.shape)
            f_obs_float = _m_broadcast_to(f_obs_float, bshape)
            f_exp = _m_broadcast_to(f_exp, bshape)
            rtol = 1e-8  # to pass existing tests
            with np.errstate(invalid='ignore'):
                f_obs_sum = f_obs_float.sum(axis=axis)
                f_exp_sum = f_exp.sum(axis=axis)
                relative_diff = (np.abs(f_obs_sum - f_exp_sum) /
                                 np.minimum(f_obs_sum, f_exp_sum))
                diff_gt_tol = (relative_diff > rtol).any()
            if diff_gt_tol:
                msg = (f"For each axis slice, the sum of the observed "
                       f"frequencies must agree with the sum of the "
                       f"expected frequencies to a relative tolerance "
                       f"of {rtol}, but the percent differences are:\n"
                       f"{relative_diff}")
>               raise ValueError(msg)
E               ValueError: For each axis slice, the sum of the observed
frequencies must agree with the sum of the expected frequencies to a
relative tolerance of 1e-08, but the percent differences are:
E               0.025894044619658885

/usr/lib/python3/dist-packages/scipy/stats/stats.py:6694: ValueError
______________________ test_two[power_divergence-kwargs5]
______________________

kind = 'power_divergence', kwargs = {}

    @pytest.mark.parametrize(
        "kind, kwargs",
        [
            ("ttest_ind", {}),
            ("ttest_ind", {"equal_var": False}),
            ("ttest_1samp", {}),
            ("ttest_rel", {}),
            ("chisquare", {}),
            ("power_divergence", {}),
            ("power_divergence", {"lambda_": 0}),
            ("power_divergence", {"lambda_": -1}),
            ("power_divergence", {"lambda_": "neyman"}),
        ],
    )
    def test_two(kind, kwargs):
        a = np.random.random(size=30)
        b = np.random.random(size=30)
        a_ = da.from_array(a, 3)
        b_ = da.from_array(b, 3)

        dask_test = getattr(dask.array.stats, kind)
        scipy_test = getattr(scipy.stats, kind)

        with pytest.warns(None):  # maybe overflow warning
(powrer_divergence)
            result = dask_test(a_, b_, **kwargs)
>           expected = scipy_test(a, b, **kwargs)

/usr/lib/python3/dist-packages/dask/array/tests/test_stats.py:88:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _

f_obs = array([0.48154045, 0.91770608, 0.79433273, 0.677548  , 0.5172343 ,
       0.97963439, 0.05970396, 0.87044663, 0.047389...74, 0.51919702,
0.17786529, 0.30872526, 0.16799736,
       0.3470901 , 0.60993241, 0.97491444, 0.6667405 , 0.40817497])
f_exp = array([0.34841225, 0.36222138, 0.90096367, 0.34009505, 0.06369394,
       0.50103548, 0.08646673, 0.08211717, 0.962662...77, 0.17116813,
0.62746241, 0.11182426, 0.35465983,
       0.40134001, 0.23779315, 0.98321459, 0.06142563, 0.88909906])
ddof = 0, axis = 0, lambda_ = 1

    def power_divergence(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None):
        """Cressie-Read power divergence statistic and goodness of fit test.

        This function tests the null hypothesis that the categorical data
        has the given frequencies, using the Cressie-Read power divergence
        statistic.

        Parameters
        ----------
        f_obs : array_like
            Observed frequencies in each category.
        f_exp : array_like, optional
            Expected frequencies in each category.  By default the
categories are
            assumed to be equally likely.
        ddof : int, optional
            "Delta degrees of freedom": adjustment to the degrees of freedom
            for the p-value.  The p-value is computed using a chi-squared
            distribution with ``k - 1 - ddof`` degrees of freedom, where `k`
            is the number of observed frequencies.  The default value of
`ddof`
            is 0.
        axis : int or None, optional
            The axis of the broadcast result of `f_obs` and `f_exp`
along which to
            apply the test.  If axis is None, all values in `f_obs` are
treated
            as a single data set.  Default is 0.
        lambda_ : float or str, optional
            The power in the Cressie-Read power divergence statistic.
The default
            is 1.  For convenience, `lambda_` may be assigned one of the
following
            strings, in which case the corresponding numerical value is
used::

                String              Value   Description
                "pearson"             1     Pearson's chi-squared statistic.
                                            In this case, the function is
                                            equivalent to `stats.chisquare`.
                "log-likelihood"      0     Log-likelihood ratio. Also
known as
                                            the G-test [3]_.
                "freeman-tukey"      -1/2   Freeman-Tukey statistic.
                "mod-log-likelihood" -1     Modified log-likelihood ratio.
                "neyman"             -2     Neyman's statistic.
                "cressie-read"        2/3   The power recommended in [5]_.

        Returns
        -------
        statistic : float or ndarray
            The Cressie-Read power divergence test statistic.  The value is
            a float if `axis` is None or if` `f_obs` and `f_exp` are 1-D.
        pvalue : float or ndarray
            The p-value of the test.  The value is a float if `ddof` and the
            return value `stat` are scalars.

        See Also
        --------
        chisquare

        Notes
        -----
        This test is invalid when the observed or expected frequencies
in each
        category are too small.  A typical rule is that all of the observed
        and expected frequencies should be at least 5.

        Also, the sum of the observed and expected frequencies must be
the same
        for the test to be valid; `power_divergence` raises an error if
the sums
        do not agree within a relative tolerance of ``1e-8``.

        When `lambda_` is less than zero, the formula for the statistic
involves
        dividing by `f_obs`, so a warning or error may be generated if
any value
        in `f_obs` is 0.

        Similarly, a warning or error may be generated if any value in
`f_exp` is
        zero when `lambda_` >= 0.

        The default degrees of freedom, k-1, are for the case when no
parameters
        of the distribution are estimated. If p parameters are estimated by
        efficient maximum likelihood then the correct degrees of freedom are
        k-1-p. If the parameters are estimated in a different way, then the
        dof can be between k-1-p and k-1. However, it is also possible that
        the asymptotic distribution is not a chisquare, in which case this
        test is not appropriate.

        This function handles masked arrays.  If an element of `f_obs`
or `f_exp`
        is masked, then data at that position is ignored, and does not count
        towards the size of the data set.

        .. versionadded:: 0.13.0

        References
        ----------
        .. [1] Lowry, Richard.  "Concepts and Applications of Inferential
               Statistics". Chapter 8.

https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html
        .. [2] "Chi-squared test",
https://en.wikipedia.org/wiki/Chi-squared_test
        .. [3] "G-test", https://en.wikipedia.org/wiki/G-test
        .. [4] Sokal, R. R. and Rohlf, F. J. "Biometry: the principles and
               practice of statistics in biological research", New York:
Freeman
               (1981)
        .. [5] Cressie, N. and Read, T. R. C., "Multinomial Goodness-of-Fit
               Tests", J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984),
               pp. 440-464.

        Examples
        --------
        (See `chisquare` for more examples.)

        When just `f_obs` is given, it is assumed that the expected
frequencies
        are uniform and given by the mean of the observed frequencies.
Here we
        perform a G-test (i.e. use the log-likelihood ratio statistic):

        >>> from scipy.stats import power_divergence
        >>> power_divergence([16, 18, 16, 14, 12, 12],
lambda_='log-likelihood')
        (2.006573162632538, 0.84823476779463769)

        The expected frequencies can be given with the `f_exp` argument:

        >>> power_divergence([16, 18, 16, 14, 12, 12],
        ...                  f_exp=[16, 16, 16, 16, 16, 8],
        ...                  lambda_='log-likelihood')
        (3.3281031458963746, 0.6495419288047497)

        When `f_obs` is 2-D, by default the test is applied to each column.

        >>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28,
20, 24]]).T
        >>> obs.shape
        (6, 2)
        >>> power_divergence(obs, lambda_="log-likelihood")
        (array([ 2.00657316,  6.77634498]), array([ 0.84823477,
0.23781225]))

        By setting ``axis=None``, the test is applied to all data in the
array,
        which is equivalent to applying the test to the flattened array.

        >>> power_divergence(obs, axis=None)
        (23.31034482758621, 0.015975692534127565)
        >>> power_divergence(obs.ravel())
        (23.31034482758621, 0.015975692534127565)

        `ddof` is the change to make to the default degrees of freedom.

        >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1)
        (2.0, 0.73575888234288467)

        The calculation of the p-values is done by broadcasting the
        test statistic with `ddof`.

        >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2])
        (2.0, array([ 0.84914504,  0.73575888,  0.5724067 ]))

        `f_obs` and `f_exp` are also broadcast.  In the following,
`f_obs` has
        shape (6,) and `f_exp` has shape (2, 6), so the result of
broadcasting
        `f_obs` and `f_exp` has shape (2, 6).  To compute the desired
chi-squared
        statistics, we must use ``axis=1``:

        >>> power_divergence([16, 18, 16, 14, 12, 12],
        ...                  f_exp=[[16, 16, 16, 16, 16, 8],
        ...                         [8, 20, 20, 16, 12, 12]],
        ...                  axis=1)
        (array([ 3.5 ,  9.25]), array([ 0.62338763,  0.09949846]))

        """
        # Convert the input argument `lambda_` to a numerical value.
        if isinstance(lambda_, str):
            if lambda_ not in _power_div_lambda_names:
                names = repr(list(_power_div_lambda_names.keys()))[1:-1]
                raise ValueError("invalid string for lambda_: {0!r}. "
                                 "Valid strings are {1}".format(lambda_,
names))
            lambda_ = _power_div_lambda_names[lambda_]
        elif lambda_ is None:
            lambda_ = 1

        f_obs = np.asanyarray(f_obs)
        f_obs_float = f_obs.astype(np.float64)

        if f_exp is not None:
            f_exp = np.asanyarray(f_exp)
            bshape = _broadcast_shapes(f_obs_float.shape, f_exp.shape)
            f_obs_float = _m_broadcast_to(f_obs_float, bshape)
            f_exp = _m_broadcast_to(f_exp, bshape)
            rtol = 1e-8  # to pass existing tests
            with np.errstate(invalid='ignore'):
                f_obs_sum = f_obs_float.sum(axis=axis)
                f_exp_sum = f_exp.sum(axis=axis)
                relative_diff = (np.abs(f_obs_sum - f_exp_sum) /
                                 np.minimum(f_obs_sum, f_exp_sum))
                diff_gt_tol = (relative_diff > rtol).any()
            if diff_gt_tol:
                msg = (f"For each axis slice, the sum of the observed "
                       f"frequencies must agree with the sum of the "
                       f"expected frequencies to a relative tolerance "
                       f"of {rtol}, but the percent differences are:\n"
                       f"{relative_diff}")
>               raise ValueError(msg)
E               ValueError: For each axis slice, the sum of the observed
frequencies must agree with the sum of the expected frequencies to a
relative tolerance of 1e-08, but the percent differences are:
E               0.0255236948989553

/usr/lib/python3/dist-packages/scipy/stats/stats.py:6694: ValueError
______________________ test_two[power_divergence-kwargs6]
______________________

kind = 'power_divergence', kwargs = {'lambda_': 0}

    @pytest.mark.parametrize(
        "kind, kwargs",
        [
            ("ttest_ind", {}),
            ("ttest_ind", {"equal_var": False}),
            ("ttest_1samp", {}),
            ("ttest_rel", {}),
            ("chisquare", {}),
            ("power_divergence", {}),
            ("power_divergence", {"lambda_": 0}),
            ("power_divergence", {"lambda_": -1}),
            ("power_divergence", {"lambda_": "neyman"}),
        ],
    )
    def test_two(kind, kwargs):
        a = np.random.random(size=30)
        b = np.random.random(size=30)
        a_ = da.from_array(a, 3)
        b_ = da.from_array(b, 3)

        dask_test = getattr(dask.array.stats, kind)
        scipy_test = getattr(scipy.stats, kind)

        with pytest.warns(None):  # maybe overflow warning
(powrer_divergence)
            result = dask_test(a_, b_, **kwargs)
>           expected = scipy_test(a, b, **kwargs)

/usr/lib/python3/dist-packages/dask/array/tests/test_stats.py:88:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _

f_obs = array([0.63467182, 0.96048488, 0.58221072, 0.71074857, 0.85604392,
       0.80233744, 0.19603734, 0.51923681, 0.334861...59, 0.96676095,
0.6747112 , 0.15166506, 0.64653236,
       0.11540545, 0.16450541, 0.11795982, 0.41130776, 0.97301235])
f_exp = array([0.13534693, 0.12792919, 0.16689546, 0.63389176, 0.68782358,
       0.6177741 , 0.026473  , 0.58817575, 0.044479...02, 0.1481515 ,
0.12261963, 0.84576878, 0.93908736,
       0.44969247, 0.14905221, 0.54827366, 0.15204124, 0.30575429])
ddof = 0, axis = 0, lambda_ = 0

    def power_divergence(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None):
        """Cressie-Read power divergence statistic and goodness of fit test.

        This function tests the null hypothesis that the categorical data
        has the given frequencies, using the Cressie-Read power divergence
        statistic.

        Parameters
        ----------
        f_obs : array_like
            Observed frequencies in each category.
        f_exp : array_like, optional
            Expected frequencies in each category.  By default the
categories are
            assumed to be equally likely.
        ddof : int, optional
            "Delta degrees of freedom": adjustment to the degrees of freedom
            for the p-value.  The p-value is computed using a chi-squared
            distribution with ``k - 1 - ddof`` degrees of freedom, where `k`
            is the number of observed frequencies.  The default value of
`ddof`
            is 0.
        axis : int or None, optional
            The axis of the broadcast result of `f_obs` and `f_exp`
along which to
            apply the test.  If axis is None, all values in `f_obs` are
treated
            as a single data set.  Default is 0.
        lambda_ : float or str, optional
            The power in the Cressie-Read power divergence statistic.
The default
            is 1.  For convenience, `lambda_` may be assigned one of the
following
            strings, in which case the corresponding numerical value is
used::

                String              Value   Description
                "pearson"             1     Pearson's chi-squared statistic.
                                            In this case, the function is
                                            equivalent to `stats.chisquare`.
                "log-likelihood"      0     Log-likelihood ratio. Also
known as
                                            the G-test [3]_.
                "freeman-tukey"      -1/2   Freeman-Tukey statistic.
                "mod-log-likelihood" -1     Modified log-likelihood ratio.
                "neyman"             -2     Neyman's statistic.
                "cressie-read"        2/3   The power recommended in [5]_.

        Returns
        -------
        statistic : float or ndarray
            The Cressie-Read power divergence test statistic.  The value is
            a float if `axis` is None or if` `f_obs` and `f_exp` are 1-D.
        pvalue : float or ndarray
            The p-value of the test.  The value is a float if `ddof` and the
            return value `stat` are scalars.

        See Also
        --------
        chisquare

        Notes
        -----
        This test is invalid when the observed or expected frequencies
in each
        category are too small.  A typical rule is that all of the observed
        and expected frequencies should be at least 5.

        Also, the sum of the observed and expected frequencies must be
the same
        for the test to be valid; `power_divergence` raises an error if
the sums
        do not agree within a relative tolerance of ``1e-8``.

        When `lambda_` is less than zero, the formula for the statistic
involves
        dividing by `f_obs`, so a warning or error may be generated if
any value
        in `f_obs` is 0.

        Similarly, a warning or error may be generated if any value in
`f_exp` is
        zero when `lambda_` >= 0.

        The default degrees of freedom, k-1, are for the case when no
parameters
        of the distribution are estimated. If p parameters are estimated by
        efficient maximum likelihood then the correct degrees of freedom are
        k-1-p. If the parameters are estimated in a different way, then the
        dof can be between k-1-p and k-1. However, it is also possible that
        the asymptotic distribution is not a chisquare, in which case this
        test is not appropriate.

        This function handles masked arrays.  If an element of `f_obs`
or `f_exp`
        is masked, then data at that position is ignored, and does not count
        towards the size of the data set.

        .. versionadded:: 0.13.0

        References
        ----------
        .. [1] Lowry, Richard.  "Concepts and Applications of Inferential
               Statistics". Chapter 8.

https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html
        .. [2] "Chi-squared test",
https://en.wikipedia.org/wiki/Chi-squared_test
        .. [3] "G-test", https://en.wikipedia.org/wiki/G-test
        .. [4] Sokal, R. R. and Rohlf, F. J. "Biometry: the principles and
               practice of statistics in biological research", New York:
Freeman
               (1981)
        .. [5] Cressie, N. and Read, T. R. C., "Multinomial Goodness-of-Fit
               Tests", J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984),
               pp. 440-464.

        Examples
        --------
        (See `chisquare` for more examples.)

        When just `f_obs` is given, it is assumed that the expected
frequencies
        are uniform and given by the mean of the observed frequencies.
Here we
        perform a G-test (i.e. use the log-likelihood ratio statistic):

        >>> from scipy.stats import power_divergence
        >>> power_divergence([16, 18, 16, 14, 12, 12],
lambda_='log-likelihood')
        (2.006573162632538, 0.84823476779463769)

        The expected frequencies can be given with the `f_exp` argument:

        >>> power_divergence([16, 18, 16, 14, 12, 12],
        ...                  f_exp=[16, 16, 16, 16, 16, 8],
        ...                  lambda_='log-likelihood')
        (3.3281031458963746, 0.6495419288047497)

        When `f_obs` is 2-D, by default the test is applied to each column.

        >>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28,
20, 24]]).T
        >>> obs.shape
        (6, 2)
        >>> power_divergence(obs, lambda_="log-likelihood")
        (array([ 2.00657316,  6.77634498]), array([ 0.84823477,
0.23781225]))

        By setting ``axis=None``, the test is applied to all data in the
array,
        which is equivalent to applying the test to the flattened array.

        >>> power_divergence(obs, axis=None)
        (23.31034482758621, 0.015975692534127565)
        >>> power_divergence(obs.ravel())
        (23.31034482758621, 0.015975692534127565)

        `ddof` is the change to make to the default degrees of freedom.

        >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1)
        (2.0, 0.73575888234288467)

        The calculation of the p-values is done by broadcasting the
        test statistic with `ddof`.

        >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2])
        (2.0, array([ 0.84914504,  0.73575888,  0.5724067 ]))

        `f_obs` and `f_exp` are also broadcast.  In the following,
`f_obs` has
        shape (6,) and `f_exp` has shape (2, 6), so the result of
broadcasting
        `f_obs` and `f_exp` has shape (2, 6).  To compute the desired
chi-squared
        statistics, we must use ``axis=1``:

        >>> power_divergence([16, 18, 16, 14, 12, 12],
        ...                  f_exp=[[16, 16, 16, 16, 16, 8],
        ...                         [8, 20, 20, 16, 12, 12]],
        ...                  axis=1)
        (array([ 3.5 ,  9.25]), array([ 0.62338763,  0.09949846]))

        """
        # Convert the input argument `lambda_` to a numerical value.
        if isinstance(lambda_, str):
            if lambda_ not in _power_div_lambda_names:
                names = repr(list(_power_div_lambda_names.keys()))[1:-1]
                raise ValueError("invalid string for lambda_: {0!r}. "
                                 "Valid strings are {1}".format(lambda_,
names))
            lambda_ = _power_div_lambda_names[lambda_]
        elif lambda_ is None:
            lambda_ = 1

        f_obs = np.asanyarray(f_obs)
        f_obs_float = f_obs.astype(np.float64)

        if f_exp is not None:
            f_exp = np.asanyarray(f_exp)
            bshape = _broadcast_shapes(f_obs_float.shape, f_exp.shape)
            f_obs_float = _m_broadcast_to(f_obs_float, bshape)
            f_exp = _m_broadcast_to(f_exp, bshape)
            rtol = 1e-8  # to pass existing tests
            with np.errstate(invalid='ignore'):
                f_obs_sum = f_obs_float.sum(axis=axis)
                f_exp_sum = f_exp.sum(axis=axis)
                relative_diff = (np.abs(f_obs_sum - f_exp_sum) /
                                 np.minimum(f_obs_sum, f_exp_sum))
                diff_gt_tol = (relative_diff > rtol).any()
            if diff_gt_tol:
                msg = (f"For each axis slice, the sum of the observed "
                       f"frequencies must agree with the sum of the "
                       f"expected frequencies to a relative tolerance "
                       f"of {rtol}, but the percent differences are:\n"
                       f"{relative_diff}")
>               raise ValueError(msg)
E               ValueError: For each axis slice, the sum of the observed
frequencies must agree with the sum of the expected frequencies to a
relative tolerance of 1e-08, but the percent differences are:
E               0.11828134694186175

/usr/lib/python3/dist-packages/scipy/stats/stats.py:6694: ValueError
______________________ test_two[power_divergence-kwargs7]
______________________

kind = 'power_divergence', kwargs = {'lambda_': -1}

    @pytest.mark.parametrize(
        "kind, kwargs",
        [
            ("ttest_ind", {}),
            ("ttest_ind", {"equal_var": False}),
            ("ttest_1samp", {}),
            ("ttest_rel", {}),
            ("chisquare", {}),
            ("power_divergence", {}),
            ("power_divergence", {"lambda_": 0}),
            ("power_divergence", {"lambda_": -1}),
            ("power_divergence", {"lambda_": "neyman"}),
        ],
    )
    def test_two(kind, kwargs):
        a = np.random.random(size=30)
        b = np.random.random(size=30)
        a_ = da.from_array(a, 3)
        b_ = da.from_array(b, 3)

        dask_test = getattr(dask.array.stats, kind)
        scipy_test = getattr(scipy.stats, kind)

        with pytest.warns(None):  # maybe overflow warning
(powrer_divergence)
            result = dask_test(a_, b_, **kwargs)
>           expected = scipy_test(a, b, **kwargs)

/usr/lib/python3/dist-packages/dask/array/tests/test_stats.py:88:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _

f_obs = array([0.93323015, 0.06847719, 0.04883692, 0.11047738, 0.46259841,
       0.20786193, 0.9073499 , 0.35431746, 0.389861...12, 0.75042707,
0.04320992, 0.58871239, 0.60498304,
       0.21119014, 0.07340626, 0.64619613, 0.78444948, 0.57526428])
f_exp = array([0.24077443, 0.18040326, 0.14252158, 0.50056972, 0.11418123,
       0.06264705, 0.12162502, 0.08841781, 0.660959...98, 0.81831997,
0.46537992, 0.0474171 , 0.87974671,
       0.88955421, 0.94621617, 0.81090708, 0.89923667, 0.95013257])
ddof = 0, axis = 0, lambda_ = -1

    def power_divergence(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None):
        """Cressie-Read power divergence statistic and goodness of fit test.

        This function tests the null hypothesis that the categorical data
        has the given frequencies, using the Cressie-Read power divergence
        statistic.

        Parameters
        ----------
        f_obs : array_like
            Observed frequencies in each category.
        f_exp : array_like, optional
            Expected frequencies in each category.  By default the
categories are
            assumed to be equally likely.
        ddof : int, optional
            "Delta degrees of freedom": adjustment to the degrees of freedom
            for the p-value.  The p-value is computed using a chi-squared
            distribution with ``k - 1 - ddof`` degrees of freedom, where `k`
            is the number of observed frequencies.  The default value of
`ddof`
            is 0.
        axis : int or None, optional
            The axis of the broadcast result of `f_obs` and `f_exp`
along which to
            apply the test.  If axis is None, all values in `f_obs` are
treated
            as a single data set.  Default is 0.
        lambda_ : float or str, optional
            The power in the Cressie-Read power divergence statistic.
The default
            is 1.  For convenience, `lambda_` may be assigned one of the
following
            strings, in which case the corresponding numerical value is
used::

                String              Value   Description
                "pearson"             1     Pearson's chi-squared statistic.
                                            In this case, the function is
                                            equivalent to `stats.chisquare`.
                "log-likelihood"      0     Log-likelihood ratio. Also
known as
                                            the G-test [3]_.
                "freeman-tukey"      -1/2   Freeman-Tukey statistic.
                "mod-log-likelihood" -1     Modified log-likelihood ratio.
                "neyman"             -2     Neyman's statistic.
                "cressie-read"        2/3   The power recommended in [5]_.

        Returns
        -------
        statistic : float or ndarray
            The Cressie-Read power divergence test statistic.  The value is
            a float if `axis` is None or if` `f_obs` and `f_exp` are 1-D.
        pvalue : float or ndarray
            The p-value of the test.  The value is a float if `ddof` and the
            return value `stat` are scalars.

        See Also
        --------
        chisquare

        Notes
        -----
        This test is invalid when the observed or expected frequencies
in each
        category are too small.  A typical rule is that all of the observed
        and expected frequencies should be at least 5.

        Also, the sum of the observed and expected frequencies must be
the same
        for the test to be valid; `power_divergence` raises an error if
the sums
        do not agree within a relative tolerance of ``1e-8``.

        When `lambda_` is less than zero, the formula for the statistic
involves
        dividing by `f_obs`, so a warning or error may be generated if
any value
        in `f_obs` is 0.

        Similarly, a warning or error may be generated if any value in
`f_exp` is
        zero when `lambda_` >= 0.

        The default degrees of freedom, k-1, are for the case when no
parameters
        of the distribution are estimated. If p parameters are estimated by
        efficient maximum likelihood then the correct degrees of freedom are
        k-1-p. If the parameters are estimated in a different way, then the
        dof can be between k-1-p and k-1. However, it is also possible that
        the asymptotic distribution is not a chisquare, in which case this
        test is not appropriate.

        This function handles masked arrays.  If an element of `f_obs`
or `f_exp`
        is masked, then data at that position is ignored, and does not count
        towards the size of the data set.

        .. versionadded:: 0.13.0

        References
        ----------
        .. [1] Lowry, Richard.  "Concepts and Applications of Inferential
               Statistics". Chapter 8.

https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html
        .. [2] "Chi-squared test",
https://en.wikipedia.org/wiki/Chi-squared_test
        .. [3] "G-test", https://en.wikipedia.org/wiki/G-test
        .. [4] Sokal, R. R. and Rohlf, F. J. "Biometry: the principles and
               practice of statistics in biological research", New York:
Freeman
               (1981)
        .. [5] Cressie, N. and Read, T. R. C., "Multinomial Goodness-of-Fit
               Tests", J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984),
               pp. 440-464.

        Examples
        --------
        (See `chisquare` for more examples.)

        When just `f_obs` is given, it is assumed that the expected
frequencies
        are uniform and given by the mean of the observed frequencies.
Here we
        perform a G-test (i.e. use the log-likelihood ratio statistic):

        >>> from scipy.stats import power_divergence
        >>> power_divergence([16, 18, 16, 14, 12, 12],
lambda_='log-likelihood')
        (2.006573162632538, 0.84823476779463769)

        The expected frequencies can be given with the `f_exp` argument:

        >>> power_divergence([16, 18, 16, 14, 12, 12],
        ...                  f_exp=[16, 16, 16, 16, 16, 8],
        ...                  lambda_='log-likelihood')
        (3.3281031458963746, 0.6495419288047497)

        When `f_obs` is 2-D, by default the test is applied to each column.

        >>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28,
20, 24]]).T
        >>> obs.shape
        (6, 2)
        >>> power_divergence(obs, lambda_="log-likelihood")
        (array([ 2.00657316,  6.77634498]), array([ 0.84823477,
0.23781225]))

        By setting ``axis=None``, the test is applied to all data in the
array,
        which is equivalent to applying the test to the flattened array.

        >>> power_divergence(obs, axis=None)
        (23.31034482758621, 0.015975692534127565)
        >>> power_divergence(obs.ravel())
        (23.31034482758621, 0.015975692534127565)

        `ddof` is the change to make to the default degrees of freedom.

        >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1)
        (2.0, 0.73575888234288467)

        The calculation of the p-values is done by broadcasting the
        test statistic with `ddof`.

        >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2])
        (2.0, array([ 0.84914504,  0.73575888,  0.5724067 ]))

        `f_obs` and `f_exp` are also broadcast.  In the following,
`f_obs` has
        shape (6,) and `f_exp` has shape (2, 6), so the result of
broadcasting
        `f_obs` and `f_exp` has shape (2, 6).  To compute the desired
chi-squared
        statistics, we must use ``axis=1``:

        >>> power_divergence([16, 18, 16, 14, 12, 12],
        ...                  f_exp=[[16, 16, 16, 16, 16, 8],
        ...                         [8, 20, 20, 16, 12, 12]],
        ...                  axis=1)
        (array([ 3.5 ,  9.25]), array([ 0.62338763,  0.09949846]))

        """
        # Convert the input argument `lambda_` to a numerical value.
        if isinstance(lambda_, str):
            if lambda_ not in _power_div_lambda_names:
                names = repr(list(_power_div_lambda_names.keys()))[1:-1]
                raise ValueError("invalid string for lambda_: {0!r}. "
                                 "Valid strings are {1}".format(lambda_,
names))
            lambda_ = _power_div_lambda_names[lambda_]
        elif lambda_ is None:
            lambda_ = 1

        f_obs = np.asanyarray(f_obs)
        f_obs_float = f_obs.astype(np.float64)

        if f_exp is not None:
            f_exp = np.asanyarray(f_exp)
            bshape = _broadcast_shapes(f_obs_float.shape, f_exp.shape)
            f_obs_float = _m_broadcast_to(f_obs_float, bshape)
            f_exp = _m_broadcast_to(f_exp, bshape)
            rtol = 1e-8  # to pass existing tests
            with np.errstate(invalid='ignore'):
                f_obs_sum = f_obs_float.sum(axis=axis)
                f_exp_sum = f_exp.sum(axis=axis)
                relative_diff = (np.abs(f_obs_sum - f_exp_sum) /
                                 np.minimum(f_obs_sum, f_exp_sum))
                diff_gt_tol = (relative_diff > rtol).any()
            if diff_gt_tol:
                msg = (f"For each axis slice, the sum of the observed "
                       f"frequencies must agree with the sum of the "
                       f"expected frequencies to a relative tolerance "
                       f"of {rtol}, but the percent differences are:\n"
                       f"{relative_diff}")
>               raise ValueError(msg)
E               ValueError: For each axis slice, the sum of the observed
frequencies must agree with the sum of the expected frequencies to a
relative tolerance of 1e-08, but the percent differences are:
E               0.11063329565866321

/usr/lib/python3/dist-packages/scipy/stats/stats.py:6694: ValueError
______________________ test_two[power_divergence-kwargs8]
______________________

kind = 'power_divergence', kwargs = {'lambda_': 'neyman'}

    @pytest.mark.parametrize(
        "kind, kwargs",
        [
            ("ttest_ind", {}),
            ("ttest_ind", {"equal_var": False}),
            ("ttest_1samp", {}),
            ("ttest_rel", {}),
            ("chisquare", {}),
            ("power_divergence", {}),
            ("power_divergence", {"lambda_": 0}),
            ("power_divergence", {"lambda_": -1}),
            ("power_divergence", {"lambda_": "neyman"}),
        ],
    )
    def test_two(kind, kwargs):
        a = np.random.random(size=30)
        b = np.random.random(size=30)
        a_ = da.from_array(a, 3)
        b_ = da.from_array(b, 3)

        dask_test = getattr(dask.array.stats, kind)
        scipy_test = getattr(scipy.stats, kind)

        with pytest.warns(None):  # maybe overflow warning
(powrer_divergence)
            result = dask_test(a_, b_, **kwargs)
>           expected = scipy_test(a, b, **kwargs)

/usr/lib/python3/dist-packages/dask/array/tests/test_stats.py:88:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _

f_obs = array([8.77388479e-01, 9.60149501e-01, 4.47046846e-01,
8.40131926e-01,
       2.53213006e-01, 7.75601859e-01, 9.242306...546e-01,
       9.89797308e-01, 3.58514314e-01, 1.74643187e-01, 8.59181200e-01,
       8.11582172e-01, 8.30851765e-04])
f_exp = array([0.38556923, 0.78153729, 0.45783246, 0.6889513 , 0.85721604,
       0.6226383 , 0.76431552, 0.6329189 , 0.037484...69, 0.16427409,
0.00692481, 0.91507232, 0.41494886,
       0.90695644, 0.57371155, 0.36052197, 0.61507666, 0.87718468])
ddof = 0, axis = 0, lambda_ = -2

    def power_divergence(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None):
        """Cressie-Read power divergence statistic and goodness of fit test.

        This function tests the null hypothesis that the categorical data
        has the given frequencies, using the Cressie-Read power divergence
        statistic.

        Parameters
        ----------
        f_obs : array_like
            Observed frequencies in each category.
        f_exp : array_like, optional
            Expected frequencies in each category.  By default the
categories are
            assumed to be equally likely.
        ddof : int, optional
            "Delta degrees of freedom": adjustment to the degrees of freedom
            for the p-value.  The p-value is computed using a chi-squared
            distribution with ``k - 1 - ddof`` degrees of freedom, where `k`
            is the number of observed frequencies.  The default value of
`ddof`
            is 0.
        axis : int or None, optional
            The axis of the broadcast result of `f_obs` and `f_exp`
along which to
            apply the test.  If axis is None, all values in `f_obs` are
treated
            as a single data set.  Default is 0.
        lambda_ : float or str, optional
            The power in the Cressie-Read power divergence statistic.
The default
            is 1.  For convenience, `lambda_` may be assigned one of the
following
            strings, in which case the corresponding numerical value is
used::

                String              Value   Description
                "pearson"             1     Pearson's chi-squared statistic.
                                            In this case, the function is
                                            equivalent to `stats.chisquare`.
                "log-likelihood"      0     Log-likelihood ratio. Also
known as
                                            the G-test [3]_.
                "freeman-tukey"      -1/2   Freeman-Tukey statistic.
                "mod-log-likelihood" -1     Modified log-likelihood ratio.
                "neyman"             -2     Neyman's statistic.
                "cressie-read"        2/3   The power recommended in [5]_.

        Returns
        -------
        statistic : float or ndarray
            The Cressie-Read power divergence test statistic.  The value is
            a float if `axis` is None or if` `f_obs` and `f_exp` are 1-D.
        pvalue : float or ndarray
            The p-value of the test.  The value is a float if `ddof` and the
            return value `stat` are scalars.

        See Also
        --------
        chisquare

        Notes
        -----
        This test is invalid when the observed or expected frequencies
in each
        category are too small.  A typical rule is that all of the observed
        and expected frequencies should be at least 5.

        Also, the sum of the observed and expected frequencies must be
the same
        for the test to be valid; `power_divergence` raises an error if
the sums
        do not agree within a relative tolerance of ``1e-8``.

        When `lambda_` is less than zero, the formula for the statistic
involves
        dividing by `f_obs`, so a warning or error may be generated if
any value
        in `f_obs` is 0.

        Similarly, a warning or error may be generated if any value in
`f_exp` is
        zero when `lambda_` >= 0.

        The default degrees of freedom, k-1, are for the case when no
parameters
        of the distribution are estimated. If p parameters are estimated by
        efficient maximum likelihood then the correct degrees of freedom are
        k-1-p. If the parameters are estimated in a different way, then the
        dof can be between k-1-p and k-1. However, it is also possible that
        the asymptotic distribution is not a chisquare, in which case this
        test is not appropriate.

        This function handles masked arrays.  If an element of `f_obs`
or `f_exp`
        is masked, then data at that position is ignored, and does not count
        towards the size of the data set.

        .. versionadded:: 0.13.0

        References
        ----------
        .. [1] Lowry, Richard.  "Concepts and Applications of Inferential
               Statistics". Chapter 8.

https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html
        .. [2] "Chi-squared test",
https://en.wikipedia.org/wiki/Chi-squared_test
        .. [3] "G-test", https://en.wikipedia.org/wiki/G-test
        .. [4] Sokal, R. R. and Rohlf, F. J. "Biometry: the principles and
               practice of statistics in biological research", New York:
Freeman
               (1981)
        .. [5] Cressie, N. and Read, T. R. C., "Multinomial Goodness-of-Fit
               Tests", J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984),
               pp. 440-464.

        Examples
        --------
        (See `chisquare` for more examples.)

        When just `f_obs` is given, it is assumed that the expected
frequencies
        are uniform and given by the mean of the observed frequencies.
Here we
        perform a G-test (i.e. use the log-likelihood ratio statistic):

        >>> from scipy.stats import power_divergence
        >>> power_divergence([16, 18, 16, 14, 12, 12],
lambda_='log-likelihood')
        (2.006573162632538, 0.84823476779463769)

        The expected frequencies can be given with the `f_exp` argument:

        >>> power_divergence([16, 18, 16, 14, 12, 12],
        ...                  f_exp=[16, 16, 16, 16, 16, 8],
        ...                  lambda_='log-likelihood')
        (3.3281031458963746, 0.6495419288047497)

        When `f_obs` is 2-D, by default the test is applied to each column.

        >>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28,
20, 24]]).T
        >>> obs.shape
        (6, 2)
        >>> power_divergence(obs, lambda_="log-likelihood")
        (array([ 2.00657316,  6.77634498]), array([ 0.84823477,
0.23781225]))

        By setting ``axis=None``, the test is applied to all data in the
array,
        which is equivalent to applying the test to the flattened array.

        >>> power_divergence(obs, axis=None)
        (23.31034482758621, 0.015975692534127565)
        >>> power_divergence(obs.ravel())
        (23.31034482758621, 0.015975692534127565)

        `ddof` is the change to make to the default degrees of freedom.

        >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1)
        (2.0, 0.73575888234288467)

        The calculation of the p-values is done by broadcasting the
        test statistic with `ddof`.

        >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2])
        (2.0, array([ 0.84914504,  0.73575888,  0.5724067 ]))

        `f_obs` and `f_exp` are also broadcast.  In the following,
`f_obs` has
        shape (6,) and `f_exp` has shape (2, 6), so the result of
broadcasting
        `f_obs` and `f_exp` has shape (2, 6).  To compute the desired
chi-squared
        statistics, we must use ``axis=1``:

        >>> power_divergence([16, 18, 16, 14, 12, 12],
        ...                  f_exp=[[16, 16, 16, 16, 16, 8],
        ...                         [8, 20, 20, 16, 12, 12]],
        ...                  axis=1)
        (array([ 3.5 ,  9.25]), array([ 0.62338763,  0.09949846]))

        """
        # Convert the input argument `lambda_` to a numerical value.
        if isinstance(lambda_, str):
            if lambda_ not in _power_div_lambda_names:
                names = repr(list(_power_div_lambda_names.keys()))[1:-1]
                raise ValueError("invalid string for lambda_: {0!r}. "
                                 "Valid strings are {1}".format(lambda_,
names))
            lambda_ = _power_div_lambda_names[lambda_]
        elif lambda_ is None:
            lambda_ = 1

        f_obs = np.asanyarray(f_obs)
        f_obs_float = f_obs.astype(np.float64)

        if f_exp is not None:
            f_exp = np.asanyarray(f_exp)
            bshape = _broadcast_shapes(f_obs_float.shape, f_exp.shape)
            f_obs_float = _m_broadcast_to(f_obs_float, bshape)
            f_exp = _m_broadcast_to(f_exp, bshape)
            rtol = 1e-8  # to pass existing tests
            with np.errstate(invalid='ignore'):
                f_obs_sum = f_obs_float.sum(axis=axis)
                f_exp_sum = f_exp.sum(axis=axis)
                relative_diff = (np.abs(f_obs_sum - f_exp_sum) /
                                 np.minimum(f_obs_sum, f_exp_sum))
                diff_gt_tol = (relative_diff > rtol).any()
            if diff_gt_tol:
                msg = (f"For each axis slice, the sum of the observed "
                       f"frequencies must agree with the sum of the "
                       f"expected frequencies to a relative tolerance "
                       f"of {rtol}, but the percent differences are:\n"
                       f"{relative_diff}")
>               raise ValueError(msg)
E               ValueError: For each axis slice, the sum of the observed
frequencies must agree with the sum of the expected frequencies to a
relative tolerance of 1e-08, but the percent differences are:
E               0.1911078274656955

OpenPGP_signature
Description: OpenPGP digital signature

Bug#992672: scipy breaks dask autopkgtest: 5 times FAILED array/tests/test_stats.py::test_two

Reply via email to