Brian Hulette created BEAM-12367:
------------------------------------

             Summary: SeriesGroupBy corr and cov do not raise the expected error
                 Key: BEAM-12367
                 URL: https://issues.apache.org/jira/browse/BEAM-12367
             Project: Beam
          Issue Type: Bug
          Components: sdk-py-core
            Reporter: Brian Hulette


SeriesGroupBy.corr should raise an error at construction time because it needs 
multiple Series:

{code}
In [4]: df.groupby('A').B.corr()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-d760b6077290> in <module>
----> 1 df.groupby('A').B.corr()

~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py
 in wrapper(*args, **kwargs)
    815                 return self.apply(curried)
    816 
--> 817             return self._python_apply_general(curried, 
self._obj_with_exclusions)
    818 
    819         wrapper.__name__ = name

~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py
 in _python_apply_general(self, f, data)
    926             data after applying f
    927         """
--> 928         keys, values, mutated = self.grouper.apply(f, data, self.axis)
    929 
    930         return self._wrap_applied_output(

~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/ops.py
 in apply(self, f, data, axis)
    236             # group might be modified
    237             group_axes = group.axes
--> 238             res = f(group)
    239             if not _is_indexed_like(res, group_axes, axis):
    240                 mutated = True

~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py
 in curried(x)
    804 
    805             def curried(x):
--> 806                 return f(x, *args, **kwargs)
    807 
    808             # preserve the name so we can detect it when calling plot 
methods,

TypeError: corr() missing 1 required positional argument: 'other'
{code}

But this isn't raised when called on an empty dataset (perhaps an upstream 
bug), so we don't raise it during proxy generation. It will not fail until the 
pipeline is running.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to