Brian Hulette created BEAM-12367:
------------------------------------
Summary: SeriesGroupBy corr and cov do not raise the expected error
Key: BEAM-12367
URL: https://issues.apache.org/jira/browse/BEAM-12367
Project: Beam
Issue Type: Bug
Components: sdk-py-core
Reporter: Brian Hulette
SeriesGroupBy.corr should raise an error at construction time because it needs
multiple Series:
{code}
In [4]: df.groupby('A').B.corr()
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-4-d760b6077290> in <module>
----> 1 df.groupby('A').B.corr()
~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py
in wrapper(*args, **kwargs)
815 return self.apply(curried)
816
--> 817 return self._python_apply_general(curried,
self._obj_with_exclusions)
818
819 wrapper.__name__ = name
~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py
in _python_apply_general(self, f, data)
926 data after applying f
927 """
--> 928 keys, values, mutated = self.grouper.apply(f, data, self.axis)
929
930 return self._wrap_applied_output(
~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/ops.py
in apply(self, f, data, axis)
236 # group might be modified
237 group_axes = group.axes
--> 238 res = f(group)
239 if not _is_indexed_like(res, group_axes, axis):
240 mutated = True
~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py
in curried(x)
804
805 def curried(x):
--> 806 return f(x, *args, **kwargs)
807
808 # preserve the name so we can detect it when calling plot
methods,
TypeError: corr() missing 1 required positional argument: 'other'
{code}
But this isn't raised when called on an empty dataset (perhaps an upstream
bug), so we don't raise it during proxy generation. It will not fail until the
pipeline is running.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)