[
https://issues.apache.org/jira/browse/BEAM-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Beam JIRA Bot updated BEAM-12367:
---------------------------------
Priority: P3 (was: P2)
> SeriesGroupBy corr and cov do not raise the expected error at pipeline
> construction time
> ----------------------------------------------------------------------------------------
>
> Key: BEAM-12367
> URL: https://issues.apache.org/jira/browse/BEAM-12367
> Project: Beam
> Issue Type: Bug
> Components: dsl-dataframe, sdk-py-core
> Reporter: Brian Hulette
> Priority: P3
> Labels: dataframe-api, stale-P2
>
> SeriesGroupBy.corr should raise an error at construction time because it
> needs multiple Series:
> {code}
> In [4]: df.groupby('A').B.corr()
> ---------------------------------------------------------------------------
> TypeError Traceback (most recent call last)
> <ipython-input-4-d760b6077290> in <module>
> ----> 1 df.groupby('A').B.corr()
> ~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py
> in wrapper(*args, **kwargs)
> 815 return self.apply(curried)
> 816
> --> 817 return self._python_apply_general(curried,
> self._obj_with_exclusions)
> 818
> 819 wrapper.__name__ = name
> ~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py
> in _python_apply_general(self, f, data)
> 926 data after applying f
> 927 """
> --> 928 keys, values, mutated = self.grouper.apply(f, data, self.axis)
> 929
> 930 return self._wrap_applied_output(
> ~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/ops.py
> in apply(self, f, data, axis)
> 236 # group might be modified
> 237 group_axes = group.axes
> --> 238 res = f(group)
> 239 if not _is_indexed_like(res, group_axes, axis):
> 240 mutated = True
> ~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/groupby/groupby.py
> in curried(x)
> 804
> 805 def curried(x):
> --> 806 return f(x, *args, **kwargs)
> 807
> 808 # preserve the name so we can detect it when calling plot
> methods,
> TypeError: corr() missing 1 required positional argument: 'other'
> {code}
> But this isn't raised when called on an empty dataset (perhaps an upstream
> bug), so we don't raise it during proxy generation. It will not fail until
> the pipeline is running.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)