Github user holdenk commented on a diff in the pull request:
https://github.com/apache/spark/pull/19872#discussion_r154782452
--- Diff: python/pyspark/sql/group.py ---
@@ -89,8 +89,15 @@ def agg(self, *exprs):
else:
# Columns
assert all(isinstance(c, Column) for c in exprs), "all exprs
should be Column"
- jdf = self._jgd.agg(exprs[0]._jc,
- _to_seq(self.sql_ctx._sc, [c._jc for c in
exprs[1:]]))
+ if isinstance(exprs[0], UDFColumn):
+ assert all(isinstance(c, UDFColumn) for c in exprs)
--- End diff --
So I'm a little worried about this change, if other folks have wrapped Java
UDAFs (which is reasonable since there aren't other ways to make UDAFs in
PySpark before this), this seems like they won't be able to mix them. I'd
suggest maybe doing what @viirya suggested bellow but instead of a failure just
a warning until Spark 3.
What do y'all think?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]