Davies Liu commented on SPARK-10915:

Python UDF is executed in batch mode to have reasonable performance. UDAF could 
be much harder to implement in batch mode, especially when it's used together 
with other aggregate functions.

One possible solution could be apply a Python UDF after CollectList, you 
already could do this as a workaround today.

> Add support for UDAFs in Python
> -------------------------------
>                 Key: SPARK-10915
>                 URL: https://issues.apache.org/jira/browse/SPARK-10915
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark, SQL
>            Reporter: Justin Uang
> This should support python defined lambdas.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to