[GitHub] spark issue #19872: WIP: [SPARK-22274][PySpark] User-defined aggregation fun...

icexelloss Thu, 07 Dec 2017 16:48:06 -0800

Github user icexelloss commented on the issue:

    https://github.com/apache/spark/pull/19872
  
    And to @holdenk 's question. Pandas group_agg udf fundamentally uses 
different physical plan than the existing java/scala udf and therefore it's 
hard to combine them together. I don't know a good way to do this, the closest 
is maybe to compute java/scala and python aggregation separately and join them 
together.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #19872: WIP: [SPARK-22274][PySpark] User-defined aggregation fun...

Reply via email to