[
https://issues.apache.org/jira/browse/SPARK-15589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15304692#comment-15304692
]
holdenk commented on SPARK-15589:
---------------------------------
Of course needs to wait for the Python Dataset API to exist.
> Anaylze simple PySpark closures and generate SQL expressions
> ------------------------------------------------------------
>
> Key: SPARK-15589
> URL: https://issues.apache.org/jira/browse/SPARK-15589
> Project: Spark
> Issue Type: Improvement
> Components: PySpark, SQL
> Reporter: holdenk
>
> Similar to SPARK-14083 we can try introspecting simple Python functions and
> see if we can generate an equivalent SQL expression. This would result in an
> even greater performance increase for PySpark users than Scala users as not
> only would they benefit from better codegen, it would also avoid substantial
> serialization cost.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]