[
https://issues.apache.org/jira/browse/SPARK-6883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566739#comment-14566739
]
Josh Rosen commented on SPARK-6883:
-----------------------------------
I would not mind seeing some of the forked cloudpickle's changes ported back to
Spark (or to have us bundle and use the forked cloudpickle), but I can't devote
any time to working on this. It would be great if someone wants to investigate
this and figure out a good way of managing third-party Python library
dependencies in Spark itself.
> Fork pyspark's cloudpickle as a separate dependency
> ---------------------------------------------------
>
> Key: SPARK-6883
> URL: https://issues.apache.org/jira/browse/SPARK-6883
> Project: Spark
> Issue Type: Improvement
> Components: PySpark
> Reporter: Kyle Kelley
> Labels: fork
>
> IPython, pyspark, picloud/multyvac/cloudpipe all rely on cloudpickle from
> various sources (cloud, pyspark, and multyvac correspondingly). It would be
> great to have this as a separately maintained project that can:
> * Work with Python3
> * Add tests!
> * Use higher order pickling (when on Python3)
> * Be installed with pip
> We're starting this off at the PyCon sprints under
> https://github.com/cloudpipe/cloudpickle. We'd like to coordinate with
> PySpark to make it work across all the above mentioned projects.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]