[ 
https://issues.apache.org/jira/browse/SPARK-6883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566739#comment-14566739
 ] 

Josh Rosen commented on SPARK-6883:
-----------------------------------

I would not mind seeing some of the forked cloudpickle's changes ported back to 
Spark (or to have us bundle and use the forked cloudpickle), but I can't devote 
any time to working on this.  It would be great if someone wants to investigate 
this and figure out a good way of managing third-party Python library 
dependencies in Spark itself.

> Fork pyspark's cloudpickle as a separate dependency
> ---------------------------------------------------
>
>                 Key: SPARK-6883
>                 URL: https://issues.apache.org/jira/browse/SPARK-6883
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>            Reporter: Kyle Kelley
>              Labels: fork
>
> IPython, pyspark, picloud/multyvac/cloudpipe all rely on cloudpickle from 
> various sources (cloud, pyspark, and multyvac correspondingly). It would be 
> great to have this as a separately maintained project that can:
> * Work with Python3
> * Add tests!
> * Use higher order pickling (when on Python3)
> * Be installed with pip
> We're starting this off at the PyCon sprints under 
> https://github.com/cloudpipe/cloudpickle. We'd like to coordinate with 
> PySpark to make it work across all the above mentioned projects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to