holdenk created SPARK-17960:

             Summary: Upgrade to Py4J 0.10.4
                 Key: SPARK-17960
                 URL: https://issues.apache.org/jira/browse/SPARK-17960
             Project: Spark
          Issue Type: Improvement
          Components: PySpark
            Reporter: holdenk
            Priority: Trivial

In general we should try and keep up to date with Py4J's new releases. The 
changes in this one are small ( 
https://github.com/bartdag/py4j/milestone/21?closed=1 ) and shouldn't impact 
Spark in any significant way so I'm going to tag this as a starter issue for 
someone looking to get a deeper understanding of how PySpark works.

Upgrading Py4J can be a bit tricky compared to updating other packages in 
general the steps are:
1) Upgrade the Py4J version on the Java side
2) Update the py4j src zip file we bundle with Spark
3) Make sure everything still works (especially the streaming tests because we 
do weird things to make streaming work and its the most likely place to break 
during a Py4J upgrade).

You can see how these bits have been done in past releases by looking in the 
git log for the last time we changed the Py4J version numbers. Sometimes even 
for "compatible" releases like this one we may need to make some small code 
changes in side of PySpark because we hook into Py4Js internals, but I don't 
think this should be the case here.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to