[
https://issues.apache.org/jira/browse/SPARK-17960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen updated SPARK-17960:
------------------------------
Assignee: Jagadeesan A S
> Upgrade to Py4J 0.10.4
> ----------------------
>
> Key: SPARK-17960
> URL: https://issues.apache.org/jira/browse/SPARK-17960
> Project: Spark
> Issue Type: Improvement
> Components: PySpark
> Reporter: holdenk
> Assignee: Jagadeesan A S
> Priority: Trivial
> Labels: starter
> Fix For: 2.1.0
>
>
> In general we should try and keep up to date with Py4J's new releases. The
> changes in this one are small (
> https://github.com/bartdag/py4j/milestone/21?closed=1 ) and shouldn't impact
> Spark in any significant way so I'm going to tag this as a starter issue for
> someone looking to get a deeper understanding of how PySpark works.
> Upgrading Py4J can be a bit tricky compared to updating other packages in
> general the steps are:
> 1) Upgrade the Py4J version on the Java side
> 2) Update the py4j src zip file we bundle with Spark
> 3) Make sure everything still works (especially the streaming tests because
> we do weird things to make streaming work and its the most likely place to
> break during a Py4J upgrade).
> You can see how these bits have been done in past releases by looking in the
> git log for the last time we changed the Py4J version numbers. Sometimes even
> for "compatible" releases like this one we may need to make some small code
> changes in side of PySpark because we hook into Py4Js internals, but I don't
> think this should be the case here.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]