GitHub user davies opened a pull request:
https://github.com/apache/spark/pull/2830
[SPARK-3971] [MLLib] [PySpark] hotfix: Customized pickler should work in
cluster mode
Customized pickler should be registered before unpickling, but in executor,
there is no way to register the picklers before run the tasks.
So, we need to register the picklers in the tasks itself, duplicate the
javaToPython() and pythonToJava() in MLlib, call SerDe.initialize() before
pickling or unpickling.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/davies/spark fix_pickle
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/2830.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2830
----
commit 0f02050a989bcd30be0ad4464b5407a162f6eca0
Author: Davies Liu <[email protected]>
Date: 2014-10-16T19:38:20Z
hotfix: Customized pickler does not work in cluster
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]