Repository: spark Updated Branches: refs/heads/branch-1.6 4fdac3c27 -> d7223bb9f
[SPARK-16077] [PYSPARK] catch the exception from pickle.whichmodule() ## What changes were proposed in this pull request? In the case that we don't know which module a object came from, will call pickle.whichmodule() to go throught all the loaded modules to find the object, which could fail because some modules, for example, six, see https://bitbucket.org/gutworth/six/issues/63/importing-six-breaks-pickling We should ignore the exception here, use `__main__` as the module name (it means we can't find the module). ## How was this patch tested? Manual tested. Can't have a unit test for this. Author: Davies Liu <dav...@databricks.com> Closes #13788 from davies/whichmodule. (cherry picked from commit d48935400ca47275f677b527c636976af09332c8) Signed-off-by: Davies Liu <davies....@gmail.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d7223bb9 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d7223bb9 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d7223bb9 Branch: refs/heads/branch-1.6 Commit: d7223bb9fdc54edcc1a45cead9a71b5bac49b2ab Parents: 4fdac3c Author: Davies Liu <dav...@databricks.com> Authored: Fri Jun 24 14:35:34 2016 -0700 Committer: Davies Liu <davies....@gmail.com> Committed: Fri Jun 24 14:35:51 2016 -0700 ---------------------------------------------------------------------- python/pyspark/cloudpickle.py | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/d7223bb9/python/pyspark/cloudpickle.py ---------------------------------------------------------------------- diff --git a/python/pyspark/cloudpickle.py b/python/pyspark/cloudpickle.py index e56e22a..822ae46 100644 --- a/python/pyspark/cloudpickle.py +++ b/python/pyspark/cloudpickle.py @@ -169,7 +169,12 @@ class CloudPickler(Pickler): if name is None: name = obj.__name__ - modname = pickle.whichmodule(obj, name) + try: + # whichmodule() could fail, see + # https://bitbucket.org/gutworth/six/issues/63/importing-six-breaks-pickling + modname = pickle.whichmodule(obj, name) + except Exception: + modname = None # print('which gives %s %s %s' % (modname, obj, name)) try: themodule = sys.modules[modname] @@ -326,7 +331,12 @@ class CloudPickler(Pickler): modname = getattr(obj, "__module__", None) if modname is None: - modname = pickle.whichmodule(obj, name) + try: + # whichmodule() could fail, see + # https://bitbucket.org/gutworth/six/issues/63/importing-six-breaks-pickling + modname = pickle.whichmodule(obj, name) + except Exception: + modname = '__main__' if modname == '__main__': themodule = None --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org