Suzen Fylke created SPARK-32094:
-----------------------------------
Summary: Patch cloudpickle.py with typing module side-effect fix
Key: SPARK-32094
URL: https://issues.apache.org/jira/browse/SPARK-32094
Project: Spark
Issue Type: Bug
Components: PySpark
Affects Versions: 3.0.0, 2.4.6
Reporter: Suzen Fylke
Pyspark's cloudpickle.py and versions of cloudpickle below 1.3.0 interfere with
dill unpickling because they define types.ClassType, which is undefined in
dill. This results in the following error:
{{}}
{code:java}
Traceback (most recent call last):
File
"/usr/local/lib/python3.6/site-packages/apache_beam/internal/pickler.py", line
279, in loads
return dill.loads(s)
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 317, in
loads
return load(file, ignore)
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 305, in load
obj = pik.load()
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 577, in
_load_type
return _reverse_typemap[name]
KeyError: 'ClassType'{code}
{{}}
(See [https://github.com/cloudpipe/cloudpickle/issues/82])
This was fixed for cloudpickle 1.3.0+
([https://github.com/cloudpipe/cloudpickle/pull/337]), but PySpark's
cloudpickle.py doesn't have this change yet.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]