[
https://issues.apache.org/jira/browse/SPARK-32094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Suzen Fylke updated SPARK-32094:
--------------------------------
Description:
Pyspark's cloudpickle.py and versions of cloudpickle below 1.3.0 interfere with
dill unpickling because they define types.ClassType, which is undefined in
dill. This results in the following error:
{code:java}
Traceback (most recent call last):
File
"/usr/local/lib/python3.6/site-packages/apache_beam/internal/pickler.py", line
279, in loads
return dill.loads(s)
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 317, in
loads
return load(file, ignore)
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 305, in load
obj = pik.load()
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 577, in
_load_type
return _reverse_typemap[name]
KeyError: 'ClassType'{code}
(See [https://github.com/cloudpipe/cloudpickle/issues/82])
This was fixed for cloudpickle 1.3.0+
([https://github.com/cloudpipe/cloudpickle/pull/337]), but PySpark's
cloudpickle.py doesn't have this change yet.
was:
Pyspark's cloudpickle.py and versions of cloudpickle below 1.3.0 interfere with
dill unpickling because they define types.ClassType, which is undefined in
dill. This results in the following error:
{{}}
{code:java}
Traceback (most recent call last):
File
"/usr/local/lib/python3.6/site-packages/apache_beam/internal/pickler.py", line
279, in loads
return dill.loads(s)
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 317, in
loads
return load(file, ignore)
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 305, in load
obj = pik.load()
File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 577, in
_load_type
return _reverse_typemap[name]
KeyError: 'ClassType'{code}
{{}}
(See [https://github.com/cloudpipe/cloudpickle/issues/82])
This was fixed for cloudpickle 1.3.0+
([https://github.com/cloudpipe/cloudpickle/pull/337]), but PySpark's
cloudpickle.py doesn't have this change yet.
> Patch cloudpickle.py with typing module side-effect fix
> -------------------------------------------------------
>
> Key: SPARK-32094
> URL: https://issues.apache.org/jira/browse/SPARK-32094
> Project: Spark
> Issue Type: Bug
> Components: PySpark
> Affects Versions: 2.4.6, 3.0.0
> Reporter: Suzen Fylke
> Priority: Major
>
> Pyspark's cloudpickle.py and versions of cloudpickle below 1.3.0 interfere
> with dill unpickling because they define types.ClassType, which is undefined
> in dill. This results in the following error:
> {code:java}
> Traceback (most recent call last):
> File
> "/usr/local/lib/python3.6/site-packages/apache_beam/internal/pickler.py",
> line 279, in loads
> return dill.loads(s)
> File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 317, in
> loads
> return load(file, ignore)
> File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 305, in
> load
> obj = pik.load()
> File "/usr/local/lib/python3.6/site-packages/dill/_dill.py", line 577, in
> _load_type
> return _reverse_typemap[name]
> KeyError: 'ClassType'{code}
> (See [https://github.com/cloudpipe/cloudpickle/issues/82])
> This was fixed for cloudpickle 1.3.0+
> ([https://github.com/cloudpipe/cloudpickle/pull/337]), but PySpark's
> cloudpickle.py doesn't have this change yet.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]