Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/16429
Thanks for your interests @azmras. I just checked it as below:
```python
sc.parallelize(range(100), 8)
```
```
Traceback (most recent call last):
File ".../spark/python/pyspark/cloudpickle.py", line 107, in dump
return Pickler.dump(self, obj)
File
"/usr/local/Cellar/python3/3.6.0/Frameworks/Python.framework/Versions/3.6/lib/python3.6/pickle.py",
line 409, in dump
self.save(obj)
File
"/usr/local/Cellar/python3/3.6.0/Frameworks/Python.framework/Versions/3.6/lib/python3.6/pickle.py",
line 476, in save
f(self, obj) # Call unbound method with explicit self
File
"/usr/local/Cellar/python3/3.6.0/Frameworks/Python.framework/Versions/3.6/lib/python3.6/pickle.py",
line 751, in save_tuple
save(element)
File
"/usr/local/Cellar/python3/3.6.0/Frameworks/Python.framework/Versions/3.6/lib/python3.6/pickle.py",
line 476, in save
f(self, obj) # Call unbound method with explicit self
File ".../spark/python/pyspark/cloudpickle.py", line 214, in save_function
self.save_function_tuple(obj)
File ".../spark/python/pyspark/cloudpickle.py", line 244, in
save_function_tuple
code, f_globals, defaults, closure, dct, base_globals =
self.extract_func_data(func)
File ".../spark/python/pyspark/cloudpickle.py", line 306, in
extract_func_data
func_global_refs = self.extract_code_globals(code)
File ".../spark/python/pyspark/cloudpickle.py", line 288, in
extract_code_globals
out_names.add(names[oparg])
IndexError: tuple index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../spark/python/pyspark/rdd.py", line 198, in __repr__
return self._jrdd.toString()
File ".../spark/python/pyspark/rdd.py", line 2438, in _jrdd
self._jrdd_deserializer, profiler)
File ".../spark/python/pyspark/rdd.py", line 2371, in _wrap_function
pickled_command, broadcast_vars, env, includes =
_prepare_for_python_RDD(sc, command)
File ".../spark/python/pyspark/rdd.py", line 2357, in
_prepare_for_python_RDD
pickled_command = ser.dumps(command)
File ".../spark/python/pyspark/serializers.py", line 452, in dumps
return cloudpickle.dumps(obj, 2)
File ".../spark/python/pyspark/cloudpickle.py", line 667, in dumps
cp.dump(obj)
File ".../spark/python/pyspark/cloudpickle.py", line 115, in dump
if "'i' format requires" in e.message:
AttributeError: 'IndexError' object has no attribute 'message'
```
It looks another issue with Python 3.6.0. This is only related with the
hijacked `collections.namedtuple`.
We should port
https://github.com/cloudpipe/cloudpickle/commit/4945361c2db92095f934b92a6c00316243caf3cc.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]