Dongjoon Hyun created SPARK-54065:
-------------------------------------
Summary: Fix `test_in_memory_data_source` in Python 3.14
Key: SPARK-54065
URL: https://issues.apache.org/jira/browse/SPARK-54065
Project: Spark
Issue Type: Sub-task
Components: PySpark
Affects Versions: 4.1.0
Reporter: Dongjoon Hyun
{code}
======================================================================
ERROR [0.007s]: test_in_memory_data_source
(pyspark.sql.tests.test_python_datasource.PythonDataSourceTests.test_in_memory_data_source)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/__w/spark/spark/python/pyspark/serializers.py", line 460, in dumps
return cloudpickle.dumps(obj, pickle_protocol)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
File "/__w/spark/spark/python/pyspark/cloudpickle/cloudpickle.py", line 1537,
in dumps
cp.dump(obj)
~~~~~~~^^^^^
File "/__w/spark/spark/python/pyspark/cloudpickle/cloudpickle.py", line 1303,
in dump
return super().dump(obj)
~~~~~~~~~~~~^^^^^
TypeError: cannot pickle '_abc._abc_data' object
when serializing dict item '_abc_impl'
when serializing tuple item 0
when serializing cell reconstructor arguments
when serializing cell object
when serializing tuple item 0
when serializing dict item '__closure__'
when serializing tuple item 1
when serializing function state
when serializing function object
when serializing dict item '__annotate_func__'
when serializing tuple item 0
when serializing abc.ABCMeta state
when serializing abc.ABCMeta object
when serializing tuple item 0
when serializing cell reconstructor arguments
when serializing cell object
when serializing tuple item 0
when serializing dict item '__closure__'
when serializing tuple item 1
when serializing function state
when serializing function object
when serializing dict item 'reader'
when serializing tuple item 0
when serializing abc.ABCMeta state
when serializing abc.ABCMeta object
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/__w/spark/spark/python/pyspark/sql/tests/test_python_datasource.py",
line 283, in test_in_memory_data_source
self.spark.dataSource.register(InMemoryDataSource)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
File "/__w/spark/spark/python/pyspark/sql/datasource.py", line 1197, in
register
wrapped = _wrap_function(sc, dataSource)
File "/__w/spark/spark/python/pyspark/sql/udf.py", line 59, in _wrap_function
pickled_command, broadcast_vars, env, includes =
_prepare_for_python_RDD(sc, command)
~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
File "/__w/spark/spark/python/pyspark/core/rdd.py", line 5121, in
_prepare_for_python_RDD
pickled_command = ser.dumps(command)
File "/__w/spark/spark/python/pyspark/serializers.py", line 470, in dumps
raise pickle.PicklingError(msg)
_pickle.PicklingError: Could not serialize object: TypeError: cannot pickle
'_abc._abc_data' object
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]