dongjoon-hyun commented on PR #52768:
URL: https://github.com/apache/spark/pull/52768#issuecomment-3459472098

   For the following question, @HyukjinKwon .
   
   > does this make python 3.14 support?
   
   1. For `Classic` mode, we still need `MLFlow` package and SPARK-54065 
(cloudpickle), 
   2. For `Connect` mode, we need `MLFlow` package, SPARK-54065 (cloudpickle), 
and SPARK-54068 (`TypeError: Object of type PlanMetrics is not JSON 
serializable`)
   
   I'm still investigating `cloudpickle` issue via SPARK-54065 which aims to 
fix the following `test_in_memory_data_source` specifically in Python 3.14.
   
   ```
   ======================================================================
   ERROR [0.007s]: test_in_memory_data_source 
(pyspark.sql.tests.test_python_datasource.PythonDataSourceTests.test_in_memory_data_source)
   ----------------------------------------------------------------------
   Traceback (most recent call last):
     File "/__w/spark/spark/python/pyspark/serializers.py", line 460, in dumps
       return cloudpickle.dumps(obj, pickle_protocol)
              ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
     File "/__w/spark/spark/python/pyspark/cloudpickle/cloudpickle.py", line 
1537, in dumps
       cp.dump(obj)
       ~~~~~~~^^^^^
     File "/__w/spark/spark/python/pyspark/cloudpickle/cloudpickle.py", line 
1303, in dump
       return super().dump(obj)
              ~~~~~~~~~~~~^^^^^
   TypeError: cannot pickle '_abc._abc_data' object
   when serializing dict item '_abc_impl'
   when serializing tuple item 0
   when serializing cell reconstructor arguments
   when serializing cell object
   when serializing tuple item 0
   when serializing dict item '__closure__'
   when serializing tuple item 1
   when serializing function state
   when serializing function object
   when serializing dict item '__annotate_func__'
   when serializing tuple item 0
   when serializing abc.ABCMeta state
   when serializing abc.ABCMeta object
   when serializing tuple item 0
   when serializing cell reconstructor arguments
   when serializing cell object
   when serializing tuple item 0
   when serializing dict item '__closure__'
   when serializing tuple item 1
   when serializing function state
   when serializing function object
   when serializing dict item 'reader'
   when serializing tuple item 0
   when serializing abc.ABCMeta state
   when serializing abc.ABCMeta object
   
   During handling of the above exception, another exception occurred:
   
   Traceback (most recent call last):
     File 
"/__w/spark/spark/python/pyspark/sql/tests/test_python_datasource.py", line 
283, in test_in_memory_data_source
       self.spark.dataSource.register(InMemoryDataSource)
       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
     File "/__w/spark/spark/python/pyspark/sql/datasource.py", line 1197, in 
register
       wrapped = _wrap_function(sc, dataSource)
     File "/__w/spark/spark/python/pyspark/sql/udf.py", line 59, in 
_wrap_function
       pickled_command, broadcast_vars, env, includes = 
_prepare_for_python_RDD(sc, command)
                                                        
~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
     File "/__w/spark/spark/python/pyspark/core/rdd.py", line 5121, in 
_prepare_for_python_RDD
       pickled_command = ser.dumps(command)
     File "/__w/spark/spark/python/pyspark/serializers.py", line 470, in dumps
       raise pickle.PicklingError(msg)
   _pickle.PicklingError: Could not serialize object: TypeError: cannot pickle 
'_abc._abc_data' object
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to