This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new 085dfeb2bed [SPARK-43949][PYTHON] Upgrade cloudpickle to 2.2.1 085dfeb2bed is described below commit 085dfeb2bed61f6d43d9b99b299373e797ac8f17 Author: Hyukjin Kwon <gurwls...@apache.org> AuthorDate: Fri Jun 2 19:38:13 2023 +0900 [SPARK-43949][PYTHON] Upgrade cloudpickle to 2.2.1 ### What changes were proposed in this pull request? This PR proposes to upgrade Cloudpickle from 2.2.0 to 2.2.1. ### Why are the changes needed? Cloudpickle 2.2.1 has a fix (https://github.com/cloudpipe/cloudpickle/pull/495) for namedtuple issue (https://github.com/cloudpipe/cloudpickle/issues/460). PySpark relies on namedtuple heavily especially for RDD. We should upgrade and fix it. ### Does this PR introduce _any_ user-facing change? Yes, see https://github.com/cloudpipe/cloudpickle/issues/460. ### How was this patch tested? Relies on cloudpickle's unittests. Existing test cases should pass too. Closes #41433 from HyukjinKwon/cloudpickle-upgrade. Authored-by: Hyukjin Kwon <gurwls...@apache.org> Signed-off-by: Hyukjin Kwon <gurwls...@apache.org> --- python/pyspark/cloudpickle/__init__.py | 2 +- python/pyspark/cloudpickle/cloudpickle_fast.py | 4 ++-- python/pyspark/cloudpickle/compat.py | 17 +++++++++++++++-- 3 files changed, 18 insertions(+), 5 deletions(-) diff --git a/python/pyspark/cloudpickle/__init__.py b/python/pyspark/cloudpickle/__init__.py index efbf1178d43..af35a0a194b 100644 --- a/python/pyspark/cloudpickle/__init__.py +++ b/python/pyspark/cloudpickle/__init__.py @@ -5,4 +5,4 @@ from pyspark.cloudpickle.cloudpickle_fast import CloudPickler, dumps, dump # no # expose their Pickler subclass at top-level under the "Pickler" name. Pickler = CloudPickler -__version__ = '2.2.0' +__version__ = '2.2.1' diff --git a/python/pyspark/cloudpickle/cloudpickle_fast.py b/python/pyspark/cloudpickle/cloudpickle_fast.py index 8741dcbdaaa..63aaffa096b 100644 --- a/python/pyspark/cloudpickle/cloudpickle_fast.py +++ b/python/pyspark/cloudpickle/cloudpickle_fast.py @@ -111,8 +111,8 @@ load, loads = pickle.load, pickle.loads def _class_getnewargs(obj): type_kwargs = {} - if "__slots__" in obj.__dict__: - type_kwargs["__slots__"] = obj.__slots__ + if "__module__" in obj.__dict__: + type_kwargs["__module__"] = obj.__module__ __dict__ = obj.__dict__.get('__dict__', None) if isinstance(__dict__, property): diff --git a/python/pyspark/cloudpickle/compat.py b/python/pyspark/cloudpickle/compat.py index 837d0f279ab..5e9b52773d2 100644 --- a/python/pyspark/cloudpickle/compat.py +++ b/python/pyspark/cloudpickle/compat.py @@ -1,5 +1,18 @@ import sys -import pickle # noqa: F401 -from pickle import Pickler # noqa: F401 +if sys.version_info < (3, 8): + try: + import pickle5 as pickle # noqa: F401 + from pickle5 import Pickler # noqa: F401 + except ImportError: + import pickle # noqa: F401 + + # Use the Python pickler for old CPython versions + from pickle import _Pickler as Pickler # noqa: F401 +else: + import pickle # noqa: F401 + + # Pickler will the C implementation in CPython and the Python + # implementation in PyPy + from pickle import Pickler # noqa: F401 --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org