[GitHub] inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] Update cloudpickle to v0.6.1
inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] Update cloudpickle to v0.6.1 URL: https://github.com/apache/spark/pull/20691#discussion_r249238272 ## File path: python/pyspark/broadcast.py ## @@ -110,7 +110,7 @@ def __init__(self, sc=None, value=None, pickle_registry=None, path=None, def dump(self, value, f): try: -pickle.dump(value, f, 2) +pickle.dump(value, f, pickle.HIGHEST_PROTOCOL) Review comment: It happened here: https://github.com/apache/spark/commit/6cf507685efd01df77d663145ae08e48c7f92948#diff-bb67501acde415576c589b478e16c60aR82 Since then it never changed. I agree that there was no particular reason for that since pickle.HIGHEST_PROTOCOL in Python 2 versions is 2 for ages, not 3 or 4. Using pickle.HIGHEST_PROTOCOL consistently should be safe for that reason. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] Update cloudpickle to v0.6.1
inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] Update cloudpickle to v0.6.1 URL: https://github.com/apache/spark/pull/20691#discussion_r249238114 ## File path: python/pyspark/serializers.py ## @@ -606,7 +604,7 @@ class PickleSerializer(FramedSerializer): """ def dumps(self, obj): -return pickle.dumps(obj, protocol) +return pickle.dumps(obj, pickle.HIGHEST_PROTOCOL) Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] Update cloudpickle to v0.6.1
inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] Update cloudpickle to v0.6.1 URL: https://github.com/apache/spark/pull/20691#discussion_r249145481 ## File path: python/pyspark/serializers.py ## @@ -65,7 +65,7 @@ from itertools import izip as zip, imap as map else: import pickle -protocol = 3 +protocol = min(pickle.HIGHEST_PROTOCOL, 4) Review comment: Done. Also reorganised older commits a bit. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] Update cloudpickle to v0.6.1
inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] Update cloudpickle to v0.6.1 URL: https://github.com/apache/spark/pull/20691#discussion_r249093196 ## File path: python/pyspark/cloudpickle.py ## @@ -72,6 +78,22 @@ PY3 = True Review comment: > Is this reversed? looks like this is meant to be true when running Python 3? (I know you didn't change it here) Can you state it more precisely? What I see is `PY3 = True` is executed when `sys.version < '3'` is False --- that's OK. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] Update cloudpickle to v0.6.1
inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] Update cloudpickle to v0.6.1 URL: https://github.com/apache/spark/pull/20691#discussion_r249090813 ## File path: python/pyspark/cloudpickle.py ## @@ -42,20 +42,26 @@ """ from __future__ import print_function -import dis -from functools import partial -import imp import io -import itertools -import logging +import dis Review comment: > Not a big deal but did these imports need to shuffle? I'm not clear what was added or removed It's an exact copy of cloudpickle 0.6.1 code. So the import order was not shuffled by me on purpose. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org