[GitHub] inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] Update cloudpickle to v0.6.1

2019-01-19 Thread GitBox
inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] 
Update cloudpickle to v0.6.1
URL: https://github.com/apache/spark/pull/20691#discussion_r249238272
 
 

 ##
 File path: python/pyspark/broadcast.py
 ##
 @@ -110,7 +110,7 @@ def __init__(self, sc=None, value=None, 
pickle_registry=None, path=None,
 
 def dump(self, value, f):
 try:
-pickle.dump(value, f, 2)
+pickle.dump(value, f, pickle.HIGHEST_PROTOCOL)
 
 Review comment:
   It happened here: 
https://github.com/apache/spark/commit/6cf507685efd01df77d663145ae08e48c7f92948#diff-bb67501acde415576c589b478e16c60aR82
   Since then it never changed.
   I agree that there was no particular reason for that since 
pickle.HIGHEST_PROTOCOL in Python 2 versions is 2 for ages, not 3 or 4. Using 
pickle.HIGHEST_PROTOCOL consistently should be safe for that reason.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] Update cloudpickle to v0.6.1

2019-01-19 Thread GitBox
inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] 
Update cloudpickle to v0.6.1
URL: https://github.com/apache/spark/pull/20691#discussion_r249238114
 
 

 ##
 File path: python/pyspark/serializers.py
 ##
 @@ -606,7 +604,7 @@ class PickleSerializer(FramedSerializer):
 """
 
 def dumps(self, obj):
-return pickle.dumps(obj, protocol)
+return pickle.dumps(obj, pickle.HIGHEST_PROTOCOL)
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] Update cloudpickle to v0.6.1

2019-01-18 Thread GitBox
inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] 
Update cloudpickle to v0.6.1
URL: https://github.com/apache/spark/pull/20691#discussion_r249145481
 
 

 ##
 File path: python/pyspark/serializers.py
 ##
 @@ -65,7 +65,7 @@
 from itertools import izip as zip, imap as map
 else:
 import pickle
-protocol = 3
+protocol = min(pickle.HIGHEST_PROTOCOL, 4)
 
 Review comment:
   Done. Also reorganised older commits a bit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] Update cloudpickle to v0.6.1

2019-01-18 Thread GitBox
inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] 
Update cloudpickle to v0.6.1
URL: https://github.com/apache/spark/pull/20691#discussion_r249093196
 
 

 ##
 File path: python/pyspark/cloudpickle.py
 ##
 @@ -72,6 +78,22 @@
 PY3 = True
 
 Review comment:
   > Is this reversed? looks like this is meant to be true when running Python 
3? (I know you didn't change it here)
   
   Can you state it more precisely? What I see is `PY3 = True` is executed when 
`sys.version < '3'` is False --- that's OK.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] Update cloudpickle to v0.6.1

2019-01-18 Thread GitBox
inpefess commented on a change in pull request #20691: [SPARK-18161] [Python] 
Update cloudpickle to v0.6.1
URL: https://github.com/apache/spark/pull/20691#discussion_r249090813
 
 

 ##
 File path: python/pyspark/cloudpickle.py
 ##
 @@ -42,20 +42,26 @@
 """
 from __future__ import print_function
 
-import dis
-from functools import partial
-import imp
 import io
-import itertools
-import logging
+import dis
 
 Review comment:
   > Not a big deal but did these imports need to shuffle? I'm not clear what 
was added or removed
   
   It's an exact copy of cloudpickle 0.6.1 code. So the import order was not 
shuffled by me on purpose.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org