[GitHub] [spark] HyukjinKwon commented on a change in pull request #28395: [SPARK-31549][PYSPARK] Add a develop API invoking collect on Python RDD with user-specified job group

GitBox Wed, 29 Apr 2020 16:43:16 -0700


HyukjinKwon commented on a change in pull request #28395:
URL: https://github.com/apache/spark/pull/28395#discussion_r417672569




##########
File path: python/pyspark/rdd.py
##########
@@ -877,6 +877,19 @@ def collect(self):
             sock_info = 
self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
         return list(_load_from_socket(sock_info, self._jrdd_deserializer))
 
+    def collectWithJobGroup(self, groupId, description, 
interruptOnCancel=False):
+        """
+        .. note:: Experimental
+
+        When collect rdd, use this method to specify job group.
+
+        .. versionadded:: 3.0.0

Review comment:
       I dont think we will ignore the existing definitions of API stability 
tags. Might be best to stick to what's documented there in tags ...
   
   Of course we should take it considering the ammended semver but I think it's 
a rather rubric to consider when we break.
   
   I kind of strongly think we should deprecate/remove it to stick to the 
standard approach for this one specifically later when we're able. I also agree 
with having this APIs as a step to migrate meanwhile.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HyukjinKwon commented on a change in pull request #28395: [SPARK-31549][PYSPARK] Add a develop API invoking collect on Python RDD with user-specified job group

Reply via email to