viirya commented on a change in pull request #24070: [SPARK-23961][PYTHON] Fix 
error when toLocalIterator goes out of scope
URL: https://github.com/apache/spark/pull/24070#discussion_r272078879
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala
 ##########
 @@ -168,7 +168,42 @@ private[spark] object PythonRDD extends Logging {
   }
 
   def toLocalIteratorAndServe[T](rdd: RDD[T]): Array[Any] = {
-    serveIterator(rdd.toLocalIterator, s"serve toLocalIterator")
 
 Review comment:
   > It is also possible that this change would be very beneficial because if 
the iterator is not fully consumed, it could save the triggering of unneeded 
jobs where the behavior before eagerly queued jobs for all partitions. In this 
sense, the change here more closely follows the Scala behavior.
   
   Once the local iterator is out of scope in Python side, will remaining jobs 
still be triggered after at Scala side it can't write into the closed 
connection?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to