BryanCutler commented on a change in pull request #24070: [SPARK-23961][PYTHON]
Fix error when toLocalIterator goes out of scope
URL: https://github.com/apache/spark/pull/24070#discussion_r272738566
##########
File path: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala
##########
@@ -168,7 +168,42 @@ private[spark] object PythonRDD extends Logging {
}
def toLocalIteratorAndServe[T](rdd: RDD[T]): Array[Any] = {
- serveIterator(rdd.toLocalIterator, s"serve toLocalIterator")
Review comment:
The previous behavior was that the Scala local iterator would advance as
long as the `write` calls to the socket are not blocked. So this means when
Python reads a batch (auto-batched elements) from the current partition, this
will unblock the Scala call to write and could start the next job.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]