srowen commented on a change in pull request #23716: [SPARK-26734] [STREAMING] 
Fix StackOverflowError with large block queue
URL: https://github.com/apache/spark/pull/23716#discussion_r252895479
 
 

 ##########
 File path: 
streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceivedBlockTracker.scala
 ##########
 @@ -112,7 +112,7 @@ private[streaming] class ReceivedBlockTracker(
   def allocateBlocksToBatch(batchTime: Time): Unit = synchronized {
     if (lastAllocatedBatchTime == null || batchTime > lastAllocatedBatchTime) {
       val streamIdToBlocks = streamIds.map { streamId =>
-        (streamId, getReceivedBlockQueue(streamId).clone())
+        (streamId, getReceivedBlockQueue(streamId).clone().dequeueAll(x => 
true))
 
 Review comment:
   This just basically copies the content right? would it be more robust to 
just construct a new mutable.Seq from the queue to be sure about what the type 
is? I know it won't be a Queue here but sounds like the representation does 
kind of matter.
   
   Was https://issues.apache.org/jira/browse/SPARK-23358 / 
https://issues.apache.org/jira/browse/SPARK-23391 really the cause or did it 
mostly uncover this?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to