bersprockets commented on pull request #30221:
URL: https://github.com/apache/spark/pull/30221#issuecomment-721213619


   > FYI I just tried and can't find a scenario that has multiple method calls 
on `hasNext()` without `next()`. 
   
   @gengliangwang My reprod case is such an example. When 
BypassMergeSortShuffleWriter#write is driving the scan, there will be multiple 
consecutive calls to hasNext (at the start of each task). This causes trouble 
only with V1 Avro. In datasource V2, there seems to be some intervening 
iterator which properly handles the multiple hasNext calls, therefore 
protecting the iterator in AvroPartitionReaderFactory from these multiple calls.
   
   I know of no case where there are consecutive calls to next without an 
intervening hasNext, but the latest commit to this PR handles it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to