wankunde commented on PR #42206:
URL: https://github.com/apache/spark/pull/42206#issuecomment-1663191874

   > It looks fine to me, except maybe check the code for left semi joins.
   > 
   > I could not make the crash happen with left semi joins. I think the bug 
might actually exist in that code (within the same task, I see a call to 
processRows _after_ eager cleanup). However, it seems that for left semi joins, 
the optimizer moves the `Window` after the `Join` (that is, the windowing is 
performed on the joined result), so there is no X row to copy.
   > 
   > By the way, there is a reason you see `processRows` called again even 
after `BufferedIterator.hasNext` returns false: `FileFormatWriter` calls 
`hasNext` to see if the iterator is empty. If it is, it instantiates an 
instance of `EmptyDirectoryDataWriter`, which also calls `hasNext`.
   
   Thanks for your review. Fix this issue for LeftSemi SMJ.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to