amogh-jahagirdar commented on PR #15006: URL: https://github.com/apache/iceberg/pull/15006#issuecomment-3746921997
>Can you give some quick notes on which versions of Spark are effected here? It also seems like we are safe if AQE is disabled is that true? I don't mean to rush you when you are working on the fix but it would be nice to know the blast radius here. Sure thing, no problem at all! It'd be 3.4/3.5/4.0/4.1. TBH, I'm not entirely sure if disabling AQE always guarantees avoiding this. I would need to double check our exact logic on file splitting, but fundamentally this issue can happen when we have a split file task across Spark Tasks AND there's a delete that spans more than 1 one of those splits. We observed that when AQE was disabled we avoided this, it may just be that AQE amplifies the issue, but I do think there are more cases beyond AQE when this could happen, like it depends on split sizes? I'd have to check further. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
