zsxwing commented on pull request #29131: URL: https://github.com/apache/spark/pull/29131#issuecomment-659873692
Thanks for raising the PR. Could you clarify what's the cost to keep this? I believe KAFKA-7703 has been fixed since you have verified it using my reproduction codes. However I'd be more conservative. Although I did report KAFKA-7703, I didn't have any evidence that this was exactly the issue we hit in production, or that was the only possible issue. There were no enough logs to prove it unfortunately. What I know is the workaround we patched in Spark did prevent Kafka consumer from reporting incorrect offsets, but it could hide other potential unknown issues. Currently there is no Spark release using Kafka 2.5.0, so I don't feel confident that there are no other unknown issues causing the same incorrect offset issue. If the cost to keep this workaround is minor, can we wait until a Spark release using Kafka 2.5.0 is out for a while? Once there is a Spark release available and people start to use it, I can look at our internal logs to see if the warning log in `fetchLatestOffsets` is really gone, which will be an evidence to prove KAFKA-7703 is likely the only issue. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
