HeartSaVioR commented on pull request #34089: URL: https://github.com/apache/spark/pull/34089#issuecomment-1035844411
It is not going to be productive if we are going to defend the change. I'm not in favor of post-reviewing just because of this. The fact is, no one knew about the breaking changes. The analysis was done in very high level, but we soon figured out there are more, thanks for folk in the Kafka community. It is not important whether the breaking changes are minors or not. We had to try to find all things and evaluate the risks before moving on. We would be pretty much confident if we have a requirement on consulting with Kafka community on upgrading versions. Whether we enforce this to minor version upgrade or only major version upgrade is another story. https://cwiki.apache.org/confluence/display/KAFKA/KIP-679%3A+Producer+will+enable+the+strongest+delivery+guarantee+by+default This is great in terms of stability, but there is no silver bullet. This is a trade off between possible data loss vs performance. https://cwiki.apache.org/confluence/display/KAFKA/An+analysis+of+the+impact+of+max.in.flight.requests.per.connection+and+acks+on+Producer+performance I see the conclusion in the analysis, "We don't understand the behavior of acks=all and acks=1 across different workloads and across the entire latency spectrum. We should leave the default as is.", and the default, has changed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
