[GitHub] [spark] HeartSaVioR commented on pull request #34089: [SPARK-36837][BUILD] Upgrade Kafka to 3.1.0

GitBox Wed, 09 Feb 2022 13:12:38 -0800


HeartSaVioR commented on pull request #34089:
URL: https://github.com/apache/spark/pull/34089#issuecomment-1034196444



   I don't understand at all. My concern is general one, not to claim and blame 
some project/version.
   
   The problem we have now is, when we upgrade to Kafka by 2.8.1 to 3.x and 
found blocker issue(s) (anything, e.g. perf regression) in Kafka client later, 
we have no way to replace Kafka artifact in runtime to mitigate. We will have 
to just wait for Kafka community to fix the issue and release a new version. 
Wouldn't it be problematic already? Will we downgrade and release bugfix 
version immediately just due to the fact our dependency has serious issue?
   
   When someone upgrades the version, they are exposed on the benefits on new 
versions but also exposed on the risks on new bugs. The new minor version may 
contain some risks, the new major version may contain more if the project 
respects semver. We seem to take the risks but the risks are technically 
propagated to end users in their production. That said, we should build enough 
rationalization on upgrading dependency. I concern less on minor version 
upgrade but major version upgrade is completely different story.
   
   * What are the benefits on upgrading Kafka to 3.1 (or even 3.0) from 2.8.1 
in terms of Spark's view?
   * What are the risks on not upgrading Kafka to 3.1 (or even 3.0)?
   
   For example, upgrading Kafka to 2.5 was very reasonable because Spark dealt 
with hanging issue on offset fetch via new feature of AdminClient. The benefits 
clearly overweighted the risks.
   
   Kafka's amazing guarantee, being able to talk with old client and new broker 
(and vice versa) is not only helping us to consider Kafka minor version upgrade 
on client as safe one. It also works for opposite way. We don't need to rush on 
upgrading Kafka client unless we figure out the pains on older client 
communicates with newer broker.
   
   We are going to release a new "minor" version of Spark, not major version, 
which means end users will expect certain amount of risks on upgrading but not 
too much like upgrading major version. We have to take control the overall 
risks on upgrading Spark from 3.2 to 3.3.
   
   @ijuma 
   I see lots of bug tickets having affected versions as 2.x but having 
resolved versions as 3.0.x and 3.1.0. Does it mean we no longer be able to 
expect new bugfixes in 2.8.x?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HeartSaVioR commented on pull request #34089: [SPARK-36837][BUILD] Upgrade Kafka to 3.1.0

Reply via email to