Hi Colin/David, > Like David said, basically it boils down to creating a feature flag for the new proposed __consumer_offsets version, or using a new IBP/metadata.version for it. Both approaches have pros and cons. Using an IBP/metadata.version bump reduces the size of the testing matrix. But using a feature flag allows people to avoid any bugs or pain associated with the change if they don't care about enabling it. This is basically the classic "should I use a feature flag or not?" discussion and we need to have it on a case-by-case basis.
I think most users are not going to care to manage versions for a bunch of different features. The IBP today has many shortcomings, but at least it's tied to a version that users understand (i.e. the release version). How would users know after upgrading to Kafka 3.1, for example, that they need to upgrade the metadata.version to 3 and offsets.version to 4 (or whatever)? It's a lot of overhead trying to understand all of the potential features and what each upgrade actually means to them. I am wondering if we could give them something more convenient which is tied to the release version. For example, maybe we could use a command like `kafka-features upgrade --release 3.1`, which the broker would then translate to an upgrade to the latest versions of the individual features at the time of the 3.1 release. Basically it's just a static map from release version to feature versions to make the upgrade process more convenient. Thanks, Jason On Wed, Nov 17, 2021 at 6:20 PM Jason Gustafson <ja...@confluent.io> wrote: > A few additional questions: > > 1. Currently the IBP tells us what version of individual inter-broker RPCs > will be used. I think the plan in this KIP is to use ApiVersions request > instead to find the highest compatible version (just like clients). Do I > have that right? > > 2. The following wasn't very clear to me: > > > Brokers will be able to observe changes to metadata.version by observing > the metadata log, and could then submit a new ApiVersionsRequest to the > other Kafka nodes. > > Is the purpose of submitting new ApiVersions requests to update the > features or the RPC versions? Does metadata.version also influence the > versions that a broker advertises? It would help to have more detail about > this. > > 3. I imagine users will want to know before performing an upgrade whether > downgrading will be safe. Would the --dry-run flag tell them this? > > Thanks, > Jason > > > > > > On Wed, Nov 17, 2021 at 3:55 PM Colin McCabe <cmcc...@apache.org> wrote: > >> On Wed, Nov 17, 2021, at 11:28, Jason Gustafson wrote: >> > Hi David, >> > >> > Forgive me if this ground has been covered already. Today, we have a few >> > other things that we have latched onto the IBP, such as upgrades to the >> > format of records in __consumer_offsets. I've been assuming that >> > metadata.version is not covering this. Is that right or is there some >> other >> > plan to take care of cases like this? >> > >> >> I think metadata.version could cover changes to things like >> __consumer_offsets, if people want it to. Or to put it another way, that is >> out of scope for this KIP. >> >> Like David said, basically it boils down to creating a feature flag for >> the new proposed __consumer_offsets version, or using a new >> IBP/metadata.version for it. Both approaches have pros and cons. Using an >> IBP/metadata.version bump reduces the size of the testing matrix. But using >> a feature flag allows people to avoid any bugs or pain associated with the >> change if they don't care about enabling it. This is basically the classic >> "should I use a feature flag or not?" discussion and we need to have it on >> a case-by-case basis. >> >> I think it's worth calling out that having a 1:1 mapping between IBP >> versions and metadata.versions will result in some metadata.versions that >> "don't do anything" (aka they do the same thing as the previous >> metadata.version). For example, if we change StopReplicaRequest again, that >> will not affect KRaft mode, but probably would require an IBP bump and >> hence a metadata.version bump. I think that's OK. It's not that different >> from updating your IBP and getting support for ZStandard, when your >> deployment doesn't use ZStandard compression. >> >> best, >> Colin >> >> > Thanks, >> > Jason >> > >> > >> > >> > On Wed, Nov 17, 2021 at 10:17 AM Jun Rao <j...@confluent.io.invalid> >> wrote: >> > >> >> Hi, Colin, >> >> >> >> Thanks for the reply. >> >> >> >> For case b, I am not sure that I understand your suggestion. Does "each >> >> subsequent level for metadata.version corresponds to an IBP version" >> mean >> >> that we need to keep IBP forever? Could you describe the upgrade >> process in >> >> this case? >> >> >> >> Jun >> >> >> >> On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe <cmcc...@apache.org> >> wrote: >> >> >> >> > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote: >> >> > > Hi, David, Colin, >> >> > > >> >> > > Thanks for the reply. >> >> > > >> >> > > 16. Discussed with David offline a bit. We have 3 cases. >> >> > > a. We upgrade from an old version where the metadata.version has >> >> already >> >> > > been finalized. In this case it makes sense to stay with that >> feature >> >> > > version after the upgrade. >> >> > >> >> > +1 >> >> > >> >> > > b. We upgrade from an old version where no metadata.version has >> been >> >> > > finalized. In this case, it makes sense to leave metadata.version >> >> > disabled >> >> > > since we don't know if all brokers have been upgraded. >> >> > >> >> > This is the scenario I was hoping to avoid by saying that ALL KRaft >> >> > clusters have metadata.version of at least 1, and each subsequent >> level >> >> for >> >> > metadata.version corresponds to an IBP version. The existing KRaft >> >> clusters >> >> > in 3.0 and earlier are preview (not for production) so I think this >> >> change >> >> > is OK for 3.x (given that it affects only KRaft). Then IBP is >> irrelevant >> >> > for KRaft clusters (the config is ignored, possibly with a WARN or >> ERROR >> >> > message generated if it is set). >> >> > >> >> > > c. We are starting from a brand new cluster and of course no >> >> > > metadata.version has been finalized. In this case, the KIP says it >> will >> >> > > pick the metadata.version in meta.properties. In the common case, >> >> people >> >> > > probably won't set the metadata.version in the meta.properties file >> >> > > explicitly. So, it will be useful to put a default (stable) version >> >> there >> >> > > when the meta.properties. >> >> > >> >> > Hmm. I was assuming that clusters where the admin didn't specify any >> >> > metadata.version during formatting would get the latest >> metadata.version. >> >> > Partly, because this is what we do for IBP today. It would be good to >> >> > clarify this... >> >> > >> >> > > >> >> > > Also, it would be useful to clarify that if a FeatureLevelRecord >> exists >> >> > for >> >> > > metadata.version, the metadata.version in meta.properties will be >> >> > ignored. >> >> > > >> >> > >> >> > Yeah, I agree. >> >> > >> >> > best, >> >> > Colin >> >> > >> >> > > Thanks, >> >> > > >> >> > > Jun >> >> > > >> >> > > >> >> > > On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe <cmcc...@apache.org> >> >> > wrote: >> >> > > >> >> > >> On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote: >> >> > >> > Hi, David, >> >> > >> > >> >> > >> > Thanks for the reply. >> >> > >> > >> >> > >> > 16. My first concern is that the KIP picks up meta.version >> >> > inconsistently >> >> > >> > during the deployment. If a new cluster is started, we pick up >> the >> >> > >> highest >> >> > >> > version. If we upgrade, we leave the feature version unchanged. >> >> > >> >> >> > >> Hi Jun, >> >> > >> >> >> > >> Thanks again for taking a look. >> >> > >> >> >> > >> The proposed behavior in KIP-778 is consistent with how it works >> >> today. >> >> > >> Upgrading the software is distinct from upgrading the IBP. >> >> > >> >> >> > >> I think it is important to keep these two operations ("upgrading >> >> > >> IBP/metadata version" and "upgrading software version") separate. >> If >> >> > they >> >> > >> are coupled it will create a situation where software upgrades are >> >> > >> difficult and dangerous. >> >> > >> >> >> > >> Consider a situation where you find some bug in your current >> software, >> >> > and >> >> > >> you want to upgrade to new software that fixes the bug. If >> upgrades >> >> and >> >> > IBP >> >> > >> bumps are coupled, you can't do this without also bumping the IBP, >> >> > which is >> >> > >> usually considered a high-risk change. That means that either you >> have >> >> > to >> >> > >> make a special build that includes only the fix (time-consuming >> and >> >> > >> error-prone), live with the bug for longer, or be very >> conservative >> >> > about >> >> > >> ever introducing new IBP/metadata versions. None of those are >> really >> >> > good >> >> > >> choices. >> >> > >> >> >> > >> > Intuitively, it seems that independent of how a cluster is >> deployed, >> >> > we >> >> > >> > should always pick the same feature version. >> >> > >> >> >> > >> I think it makes sense to draw a distinction between upgrading an >> >> > existing >> >> > >> cluster and deploying a new one. What most people want out of >> upgrades >> >> > is >> >> > >> that things should keep working, but with bug fixes. If we change >> >> that, >> >> > it >> >> > >> just makes people more reluctant to upgrade (which is always a >> >> > problem...) >> >> > >> >> >> > >> > I think we need to think this through in this KIP. My second >> concern >> >> > is >> >> > >> > that as a particular version matures, it's inconvenient for a >> user >> >> to >> >> > >> manually >> >> > >> > upgrade every feature version. As long as we have a path to >> achieve >> >> > that >> >> > >> in >> >> > >> > the future, we don't need to address that in this KIP. >> >> > >> >> >> > >> If people are managing a large number of Kafka clusters, they will >> >> want >> >> > to >> >> > >> do some sort of A/B testing with IBP/metadata versions. So if you >> have >> >> > 1000 >> >> > >> Kafka clusters, you roll out the new IBP version to 10 of them >> and see >> >> > how >> >> > >> it goes. If that goes well, you roll it out to more, etc. >> >> > >> >> >> > >> So, the automation needs to be at the cluster management layer, >> not at >> >> > the >> >> > >> Kafka layer. Each Kafka cluster doesn't know how well things went >> in >> >> the >> >> > >> other 999 clusters. Automatically upgrading is a bad thing for the >> >> same >> >> > >> reason Kafka automatically upgrading its own software version >> would >> >> be a >> >> > >> bad thing -- it could lead to a disruption to a sensitive cluster >> at >> >> the >> >> > >> wrong time. >> >> > >> >> >> > >> For people who are just managing one or two Kafka clusters, >> >> > automatically >> >> > >> upgrading feature versions isn't a big burden and can be done >> >> manually. >> >> > >> This is all consistent with how IBP works today. >> >> > >> >> >> > >> Also, we already have a command-line option to the feature tool >> which >> >> > >> upgrades all features to the latest available, if that is what the >> >> > >> administrator desires. Perhaps we could add documentation saying >> that >> >> > this >> >> > >> should be done as the last step of the upgrade. >> >> > >> >> >> > >> best, >> >> > >> Colin >> >> > >> >> >> > >> > >> >> > >> > 21. "./kafka-features.sh delete": Deleting a feature seems a bit >> >> weird >> >> > >> > since the logic is always there. Would it be better to use >> disable? >> >> > >> > >> >> > >> > Jun >> >> > >> > >> >> > >> > On Fri, Nov 5, 2021 at 8:11 AM David Arthur >> >> > >> > <david.art...@confluent.io.invalid> wrote: >> >> > >> > >> >> > >> >> Colin and Jun, thanks for the additional comments! >> >> > >> >> >> >> > >> >> Colin: >> >> > >> >> >> >> > >> >> > We've been talking about having an automated RPC >> compatibility >> >> > checker >> >> > >> >> >> >> > >> >> Do we have a way to mark fields in schemas as deprecated? It >> can >> >> > stay in >> >> > >> >> the RPC, it just complicates the logic a bit. >> >> > >> >> >> >> > >> >> > It would be nice if the active controller could validate >> that a >> >> > >> majority >> >> > >> >> of the quorum could use the proposed metadata.version. The >> active >> >> > >> >> controller should have this information, right? If we don't >> have >> >> > recent >> >> > >> >> information from a quorum of voters, we wouldn't be active. >> >> > >> >> >> >> > >> >> I believe we should have this information from the >> >> > ApiVersionsResponse. >> >> > >> It >> >> > >> >> would be good to do this validation to avoid a situation where >> a >> >> > >> >> quorum leader can't be elected due to unprocessable records. >> >> > >> >> >> >> > >> >> > Do we need delete as a command separate from downgrade? >> >> > >> >> >> >> > >> >> I think from an operator's perspective, it is nice to >> distinguish >> >> > >> between >> >> > >> >> changing a feature flag and unsetting it. It might be >> surprising to >> >> > an >> >> > >> >> operator to see the flag's version set to nothing when they >> >> requested >> >> > >> the >> >> > >> >> downgrade to version 0 (or less). >> >> > >> >> >> >> > >> >> > it seems like we should spell out that metadata.version >> begins at >> >> > 1 in >> >> > >> >> KRaft clusters >> >> > >> >> >> >> > >> >> I added this text: >> >> > >> >> >> >> > >> >> Introduce an IBP version to indicate the lowest software >> version >> >> that >> >> > >> >> > supports *metadata.version*. Below this IBP, the >> >> > *metadata.version* is >> >> > >> >> > undefined and will not be examined. At or above this IBP, the >> >> > >> >> > *metadata.version* must be *0* for ZooKeeper clusters and >> will be >> >> > >> >> > initialized as *1* for KRaft clusters. >> >> > >> >> >> >> > >> >> >> >> > >> >> > We probably also want an RPC implemented by both brokers and >> >> > >> controllers >> >> > >> >> that will reveal the min and max supported versions for each >> >> feature >> >> > >> level >> >> > >> >> supported by the server >> >> > >> >> >> >> > >> >> This is available in ApiVersionsResponse (we include the >> server's >> >> > >> supported >> >> > >> >> features as well as the cluster's finalized features) >> >> > >> >> >> >> > >> >> -------- >> >> > >> >> >> >> > >> >> Jun: >> >> > >> >> >> >> > >> >> 12. I've updated the KIP with AdminClient changes >> >> > >> >> >> >> > >> >> 14. You're right, it looks like I missed a few sections >> regarding >> >> > >> snapshot >> >> > >> >> generation. I've corrected it >> >> > >> >> >> >> > >> >> 16. This feels more like an enhancement to KIP-584. I agree it >> >> could >> >> > be >> >> > >> >> useful, but perhaps we could address it separately from KRaft >> >> > upgrades? >> >> > >> >> >> >> > >> >> 20. Indeed snapshots are not strictly necessary during an >> upgrade, >> >> > I've >> >> > >> >> reworded this >> >> > >> >> >> >> > >> >> >> >> > >> >> Thanks! >> >> > >> >> David >> >> > >> >> >> >> > >> >> >> >> > >> >> On Thu, Nov 4, 2021 at 6:51 PM Jun Rao >> <j...@confluent.io.invalid> >> >> > >> wrote: >> >> > >> >> >> >> > >> >> > Hi, David, Jose and Colin, >> >> > >> >> > >> >> > >> >> > Thanks for the reply. A few more comments. >> >> > >> >> > >> >> > >> >> > 12. It seems that we haven't updated the AdminClient >> accordingly? >> >> > >> >> > >> >> > >> >> > 14. "Metadata snapshot is generated and sent to the other >> >> inactive >> >> > >> >> > controllers and to brokers". I thought we wanted each broker >> to >> >> > >> generate >> >> > >> >> > its own snapshot independently? If only the controller >> generates >> >> > the >> >> > >> >> > snapshot, how do we force other brokers to pick it up? >> >> > >> >> > >> >> > >> >> > 16. If a feature version is new, one may not want to enable >> it >> >> > >> >> immediately >> >> > >> >> > after the cluster is upgraded. However, if a feature version >> has >> >> > been >> >> > >> >> > stable, requiring every user to run a command to upgrade to >> that >> >> > >> version >> >> > >> >> > seems inconvenient. One way to improve this is for each >> feature >> >> to >> >> > >> define >> >> > >> >> > one version as the default. Then, when we upgrade a cluster, >> we >> >> > will >> >> > >> >> > automatically upgrade the feature to the default version. An >> >> admin >> >> > >> could >> >> > >> >> > use the tool to upgrade to a version higher than the default. >> >> > >> >> > >> >> > >> >> > 20. "The quorum controller can assist with this process by >> >> > generating >> >> > >> a >> >> > >> >> > metadata snapshot after a metadata.version increase has been >> >> > >> committed to >> >> > >> >> > the metadata log. This snapshot will be a convenient way to >> let >> >> > broker >> >> > >> >> and >> >> > >> >> > controller components rebuild their entire in-memory state >> >> > following >> >> > >> an >> >> > >> >> > upgrade." The new version of the software could read both >> the new >> >> > and >> >> > >> the >> >> > >> >> > old version. Is generating a new snapshot during upgrade >> needed? >> >> > >> >> > >> >> > >> >> > Jun >> >> > >> >> > >> >> > >> >> > >> >> > >> >> > On Wed, Nov 3, 2021 at 5:42 PM Colin McCabe < >> cmcc...@apache.org> >> >> > >> wrote: >> >> > >> >> > >> >> > >> >> > > On Tue, Oct 12, 2021, at 10:34, Jun Rao wrote: >> >> > >> >> > > > Hi, David, >> >> > >> >> > > > >> >> > >> >> > > > One more comment. >> >> > >> >> > > > >> >> > >> >> > > > 16. The main reason why KIP-584 requires finalizing a >> feature >> >> > >> >> manually >> >> > >> >> > is >> >> > >> >> > > > that in the ZK world, the controller doesn't know all >> brokers >> >> > in a >> >> > >> >> > > cluster. >> >> > >> >> > > > A broker temporarily down is not registered in ZK. in the >> >> KRaft >> >> > >> >> world, >> >> > >> >> > > the >> >> > >> >> > > > controller keeps track of all brokers, including those >> that >> >> are >> >> > >> >> > > temporarily >> >> > >> >> > > > down. This makes it possible for the controller to >> >> > automatically >> >> > >> >> > > finalize a >> >> > >> >> > > > feature---it's safe to do so when all brokers support >> that >> >> > >> feature. >> >> > >> >> > This >> >> > >> >> > > > will make the upgrade process much simpler since no >> manual >> >> > >> command is >> >> > >> >> > > > required to turn on a new feature. Have we considered >> this? >> >> > >> >> > > > >> >> > >> >> > > > Thanks, >> >> > >> >> > > > >> >> > >> >> > > > Jun >> >> > >> >> > > >> >> > >> >> > > Hi Jun, >> >> > >> >> > > >> >> > >> >> > > I guess David commented on this point already, but I'll >> comment >> >> > as >> >> > >> >> well. >> >> > >> >> > I >> >> > >> >> > > always had the perception that users viewed rolls as >> >> potentially >> >> > >> risky >> >> > >> >> > and >> >> > >> >> > > were looking for ways to reduce the risk. Not enabling >> features >> >> > >> right >> >> > >> >> > away >> >> > >> >> > > after installing new software seems like one way to do >> that. If >> >> > we >> >> > >> had >> >> > >> >> a >> >> > >> >> > > feature to automatically upgrade during a roll, I'm not >> sure >> >> > that I >> >> > >> >> would >> >> > >> >> > > recommend that people use it, because if something fails, >> it >> >> > makes >> >> > >> it >> >> > >> >> > > harder to tell if the new feature is at fault, or something >> >> else >> >> > in >> >> > >> the >> >> > >> >> > new >> >> > >> >> > > software. >> >> > >> >> > > >> >> > >> >> > > We already tell users to do a "double roll" when going to >> a new >> >> > IBP. >> >> > >> >> > (Just >> >> > >> >> > > to give background to people who haven't heard that >> phrase, the >> >> > >> first >> >> > >> >> > roll >> >> > >> >> > > installs the new software, and the second roll updates the >> >> IBP). >> >> > So >> >> > >> >> this >> >> > >> >> > > KIP-778 mechanism is basically very similar to that, >> except the >> >> > >> second >> >> > >> >> > > thing isn't a roll, but just an upgrade command. So I think >> >> this >> >> > is >> >> > >> >> > > consistent with what we currently do. >> >> > >> >> > > >> >> > >> >> > > Also, just like David said, we can always add auto-upgrade >> >> later >> >> > if >> >> > >> >> there >> >> > >> >> > > is demand... >> >> > >> >> > > >> >> > >> >> > > best, >> >> > >> >> > > Colin >> >> > >> >> > > >> >> > >> >> > > >> >> > >> >> > > > >> >> > >> >> > > > On Thu, Oct 7, 2021 at 5:19 PM Jun Rao <j...@confluent.io >> > >> >> > wrote: >> >> > >> >> > > > >> >> > >> >> > > >> Hi, David, >> >> > >> >> > > >> >> >> > >> >> > > >> Thanks for the KIP. A few comments below. >> >> > >> >> > > >> >> >> > >> >> > > >> 10. It would be useful to describe how the controller >> node >> >> > >> >> determines >> >> > >> >> > > the >> >> > >> >> > > >> RPC version used to communicate to other controller >> nodes. >> >> > There >> >> > >> >> seems >> >> > >> >> > > to >> >> > >> >> > > >> be a bootstrap problem. A controller node can't read >> the log >> >> > and >> >> > >> >> > > >> therefore the feature level until a quorum leader is >> >> elected. >> >> > But >> >> > >> >> > leader >> >> > >> >> > > >> election requires an RPC. >> >> > >> >> > > >> >> >> > >> >> > > >> 11. For downgrades, it would be useful to describe how >> to >> >> > >> determine >> >> > >> >> > the >> >> > >> >> > > >> downgrade process (generating new snapshot, propagating >> the >> >> > >> >> snapshot, >> >> > >> >> > > etc) >> >> > >> >> > > >> has completed. We could block the UpdateFeature request >> >> until >> >> > the >> >> > >> >> > > process >> >> > >> >> > > >> is completed. However, since the process could take >> time, >> >> the >> >> > >> >> request >> >> > >> >> > > could >> >> > >> >> > > >> time out. Another way is through DescribeFeature and the >> >> > server >> >> > >> only >> >> > >> >> > > >> reports downgraded versions after the process is >> completed. >> >> > >> >> > > >> >> >> > >> >> > > >> 12. Since we are changing UpdateFeaturesRequest, do we >> need >> >> to >> >> > >> >> change >> >> > >> >> > > the >> >> > >> >> > > >> AdminClient api for updateFeatures too? >> >> > >> >> > > >> >> >> > >> >> > > >> 13. For the paragraph starting with "In the absence of >> an >> >> > >> operator >> >> > >> >> > > >> defined value for metadata.version", in KIP-584, we >> >> described >> >> > >> how to >> >> > >> >> > > >> finalize features with New cluster bootstrap. In that >> case, >> >> > it's >> >> > >> >> > > >> inconvenient for the users to have to run an admin tool >> to >> >> > >> finalize >> >> > >> >> > the >> >> > >> >> > > >> version for each feature. Instead, the system detects >> that >> >> the >> >> > >> >> > /features >> >> > >> >> > > >> path is missing in ZK and thus automatically finalizes >> every >> >> > >> feature >> >> > >> >> > > with >> >> > >> >> > > >> the latest supported version. Could we do something >> similar >> >> in >> >> > >> the >> >> > >> >> > KRaft >> >> > >> >> > > >> mode? >> >> > >> >> > > >> >> >> > >> >> > > >> 14. After the quorum leader generates a new snapshot, >> how do >> >> > we >> >> > >> >> force >> >> > >> >> > > >> other nodes to pick up the new snapshot? >> >> > >> >> > > >> >> >> > >> >> > > >> 15. I agree with Jose that it will be useful to describe >> >> when >> >> > >> >> > > generating a >> >> > >> >> > > >> new snapshot is needed. To me, it seems the new >> snapshot is >> >> > only >> >> > >> >> > needed >> >> > >> >> > > >> when incompatible changes are made. >> >> > >> >> > > >> >> >> > >> >> > > >> 7. Jose, what control records were you referring? >> >> > >> >> > > >> >> >> > >> >> > > >> Thanks, >> >> > >> >> > > >> >> >> > >> >> > > >> Jun >> >> > >> >> > > >> >> >> > >> >> > > >> >> >> > >> >> > > >> On Tue, Oct 5, 2021 at 8:53 AM David Arthur < >> >> > >> davidart...@apache.org >> >> > >> >> > >> >> > >> >> > > >> wrote: >> >> > >> >> > > >> >> >> > >> >> > > >>> Jose, thanks for the thorough review and comments! >> >> > >> >> > > >>> >> >> > >> >> > > >>> I am out of the office until next week, so I probably >> won't >> >> > be >> >> > >> able >> >> > >> >> > to >> >> > >> >> > > >>> update the KIP until then. Here are some replies to >> your >> >> > >> questions: >> >> > >> >> > > >>> >> >> > >> >> > > >>> 1. Generate snapshot on upgrade >> >> > >> >> > > >>> > > Metadata snapshot is generated and sent to the >> other >> >> > nodes >> >> > >> >> > > >>> > Why does the Active Controller need to generate a new >> >> > snapshot >> >> > >> >> and >> >> > >> >> > > >>> > force a snapshot fetch from the replicas (inactive >> >> > controller >> >> > >> and >> >> > >> >> > > >>> > brokers) on an upgrade? Isn't writing the >> >> > FeatureLevelRecord >> >> > >> good >> >> > >> >> > > >>> > enough to communicate the upgrade to the replicas? >> >> > >> >> > > >>> >> >> > >> >> > > >>> >> >> > >> >> > > >>> You're right, we don't necessarily need to _transmit_ a >> >> > >> snapshot, >> >> > >> >> > since >> >> > >> >> > > >>> each node can generate its own equivalent snapshot >> >> > >> >> > > >>> >> >> > >> >> > > >>> 2. Generate snapshot on downgrade >> >> > >> >> > > >>> > > Metadata snapshot is generated and sent to the >> other >> >> > >> inactive >> >> > >> >> > > >>> > controllers and to brokers (this snapshot may be >> lossy!) >> >> > >> >> > > >>> > Why do we need to send this downgraded snapshot to >> the >> >> > >> brokers? >> >> > >> >> The >> >> > >> >> > > >>> > replicas have seen the FeatureLevelRecord and >> noticed the >> >> > >> >> > downgrade. >> >> > >> >> > > >>> > Can we have the replicas each independently generate >> a >> >> > >> downgraded >> >> > >> >> > > >>> > snapshot at the offset for the downgraded >> >> > FeatureLevelRecord? >> >> > >> I >> >> > >> >> > > assume >> >> > >> >> > > >>> > that the active controller will guarantee that all >> >> records >> >> > >> after >> >> > >> >> > the >> >> > >> >> > > >>> > FatureLevelRecord use the downgraded version. If so, >> it >> >> > would >> >> > >> be >> >> > >> >> > good >> >> > >> >> > > >>> > to mention that explicitly. >> >> > >> >> > > >>> >> >> > >> >> > > >>> >> >> > >> >> > > >>> Similar to above, yes a broker that detects a >> downgrade via >> >> > >> >> > > >>> FeatureLevelRecord could generate its own downgrade >> >> snapshot >> >> > and >> >> > >> >> > reload >> >> > >> >> > > >>> its >> >> > >> >> > > >>> state from that. This does get a little fuzzy when we >> >> > consider >> >> > >> >> cases >> >> > >> >> > > where >> >> > >> >> > > >>> brokers are on different software versions and could be >> >> > >> generating >> >> > >> >> a >> >> > >> >> > > >>> downgrade snapshot for version X, but using different >> >> > versions >> >> > >> of >> >> > >> >> the >> >> > >> >> > > >>> code. >> >> > >> >> > > >>> It might be safer to let the controller generate the >> >> > snapshot so >> >> > >> >> each >> >> > >> >> > > >>> broker (regardless of software version) gets the same >> >> > records. >> >> > >> >> > However, >> >> > >> >> > > >>> for >> >> > >> >> > > >>> upgrades (or downgrades) we expect the whole cluster >> to be >> >> > >> running >> >> > >> >> > the >> >> > >> >> > > >>> same >> >> > >> >> > > >>> software version before triggering the metadata.version >> >> > change, >> >> > >> so >> >> > >> >> > > perhaps >> >> > >> >> > > >>> this isn't a likely scenario. Thoughts? >> >> > >> >> > > >>> >> >> > >> >> > > >>> >> >> > >> >> > > >>> 3. Max metadata version >> >> > >> >> > > >>> > >For the first release that supports >> metadata.version, we >> >> > can >> >> > >> >> > simply >> >> > >> >> > > >>> > initialize metadata.version with the current (and >> only) >> >> > >> version. >> >> > >> >> > For >> >> > >> >> > > >>> future >> >> > >> >> > > >>> > releases, we will need a mechanism to bootstrap a >> >> > particular >> >> > >> >> > version. >> >> > >> >> > > >>> This >> >> > >> >> > > >>> > could be done using the meta.properties file or some >> >> > similar >> >> > >> >> > > mechanism. >> >> > >> >> > > >>> The >> >> > >> >> > > >>> > reason we need the allow for a specific initial >> version >> >> is >> >> > to >> >> > >> >> > support >> >> > >> >> > > >>> the >> >> > >> >> > > >>> > use case of starting a Kafka cluster at version X >> with an >> >> > >> older >> >> > >> >> > > >>> > metadata.version. >> >> > >> >> > > >>> >> >> > >> >> > > >>> >> >> > >> >> > > >>> I assume that the Active Controller will learn the >> metadata >> >> > >> version >> >> > >> >> > of >> >> > >> >> > > >>> > the broker through the BrokerRegistrationRequest. How >> >> will >> >> > the >> >> > >> >> > Active >> >> > >> >> > > >>> > Controller learn about the max metadata version of >> the >> >> > >> inactive >> >> > >> >> > > >>> > controller nodes? We currently don't send a >> registration >> >> > >> request >> >> > >> >> > from >> >> > >> >> > > >>> > the inactive controller to the active controller. >> >> > >> >> > > >>> >> >> > >> >> > > >>> >> >> > >> >> > > >>> This came up during the design, but I neglected to add >> it >> >> to >> >> > the >> >> > >> >> KIP. >> >> > >> >> > > We >> >> > >> >> > > >>> will need a mechanism for determining the supported >> >> features >> >> > of >> >> > >> >> each >> >> > >> >> > > >>> controller similar to how brokers use >> >> > BrokerRegistrationRequest. >> >> > >> >> > > Perhaps >> >> > >> >> > > >>> controllers could write a FeatureLevelRecord (or >> similar) >> >> to >> >> > the >> >> > >> >> > > metadata >> >> > >> >> > > >>> log indicating their supported version. WDYT? >> >> > >> >> > > >>> >> >> > >> >> > > >>> Why do you need to bootstrap a particular version? >> Isn't >> >> the >> >> > >> intent >> >> > >> >> > > >>> > that the broker will learn the active metadata >> version by >> >> > >> reading >> >> > >> >> > the >> >> > >> >> > > >>> > metadata before unfencing? >> >> > >> >> > > >>> >> >> > >> >> > > >>> >> >> > >> >> > > >>> This bootstrapping is needed for when a KRaft cluster >> is >> >> > first >> >> > >> >> > > started. If >> >> > >> >> > > >>> we don't have this mechanism, the cluster can't really >> do >> >> > >> anything >> >> > >> >> > > until >> >> > >> >> > > >>> the operator finalizes the metadata.version with the >> tool. >> >> > The >> >> > >> >> > > >>> bootstrapping will be done by the controller and the >> >> brokers >> >> > >> will >> >> > >> >> see >> >> > >> >> > > this >> >> > >> >> > > >>> version as a record (like you say). I'll add some text >> to >> >> > >> clarify >> >> > >> >> > this. >> >> > >> >> > > >>> >> >> > >> >> > > >>> >> >> > >> >> > > >>> 4. Reject Registration - This is related to the bullet >> >> point >> >> > >> above. >> >> > >> >> > > >>> > What will be the behavior of the active controller >> if the >> >> > >> broker >> >> > >> >> > > sends >> >> > >> >> > > >>> > a metadata version that is not compatible with the >> >> cluster >> >> > >> wide >> >> > >> >> > > >>> > metadata version? >> >> > >> >> > > >>> >> >> > >> >> > > >>> >> >> > >> >> > > >>> If a broker starts up with a lower supported version >> range >> >> > than >> >> > >> the >> >> > >> >> > > >>> current >> >> > >> >> > > >>> cluster metadata.version, it should log an error and >> >> > shutdown. >> >> > >> This >> >> > >> >> > is >> >> > >> >> > > in >> >> > >> >> > > >>> line with KIP-584. >> >> > >> >> > > >>> >> >> > >> >> > > >>> 5. Discover upgrade >> >> > >> >> > > >>> > > This snapshot will be a convenient way to let >> broker >> >> and >> >> > >> >> > controller >> >> > >> >> > > >>> > components rebuild their entire in-memory state >> following >> >> > an >> >> > >> >> > upgrade. >> >> > >> >> > > >>> > Can we rely on the presence of the >> FeatureLevelRecord for >> >> > the >> >> > >> >> > > metadata >> >> > >> >> > > >>> > version for this functionality? If so, it avoids >> having >> >> to >> >> > >> reload >> >> > >> >> > the >> >> > >> >> > > >>> > snapshot. >> >> > >> >> > > >>> >> >> > >> >> > > >>> >> >> > >> >> > > >>> For upgrades, yes probably since we won't need to >> "rewrite" >> >> > any >> >> > >> >> > > records in >> >> > >> >> > > >>> this case. For downgrades, we will need to generate the >> >> > snapshot >> >> > >> >> and >> >> > >> >> > > >>> reload >> >> > >> >> > > >>> everything. >> >> > >> >> > > >>> >> >> > >> >> > > >>> 6. Metadata version specification >> >> > >> >> > > >>> > > V4(version=4, isBackwardsCompatible=false, >> >> > description="New >> >> > >> >> > > metadata >> >> > >> >> > > >>> > record type Bar"), >> >> > >> >> > > >>> >> >> > >> >> > > >>> >> >> > >> >> > > >>> Very cool. Do you have plans to generate Apache Kafka >> HTML >> >> > >> >> > > >>> > documentation for this information? Would be helpful >> to >> >> > >> display >> >> > >> >> > this >> >> > >> >> > > >>> > information to the user using the kafka-features.sh >> and >> >> > >> feature >> >> > >> >> > RPC? >> >> > >> >> > > >>> >> >> > >> >> > > >>> >> >> > >> >> > > >>> Hm good idea :) I'll add a brief section on >> documentation. >> >> > This >> >> > >> >> would >> >> > >> >> > > >>> certainly be very useful >> >> > >> >> > > >>> >> >> > >> >> > > >>> 7.Downgrade records >> >> > >> >> > > >>> > I think we should explicitly mention that the >> downgrade >> >> > >> process >> >> > >> >> > will >> >> > >> >> > > >>> > downgrade both metadata records and controller >> records. >> >> In >> >> > >> >> KIP-630 >> >> > >> >> > we >> >> > >> >> > > >>> > introduced two control records for snapshots. >> >> > >> >> > > >>> >> >> > >> >> > > >>> >> >> > >> >> > > >>> Yes, good call. Let me re-read that KIP and include >> some >> >> > >> details. >> >> > >> >> > > >>> >> >> > >> >> > > >>> Thanks again for the comments! >> >> > >> >> > > >>> -David >> >> > >> >> > > >>> >> >> > >> >> > > >>> >> >> > >> >> > > >>> On Wed, Sep 29, 2021 at 5:09 PM José Armando García >> Sancio >> >> > >> >> > > >>> <jsan...@confluent.io.invalid> wrote: >> >> > >> >> > > >>> >> >> > >> >> > > >>> > One more comment. >> >> > >> >> > > >>> > >> >> > >> >> > > >>> > 7.Downgrade records >> >> > >> >> > > >>> > I think we should explicitly mention that the >> downgrade >> >> > >> process >> >> > >> >> > will >> >> > >> >> > > >>> > downgrade both metadata records and controller >> records. >> >> In >> >> > >> >> KIP-630 >> >> > >> >> > we >> >> > >> >> > > >>> > introduced two control records for snapshots. >> >> > >> >> > > >>> > >> >> > >> >> > > >>> >> >> > >> >> > > >> >> >> > >> >> > > >> >> > >> >> > >> >> > >> >> >> >> > >> >> >> >> > >> >> -- >> >> > >> >> -David >> >> > >> >> >> >> > >> >> >> > >> >> >> >