Hi Colin,

On Fri, Apr 25, 2025 at 5:42 PM Colin McCabe <cmcc...@apache.org> wrote:
> > The "initiating the downgrade" section states "It will also check that
> > the requested metadata version is compatible with all the other
> > KIP-584 features that are enabled." What do you mean by this?
>
> So, in Kafka version 4.0, there are no KIP-584 features that rely on a 
> specific metadata version. However, there could be some in the future. This 
> is just a statement that if we do have these kind of cross-feature 
> dependencies in the future, we will enforce them.

Thanks. I took a look at KIP-584 and KIP-1022. Maybe I missed it but
neither of them define how dependencies are specified or checked. I
suspect that we will need a KIP if we ever add dependencies between
features.

> > In the same section, you have "If the metadata version is too old to
> > support KIP-919 controller registrations (pre 3.7.0-IV0), we will
> > simply assume that the controllers do support downgrade, because we
> > cannot check. This is similar to how metadata version upgrade is
> > handled in the absence of controller registrations." What happens if
> > this assumption is incorrect, the controller doesn't support
> > downgrade? Is there something that the user can do and we can document
> > to mitigate any issues that result from this?
> This is the scenario where some controllers are running the 3.6.0 release or 
> older, and some controllers are running 4.2 or newer, on a very old metadata 
> version without controller registrations. There really isn't anything we can 
> do to detect this since by definition, the pre-3.7-IV3 metadata version 
> doesn't have controller registrations.
>

Why are the controllers running 3.6.x? They could be running 4.0.x but
the MV hasn't been upgraded to 3.7 (when controller registration was
added), no?

> Any problems that result from this should be clearable by restarting the 
> ancient controller processes, though (since they will then load the snapshot 
> at the older MV)

Are you assuming that the user is going to upgrade the software
version when they restart the "ancient" controller processes?

I am trying to think of what would happen if the controller(s) on
software version <= 4.0.x become leader after a controller downgrades
the version. The controllers on software version <= 4.0.x will still
have metadata records serialized using the old MV, no?

> > In the same section, you have "If the updateFeatures RPC specified
> > multiple features, the metadata version downgrade record will be
> > emitted last." Is there a reason why this is required? I think that
> > all of the records would be in the same record batch so it is not
> > clear to me why you need the invariant that the metadata version is
> > last.
> It's required because in theory the other features could depend on 
> metadata.version. So we want to avoid the potential invalid state where MV 
> has been downgraded but the features which depend on it have not.

I still don't fully understand why this is required. First, assuming
feature dependency is supported, the dependency can be between any
feature version. I know that you are not designing for any feature
downgrade but the dependency could be between transaction.version and
group.version. Do these feature level versions also need to be ordered
if in the future we support such downgrades?

> > In the same section, you have "When a broker or controller replays the
> > new FeatureLevelRecord telling it to change the metadata.version, it
> > will immediately trigger a writing a __cluster_metadata snapshot with
> > the new metadata version." I assume that a new snapshot will be
> > generated only when the metadata version decreases, is that correct?
> > Can you explain how a change in metadata version will be detected? For
> > example, the FeatureLevelRecord may be in the cluster metadata
> > checkpoint. In that case a new checkpoint doesn't need to be
> > generated. I get the impression that this solution depends on the
> > KRaft listener always returning the latest checkpoint in the
> > RaftClient#handleLoadSnapshot. If so, should we make that explicit?
> >
>
> Hmm, I'm a bit confused about the question. The current Raft interfaces do 
> distinguish between records we load from a snapshot and records we load from 
> the log. So there is no risk of confusing the FeatureLevelRecord we loaded 
> from the latest snapshot with the FeatureLevelRecord we loaded from the log.

Right, I was thinking of the following log and snapshot:

Snapshot at end offset 100. All records between 0 and 99, inclusive,
have been included in the snapshot. The MV in the snapshot is X
offset 100 -> metadata.version = Y -- MV was previously upgraded to version Y
...             ->   ...  -- All of these records are serialized using
version Y.
offset 110 -> metadata.version = X -- MV was downgraded to X.

Before a snapshot that includes offset 110 (MV 3.9) could be generated
the node restarts. How would the code identify that the records
between 100 and 110 need to be snapshotted using metadata version 3.9?
Note that the metadata loader can batch all of the records between 100
and 110 into one delta.

> > Given that checkpoint generation is asynchronous from committing the
> > new metadata version, should we have metrics (or a different
> > mechanism) that the user can monitor to determine when it is safe to
> > downgrade the software version of a node?
> >
> That's a fair point. We should have some documentation about waiting for a 
> snapshot file to appear with a date later than the RPC date, I guess.

I was also thinking if we can expose this in a metric. E.g.
LastSnapshottedMetadataVersion. If that metric is X, the user must run
software versions that are greater than or equal to X. What do you
think? The metric can expose the numerical MV and not the "string" MV.

Thanks,
-- 
-José

Reply via email to