Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-12-17 Thread David Arthur
Tom, thanks for taking a look!

1. Yes you're right, I'll fix that
2. Components refers to broker objects like ReplicaManager,
ClientQuotaManager, etc. Controller components (so far) don't need to
reload any state based on metadata.version changes.
3. Yes, if a controller reads a FeatureLevelRecord from a snapshot or
commit event, and it's for an unsupportable metadata.version, it will
terminate
4. I think it should be impossible for the active controller to read an
unsupported metadata.version except for bugs
5. Yes that's possible, however once B sees the new version, it will close
its connections to A and force the ApiVersions exchange to happen again.
This is an area we're looking to improve on in a future KIP
6. Yes, it should, I'll add.

Cheers,
David

On Fri, Dec 17, 2021 at 12:16 PM Tom Bentley  wrote:

> Hi David,
>
> Thanks for the KIP. Sorry I'm so late in looking at this. I have a few
> questions.
>
> 1. In the Proposed Changes Overview it says "The controller validates that
> the cluster can be safely downgraded to this version (override with
> --force)" I think that be "--unsafe"?
>
> 2. Also in this section it says "Components reload their state with new
> version" the word "components" is a little vague. I think it means
> controllers and brokers, right?
>
> 3. In the compatibility section it says "A controller running old software
> will join the quorum and begin replicating the metadata log. If this
> inactive controller encounters a FeatureLevelRecord for *metadata.version*
> that it cannot support, it should terminate." I assume "encounters"
> includes the snapshot case?
>
> 4. "In the unlikely event that an active controller encounters an
> unsupported *metadata.version*, it should resign and terminate. ", I'm
> unclear on how this could happen, could you elaborate?
>
> 5. In the ApiVersions part of the Upgrades section, "Brokers will observe
> changes to *metadata.version *as they replicate records from the metadata
> log. If a new *metadata.version* is seen, brokers will renegotiate
> compatible RPCs with other brokers through the the ApiVersions workflow."
> Is it possible that broker A does a metadata fetch, sees a changed
> metadata.version and attempts renegotiation with broker B before B has seen
> the metadata.version?
>
> 6. Why does "kafka-features.sh downgrade" not support "--release"?
>
> Kind regards,
>
> Tom
>
> On Tue, Nov 30, 2021 at 10:26 PM Jun Rao  wrote:
>
> > Hi, David,
> >
> > Thanks for the reply. No more questions from me.
> >
> > Jun
> >
> > On Tue, Nov 30, 2021 at 1:52 PM David Arthur
> >  wrote:
> >
> > > Thanks Jun,
> > >
> > > 30: I clarified the description of the "upgrade" command to:
> > >
> > > The FEATURE and VERSION arguments can be repeated to indicate an
> upgrade
> > of
> > > > multiple features in one request. Alternatively, the RELEASE flag can
> > be
> > > > given to upgrade to the default versions of the specified release.
> > These
> > > > two options, FEATURE and RELEASE, are mutually exclusive. If neither
> is
> > > > given, this command will do nothing.
> > >
> > >
> > > Basically, you must provide either "kafka-features.sh upgrade --release
> > > 3.2" or something like "kafka-features.sh upgrade --feature X --version
> > 10"
> > >
> > > -David
> > >
> > > On Tue, Nov 30, 2021 at 2:51 PM Jun Rao 
> > wrote:
> > >
> > > > Hi, David,
> > > >
> > > > Thanks for the reply. Just one more minor comment.
> > > >
> > > > 30. ./kafka-features.sh upgrade: It seems that the release param is
> > > > optional. Could you describe the semantic when release is not
> > specified?
> > > >
> > > > Jun
> > > >
> > > > On Mon, Nov 29, 2021 at 5:06 PM David Arthur
> > > >  wrote:
> > > >
> > > > > Jun, I updated the KIP with the "disable" CLI.
> > > > >
> > > > > For 16, I think you're asking how we can safely introduce the
> > > > > initial version of other feature flags in the future. This will
> > > probably
> > > > > depend on the nature of the feature that the new flag is gating. We
> > can
> > > > > probably take a similar approach and say version 1 is backwards
> > > > compatible
> > > > > to some point (possibly an IBP or metadata.version?).
> > > > >
> > > > > -David
> > > > >
> > > > > On Fri, Nov 19, 2021 at 3:00 PM Jun Rao 
> > > > wrote:
> > > > >
> > > > > > Hi, David,
> > > > > >
> > > > > > Thanks for the reply. The new CLI sounds reasonable to me.
> > > > > >
> > > > > > 16.
> > > > > > For case C, choosing the latest version sounds good to me.
> > > > > > For case B, for metadata.version, we pick version 1 since it just
> > > > happens
> > > > > > that for metadata.version version 1 is backward compatible. How
> do
> > we
> > > > > make
> > > > > > this more general for other features?
> > > > > >
> > > > > > 21. Do you still plan to change "delete" to "disable" in the CLI?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Nov 18, 2021 at 11:50 AM David Arthur
> > > > > >  

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-12-17 Thread Tom Bentley
Hi David,

Thanks for the KIP. Sorry I'm so late in looking at this. I have a few
questions.

1. In the Proposed Changes Overview it says "The controller validates that
the cluster can be safely downgraded to this version (override with
--force)" I think that be "--unsafe"?

2. Also in this section it says "Components reload their state with new
version" the word "components" is a little vague. I think it means
controllers and brokers, right?

3. In the compatibility section it says "A controller running old software
will join the quorum and begin replicating the metadata log. If this
inactive controller encounters a FeatureLevelRecord for *metadata.version*
that it cannot support, it should terminate." I assume "encounters"
includes the snapshot case?

4. "In the unlikely event that an active controller encounters an
unsupported *metadata.version*, it should resign and terminate. ", I'm
unclear on how this could happen, could you elaborate?

5. In the ApiVersions part of the Upgrades section, "Brokers will observe
changes to *metadata.version *as they replicate records from the metadata
log. If a new *metadata.version* is seen, brokers will renegotiate
compatible RPCs with other brokers through the the ApiVersions workflow."
Is it possible that broker A does a metadata fetch, sees a changed
metadata.version and attempts renegotiation with broker B before B has seen
the metadata.version?

6. Why does "kafka-features.sh downgrade" not support "--release"?

Kind regards,

Tom

On Tue, Nov 30, 2021 at 10:26 PM Jun Rao  wrote:

> Hi, David,
>
> Thanks for the reply. No more questions from me.
>
> Jun
>
> On Tue, Nov 30, 2021 at 1:52 PM David Arthur
>  wrote:
>
> > Thanks Jun,
> >
> > 30: I clarified the description of the "upgrade" command to:
> >
> > The FEATURE and VERSION arguments can be repeated to indicate an upgrade
> of
> > > multiple features in one request. Alternatively, the RELEASE flag can
> be
> > > given to upgrade to the default versions of the specified release.
> These
> > > two options, FEATURE and RELEASE, are mutually exclusive. If neither is
> > > given, this command will do nothing.
> >
> >
> > Basically, you must provide either "kafka-features.sh upgrade --release
> > 3.2" or something like "kafka-features.sh upgrade --feature X --version
> 10"
> >
> > -David
> >
> > On Tue, Nov 30, 2021 at 2:51 PM Jun Rao 
> wrote:
> >
> > > Hi, David,
> > >
> > > Thanks for the reply. Just one more minor comment.
> > >
> > > 30. ./kafka-features.sh upgrade: It seems that the release param is
> > > optional. Could you describe the semantic when release is not
> specified?
> > >
> > > Jun
> > >
> > > On Mon, Nov 29, 2021 at 5:06 PM David Arthur
> > >  wrote:
> > >
> > > > Jun, I updated the KIP with the "disable" CLI.
> > > >
> > > > For 16, I think you're asking how we can safely introduce the
> > > > initial version of other feature flags in the future. This will
> > probably
> > > > depend on the nature of the feature that the new flag is gating. We
> can
> > > > probably take a similar approach and say version 1 is backwards
> > > compatible
> > > > to some point (possibly an IBP or metadata.version?).
> > > >
> > > > -David
> > > >
> > > > On Fri, Nov 19, 2021 at 3:00 PM Jun Rao 
> > > wrote:
> > > >
> > > > > Hi, David,
> > > > >
> > > > > Thanks for the reply. The new CLI sounds reasonable to me.
> > > > >
> > > > > 16.
> > > > > For case C, choosing the latest version sounds good to me.
> > > > > For case B, for metadata.version, we pick version 1 since it just
> > > happens
> > > > > that for metadata.version version 1 is backward compatible. How do
> we
> > > > make
> > > > > this more general for other features?
> > > > >
> > > > > 21. Do you still plan to change "delete" to "disable" in the CLI?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Nov 18, 2021 at 11:50 AM David Arthur
> > > > >  wrote:
> > > > >
> > > > > > Jun,
> > > > > >
> > > > > > The KIP has some changes to the CLI for KIP-584. With Jason's
> > > > suggestion
> > > > > > incorporated, these three commands would look like:
> > > > > >
> > > > > > 1) kafka-features.sh upgrade --release latest
> > > > > > upgrades all known features to their defaults in the latest
> release
> > > > > >
> > > > > > 2) kafka-features.sh downgrade --release 3.x
> > > > > > downgrade all known features to the default versions from 3.x
> > > > > >
> > > > > > 3) kafka-features.sh describe --release latest
> > > > > > Describe the known features of the latest release
> > > > > >
> > > > > > The --upgrade-all and --downgrade-all are replaced by specific
> each
> > > > > > feature+version or giving the --release argument. One problem
> with
> > > > > > --downgrade-all is it's not clear what the target versions should
> > be:
> > > > the
> > > > > > previous version before the last upgrade -- or the lowest
> supported
> > > > > > version. Since downgrades will be less common, I think it's
> better
> 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-30 Thread Jun Rao
Hi, David,

Thanks for the reply. No more questions from me.

Jun

On Tue, Nov 30, 2021 at 1:52 PM David Arthur
 wrote:

> Thanks Jun,
>
> 30: I clarified the description of the "upgrade" command to:
>
> The FEATURE and VERSION arguments can be repeated to indicate an upgrade of
> > multiple features in one request. Alternatively, the RELEASE flag can be
> > given to upgrade to the default versions of the specified release. These
> > two options, FEATURE and RELEASE, are mutually exclusive. If neither is
> > given, this command will do nothing.
>
>
> Basically, you must provide either "kafka-features.sh upgrade --release
> 3.2" or something like "kafka-features.sh upgrade --feature X --version 10"
>
> -David
>
> On Tue, Nov 30, 2021 at 2:51 PM Jun Rao  wrote:
>
> > Hi, David,
> >
> > Thanks for the reply. Just one more minor comment.
> >
> > 30. ./kafka-features.sh upgrade: It seems that the release param is
> > optional. Could you describe the semantic when release is not specified?
> >
> > Jun
> >
> > On Mon, Nov 29, 2021 at 5:06 PM David Arthur
> >  wrote:
> >
> > > Jun, I updated the KIP with the "disable" CLI.
> > >
> > > For 16, I think you're asking how we can safely introduce the
> > > initial version of other feature flags in the future. This will
> probably
> > > depend on the nature of the feature that the new flag is gating. We can
> > > probably take a similar approach and say version 1 is backwards
> > compatible
> > > to some point (possibly an IBP or metadata.version?).
> > >
> > > -David
> > >
> > > On Fri, Nov 19, 2021 at 3:00 PM Jun Rao 
> > wrote:
> > >
> > > > Hi, David,
> > > >
> > > > Thanks for the reply. The new CLI sounds reasonable to me.
> > > >
> > > > 16.
> > > > For case C, choosing the latest version sounds good to me.
> > > > For case B, for metadata.version, we pick version 1 since it just
> > happens
> > > > that for metadata.version version 1 is backward compatible. How do we
> > > make
> > > > this more general for other features?
> > > >
> > > > 21. Do you still plan to change "delete" to "disable" in the CLI?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > >
> > > >
> > > > On Thu, Nov 18, 2021 at 11:50 AM David Arthur
> > > >  wrote:
> > > >
> > > > > Jun,
> > > > >
> > > > > The KIP has some changes to the CLI for KIP-584. With Jason's
> > > suggestion
> > > > > incorporated, these three commands would look like:
> > > > >
> > > > > 1) kafka-features.sh upgrade --release latest
> > > > > upgrades all known features to their defaults in the latest release
> > > > >
> > > > > 2) kafka-features.sh downgrade --release 3.x
> > > > > downgrade all known features to the default versions from 3.x
> > > > >
> > > > > 3) kafka-features.sh describe --release latest
> > > > > Describe the known features of the latest release
> > > > >
> > > > > The --upgrade-all and --downgrade-all are replaced by specific each
> > > > > feature+version or giving the --release argument. One problem with
> > > > > --downgrade-all is it's not clear what the target versions should
> be:
> > > the
> > > > > previous version before the last upgrade -- or the lowest supported
> > > > > version. Since downgrades will be less common, I think it's better
> to
> > > > make
> > > > > the operator be more explicit about the desired downgrade version
> > > (either
> > > > > through [--feature X --version Y] or [--release 3.1]). Does that
> seem
> > > > > reasonable?
> > > > >
> > > > > 16. For case C, I think we will want to always use the latest
> > > > > metadata.version for new clusters (like we do for IBP). We can
> always
> > > > > change what we mean by "default" down the road. Also, this will
> > likely
> > > > > become a bit of information we include in release and upgrade notes
> > > with
> > > > > each release.
> > > > >
> > > > > -David
> > > > >
> > > > > On Thu, Nov 18, 2021 at 1:14 PM Jun Rao 
> > > > wrote:
> > > > >
> > > > > > Hi, Jason, David,
> > > > > >
> > > > > > Just to clarify on the interaction with the end user, the design
> in
> > > > > KIP-584
> > > > > > allows one to do the following.
> > > > > > (1) kafka-features.sh  --upgrade-all --bootstrap-server
> > > > > > kafka-broker0.prn1:9071 to upgrade all features to the latest
> > version
> > > > > known
> > > > > > by the tool. The user doesn't need to know a specific feature
> > > version.
> > > > > > (2) kafka-features.sh  --downgrade-all --bootstrap-server
> > > > > > kafka-broker0.prn1:9071 to downgrade all features to the latest
> > > version
> > > > > > known by the tool. The user doesn't need to know a specific
> feature
> > > > > > version.
> > > > > > (3) kafka-features.sh  --describe --bootstrap-server
> > > > > > kafka-broker0.prn1:9071 to find out the supported version for
> each
> > > > > feature.
> > > > > > This allows the user to upgrade each feature individually.
> > > > > >
> > > > > > Most users will be doing (1) and occasionally (2), and won't need
> > to
> > > > know
> > > > > > the exact version 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-30 Thread David Arthur
Thanks Jun,

30: I clarified the description of the "upgrade" command to:

The FEATURE and VERSION arguments can be repeated to indicate an upgrade of
> multiple features in one request. Alternatively, the RELEASE flag can be
> given to upgrade to the default versions of the specified release. These
> two options, FEATURE and RELEASE, are mutually exclusive. If neither is
> given, this command will do nothing.


Basically, you must provide either "kafka-features.sh upgrade --release
3.2" or something like "kafka-features.sh upgrade --feature X --version 10"

-David

On Tue, Nov 30, 2021 at 2:51 PM Jun Rao  wrote:

> Hi, David,
>
> Thanks for the reply. Just one more minor comment.
>
> 30. ./kafka-features.sh upgrade: It seems that the release param is
> optional. Could you describe the semantic when release is not specified?
>
> Jun
>
> On Mon, Nov 29, 2021 at 5:06 PM David Arthur
>  wrote:
>
> > Jun, I updated the KIP with the "disable" CLI.
> >
> > For 16, I think you're asking how we can safely introduce the
> > initial version of other feature flags in the future. This will probably
> > depend on the nature of the feature that the new flag is gating. We can
> > probably take a similar approach and say version 1 is backwards
> compatible
> > to some point (possibly an IBP or metadata.version?).
> >
> > -David
> >
> > On Fri, Nov 19, 2021 at 3:00 PM Jun Rao 
> wrote:
> >
> > > Hi, David,
> > >
> > > Thanks for the reply. The new CLI sounds reasonable to me.
> > >
> > > 16.
> > > For case C, choosing the latest version sounds good to me.
> > > For case B, for metadata.version, we pick version 1 since it just
> happens
> > > that for metadata.version version 1 is backward compatible. How do we
> > make
> > > this more general for other features?
> > >
> > > 21. Do you still plan to change "delete" to "disable" in the CLI?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > >
> > > On Thu, Nov 18, 2021 at 11:50 AM David Arthur
> > >  wrote:
> > >
> > > > Jun,
> > > >
> > > > The KIP has some changes to the CLI for KIP-584. With Jason's
> > suggestion
> > > > incorporated, these three commands would look like:
> > > >
> > > > 1) kafka-features.sh upgrade --release latest
> > > > upgrades all known features to their defaults in the latest release
> > > >
> > > > 2) kafka-features.sh downgrade --release 3.x
> > > > downgrade all known features to the default versions from 3.x
> > > >
> > > > 3) kafka-features.sh describe --release latest
> > > > Describe the known features of the latest release
> > > >
> > > > The --upgrade-all and --downgrade-all are replaced by specific each
> > > > feature+version or giving the --release argument. One problem with
> > > > --downgrade-all is it's not clear what the target versions should be:
> > the
> > > > previous version before the last upgrade -- or the lowest supported
> > > > version. Since downgrades will be less common, I think it's better to
> > > make
> > > > the operator be more explicit about the desired downgrade version
> > (either
> > > > through [--feature X --version Y] or [--release 3.1]). Does that seem
> > > > reasonable?
> > > >
> > > > 16. For case C, I think we will want to always use the latest
> > > > metadata.version for new clusters (like we do for IBP). We can always
> > > > change what we mean by "default" down the road. Also, this will
> likely
> > > > become a bit of information we include in release and upgrade notes
> > with
> > > > each release.
> > > >
> > > > -David
> > > >
> > > > On Thu, Nov 18, 2021 at 1:14 PM Jun Rao 
> > > wrote:
> > > >
> > > > > Hi, Jason, David,
> > > > >
> > > > > Just to clarify on the interaction with the end user, the design in
> > > > KIP-584
> > > > > allows one to do the following.
> > > > > (1) kafka-features.sh  --upgrade-all --bootstrap-server
> > > > > kafka-broker0.prn1:9071 to upgrade all features to the latest
> version
> > > > known
> > > > > by the tool. The user doesn't need to know a specific feature
> > version.
> > > > > (2) kafka-features.sh  --downgrade-all --bootstrap-server
> > > > > kafka-broker0.prn1:9071 to downgrade all features to the latest
> > version
> > > > > known by the tool. The user doesn't need to know a specific feature
> > > > > version.
> > > > > (3) kafka-features.sh  --describe --bootstrap-server
> > > > > kafka-broker0.prn1:9071 to find out the supported version for each
> > > > feature.
> > > > > This allows the user to upgrade each feature individually.
> > > > >
> > > > > Most users will be doing (1) and occasionally (2), and won't need
> to
> > > know
> > > > > the exact version of each feature.
> > > > >
> > > > > 16. For case C, what's the default version? Is it always the
> latest?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > >
> > > > > On Thu, Nov 18, 2021 at 8:15 AM David Arthur
> > > > >  wrote:
> > > > >
> > > > > > Colin, thanks for the detailed response. I understand what you're
> > > > saying
> > > > > > and I agree with your 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-30 Thread Jun Rao
Hi, David,

Thanks for the reply. Just one more minor comment.

30. ./kafka-features.sh upgrade: It seems that the release param is
optional. Could you describe the semantic when release is not specified?

Jun

On Mon, Nov 29, 2021 at 5:06 PM David Arthur
 wrote:

> Jun, I updated the KIP with the "disable" CLI.
>
> For 16, I think you're asking how we can safely introduce the
> initial version of other feature flags in the future. This will probably
> depend on the nature of the feature that the new flag is gating. We can
> probably take a similar approach and say version 1 is backwards compatible
> to some point (possibly an IBP or metadata.version?).
>
> -David
>
> On Fri, Nov 19, 2021 at 3:00 PM Jun Rao  wrote:
>
> > Hi, David,
> >
> > Thanks for the reply. The new CLI sounds reasonable to me.
> >
> > 16.
> > For case C, choosing the latest version sounds good to me.
> > For case B, for metadata.version, we pick version 1 since it just happens
> > that for metadata.version version 1 is backward compatible. How do we
> make
> > this more general for other features?
> >
> > 21. Do you still plan to change "delete" to "disable" in the CLI?
> >
> > Thanks,
> >
> > Jun
> >
> >
> >
> > On Thu, Nov 18, 2021 at 11:50 AM David Arthur
> >  wrote:
> >
> > > Jun,
> > >
> > > The KIP has some changes to the CLI for KIP-584. With Jason's
> suggestion
> > > incorporated, these three commands would look like:
> > >
> > > 1) kafka-features.sh upgrade --release latest
> > > upgrades all known features to their defaults in the latest release
> > >
> > > 2) kafka-features.sh downgrade --release 3.x
> > > downgrade all known features to the default versions from 3.x
> > >
> > > 3) kafka-features.sh describe --release latest
> > > Describe the known features of the latest release
> > >
> > > The --upgrade-all and --downgrade-all are replaced by specific each
> > > feature+version or giving the --release argument. One problem with
> > > --downgrade-all is it's not clear what the target versions should be:
> the
> > > previous version before the last upgrade -- or the lowest supported
> > > version. Since downgrades will be less common, I think it's better to
> > make
> > > the operator be more explicit about the desired downgrade version
> (either
> > > through [--feature X --version Y] or [--release 3.1]). Does that seem
> > > reasonable?
> > >
> > > 16. For case C, I think we will want to always use the latest
> > > metadata.version for new clusters (like we do for IBP). We can always
> > > change what we mean by "default" down the road. Also, this will likely
> > > become a bit of information we include in release and upgrade notes
> with
> > > each release.
> > >
> > > -David
> > >
> > > On Thu, Nov 18, 2021 at 1:14 PM Jun Rao 
> > wrote:
> > >
> > > > Hi, Jason, David,
> > > >
> > > > Just to clarify on the interaction with the end user, the design in
> > > KIP-584
> > > > allows one to do the following.
> > > > (1) kafka-features.sh  --upgrade-all --bootstrap-server
> > > > kafka-broker0.prn1:9071 to upgrade all features to the latest version
> > > known
> > > > by the tool. The user doesn't need to know a specific feature
> version.
> > > > (2) kafka-features.sh  --downgrade-all --bootstrap-server
> > > > kafka-broker0.prn1:9071 to downgrade all features to the latest
> version
> > > > known by the tool. The user doesn't need to know a specific feature
> > > > version.
> > > > (3) kafka-features.sh  --describe --bootstrap-server
> > > > kafka-broker0.prn1:9071 to find out the supported version for each
> > > feature.
> > > > This allows the user to upgrade each feature individually.
> > > >
> > > > Most users will be doing (1) and occasionally (2), and won't need to
> > know
> > > > the exact version of each feature.
> > > >
> > > > 16. For case C, what's the default version? Is it always the latest?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > >
> > > > On Thu, Nov 18, 2021 at 8:15 AM David Arthur
> > > >  wrote:
> > > >
> > > > > Colin, thanks for the detailed response. I understand what you're
> > > saying
> > > > > and I agree with your rationale.
> > > > >
> > > > > It seems like we could just initialize their cluster.metadata to 1
> > when
> > > > the
> > > > > > software is upgraded to 3.2.
> > > > > >
> > > > >
> > > > > Concretely, this means the controller would see that there is no
> > > > > FeatureLevelRecord in the log, and so it would bootstrap one.
> > Normally
> > > > for
> > > > > cluster initialization, this would be read from meta.properties,
> but
> > in
> > > > the
> > > > > case of preview clusters we would need to special case it to
> > initialize
> > > > the
> > > > > version to 1.
> > > > >
> > > > > Once the new FeatureLevelRecord has been committed, any nodes
> > (brokers
> > > or
> > > > > controllers) that are running the latest software will react to the
> > new
> > > > > metadata.version. This means we will need to make this initial
> > version
> > > > of 1
> > > > > be 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-29 Thread David Arthur
Jun, I updated the KIP with the "disable" CLI.

For 16, I think you're asking how we can safely introduce the
initial version of other feature flags in the future. This will probably
depend on the nature of the feature that the new flag is gating. We can
probably take a similar approach and say version 1 is backwards compatible
to some point (possibly an IBP or metadata.version?).

-David

On Fri, Nov 19, 2021 at 3:00 PM Jun Rao  wrote:

> Hi, David,
>
> Thanks for the reply. The new CLI sounds reasonable to me.
>
> 16.
> For case C, choosing the latest version sounds good to me.
> For case B, for metadata.version, we pick version 1 since it just happens
> that for metadata.version version 1 is backward compatible. How do we make
> this more general for other features?
>
> 21. Do you still plan to change "delete" to "disable" in the CLI?
>
> Thanks,
>
> Jun
>
>
>
> On Thu, Nov 18, 2021 at 11:50 AM David Arthur
>  wrote:
>
> > Jun,
> >
> > The KIP has some changes to the CLI for KIP-584. With Jason's suggestion
> > incorporated, these three commands would look like:
> >
> > 1) kafka-features.sh upgrade --release latest
> > upgrades all known features to their defaults in the latest release
> >
> > 2) kafka-features.sh downgrade --release 3.x
> > downgrade all known features to the default versions from 3.x
> >
> > 3) kafka-features.sh describe --release latest
> > Describe the known features of the latest release
> >
> > The --upgrade-all and --downgrade-all are replaced by specific each
> > feature+version or giving the --release argument. One problem with
> > --downgrade-all is it's not clear what the target versions should be: the
> > previous version before the last upgrade -- or the lowest supported
> > version. Since downgrades will be less common, I think it's better to
> make
> > the operator be more explicit about the desired downgrade version (either
> > through [--feature X --version Y] or [--release 3.1]). Does that seem
> > reasonable?
> >
> > 16. For case C, I think we will want to always use the latest
> > metadata.version for new clusters (like we do for IBP). We can always
> > change what we mean by "default" down the road. Also, this will likely
> > become a bit of information we include in release and upgrade notes with
> > each release.
> >
> > -David
> >
> > On Thu, Nov 18, 2021 at 1:14 PM Jun Rao 
> wrote:
> >
> > > Hi, Jason, David,
> > >
> > > Just to clarify on the interaction with the end user, the design in
> > KIP-584
> > > allows one to do the following.
> > > (1) kafka-features.sh  --upgrade-all --bootstrap-server
> > > kafka-broker0.prn1:9071 to upgrade all features to the latest version
> > known
> > > by the tool. The user doesn't need to know a specific feature version.
> > > (2) kafka-features.sh  --downgrade-all --bootstrap-server
> > > kafka-broker0.prn1:9071 to downgrade all features to the latest version
> > > known by the tool. The user doesn't need to know a specific feature
> > > version.
> > > (3) kafka-features.sh  --describe --bootstrap-server
> > > kafka-broker0.prn1:9071 to find out the supported version for each
> > feature.
> > > This allows the user to upgrade each feature individually.
> > >
> > > Most users will be doing (1) and occasionally (2), and won't need to
> know
> > > the exact version of each feature.
> > >
> > > 16. For case C, what's the default version? Is it always the latest?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Thu, Nov 18, 2021 at 8:15 AM David Arthur
> > >  wrote:
> > >
> > > > Colin, thanks for the detailed response. I understand what you're
> > saying
> > > > and I agree with your rationale.
> > > >
> > > > It seems like we could just initialize their cluster.metadata to 1
> when
> > > the
> > > > > software is upgraded to 3.2.
> > > > >
> > > >
> > > > Concretely, this means the controller would see that there is no
> > > > FeatureLevelRecord in the log, and so it would bootstrap one.
> Normally
> > > for
> > > > cluster initialization, this would be read from meta.properties, but
> in
> > > the
> > > > case of preview clusters we would need to special case it to
> initialize
> > > the
> > > > version to 1.
> > > >
> > > > Once the new FeatureLevelRecord has been committed, any nodes
> (brokers
> > or
> > > > controllers) that are running the latest software will react to the
> new
> > > > metadata.version. This means we will need to make this initial
> version
> > > of 1
> > > > be backwards compatible to what we have in 3.0 and 3.1 (since some
> > > brokers
> > > > will be on the new software and some on older/preview versions)
> > > >
> > > > I agree that auto-upgrading preview users from IBP 3.0 to
> > > metadata.version
> > > > 1 (equivalent to IBP 3.2) is probably fine.
> > > >
> > > > Back to Jun's case B:
> > > >
> > > > b. We upgrade from an old version where no metadata.version has been
> > > > > finalized. In this case, it makes sense to leave metadata.version
> > > > disabled
> > > > > since we don't 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-19 Thread Jun Rao
Hi, David,

Thanks for the reply. The new CLI sounds reasonable to me.

16.
For case C, choosing the latest version sounds good to me.
For case B, for metadata.version, we pick version 1 since it just happens
that for metadata.version version 1 is backward compatible. How do we make
this more general for other features?

21. Do you still plan to change "delete" to "disable" in the CLI?

Thanks,

Jun



On Thu, Nov 18, 2021 at 11:50 AM David Arthur
 wrote:

> Jun,
>
> The KIP has some changes to the CLI for KIP-584. With Jason's suggestion
> incorporated, these three commands would look like:
>
> 1) kafka-features.sh upgrade --release latest
> upgrades all known features to their defaults in the latest release
>
> 2) kafka-features.sh downgrade --release 3.x
> downgrade all known features to the default versions from 3.x
>
> 3) kafka-features.sh describe --release latest
> Describe the known features of the latest release
>
> The --upgrade-all and --downgrade-all are replaced by specific each
> feature+version or giving the --release argument. One problem with
> --downgrade-all is it's not clear what the target versions should be: the
> previous version before the last upgrade -- or the lowest supported
> version. Since downgrades will be less common, I think it's better to make
> the operator be more explicit about the desired downgrade version (either
> through [--feature X --version Y] or [--release 3.1]). Does that seem
> reasonable?
>
> 16. For case C, I think we will want to always use the latest
> metadata.version for new clusters (like we do for IBP). We can always
> change what we mean by "default" down the road. Also, this will likely
> become a bit of information we include in release and upgrade notes with
> each release.
>
> -David
>
> On Thu, Nov 18, 2021 at 1:14 PM Jun Rao  wrote:
>
> > Hi, Jason, David,
> >
> > Just to clarify on the interaction with the end user, the design in
> KIP-584
> > allows one to do the following.
> > (1) kafka-features.sh  --upgrade-all --bootstrap-server
> > kafka-broker0.prn1:9071 to upgrade all features to the latest version
> known
> > by the tool. The user doesn't need to know a specific feature version.
> > (2) kafka-features.sh  --downgrade-all --bootstrap-server
> > kafka-broker0.prn1:9071 to downgrade all features to the latest version
> > known by the tool. The user doesn't need to know a specific feature
> > version.
> > (3) kafka-features.sh  --describe --bootstrap-server
> > kafka-broker0.prn1:9071 to find out the supported version for each
> feature.
> > This allows the user to upgrade each feature individually.
> >
> > Most users will be doing (1) and occasionally (2), and won't need to know
> > the exact version of each feature.
> >
> > 16. For case C, what's the default version? Is it always the latest?
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Thu, Nov 18, 2021 at 8:15 AM David Arthur
> >  wrote:
> >
> > > Colin, thanks for the detailed response. I understand what you're
> saying
> > > and I agree with your rationale.
> > >
> > > It seems like we could just initialize their cluster.metadata to 1 when
> > the
> > > > software is upgraded to 3.2.
> > > >
> > >
> > > Concretely, this means the controller would see that there is no
> > > FeatureLevelRecord in the log, and so it would bootstrap one. Normally
> > for
> > > cluster initialization, this would be read from meta.properties, but in
> > the
> > > case of preview clusters we would need to special case it to initialize
> > the
> > > version to 1.
> > >
> > > Once the new FeatureLevelRecord has been committed, any nodes (brokers
> or
> > > controllers) that are running the latest software will react to the new
> > > metadata.version. This means we will need to make this initial version
> > of 1
> > > be backwards compatible to what we have in 3.0 and 3.1 (since some
> > brokers
> > > will be on the new software and some on older/preview versions)
> > >
> > > I agree that auto-upgrading preview users from IBP 3.0 to
> > metadata.version
> > > 1 (equivalent to IBP 3.2) is probably fine.
> > >
> > > Back to Jun's case B:
> > >
> > > b. We upgrade from an old version where no metadata.version has been
> > > > finalized. In this case, it makes sense to leave metadata.version
> > > disabled
> > > > since we don't know if all brokers have been upgraded.
> > >
> > >
> > > Instead of leaving it uninitialized, we would initialize it to 1 which
> > > would be backwards compatible to 3.0 and 3.1. This would eliminate the
> > need
> > > to check a "final IBP" or deal with 3.2+ clusters without a
> > > metadata.version set. The downgrade path for 3.2 back to a preview
> > release
> > > should be fine since we are saying that metadata.version 1 is
> compatible
> > > with those releases. The FeatureLevelRecord would exist, but it would
> be
> > > ignored (we might need to make sure this actually works in 3.0).
> > >
> > > For clarity, here are the three workflows of Jun's three cases:
> > >
> > > A (KRaft 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-18 Thread David Arthur
Jun,

The KIP has some changes to the CLI for KIP-584. With Jason's suggestion
incorporated, these three commands would look like:

1) kafka-features.sh upgrade --release latest
upgrades all known features to their defaults in the latest release

2) kafka-features.sh downgrade --release 3.x
downgrade all known features to the default versions from 3.x

3) kafka-features.sh describe --release latest
Describe the known features of the latest release

The --upgrade-all and --downgrade-all are replaced by specific each
feature+version or giving the --release argument. One problem with
--downgrade-all is it's not clear what the target versions should be: the
previous version before the last upgrade -- or the lowest supported
version. Since downgrades will be less common, I think it's better to make
the operator be more explicit about the desired downgrade version (either
through [--feature X --version Y] or [--release 3.1]). Does that seem
reasonable?

16. For case C, I think we will want to always use the latest
metadata.version for new clusters (like we do for IBP). We can always
change what we mean by "default" down the road. Also, this will likely
become a bit of information we include in release and upgrade notes with
each release.

-David

On Thu, Nov 18, 2021 at 1:14 PM Jun Rao  wrote:

> Hi, Jason, David,
>
> Just to clarify on the interaction with the end user, the design in KIP-584
> allows one to do the following.
> (1) kafka-features.sh  --upgrade-all --bootstrap-server
> kafka-broker0.prn1:9071 to upgrade all features to the latest version known
> by the tool. The user doesn't need to know a specific feature version.
> (2) kafka-features.sh  --downgrade-all --bootstrap-server
> kafka-broker0.prn1:9071 to downgrade all features to the latest version
> known by the tool. The user doesn't need to know a specific feature
> version.
> (3) kafka-features.sh  --describe --bootstrap-server
> kafka-broker0.prn1:9071 to find out the supported version for each feature.
> This allows the user to upgrade each feature individually.
>
> Most users will be doing (1) and occasionally (2), and won't need to know
> the exact version of each feature.
>
> 16. For case C, what's the default version? Is it always the latest?
>
> Thanks,
>
> Jun
>
>
> On Thu, Nov 18, 2021 at 8:15 AM David Arthur
>  wrote:
>
> > Colin, thanks for the detailed response. I understand what you're saying
> > and I agree with your rationale.
> >
> > It seems like we could just initialize their cluster.metadata to 1 when
> the
> > > software is upgraded to 3.2.
> > >
> >
> > Concretely, this means the controller would see that there is no
> > FeatureLevelRecord in the log, and so it would bootstrap one. Normally
> for
> > cluster initialization, this would be read from meta.properties, but in
> the
> > case of preview clusters we would need to special case it to initialize
> the
> > version to 1.
> >
> > Once the new FeatureLevelRecord has been committed, any nodes (brokers or
> > controllers) that are running the latest software will react to the new
> > metadata.version. This means we will need to make this initial version
> of 1
> > be backwards compatible to what we have in 3.0 and 3.1 (since some
> brokers
> > will be on the new software and some on older/preview versions)
> >
> > I agree that auto-upgrading preview users from IBP 3.0 to
> metadata.version
> > 1 (equivalent to IBP 3.2) is probably fine.
> >
> > Back to Jun's case B:
> >
> > b. We upgrade from an old version where no metadata.version has been
> > > finalized. In this case, it makes sense to leave metadata.version
> > disabled
> > > since we don't know if all brokers have been upgraded.
> >
> >
> > Instead of leaving it uninitialized, we would initialize it to 1 which
> > would be backwards compatible to 3.0 and 3.1. This would eliminate the
> need
> > to check a "final IBP" or deal with 3.2+ clusters without a
> > metadata.version set. The downgrade path for 3.2 back to a preview
> release
> > should be fine since we are saying that metadata.version 1 is compatible
> > with those releases. The FeatureLevelRecord would exist, but it would be
> > ignored (we might need to make sure this actually works in 3.0).
> >
> > For clarity, here are the three workflows of Jun's three cases:
> >
> > A (KRaft 3.2+ to KRaft 3.2+):
> > * User initializes cluster with an explicit metadata.version X, or the
> tool
> > selects the default
> > * After upgrade, cluster stays at version X until operator upgrades to Y
> >
> > B (KRaft Preview to KRaft 3.2+):
> > * User initializes cluster without metadata.version
> > * After controller leader is upgraded, initializes metadata.version to 1
> > * After the cluster is upgraded, an operator may further upgrade to
> version
> > Y
> >
> > C (New KRaft 3.2+):
> > * User initializes cluster with an explicit metadata.version X, or the
> tool
> > selects the default
> >
> >
> > This has been mentioned in this thread, but we should explicitly call out
> > 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-18 Thread Jun Rao
Hi, Jason, David,

Just to clarify on the interaction with the end user, the design in KIP-584
allows one to do the following.
(1) kafka-features.sh  --upgrade-all --bootstrap-server
kafka-broker0.prn1:9071 to upgrade all features to the latest version known
by the tool. The user doesn't need to know a specific feature version.
(2) kafka-features.sh  --downgrade-all --bootstrap-server
kafka-broker0.prn1:9071 to downgrade all features to the latest version
known by the tool. The user doesn't need to know a specific feature version.
(3) kafka-features.sh  --describe --bootstrap-server
kafka-broker0.prn1:9071 to find out the supported version for each feature.
This allows the user to upgrade each feature individually.

Most users will be doing (1) and occasionally (2), and won't need to know
the exact version of each feature.

16. For case C, what's the default version? Is it always the latest?

Thanks,

Jun


On Thu, Nov 18, 2021 at 8:15 AM David Arthur
 wrote:

> Colin, thanks for the detailed response. I understand what you're saying
> and I agree with your rationale.
>
> It seems like we could just initialize their cluster.metadata to 1 when the
> > software is upgraded to 3.2.
> >
>
> Concretely, this means the controller would see that there is no
> FeatureLevelRecord in the log, and so it would bootstrap one. Normally for
> cluster initialization, this would be read from meta.properties, but in the
> case of preview clusters we would need to special case it to initialize the
> version to 1.
>
> Once the new FeatureLevelRecord has been committed, any nodes (brokers or
> controllers) that are running the latest software will react to the new
> metadata.version. This means we will need to make this initial version of 1
> be backwards compatible to what we have in 3.0 and 3.1 (since some brokers
> will be on the new software and some on older/preview versions)
>
> I agree that auto-upgrading preview users from IBP 3.0 to metadata.version
> 1 (equivalent to IBP 3.2) is probably fine.
>
> Back to Jun's case B:
>
> b. We upgrade from an old version where no metadata.version has been
> > finalized. In this case, it makes sense to leave metadata.version
> disabled
> > since we don't know if all brokers have been upgraded.
>
>
> Instead of leaving it uninitialized, we would initialize it to 1 which
> would be backwards compatible to 3.0 and 3.1. This would eliminate the need
> to check a "final IBP" or deal with 3.2+ clusters without a
> metadata.version set. The downgrade path for 3.2 back to a preview release
> should be fine since we are saying that metadata.version 1 is compatible
> with those releases. The FeatureLevelRecord would exist, but it would be
> ignored (we might need to make sure this actually works in 3.0).
>
> For clarity, here are the three workflows of Jun's three cases:
>
> A (KRaft 3.2+ to KRaft 3.2+):
> * User initializes cluster with an explicit metadata.version X, or the tool
> selects the default
> * After upgrade, cluster stays at version X until operator upgrades to Y
>
> B (KRaft Preview to KRaft 3.2+):
> * User initializes cluster without metadata.version
> * After controller leader is upgraded, initializes metadata.version to 1
> * After the cluster is upgraded, an operator may further upgrade to version
> Y
>
> C (New KRaft 3.2+):
> * User initializes cluster with an explicit metadata.version X, or the tool
> selects the default
>
>
> This has been mentioned in this thread, but we should explicitly call out
> that the absence of metadata.version in meta.properties will be used
> to identify that we are in case B. After 3.2, we will require that
> metadata.version is present in meta.properties for new clusters. If users
> omit the --metadata-version flag from kafka-storage.sh, we should fill in a
> default.
>
> So how about the following changes/clarifications to the KIP:
>
> * Starting with 3.2, metadata.version is required in KRaft, IBP is ignored
> * If meta.properties does not include metadata.version, it indicates this
> cluster was initialized with a preview release
> * If upgrading from a preview release to 3.2+, the controller will
> initialize metadata.version=1
> * metadata.version=1 is backwards compatible with preview releases
>
> -David
>
>
> On Thu, Nov 18, 2021 at 10:02 AM David Arthur 
> wrote:
>
> > Jason,
> >
> > 1/2. You've got it right. The intention is that metadata.version will
> gate
> > both RPCs (like IBP does) as well as metadata records. So, when a broker
> > sees that metadata.version changed, it may start advertising new RPCs and
> > it will need to refresh the ApiVersions it has for other brokers. I can
> try
> > to make this more explicit in the KIP.
> >
> > 3. Yes, that's the intention of the --dry-run flag. The current
> > implementation in kafka-features.sh does a dry run on the client side,
> but
> > this KIP pushes the validation down to the controller. This will allow us
> > to have the context needed to do proper validation
> >
> > Re: version 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-18 Thread David Arthur
Colin, thanks for the detailed response. I understand what you're saying
and I agree with your rationale.

It seems like we could just initialize their cluster.metadata to 1 when the
> software is upgraded to 3.2.
>

Concretely, this means the controller would see that there is no
FeatureLevelRecord in the log, and so it would bootstrap one. Normally for
cluster initialization, this would be read from meta.properties, but in the
case of preview clusters we would need to special case it to initialize the
version to 1.

Once the new FeatureLevelRecord has been committed, any nodes (brokers or
controllers) that are running the latest software will react to the new
metadata.version. This means we will need to make this initial version of 1
be backwards compatible to what we have in 3.0 and 3.1 (since some brokers
will be on the new software and some on older/preview versions)

I agree that auto-upgrading preview users from IBP 3.0 to metadata.version
1 (equivalent to IBP 3.2) is probably fine.

Back to Jun's case B:

b. We upgrade from an old version where no metadata.version has been
> finalized. In this case, it makes sense to leave metadata.version disabled
> since we don't know if all brokers have been upgraded.


Instead of leaving it uninitialized, we would initialize it to 1 which
would be backwards compatible to 3.0 and 3.1. This would eliminate the need
to check a "final IBP" or deal with 3.2+ clusters without a
metadata.version set. The downgrade path for 3.2 back to a preview release
should be fine since we are saying that metadata.version 1 is compatible
with those releases. The FeatureLevelRecord would exist, but it would be
ignored (we might need to make sure this actually works in 3.0).

For clarity, here are the three workflows of Jun's three cases:

A (KRaft 3.2+ to KRaft 3.2+):
* User initializes cluster with an explicit metadata.version X, or the tool
selects the default
* After upgrade, cluster stays at version X until operator upgrades to Y

B (KRaft Preview to KRaft 3.2+):
* User initializes cluster without metadata.version
* After controller leader is upgraded, initializes metadata.version to 1
* After the cluster is upgraded, an operator may further upgrade to version
Y

C (New KRaft 3.2+):
* User initializes cluster with an explicit metadata.version X, or the tool
selects the default


This has been mentioned in this thread, but we should explicitly call out
that the absence of metadata.version in meta.properties will be used
to identify that we are in case B. After 3.2, we will require that
metadata.version is present in meta.properties for new clusters. If users
omit the --metadata-version flag from kafka-storage.sh, we should fill in a
default.

So how about the following changes/clarifications to the KIP:

* Starting with 3.2, metadata.version is required in KRaft, IBP is ignored
* If meta.properties does not include metadata.version, it indicates this
cluster was initialized with a preview release
* If upgrading from a preview release to 3.2+, the controller will
initialize metadata.version=1
* metadata.version=1 is backwards compatible with preview releases

-David


On Thu, Nov 18, 2021 at 10:02 AM David Arthur 
wrote:

> Jason,
>
> 1/2. You've got it right. The intention is that metadata.version will gate
> both RPCs (like IBP does) as well as metadata records. So, when a broker
> sees that metadata.version changed, it may start advertising new RPCs and
> it will need to refresh the ApiVersions it has for other brokers. I can try
> to make this more explicit in the KIP.
>
> 3. Yes, that's the intention of the --dry-run flag. The current
> implementation in kafka-features.sh does a dry run on the client side, but
> this KIP pushes the validation down to the controller. This will allow us
> to have the context needed to do proper validation
>
> Re: version number complexity -- yes this has come up in offline
> discussions. With just one feature flag to manage, it's not so bad, but
> explicitly managing even a few would be a burden. I like your suggestion of
> the "--release" flag. That could act as a sort of manifest of versions
> (i.e., the defaults from that release). We could also support something
> like "kafka-features upgrade --release latest" to bring everything to the
> highest supported version.
>
>
>
> On Wed, Nov 17, 2021 at 9:36 PM Jason Gustafson 
> wrote:
>
>> Hi Colin/David,
>>
>> > Like David said, basically it boils down to creating a feature flag for
>> the new proposed __consumer_offsets version, or using a new
>> IBP/metadata.version for it. Both approaches have pros and cons. Using an
>> IBP/metadata.version bump reduces the size of the testing matrix. But
>> using
>> a feature flag allows people to avoid any bugs or pain associated with the
>> change if they don't care about enabling it. This is basically the classic
>> "should I use a feature flag or not?" discussion and we need to have it on
>> a case-by-case basis.
>>
>> I think most users are not going 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-18 Thread David Arthur
Jason,

1/2. You've got it right. The intention is that metadata.version will gate
both RPCs (like IBP does) as well as metadata records. So, when a broker
sees that metadata.version changed, it may start advertising new RPCs and
it will need to refresh the ApiVersions it has for other brokers. I can try
to make this more explicit in the KIP.

3. Yes, that's the intention of the --dry-run flag. The current
implementation in kafka-features.sh does a dry run on the client side, but
this KIP pushes the validation down to the controller. This will allow us
to have the context needed to do proper validation

Re: version number complexity -- yes this has come up in offline
discussions. With just one feature flag to manage, it's not so bad, but
explicitly managing even a few would be a burden. I like your suggestion of
the "--release" flag. That could act as a sort of manifest of versions
(i.e., the defaults from that release). We could also support something
like "kafka-features upgrade --release latest" to bring everything to the
highest supported version.



On Wed, Nov 17, 2021 at 9:36 PM Jason Gustafson 
wrote:

> Hi Colin/David,
>
> > Like David said, basically it boils down to creating a feature flag for
> the new proposed __consumer_offsets version, or using a new
> IBP/metadata.version for it. Both approaches have pros and cons. Using an
> IBP/metadata.version bump reduces the size of the testing matrix. But using
> a feature flag allows people to avoid any bugs or pain associated with the
> change if they don't care about enabling it. This is basically the classic
> "should I use a feature flag or not?" discussion and we need to have it on
> a case-by-case basis.
>
> I think most users are not going to care to manage versions for a bunch of
> different features. The IBP today has many shortcomings, but at least it's
> tied to a version that users understand (i.e. the release version). How
> would users know after upgrading to Kafka 3.1, for example, that they need
> to upgrade the metadata.version to 3  and offsets.version to 4 (or
> whatever)? It's a lot of overhead trying to understand all of the potential
> features and what each upgrade actually means to them. I am wondering if we
> could give them something more convenient which is tied to the release
> version. For example, maybe we could use a command like `kafka-features
> upgrade --release 3.1`, which the broker would then translate to an upgrade
> to the latest versions of the individual features at the time of the 3.1
> release. Basically it's just a static map from release version to feature
> versions to make the upgrade process more convenient.
>
> Thanks,
> Jason
>
>
>
>
> On Wed, Nov 17, 2021 at 6:20 PM Jason Gustafson 
> wrote:
>
> > A few additional questions:
> >
> > 1. Currently the IBP tells us what version of individual inter-broker
> RPCs
> > will be used. I think the plan in this KIP is to use ApiVersions request
> > instead to find the highest compatible version (just like clients). Do I
> > have that right?
> >
> > 2. The following wasn't very clear to me:
> >
> > > Brokers will be able to observe changes to metadata.version by
> observing
> > the metadata log, and could then submit a new ApiVersionsRequest to the
> > other Kafka nodes.
> >
> > Is the purpose of submitting new ApiVersions requests to update the
> > features or the RPC versions? Does metadata.version also influence the
> > versions that a broker advertises? It would help to have more detail
> about
> > this.
> >
> > 3. I imagine users will want to know before performing an upgrade whether
> > downgrading will be safe. Would the --dry-run flag tell them this?
> >
> > Thanks,
> > Jason
> >
> >
> >
> >
> >
> > On Wed, Nov 17, 2021 at 3:55 PM Colin McCabe  wrote:
> >
> >> On Wed, Nov 17, 2021, at 11:28, Jason Gustafson wrote:
> >> > Hi David,
> >> >
> >> > Forgive me if this ground has been covered already. Today, we have a
> few
> >> > other things that we have latched onto the IBP, such as upgrades to
> the
> >> > format of records in __consumer_offsets. I've been assuming that
> >> > metadata.version is not covering this. Is that right or is there some
> >> other
> >> > plan to take care of cases like this?
> >> >
> >>
> >> I think metadata.version could cover changes to things like
> >> __consumer_offsets, if people want it to. Or to put it another way,
> that is
> >> out of scope for this KIP.
> >>
> >> Like David said, basically it boils down to creating a feature flag for
> >> the new proposed __consumer_offsets version, or using a new
> >> IBP/metadata.version for it. Both approaches have pros and cons. Using
> an
> >> IBP/metadata.version bump reduces the size of the testing matrix. But
> using
> >> a feature flag allows people to avoid any bugs or pain associated with
> the
> >> change if they don't care about enabling it. This is basically the
> classic
> >> "should I use a feature flag or not?" discussion and we need to have it
> on
> >> a case-by-case 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-17 Thread Jason Gustafson
Hi Colin/David,

> Like David said, basically it boils down to creating a feature flag for
the new proposed __consumer_offsets version, or using a new
IBP/metadata.version for it. Both approaches have pros and cons. Using an
IBP/metadata.version bump reduces the size of the testing matrix. But using
a feature flag allows people to avoid any bugs or pain associated with the
change if they don't care about enabling it. This is basically the classic
"should I use a feature flag or not?" discussion and we need to have it on
a case-by-case basis.

I think most users are not going to care to manage versions for a bunch of
different features. The IBP today has many shortcomings, but at least it's
tied to a version that users understand (i.e. the release version). How
would users know after upgrading to Kafka 3.1, for example, that they need
to upgrade the metadata.version to 3  and offsets.version to 4 (or
whatever)? It's a lot of overhead trying to understand all of the potential
features and what each upgrade actually means to them. I am wondering if we
could give them something more convenient which is tied to the release
version. For example, maybe we could use a command like `kafka-features
upgrade --release 3.1`, which the broker would then translate to an upgrade
to the latest versions of the individual features at the time of the 3.1
release. Basically it's just a static map from release version to feature
versions to make the upgrade process more convenient.

Thanks,
Jason




On Wed, Nov 17, 2021 at 6:20 PM Jason Gustafson  wrote:

> A few additional questions:
>
> 1. Currently the IBP tells us what version of individual inter-broker RPCs
> will be used. I think the plan in this KIP is to use ApiVersions request
> instead to find the highest compatible version (just like clients). Do I
> have that right?
>
> 2. The following wasn't very clear to me:
>
> > Brokers will be able to observe changes to metadata.version by observing
> the metadata log, and could then submit a new ApiVersionsRequest to the
> other Kafka nodes.
>
> Is the purpose of submitting new ApiVersions requests to update the
> features or the RPC versions? Does metadata.version also influence the
> versions that a broker advertises? It would help to have more detail about
> this.
>
> 3. I imagine users will want to know before performing an upgrade whether
> downgrading will be safe. Would the --dry-run flag tell them this?
>
> Thanks,
> Jason
>
>
>
>
>
> On Wed, Nov 17, 2021 at 3:55 PM Colin McCabe  wrote:
>
>> On Wed, Nov 17, 2021, at 11:28, Jason Gustafson wrote:
>> > Hi David,
>> >
>> > Forgive me if this ground has been covered already. Today, we have a few
>> > other things that we have latched onto the IBP, such as upgrades to the
>> > format of records in __consumer_offsets. I've been assuming that
>> > metadata.version is not covering this. Is that right or is there some
>> other
>> > plan to take care of cases like this?
>> >
>>
>> I think metadata.version could cover changes to things like
>> __consumer_offsets, if people want it to. Or to put it another way, that is
>> out of scope for this KIP.
>>
>> Like David said, basically it boils down to creating a feature flag for
>> the new proposed __consumer_offsets version, or using a new
>> IBP/metadata.version for it. Both approaches have pros and cons. Using an
>> IBP/metadata.version bump reduces the size of the testing matrix. But using
>> a feature flag allows people to avoid any bugs or pain associated with the
>> change if they don't care about enabling it. This is basically the classic
>> "should I use a feature flag or not?" discussion and we need to have it on
>> a case-by-case basis.
>>
>> I think it's worth calling out that having a 1:1 mapping between IBP
>> versions and metadata.versions will result in some metadata.versions that
>> "don't do anything" (aka they do the same thing as the previous
>> metadata.version). For example, if we change StopReplicaRequest again, that
>> will not affect KRaft mode, but probably would require an IBP bump and
>> hence a metadata.version bump. I think that's OK. It's not that different
>> from updating your IBP and getting support for ZStandard, when your
>> deployment doesn't use ZStandard compression.
>>
>> best,
>> Colin
>>
>> > Thanks,
>> > Jason
>> >
>> >
>> >
>> > On Wed, Nov 17, 2021 at 10:17 AM Jun Rao 
>> wrote:
>> >
>> >> Hi, Colin,
>> >>
>> >> Thanks for the reply.
>> >>
>> >> For case b, I am not sure that I understand your suggestion. Does "each
>> >> subsequent level for metadata.version corresponds to an IBP version"
>> mean
>> >> that we need to keep IBP forever? Could you describe the upgrade
>> process in
>> >> this case?
>> >>
>> >> Jun
>> >>
>> >> On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe 
>> wrote:
>> >>
>> >> > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote:
>> >> > > Hi, David, Colin,
>> >> > >
>> >> > > Thanks for the reply.
>> >> > >
>> >> > > 16. Discussed with David offline a bit. We have 3 cases.
>> 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-17 Thread Jason Gustafson
A few additional questions:

1. Currently the IBP tells us what version of individual inter-broker RPCs
will be used. I think the plan in this KIP is to use ApiVersions request
instead to find the highest compatible version (just like clients). Do I
have that right?

2. The following wasn't very clear to me:

> Brokers will be able to observe changes to metadata.version by observing
the metadata log, and could then submit a new ApiVersionsRequest to the
other Kafka nodes.

Is the purpose of submitting new ApiVersions requests to update the
features or the RPC versions? Does metadata.version also influence the
versions that a broker advertises? It would help to have more detail about
this.

3. I imagine users will want to know before performing an upgrade whether
downgrading will be safe. Would the --dry-run flag tell them this?

Thanks,
Jason





On Wed, Nov 17, 2021 at 3:55 PM Colin McCabe  wrote:

> On Wed, Nov 17, 2021, at 11:28, Jason Gustafson wrote:
> > Hi David,
> >
> > Forgive me if this ground has been covered already. Today, we have a few
> > other things that we have latched onto the IBP, such as upgrades to the
> > format of records in __consumer_offsets. I've been assuming that
> > metadata.version is not covering this. Is that right or is there some
> other
> > plan to take care of cases like this?
> >
>
> I think metadata.version could cover changes to things like
> __consumer_offsets, if people want it to. Or to put it another way, that is
> out of scope for this KIP.
>
> Like David said, basically it boils down to creating a feature flag for
> the new proposed __consumer_offsets version, or using a new
> IBP/metadata.version for it. Both approaches have pros and cons. Using an
> IBP/metadata.version bump reduces the size of the testing matrix. But using
> a feature flag allows people to avoid any bugs or pain associated with the
> change if they don't care about enabling it. This is basically the classic
> "should I use a feature flag or not?" discussion and we need to have it on
> a case-by-case basis.
>
> I think it's worth calling out that having a 1:1 mapping between IBP
> versions and metadata.versions will result in some metadata.versions that
> "don't do anything" (aka they do the same thing as the previous
> metadata.version). For example, if we change StopReplicaRequest again, that
> will not affect KRaft mode, but probably would require an IBP bump and
> hence a metadata.version bump. I think that's OK. It's not that different
> from updating your IBP and getting support for ZStandard, when your
> deployment doesn't use ZStandard compression.
>
> best,
> Colin
>
> > Thanks,
> > Jason
> >
> >
> >
> > On Wed, Nov 17, 2021 at 10:17 AM Jun Rao 
> wrote:
> >
> >> Hi, Colin,
> >>
> >> Thanks for the reply.
> >>
> >> For case b, I am not sure that I understand your suggestion. Does "each
> >> subsequent level for metadata.version corresponds to an IBP version"
> mean
> >> that we need to keep IBP forever? Could you describe the upgrade
> process in
> >> this case?
> >>
> >> Jun
> >>
> >> On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe 
> wrote:
> >>
> >> > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote:
> >> > > Hi, David, Colin,
> >> > >
> >> > > Thanks for the reply.
> >> > >
> >> > > 16. Discussed with David offline a bit. We have 3 cases.
> >> > > a. We upgrade from an old version where the metadata.version has
> >> already
> >> > > been finalized. In this case it makes sense to stay with that
> feature
> >> > > version after the upgrade.
> >> >
> >> > +1
> >> >
> >> > > b. We upgrade from an old version where no metadata.version has been
> >> > > finalized. In this case, it makes sense to leave metadata.version
> >> > disabled
> >> > > since we don't know if all brokers have been upgraded.
> >> >
> >> > This is the scenario I was hoping to avoid by saying that ALL KRaft
> >> > clusters have metadata.version of at least 1, and each subsequent
> level
> >> for
> >> > metadata.version corresponds to an IBP version. The existing KRaft
> >> clusters
> >> > in 3.0 and earlier are preview (not for production) so I think this
> >> change
> >> > is OK for 3.x (given that it affects only KRaft). Then IBP is
> irrelevant
> >> > for KRaft clusters (the config is ignored, possibly with a WARN or
> ERROR
> >> > message generated if it is set).
> >> >
> >> > > c. We are starting from a brand new cluster and of course no
> >> > > metadata.version has been finalized. In this case, the KIP says it
> will
> >> > > pick the metadata.version in meta.properties. In the common case,
> >> people
> >> > > probably won't set the metadata.version in the meta.properties file
> >> > > explicitly. So, it will be useful to put a default (stable) version
> >> there
> >> > > when the meta.properties.
> >> >
> >> > Hmm. I was assuming that clusters where the admin didn't specify any
> >> > metadata.version during formatting would get the latest
> metadata.version.
> >> > Partly, because this is what we do for IBP 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-17 Thread Colin McCabe
On Wed, Nov 17, 2021, at 11:28, Jason Gustafson wrote:
> Hi David,
>
> Forgive me if this ground has been covered already. Today, we have a few
> other things that we have latched onto the IBP, such as upgrades to the
> format of records in __consumer_offsets. I've been assuming that
> metadata.version is not covering this. Is that right or is there some other
> plan to take care of cases like this?
>

I think metadata.version could cover changes to things like __consumer_offsets, 
if people want it to. Or to put it another way, that is out of scope for this 
KIP.

Like David said, basically it boils down to creating a feature flag for the new 
proposed __consumer_offsets version, or using a new IBP/metadata.version for 
it. Both approaches have pros and cons. Using an IBP/metadata.version bump 
reduces the size of the testing matrix. But using a feature flag allows people 
to avoid any bugs or pain associated with the change if they don't care about 
enabling it. This is basically the classic "should I use a feature flag or 
not?" discussion and we need to have it on a case-by-case basis.

I think it's worth calling out that having a 1:1 mapping between IBP versions 
and metadata.versions will result in some metadata.versions that "don't do 
anything" (aka they do the same thing as the previous metadata.version). For 
example, if we change StopReplicaRequest again, that will not affect KRaft 
mode, but probably would require an IBP bump and hence a metadata.version bump. 
I think that's OK. It's not that different from updating your IBP and getting 
support for ZStandard, when your deployment doesn't use ZStandard compression.

best,
Colin

> Thanks,
> Jason
>
>
>
> On Wed, Nov 17, 2021 at 10:17 AM Jun Rao  wrote:
>
>> Hi, Colin,
>>
>> Thanks for the reply.
>>
>> For case b, I am not sure that I understand your suggestion. Does "each
>> subsequent level for metadata.version corresponds to an IBP version" mean
>> that we need to keep IBP forever? Could you describe the upgrade process in
>> this case?
>>
>> Jun
>>
>> On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe  wrote:
>>
>> > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote:
>> > > Hi, David, Colin,
>> > >
>> > > Thanks for the reply.
>> > >
>> > > 16. Discussed with David offline a bit. We have 3 cases.
>> > > a. We upgrade from an old version where the metadata.version has
>> already
>> > > been finalized. In this case it makes sense to stay with that feature
>> > > version after the upgrade.
>> >
>> > +1
>> >
>> > > b. We upgrade from an old version where no metadata.version has been
>> > > finalized. In this case, it makes sense to leave metadata.version
>> > disabled
>> > > since we don't know if all brokers have been upgraded.
>> >
>> > This is the scenario I was hoping to avoid by saying that ALL KRaft
>> > clusters have metadata.version of at least 1, and each subsequent level
>> for
>> > metadata.version corresponds to an IBP version. The existing KRaft
>> clusters
>> > in 3.0 and earlier are preview (not for production) so I think this
>> change
>> > is OK for 3.x (given that it affects only KRaft). Then IBP is irrelevant
>> > for KRaft clusters (the config is ignored, possibly with a WARN or ERROR
>> > message generated if it is set).
>> >
>> > > c. We are starting from a brand new cluster and of course no
>> > > metadata.version has been finalized. In this case, the KIP says it will
>> > > pick the metadata.version in meta.properties. In the common case,
>> people
>> > > probably won't set the metadata.version in the meta.properties file
>> > > explicitly. So, it will be useful to put a default (stable) version
>> there
>> > > when the meta.properties.
>> >
>> > Hmm. I was assuming that clusters where the admin didn't specify any
>> > metadata.version during formatting would get the latest metadata.version.
>> > Partly, because this is what we do for IBP today. It would be good to
>> > clarify this...
>> >
>> > >
>> > > Also, it would be useful to clarify that if a FeatureLevelRecord exists
>> > for
>> > > metadata.version, the metadata.version in meta.properties will be
>> > ignored.
>> > >
>> >
>> > Yeah, I agree.
>> >
>> > best,
>> > Colin
>> >
>> > > Thanks,
>> > >
>> > > Jun
>> > >
>> > >
>> > > On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe 
>> > wrote:
>> > >
>> > >> On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote:
>> > >> > Hi, David,
>> > >> >
>> > >> > Thanks for the reply.
>> > >> >
>> > >> > 16. My first concern is that the KIP picks up meta.version
>> > inconsistently
>> > >> > during the deployment. If a new cluster is started, we pick up the
>> > >> highest
>> > >> > version. If we upgrade, we leave the feature version unchanged.
>> > >>
>> > >> Hi Jun,
>> > >>
>> > >> Thanks again for taking a look.
>> > >>
>> > >> The proposed behavior in KIP-778 is consistent with how it works
>> today.
>> > >> Upgrading the software is distinct from upgrading the IBP.
>> > >>
>> > >> I think it is important to keep these two operations ("upgrading

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-17 Thread Colin McCabe
On Wed, Nov 17, 2021, at 12:28, David Arthur wrote:
> Colin,
>
> I wasn't intending to suggest that IBP and metadata.version co-exist long
> term in KRaft. I was thinking we would have one final IBP that would signal
> KRaft clusters to start looking at metadata.version instead. This
> was working under the assumption that we should be able to upgrade 3.0
> clusters to a version with metadata.version. I don't think it would be too
> difficult since we could treat the absence of a feature flag as an implicit
> version 0 which would be compatible with 3.0.
>

Hi David,

Hmm... do we really have to have "one final IBP" for upgraded 3.0 / 3.1 KRaft 
clusters? It seems like we could just initialize their cluster.metadata to 1 
when the software is upgraded to 3.2.

Admittedly this is a bit of a hack -- in general, we don't want to couple 
software upgrades and metadata version bumps. But 3.0 and 3.1 were "preview," 
not intended for production, and this is a pretty minor speed bump for someone 
testing out pre-production software.

This would greatly simplify the system by getting rid of IBP completely in 
KRaft clusters, by substituting it with metadata.version.

>
> Assuming this lands in 3.2, we would have
>
> SoftwareIBPmetadata.versioneffective version
> 3.0 3.0-   0
> 3.2 3.0-   0
> 3.2 3.2-   0
> 3.2 3.21   1
>
> where metadata.version=0 is compatible with what we have today in KRaft.
>
> If we don't support 3.0 -> 3.x KRaft upgrades, then early adopters may be
> rather inconvenienced :) I can't say for sure that the extra complexity is
> worth the convenience of allowing upgrades from the preview versions.
>

So let's drill down to what the inconvenience would be. It would basically be 
an unexpected (if they didn't read the upgrade docs) IBP bump to whatever the 
IBP is in 3.2. Assuming that they started on KAFKA_3_0_IV0 and got 
auto-upgraded to the latest, after the upgrade they would have these additional 
features:

> 1. Introduce ListOffsets V7 (KIP-724)
> 2. Add topic IDs to Fetch requests/responses (KIP-516)

That seems pretty minor (again, noting that this ONLY applies to KRaft 3.0 
Preview users, NOT to ZK users). I don't think it's worth keeping IBP around 
just to prevent this inconvenience.

We have a limited amount of time and effort available to manage versioning 
issues. I don't think it's realistic to have to think about the cross-product 
of all possible metadata.version and IBP values when making a change.

We also have to expect new features to keep going in while the transition is 
taking place. That's why I think having a 1:1 mapping between new IBP versions 
and metadata.versions is helpful.

best,
Colin

>
> We probably need to make a decision on this since it impacts a few things
> in the KIP. What do folks think?
>
> -David
>
>
> On Wed, Nov 17, 2021 at 2:28 PM Jason Gustafson 
> wrote:
>
>> Hi David,
>>
>> Forgive me if this ground has been covered already. Today, we have a few
>> other things that we have latched onto the IBP, such as upgrades to the
>> format of records in __consumer_offsets. I've been assuming that
>> metadata.version is not covering this. Is that right or is there some other
>> plan to take care of cases like this?
>>
>> Thanks,
>> Jason
>>
>>
>>
>> On Wed, Nov 17, 2021 at 10:17 AM Jun Rao  wrote:
>>
>> > Hi, Colin,
>> >
>> > Thanks for the reply.
>> >
>> > For case b, I am not sure that I understand your suggestion. Does "each
>> > subsequent level for metadata.version corresponds to an IBP version" mean
>> > that we need to keep IBP forever? Could you describe the upgrade process
>> in
>> > this case?
>> >
>> > Jun
>> >
>> > On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe  wrote:
>> >
>> > > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote:
>> > > > Hi, David, Colin,
>> > > >
>> > > > Thanks for the reply.
>> > > >
>> > > > 16. Discussed with David offline a bit. We have 3 cases.
>> > > > a. We upgrade from an old version where the metadata.version has
>> > already
>> > > > been finalized. In this case it makes sense to stay with that feature
>> > > > version after the upgrade.
>> > >
>> > > +1
>> > >
>> > > > b. We upgrade from an old version where no metadata.version has been
>> > > > finalized. In this case, it makes sense to leave metadata.version
>> > > disabled
>> > > > since we don't know if all brokers have been upgraded.
>> > >
>> > > This is the scenario I was hoping to avoid by saying that ALL KRaft
>> > > clusters have metadata.version of at least 1, and each subsequent level
>> > for
>> > > metadata.version corresponds to an IBP version. The existing KRaft
>> > clusters
>> > > in 3.0 and earlier are preview (not for production) so I think this
>> > change
>> > > is OK for 3.x (given that it affects only KRaft). Then IBP is
>> irrelevant
>> > > for KRaft clusters (the config is ignored, possibly with a WARN or
>> ERROR
>> > > 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-17 Thread David Arthur
Jason, thanks for the review! No, I don't think that has been covered
explicitly and no metadata.version will not gate things like our offsets
format. The intention is that we will introduce new KIP-584 feature flags
for those formats once the need arises. For ZK clusters, we'll probably
keep using the IBP mechanism and only worry about introducing the feature
flag for KRaft clusters. I'll add a short section to the KIP with this
clarification.

Thanks!
-David

On Wed, Nov 17, 2021 at 3:28 PM David Arthur 
wrote:

> Colin,
>
> I wasn't intending to suggest that IBP and metadata.version co-exist long
> term in KRaft. I was thinking we would have one final IBP that would signal
> KRaft clusters to start looking at metadata.version instead. This
> was working under the assumption that we should be able to upgrade 3.0
> clusters to a version with metadata.version. I don't think it would be too
> difficult since we could treat the absence of a feature flag as an implicit
> version 0 which would be compatible with 3.0.
>
> Assuming this lands in 3.2, we would have
>
> SoftwareIBPmetadata.versioneffective version
> 3.0 3.0-   0
> 3.2 3.0-   0
> 3.2 3.2-   0
> 3.2 3.21   1
>
> where metadata.version=0 is compatible with what we have today in KRaft.
>
> If we don't support 3.0 -> 3.x KRaft upgrades, then early adopters may be
> rather inconvenienced :) I can't say for sure that the extra complexity is
> worth the convenience of allowing upgrades from the preview versions.
>
> We probably need to make a decision on this since it impacts a few things
> in the KIP. What do folks think?
>
> -David
>
>
> On Wed, Nov 17, 2021 at 2:28 PM Jason Gustafson 
> wrote:
>
>> Hi David,
>>
>> Forgive me if this ground has been covered already. Today, we have a few
>> other things that we have latched onto the IBP, such as upgrades to the
>> format of records in __consumer_offsets. I've been assuming that
>> metadata.version is not covering this. Is that right or is there some
>> other
>> plan to take care of cases like this?
>>
>> Thanks,
>> Jason
>>
>>
>>
>> On Wed, Nov 17, 2021 at 10:17 AM Jun Rao 
>> wrote:
>>
>> > Hi, Colin,
>> >
>> > Thanks for the reply.
>> >
>> > For case b, I am not sure that I understand your suggestion. Does "each
>> > subsequent level for metadata.version corresponds to an IBP version"
>> mean
>> > that we need to keep IBP forever? Could you describe the upgrade
>> process in
>> > this case?
>> >
>> > Jun
>> >
>> > On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe 
>> wrote:
>> >
>> > > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote:
>> > > > Hi, David, Colin,
>> > > >
>> > > > Thanks for the reply.
>> > > >
>> > > > 16. Discussed with David offline a bit. We have 3 cases.
>> > > > a. We upgrade from an old version where the metadata.version has
>> > already
>> > > > been finalized. In this case it makes sense to stay with that
>> feature
>> > > > version after the upgrade.
>> > >
>> > > +1
>> > >
>> > > > b. We upgrade from an old version where no metadata.version has been
>> > > > finalized. In this case, it makes sense to leave metadata.version
>> > > disabled
>> > > > since we don't know if all brokers have been upgraded.
>> > >
>> > > This is the scenario I was hoping to avoid by saying that ALL KRaft
>> > > clusters have metadata.version of at least 1, and each subsequent
>> level
>> > for
>> > > metadata.version corresponds to an IBP version. The existing KRaft
>> > clusters
>> > > in 3.0 and earlier are preview (not for production) so I think this
>> > change
>> > > is OK for 3.x (given that it affects only KRaft). Then IBP is
>> irrelevant
>> > > for KRaft clusters (the config is ignored, possibly with a WARN or
>> ERROR
>> > > message generated if it is set).
>> > >
>> > > > c. We are starting from a brand new cluster and of course no
>> > > > metadata.version has been finalized. In this case, the KIP says it
>> will
>> > > > pick the metadata.version in meta.properties. In the common case,
>> > people
>> > > > probably won't set the metadata.version in the meta.properties file
>> > > > explicitly. So, it will be useful to put a default (stable) version
>> > there
>> > > > when the meta.properties.
>> > >
>> > > Hmm. I was assuming that clusters where the admin didn't specify any
>> > > metadata.version during formatting would get the latest
>> metadata.version.
>> > > Partly, because this is what we do for IBP today. It would be good to
>> > > clarify this...
>> > >
>> > > >
>> > > > Also, it would be useful to clarify that if a FeatureLevelRecord
>> exists
>> > > for
>> > > > metadata.version, the metadata.version in meta.properties will be
>> > > ignored.
>> > > >
>> > >
>> > > Yeah, I agree.
>> > >
>> > > best,
>> > > Colin
>> > >
>> > > > Thanks,
>> > > >
>> > > > Jun
>> > > >
>> > > >
>> > > > On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe 
>> > > wrote:
>> > > >
>> > > 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-17 Thread David Arthur
Colin,

I wasn't intending to suggest that IBP and metadata.version co-exist long
term in KRaft. I was thinking we would have one final IBP that would signal
KRaft clusters to start looking at metadata.version instead. This
was working under the assumption that we should be able to upgrade 3.0
clusters to a version with metadata.version. I don't think it would be too
difficult since we could treat the absence of a feature flag as an implicit
version 0 which would be compatible with 3.0.

Assuming this lands in 3.2, we would have

SoftwareIBPmetadata.versioneffective version
3.0 3.0-   0
3.2 3.0-   0
3.2 3.2-   0
3.2 3.21   1

where metadata.version=0 is compatible with what we have today in KRaft.

If we don't support 3.0 -> 3.x KRaft upgrades, then early adopters may be
rather inconvenienced :) I can't say for sure that the extra complexity is
worth the convenience of allowing upgrades from the preview versions.

We probably need to make a decision on this since it impacts a few things
in the KIP. What do folks think?

-David


On Wed, Nov 17, 2021 at 2:28 PM Jason Gustafson 
wrote:

> Hi David,
>
> Forgive me if this ground has been covered already. Today, we have a few
> other things that we have latched onto the IBP, such as upgrades to the
> format of records in __consumer_offsets. I've been assuming that
> metadata.version is not covering this. Is that right or is there some other
> plan to take care of cases like this?
>
> Thanks,
> Jason
>
>
>
> On Wed, Nov 17, 2021 at 10:17 AM Jun Rao  wrote:
>
> > Hi, Colin,
> >
> > Thanks for the reply.
> >
> > For case b, I am not sure that I understand your suggestion. Does "each
> > subsequent level for metadata.version corresponds to an IBP version" mean
> > that we need to keep IBP forever? Could you describe the upgrade process
> in
> > this case?
> >
> > Jun
> >
> > On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe  wrote:
> >
> > > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote:
> > > > Hi, David, Colin,
> > > >
> > > > Thanks for the reply.
> > > >
> > > > 16. Discussed with David offline a bit. We have 3 cases.
> > > > a. We upgrade from an old version where the metadata.version has
> > already
> > > > been finalized. In this case it makes sense to stay with that feature
> > > > version after the upgrade.
> > >
> > > +1
> > >
> > > > b. We upgrade from an old version where no metadata.version has been
> > > > finalized. In this case, it makes sense to leave metadata.version
> > > disabled
> > > > since we don't know if all brokers have been upgraded.
> > >
> > > This is the scenario I was hoping to avoid by saying that ALL KRaft
> > > clusters have metadata.version of at least 1, and each subsequent level
> > for
> > > metadata.version corresponds to an IBP version. The existing KRaft
> > clusters
> > > in 3.0 and earlier are preview (not for production) so I think this
> > change
> > > is OK for 3.x (given that it affects only KRaft). Then IBP is
> irrelevant
> > > for KRaft clusters (the config is ignored, possibly with a WARN or
> ERROR
> > > message generated if it is set).
> > >
> > > > c. We are starting from a brand new cluster and of course no
> > > > metadata.version has been finalized. In this case, the KIP says it
> will
> > > > pick the metadata.version in meta.properties. In the common case,
> > people
> > > > probably won't set the metadata.version in the meta.properties file
> > > > explicitly. So, it will be useful to put a default (stable) version
> > there
> > > > when the meta.properties.
> > >
> > > Hmm. I was assuming that clusters where the admin didn't specify any
> > > metadata.version during formatting would get the latest
> metadata.version.
> > > Partly, because this is what we do for IBP today. It would be good to
> > > clarify this...
> > >
> > > >
> > > > Also, it would be useful to clarify that if a FeatureLevelRecord
> exists
> > > for
> > > > metadata.version, the metadata.version in meta.properties will be
> > > ignored.
> > > >
> > >
> > > Yeah, I agree.
> > >
> > > best,
> > > Colin
> > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > >
> > > > On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe 
> > > wrote:
> > > >
> > > >> On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote:
> > > >> > Hi, David,
> > > >> >
> > > >> > Thanks for the reply.
> > > >> >
> > > >> > 16. My first concern is that the KIP picks up meta.version
> > > inconsistently
> > > >> > during the deployment. If a new cluster is started, we pick up the
> > > >> highest
> > > >> > version. If we upgrade, we leave the feature version unchanged.
> > > >>
> > > >> Hi Jun,
> > > >>
> > > >> Thanks again for taking a look.
> > > >>
> > > >> The proposed behavior in KIP-778 is consistent with how it works
> > today.
> > > >> Upgrading the software is distinct from upgrading the IBP.
> > > >>
> > > >> I think it is important to keep these two 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-17 Thread Jason Gustafson
Hi David,

Forgive me if this ground has been covered already. Today, we have a few
other things that we have latched onto the IBP, such as upgrades to the
format of records in __consumer_offsets. I've been assuming that
metadata.version is not covering this. Is that right or is there some other
plan to take care of cases like this?

Thanks,
Jason



On Wed, Nov 17, 2021 at 10:17 AM Jun Rao  wrote:

> Hi, Colin,
>
> Thanks for the reply.
>
> For case b, I am not sure that I understand your suggestion. Does "each
> subsequent level for metadata.version corresponds to an IBP version" mean
> that we need to keep IBP forever? Could you describe the upgrade process in
> this case?
>
> Jun
>
> On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe  wrote:
>
> > On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote:
> > > Hi, David, Colin,
> > >
> > > Thanks for the reply.
> > >
> > > 16. Discussed with David offline a bit. We have 3 cases.
> > > a. We upgrade from an old version where the metadata.version has
> already
> > > been finalized. In this case it makes sense to stay with that feature
> > > version after the upgrade.
> >
> > +1
> >
> > > b. We upgrade from an old version where no metadata.version has been
> > > finalized. In this case, it makes sense to leave metadata.version
> > disabled
> > > since we don't know if all brokers have been upgraded.
> >
> > This is the scenario I was hoping to avoid by saying that ALL KRaft
> > clusters have metadata.version of at least 1, and each subsequent level
> for
> > metadata.version corresponds to an IBP version. The existing KRaft
> clusters
> > in 3.0 and earlier are preview (not for production) so I think this
> change
> > is OK for 3.x (given that it affects only KRaft). Then IBP is irrelevant
> > for KRaft clusters (the config is ignored, possibly with a WARN or ERROR
> > message generated if it is set).
> >
> > > c. We are starting from a brand new cluster and of course no
> > > metadata.version has been finalized. In this case, the KIP says it will
> > > pick the metadata.version in meta.properties. In the common case,
> people
> > > probably won't set the metadata.version in the meta.properties file
> > > explicitly. So, it will be useful to put a default (stable) version
> there
> > > when the meta.properties.
> >
> > Hmm. I was assuming that clusters where the admin didn't specify any
> > metadata.version during formatting would get the latest metadata.version.
> > Partly, because this is what we do for IBP today. It would be good to
> > clarify this...
> >
> > >
> > > Also, it would be useful to clarify that if a FeatureLevelRecord exists
> > for
> > > metadata.version, the metadata.version in meta.properties will be
> > ignored.
> > >
> >
> > Yeah, I agree.
> >
> > best,
> > Colin
> >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe 
> > wrote:
> > >
> > >> On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote:
> > >> > Hi, David,
> > >> >
> > >> > Thanks for the reply.
> > >> >
> > >> > 16. My first concern is that the KIP picks up meta.version
> > inconsistently
> > >> > during the deployment. If a new cluster is started, we pick up the
> > >> highest
> > >> > version. If we upgrade, we leave the feature version unchanged.
> > >>
> > >> Hi Jun,
> > >>
> > >> Thanks again for taking a look.
> > >>
> > >> The proposed behavior in KIP-778 is consistent with how it works
> today.
> > >> Upgrading the software is distinct from upgrading the IBP.
> > >>
> > >> I think it is important to keep these two operations ("upgrading
> > >> IBP/metadata version" and "upgrading software version") separate. If
> > they
> > >> are coupled it will create a situation where software upgrades are
> > >> difficult and dangerous.
> > >>
> > >> Consider a situation where you find some bug in your current software,
> > and
> > >> you want to upgrade to new software that fixes the bug. If upgrades
> and
> > IBP
> > >> bumps are coupled, you can't do this without also bumping the IBP,
> > which is
> > >> usually considered a high-risk change. That means that either you have
> > to
> > >> make a special build that includes only the fix (time-consuming and
> > >> error-prone), live with the bug for longer, or be very conservative
> > about
> > >> ever introducing new IBP/metadata versions. None of those are really
> > good
> > >> choices.
> > >>
> > >> > Intuitively, it seems that independent of how a cluster is deployed,
> > we
> > >> > should always pick the same feature version.
> > >>
> > >> I think it makes sense to draw a distinction between upgrading an
> > existing
> > >> cluster and deploying a new one. What most people want out of upgrades
> > is
> > >> that things should keep working, but with bug fixes. If we change
> that,
> > it
> > >> just makes people more reluctant to upgrade (which is always a
> > problem...)
> > >>
> > >> > I think we need to think this through in this KIP. My second concern
> > is
> > >> > that as a particular version 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-17 Thread Jun Rao
Hi, Colin,

Thanks for the reply.

For case b, I am not sure that I understand your suggestion. Does "each
subsequent level for metadata.version corresponds to an IBP version" mean
that we need to keep IBP forever? Could you describe the upgrade process in
this case?

Jun

On Tue, Nov 16, 2021 at 3:45 PM Colin McCabe  wrote:

> On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote:
> > Hi, David, Colin,
> >
> > Thanks for the reply.
> >
> > 16. Discussed with David offline a bit. We have 3 cases.
> > a. We upgrade from an old version where the metadata.version has already
> > been finalized. In this case it makes sense to stay with that feature
> > version after the upgrade.
>
> +1
>
> > b. We upgrade from an old version where no metadata.version has been
> > finalized. In this case, it makes sense to leave metadata.version
> disabled
> > since we don't know if all brokers have been upgraded.
>
> This is the scenario I was hoping to avoid by saying that ALL KRaft
> clusters have metadata.version of at least 1, and each subsequent level for
> metadata.version corresponds to an IBP version. The existing KRaft clusters
> in 3.0 and earlier are preview (not for production) so I think this change
> is OK for 3.x (given that it affects only KRaft). Then IBP is irrelevant
> for KRaft clusters (the config is ignored, possibly with a WARN or ERROR
> message generated if it is set).
>
> > c. We are starting from a brand new cluster and of course no
> > metadata.version has been finalized. In this case, the KIP says it will
> > pick the metadata.version in meta.properties. In the common case, people
> > probably won't set the metadata.version in the meta.properties file
> > explicitly. So, it will be useful to put a default (stable) version there
> > when the meta.properties.
>
> Hmm. I was assuming that clusters where the admin didn't specify any
> metadata.version during formatting would get the latest metadata.version.
> Partly, because this is what we do for IBP today. It would be good to
> clarify this...
>
> >
> > Also, it would be useful to clarify that if a FeatureLevelRecord exists
> for
> > metadata.version, the metadata.version in meta.properties will be
> ignored.
> >
>
> Yeah, I agree.
>
> best,
> Colin
>
> > Thanks,
> >
> > Jun
> >
> >
> > On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe 
> wrote:
> >
> >> On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote:
> >> > Hi, David,
> >> >
> >> > Thanks for the reply.
> >> >
> >> > 16. My first concern is that the KIP picks up meta.version
> inconsistently
> >> > during the deployment. If a new cluster is started, we pick up the
> >> highest
> >> > version. If we upgrade, we leave the feature version unchanged.
> >>
> >> Hi Jun,
> >>
> >> Thanks again for taking a look.
> >>
> >> The proposed behavior in KIP-778 is consistent with how it works today.
> >> Upgrading the software is distinct from upgrading the IBP.
> >>
> >> I think it is important to keep these two operations ("upgrading
> >> IBP/metadata version" and "upgrading software version") separate. If
> they
> >> are coupled it will create a situation where software upgrades are
> >> difficult and dangerous.
> >>
> >> Consider a situation where you find some bug in your current software,
> and
> >> you want to upgrade to new software that fixes the bug. If upgrades and
> IBP
> >> bumps are coupled, you can't do this without also bumping the IBP,
> which is
> >> usually considered a high-risk change. That means that either you have
> to
> >> make a special build that includes only the fix (time-consuming and
> >> error-prone), live with the bug for longer, or be very conservative
> about
> >> ever introducing new IBP/metadata versions. None of those are really
> good
> >> choices.
> >>
> >> > Intuitively, it seems that independent of how a cluster is deployed,
> we
> >> > should always pick the same feature version.
> >>
> >> I think it makes sense to draw a distinction between upgrading an
> existing
> >> cluster and deploying a new one. What most people want out of upgrades
> is
> >> that things should keep working, but with bug fixes. If we change that,
> it
> >> just makes people more reluctant to upgrade (which is always a
> problem...)
> >>
> >> > I think we need to think this through in this KIP. My second concern
> is
> >> > that as a particular version matures, it's inconvenient for a user to
> >> manually
> >> > upgrade every feature version. As long as we have a path to achieve
> that
> >> in
> >> > the future, we don't need to address that in this KIP.
> >>
> >> If people are managing a large number of Kafka clusters, they will want
> to
> >> do some sort of A/B testing with IBP/metadata versions. So if you have
> 1000
> >> Kafka clusters, you roll out the new IBP version to 10 of them and see
> how
> >> it goes. If that goes well, you roll it out to more, etc.
> >>
> >> So, the automation needs to be at the cluster management layer, not at
> the
> >> Kafka layer. Each Kafka cluster doesn't know how well 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Colin McCabe
On Tue, Nov 16, 2021, at 15:13, Jun Rao wrote:
> Hi, David, Colin,
>
> Thanks for the reply.
>
> 16. Discussed with David offline a bit. We have 3 cases.
> a. We upgrade from an old version where the metadata.version has already
> been finalized. In this case it makes sense to stay with that feature
> version after the upgrade.

+1

> b. We upgrade from an old version where no metadata.version has been
> finalized. In this case, it makes sense to leave metadata.version disabled
> since we don't know if all brokers have been upgraded.

This is the scenario I was hoping to avoid by saying that ALL KRaft clusters 
have metadata.version of at least 1, and each subsequent level for 
metadata.version corresponds to an IBP version. The existing KRaft clusters in 
3.0 and earlier are preview (not for production) so I think this change is OK 
for 3.x (given that it affects only KRaft). Then IBP is irrelevant for KRaft 
clusters (the config is ignored, possibly with a WARN or ERROR message 
generated if it is set).

> c. We are starting from a brand new cluster and of course no
> metadata.version has been finalized. In this case, the KIP says it will
> pick the metadata.version in meta.properties. In the common case, people
> probably won't set the metadata.version in the meta.properties file
> explicitly. So, it will be useful to put a default (stable) version there
> when the meta.properties.

Hmm. I was assuming that clusters where the admin didn't specify any 
metadata.version during formatting would get the latest metadata.version. 
Partly, because this is what we do for IBP today. It would be good to clarify 
this...

>
> Also, it would be useful to clarify that if a FeatureLevelRecord exists for
> metadata.version, the metadata.version in meta.properties will be ignored.
>

Yeah, I agree.

best,
Colin

> Thanks,
>
> Jun
>
>
> On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe  wrote:
>
>> On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote:
>> > Hi, David,
>> >
>> > Thanks for the reply.
>> >
>> > 16. My first concern is that the KIP picks up meta.version inconsistently
>> > during the deployment. If a new cluster is started, we pick up the
>> highest
>> > version. If we upgrade, we leave the feature version unchanged.
>>
>> Hi Jun,
>>
>> Thanks again for taking a look.
>>
>> The proposed behavior in KIP-778 is consistent with how it works today.
>> Upgrading the software is distinct from upgrading the IBP.
>>
>> I think it is important to keep these two operations ("upgrading
>> IBP/metadata version" and "upgrading software version") separate. If they
>> are coupled it will create a situation where software upgrades are
>> difficult and dangerous.
>>
>> Consider a situation where you find some bug in your current software, and
>> you want to upgrade to new software that fixes the bug. If upgrades and IBP
>> bumps are coupled, you can't do this without also bumping the IBP, which is
>> usually considered a high-risk change. That means that either you have to
>> make a special build that includes only the fix (time-consuming and
>> error-prone), live with the bug for longer, or be very conservative about
>> ever introducing new IBP/metadata versions. None of those are really good
>> choices.
>>
>> > Intuitively, it seems that independent of how a cluster is deployed, we
>> > should always pick the same feature version.
>>
>> I think it makes sense to draw a distinction between upgrading an existing
>> cluster and deploying a new one. What most people want out of upgrades is
>> that things should keep working, but with bug fixes. If we change that, it
>> just makes people more reluctant to upgrade (which is always a problem...)
>>
>> > I think we need to think this through in this KIP. My second concern is
>> > that as a particular version matures, it's inconvenient for a user to
>> manually
>> > upgrade every feature version. As long as we have a path to achieve that
>> in
>> > the future, we don't need to address that in this KIP.
>>
>> If people are managing a large number of Kafka clusters, they will want to
>> do some sort of A/B testing with IBP/metadata versions. So if you have 1000
>> Kafka clusters, you roll out the new IBP version to 10 of them and see how
>> it goes. If that goes well, you roll it out to more, etc.
>>
>> So, the automation needs to be at the cluster management layer, not at the
>> Kafka layer. Each Kafka cluster doesn't know how well things went in the
>> other 999 clusters. Automatically upgrading is a bad thing for the same
>> reason Kafka automatically upgrading its own software version would be a
>> bad thing -- it could lead to a disruption to a sensitive cluster at the
>> wrong time.
>>
>> For people who are just managing one or two Kafka clusters, automatically
>> upgrading feature versions isn't a big burden and can be done manually.
>> This is all consistent with how IBP works today.
>>
>> Also, we already have a command-line option to the feature tool which
>> upgrades all features to 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Jun Rao
Hi, David, Colin,

Thanks for the reply.

16. Discussed with David offline a bit. We have 3 cases.
a. We upgrade from an old version where the metadata.version has already
been finalized. In this case it makes sense to stay with that feature
version after the upgrade.
b. We upgrade from an old version where no metadata.version has been
finalized. In this case, it makes sense to leave metadata.version disabled
since we don't know if all brokers have been upgraded.
c. We are starting from a brand new cluster and of course no
metadata.version has been finalized. In this case, the KIP says it will
pick the metadata.version in meta.properties. In the common case, people
probably won't set the metadata.version in the meta.properties file
explicitly. So, it will be useful to put a default (stable) version there
when the meta.properties.

Also, it would be useful to clarify that if a FeatureLevelRecord exists for
metadata.version, the metadata.version in meta.properties will be ignored.

Thanks,

Jun


On Tue, Nov 16, 2021 at 12:39 PM Colin McCabe  wrote:

> On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote:
> > Hi, David,
> >
> > Thanks for the reply.
> >
> > 16. My first concern is that the KIP picks up meta.version inconsistently
> > during the deployment. If a new cluster is started, we pick up the
> highest
> > version. If we upgrade, we leave the feature version unchanged.
>
> Hi Jun,
>
> Thanks again for taking a look.
>
> The proposed behavior in KIP-778 is consistent with how it works today.
> Upgrading the software is distinct from upgrading the IBP.
>
> I think it is important to keep these two operations ("upgrading
> IBP/metadata version" and "upgrading software version") separate. If they
> are coupled it will create a situation where software upgrades are
> difficult and dangerous.
>
> Consider a situation where you find some bug in your current software, and
> you want to upgrade to new software that fixes the bug. If upgrades and IBP
> bumps are coupled, you can't do this without also bumping the IBP, which is
> usually considered a high-risk change. That means that either you have to
> make a special build that includes only the fix (time-consuming and
> error-prone), live with the bug for longer, or be very conservative about
> ever introducing new IBP/metadata versions. None of those are really good
> choices.
>
> > Intuitively, it seems that independent of how a cluster is deployed, we
> > should always pick the same feature version.
>
> I think it makes sense to draw a distinction between upgrading an existing
> cluster and deploying a new one. What most people want out of upgrades is
> that things should keep working, but with bug fixes. If we change that, it
> just makes people more reluctant to upgrade (which is always a problem...)
>
> > I think we need to think this through in this KIP. My second concern is
> > that as a particular version matures, it's inconvenient for a user to
> manually
> > upgrade every feature version. As long as we have a path to achieve that
> in
> > the future, we don't need to address that in this KIP.
>
> If people are managing a large number of Kafka clusters, they will want to
> do some sort of A/B testing with IBP/metadata versions. So if you have 1000
> Kafka clusters, you roll out the new IBP version to 10 of them and see how
> it goes. If that goes well, you roll it out to more, etc.
>
> So, the automation needs to be at the cluster management layer, not at the
> Kafka layer. Each Kafka cluster doesn't know how well things went in the
> other 999 clusters. Automatically upgrading is a bad thing for the same
> reason Kafka automatically upgrading its own software version would be a
> bad thing -- it could lead to a disruption to a sensitive cluster at the
> wrong time.
>
> For people who are just managing one or two Kafka clusters, automatically
> upgrading feature versions isn't a big burden and can be done manually.
> This is all consistent with how IBP works today.
>
> Also, we already have a command-line option to the feature tool which
> upgrades all features to the latest available, if that is what the
> administrator desires. Perhaps we could add documentation saying that this
> should be done as the last step of the upgrade.
>
> best,
> Colin
>
> >
> > 21. "./kafka-features.sh delete": Deleting a feature seems a bit weird
> > since the logic is always there. Would it be better to use disable?
> >
> > Jun
> >
> > On Fri, Nov 5, 2021 at 8:11 AM David Arthur
> >  wrote:
> >
> >> Colin and Jun, thanks for the additional comments!
> >>
> >> Colin:
> >>
> >> > We've been talking about having an automated RPC compatibility checker
> >>
> >> Do we have a way to mark fields in schemas as deprecated? It can stay in
> >> the RPC, it just complicates the logic a bit.
> >>
> >> > It would be nice if the active controller could validate that a
> majority
> >> of the quorum could use the proposed metadata.version. The active
> >> controller should have this 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Guozhang Wang
Yeah I agree that checking a majority of voters support the
metadata.version is sufficient. What I was originally considering is
whether (in the future) we could consider encoding the metadata.version
value in the vote request as well, so that the elected leader is supposed
to have a version that's supported by a majority of the quorum.

Guozhang

On Tue, Nov 16, 2021 at 2:02 PM Colin McCabe  wrote:

> On Tue, Nov 16, 2021, at 13:36, Guozhang Wang wrote:
> > Hi Colin,
> >
> > If we allow downgrades which would be appended in metadata.version, then
> > the length of the __cluster_medata log may not be safe to indicate higher
> > versions, since older version records could still be appended later than
> a
> > new version record right?
> >
>
> That's fair... the longer log could be at a lower metadata.version.
>
> However, can you think of a scenario where the upgrade / downgrade logic
> doesn't protect us here?
>
> Checking that a majority of the voters support the new metadata.version
> we're going to seems to cover all the cases I can think of. I guess there
> could be time-of-check, time-of-use race conditions, but they only happen
> if you are doing a feature downgrade at the same time as a software version
> downgrade. This is something the cluster management software should not do
> (I guess we should spell this out in the KIP) I think nobody would want to
> do that since it would mean rolling the cluster while you're messing with
> feature flags.
>
> best,
> Colin
>


-- 
-- Guozhang


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Colin McCabe
On Tue, Nov 16, 2021, at 13:36, Guozhang Wang wrote:
> Hi Colin,
>
> If we allow downgrades which would be appended in metadata.version, then
> the length of the __cluster_medata log may not be safe to indicate higher
> versions, since older version records could still be appended later than a
> new version record right?
>

That's fair... the longer log could be at a lower metadata.version.

However, can you think of a scenario where the upgrade / downgrade logic 
doesn't protect us here?

Checking that a majority of the voters support the new metadata.version we're 
going to seems to cover all the cases I can think of. I guess there could be 
time-of-check, time-of-use race conditions, but they only happen if you are 
doing a feature downgrade at the same time as a software version downgrade. 
This is something the cluster management software should not do (I guess we 
should spell this out in the KIP) I think nobody would want to do that since it 
would mean rolling the cluster while you're messing with feature flags.

best,
Colin


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Guozhang Wang
Hi Colin,

If we allow downgrades which would be appended in metadata.version, then
the length of the __cluster_medata log may not be safe to indicate higher
versions, since older version records could still be appended later than a
new version record right?

On Tue, Nov 16, 2021 at 1:16 PM Colin McCabe  wrote:

> On Tue, Nov 16, 2021, at 06:36, David Arthur wrote:
> > An interesting case here is how to handle a version update if a majority
> of
> > the quorum supports it, but the leader doesn't. For example, if C1 was
> the
> > leader and an upgrade to version 4 was requested. Maybe this would
> trigger
> > C1 to resign and inform the client to retry the update later.
> >
>
> Hmm, wouldn't we want to just reject the version update in this case? I
> don't see what the advantage of allowing it would be.
>
> The reason for requiring a majority rather than all voters is mainly to
> cover the case where a voter is down, I thought. That clearly doesn't apply
> if the un-upgraded voter is the leader itself...
>
> >
> > We may eventually want to consider the metadata.version when electing a
> > leader, but as long as we have the majority requirement before
> committing a
> > new metadata.version, I think we should be safe.
> >
>
> Yeah, this is safe. If we elect a leader at metadata.version X then that
> means that a majority of the cluster is at least at version X. Proof by
> contradiction: assume that this is not the case. Then the newly elected
> leader must have a shorter __cluster_metadata log than a majority of the
> voters. But this is incompatible with winning a Raft election.
>
> In the case where the leader is "behind" some of the other voters, those
> voters will truncate their logs to match the new leader. This will
> downgrade them. Basically this is the case where the feature upgrade was
> proposed, but never fully completed.
>
> best,
> Colin
>
>
> > -David
> >
> > On Mon, Nov 15, 2021 at 12:52 PM Guozhang Wang 
> wrote:
> >
> >> Thanks David,
> >>
> >> 1. Got it. One thing I'm still not very clear is why it's sufficient to
> >> select a metadata.version which is supported by majority of the quorum,
> but
> >> not the whole quorum (i.e. choosing the lowest version among all the
> quorum
> >> members)? Since the leader election today does not take this value into
> >> consideration, we are not guaranteed that newly selected leaders would
> >> always be able to recognize and support the initialized metadata.version
> >> right?
> >>
> >> 2. Yeah I think I agree the behavior-but-not-RPC-change scenario is
> beyond
> >> the scope of this KIP, we can defer it to later discussions.
> >>
> >> On Mon, Nov 15, 2021 at 8:13 AM David Arthur
> >>  wrote:
> >>
> >> > Guozhang, thanks for the review!
> >> >
> >> > 1, As we've defined it so far, the initial metadata.version is set by
> an
> >> > operator via the "kafka-storage.sh" tool. It would be possible for
> >> > different values to be selected, but only the quorum leader would
> commit
> >> a
> >> > FeatureLevelRecord with the version they read locally. See the above
> >> reply
> >> > to Jun's question for a little more detail.
> >> >
> >> > We need to enable the KRaft RPCs regardless of metadata.version (vote,
> >> > heartbeat, fetch, etc) so that the quorum can be formed, a leader can
> be
> >> > elected, and followers can learn about the selected metadata.version.
> I
> >> > believe the quorum startup procedure would go something like:
> >> >
> >> > * Controller nodes start, read their logs, begin leader election
> >> > * While the elected node is becoming leader (in
> >> > QuorumMetaLogListener#handleLeaderChange), initialize
> metadata.version if
> >> > necessary
> >> > * Followers replicate the FeatureLevelRecord
> >> >
> >> > This (probably) means the quorum peers must continue to rely on
> >> ApiVersions
> >> > to negotiate compatible RPC versions and these versions cannot depend
> on
> >> > metadata.version.
> >> >
> >> > Does this make sense?
> >> >
> >> >
> >> > 2, ApiVersionResponse includes the broker's supported feature flags as
> >> well
> >> > as the cluster-wide finalized feature flags. We probably need to add
> >> > something like the feature flag "epoch" to this response payload in
> order
> >> > to see which broker is most up to date.
> >> >
> >> > If the new feature flag version included RPC changes, we are helped by
> >> the
> >> > fact that a client won't attempt to use the new RPC until it has
> >> discovered
> >> > a broker that supports it via ApiVersions. The problem is more
> difficult
> >> > for cases like you described where the feature flag upgrade changes
> the
> >> > behavior of the broker, but not its RPCs. This is actually the same
> >> problem
> >> > as upgrading the IBP. During a rolling restart, clients may hit
> different
> >> > brokers with different capabilities and not know it.
> >> >
> >> > This probably warrants further investigation, but hopefully you agree
> it
> >> is
> >> > beyond the scope of this KIP :)
> >> 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Guozhang Wang
Thanks, I think having the leader election to consider the metadata.version
would be a good idea moving forward, but we do not need to include in this
KIP.

On Tue, Nov 16, 2021 at 6:37 AM David Arthur
 wrote:

> Guozhang,
>
> 1. By requiring a majority of all controller nodes to support the version
> selected by the leader, we increase the likelihood that the next leader
> will also support it. We can't guarantee that all nodes definitely support
> the selected metadata.version because there could always be an offline or
> partitioned peer that is running old software.
>
> If a controller running old software manages elected, we hit this case:
>
>  In the unlikely event that an active controller encounters an unsupported
> > metadata.version, it should resign and terminate.
>
>
> So, given this, we should be able to eventually elect a controller that
> does support the metadata.version.
>
>
> Consider controllers C1, C2, C3 with this arrangement:
>
> Node  SoftwareVer MaxMetadataVer
> C13.2 1
> C23.3 4
> C33.3 4
>
> If the current metadata.version is 1 and we're trying to upgrade to 4, we
> would allow it since two of the three nodes support it. If any one
> controller is down while we are attempting an upgrade, we would require
> that both of remaining alive nodes support the target metadata.version
> (since we require a majority of _all_ controller nodes, not just alive
> ones).
>
> An interesting case here is how to handle a version update if a majority of
> the quorum supports it, but the leader doesn't. For example, if C1 was the
> leader and an upgrade to version 4 was requested. Maybe this would trigger
> C1 to resign and inform the client to retry the update later.
>
> We may eventually want to consider the metadata.version when electing a
> leader, but as long as we have the majority requirement before committing a
> new metadata.version, I think we should be safe.
>
> -David
>
> On Mon, Nov 15, 2021 at 12:52 PM Guozhang Wang  wrote:
>
> > Thanks David,
> >
> > 1. Got it. One thing I'm still not very clear is why it's sufficient to
> > select a metadata.version which is supported by majority of the quorum,
> but
> > not the whole quorum (i.e. choosing the lowest version among all the
> quorum
> > members)? Since the leader election today does not take this value into
> > consideration, we are not guaranteed that newly selected leaders would
> > always be able to recognize and support the initialized metadata.version
> > right?
> >
> > 2. Yeah I think I agree the behavior-but-not-RPC-change scenario is
> beyond
> > the scope of this KIP, we can defer it to later discussions.
> >
> > On Mon, Nov 15, 2021 at 8:13 AM David Arthur
> >  wrote:
> >
> > > Guozhang, thanks for the review!
> > >
> > > 1, As we've defined it so far, the initial metadata.version is set by
> an
> > > operator via the "kafka-storage.sh" tool. It would be possible for
> > > different values to be selected, but only the quorum leader would
> commit
> > a
> > > FeatureLevelRecord with the version they read locally. See the above
> > reply
> > > to Jun's question for a little more detail.
> > >
> > > We need to enable the KRaft RPCs regardless of metadata.version (vote,
> > > heartbeat, fetch, etc) so that the quorum can be formed, a leader can
> be
> > > elected, and followers can learn about the selected metadata.version. I
> > > believe the quorum startup procedure would go something like:
> > >
> > > * Controller nodes start, read their logs, begin leader election
> > > * While the elected node is becoming leader (in
> > > QuorumMetaLogListener#handleLeaderChange), initialize metadata.version
> if
> > > necessary
> > > * Followers replicate the FeatureLevelRecord
> > >
> > > This (probably) means the quorum peers must continue to rely on
> > ApiVersions
> > > to negotiate compatible RPC versions and these versions cannot depend
> on
> > > metadata.version.
> > >
> > > Does this make sense?
> > >
> > >
> > > 2, ApiVersionResponse includes the broker's supported feature flags as
> > well
> > > as the cluster-wide finalized feature flags. We probably need to add
> > > something like the feature flag "epoch" to this response payload in
> order
> > > to see which broker is most up to date.
> > >
> > > If the new feature flag version included RPC changes, we are helped by
> > the
> > > fact that a client won't attempt to use the new RPC until it has
> > discovered
> > > a broker that supports it via ApiVersions. The problem is more
> difficult
> > > for cases like you described where the feature flag upgrade changes the
> > > behavior of the broker, but not its RPCs. This is actually the same
> > problem
> > > as upgrading the IBP. During a rolling restart, clients may hit
> different
> > > brokers with different capabilities and not know it.
> > >
> > > This probably warrants further investigation, but hopefully you agree
> it
> > is
> > > beyond the scope of this KIP :)
> > >
> > > -David
> > >

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Colin McCabe
On Tue, Nov 16, 2021, at 06:36, David Arthur wrote:
> An interesting case here is how to handle a version update if a majority of
> the quorum supports it, but the leader doesn't. For example, if C1 was the
> leader and an upgrade to version 4 was requested. Maybe this would trigger
> C1 to resign and inform the client to retry the update later.
>

Hmm, wouldn't we want to just reject the version update in this case? I don't 
see what the advantage of allowing it would be.

The reason for requiring a majority rather than all voters is mainly to cover 
the case where a voter is down, I thought. That clearly doesn't apply if the 
un-upgraded voter is the leader itself...

>
> We may eventually want to consider the metadata.version when electing a
> leader, but as long as we have the majority requirement before committing a
> new metadata.version, I think we should be safe.
>

Yeah, this is safe. If we elect a leader at metadata.version X then that means 
that a majority of the cluster is at least at version X. Proof by 
contradiction: assume that this is not the case. Then the newly elected leader 
must have a shorter __cluster_metadata log than a majority of the voters. But 
this is incompatible with winning a Raft election.

In the case where the leader is "behind" some of the other voters, those voters 
will truncate their logs to match the new leader. This will downgrade them. 
Basically this is the case where the feature upgrade was proposed, but never 
fully completed.

best,
Colin


> -David
>
> On Mon, Nov 15, 2021 at 12:52 PM Guozhang Wang  wrote:
>
>> Thanks David,
>>
>> 1. Got it. One thing I'm still not very clear is why it's sufficient to
>> select a metadata.version which is supported by majority of the quorum, but
>> not the whole quorum (i.e. choosing the lowest version among all the quorum
>> members)? Since the leader election today does not take this value into
>> consideration, we are not guaranteed that newly selected leaders would
>> always be able to recognize and support the initialized metadata.version
>> right?
>>
>> 2. Yeah I think I agree the behavior-but-not-RPC-change scenario is beyond
>> the scope of this KIP, we can defer it to later discussions.
>>
>> On Mon, Nov 15, 2021 at 8:13 AM David Arthur
>>  wrote:
>>
>> > Guozhang, thanks for the review!
>> >
>> > 1, As we've defined it so far, the initial metadata.version is set by an
>> > operator via the "kafka-storage.sh" tool. It would be possible for
>> > different values to be selected, but only the quorum leader would commit
>> a
>> > FeatureLevelRecord with the version they read locally. See the above
>> reply
>> > to Jun's question for a little more detail.
>> >
>> > We need to enable the KRaft RPCs regardless of metadata.version (vote,
>> > heartbeat, fetch, etc) so that the quorum can be formed, a leader can be
>> > elected, and followers can learn about the selected metadata.version. I
>> > believe the quorum startup procedure would go something like:
>> >
>> > * Controller nodes start, read their logs, begin leader election
>> > * While the elected node is becoming leader (in
>> > QuorumMetaLogListener#handleLeaderChange), initialize metadata.version if
>> > necessary
>> > * Followers replicate the FeatureLevelRecord
>> >
>> > This (probably) means the quorum peers must continue to rely on
>> ApiVersions
>> > to negotiate compatible RPC versions and these versions cannot depend on
>> > metadata.version.
>> >
>> > Does this make sense?
>> >
>> >
>> > 2, ApiVersionResponse includes the broker's supported feature flags as
>> well
>> > as the cluster-wide finalized feature flags. We probably need to add
>> > something like the feature flag "epoch" to this response payload in order
>> > to see which broker is most up to date.
>> >
>> > If the new feature flag version included RPC changes, we are helped by
>> the
>> > fact that a client won't attempt to use the new RPC until it has
>> discovered
>> > a broker that supports it via ApiVersions. The problem is more difficult
>> > for cases like you described where the feature flag upgrade changes the
>> > behavior of the broker, but not its RPCs. This is actually the same
>> problem
>> > as upgrading the IBP. During a rolling restart, clients may hit different
>> > brokers with different capabilities and not know it.
>> >
>> > This probably warrants further investigation, but hopefully you agree it
>> is
>> > beyond the scope of this KIP :)
>> >
>> > -David
>> >
>> >
>> > On Mon, Nov 15, 2021 at 10:26 AM David Arthur > >
>> > wrote:
>> >
>> > > Jun, thanks for the comments!
>> > >
>> > > 16, When a new cluster is deployed, we don't select the highest
>> available
>> > > metadata.version, but rather the quorum leader picks a bootstrap
>> version
>> > > defined in meta.properties. As mentioned earlier, we should add
>> > validation
>> > > here to ensure a majority of the followers could support this version
>> > > before initializing it. This would avoid a situation where a 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Colin McCabe
Hi David,

On Mon, Nov 15, 2021, at 08:13, David Arthur wrote:
> This (probably) means the quorum peers must continue to rely on ApiVersions
> to negotiate compatible RPC versions and these versions cannot depend on
> metadata.version.

I agree. Can we explicitly spell out in KIP that quorum peers use ApiVersions 
to negotiate compatible RPC versions, rather than using metadata.version?

>
> If the new feature flag version included RPC changes, we are helped by the
> fact that a client won't attempt to use the new RPC until it has discovered
> a broker that supports it via ApiVersions. The problem is more difficult
> for cases like you described where the feature flag upgrade changes the
> behavior of the broker, but not its RPCs. This is actually the same problem
> as upgrading the IBP. During a rolling restart, clients may hit different
> brokers with different capabilities and not know it.
>
> This probably warrants further investigation, but hopefully you agree it is
> beyond the scope of this KIP :)
>

Yes, this is an existing issue (at least in theory... doesn't seem to be in 
practice) and out of scope for this KIP.

> Thinking a bit more on this, we do need to define a state where we're
> running newer software, but we don't have the feature flag set. This could
> happen if we were running an older IBP that did not support KIP-778.

I took another look at the KIP and I see that it calls for IBP to continue to 
exist alongside metadata.version. In our previous offline discussions, I think 
we were assuming that metadata.version would replace the IBP completely. Also, 
there would be a 1:1 correspondence between new IBP versions and new 
metadata.version levels. Basically ZK-based clusters will keep using IBP, the 
same as they've always done, whereas KRaft-based clusters will use 
metadata.version instead.

I actually don't think it makes sense to keep IBP if we are introducing 
metadata.version. I think metadata.version should replace it completely in the 
KRaft case, and ZK-based brokers should keep working exactly like they 
currently do. This avoids all the complexity described above. What do you think?

Also, it seems like KIP-778 is not present on the main KIP wiki page... can you 
add it?

best,
Colin


> -David
>
>
> On Mon, Nov 15, 2021 at 10:26 AM David Arthur 
> wrote:
>
>> Jun, thanks for the comments!
>>
>> 16, When a new cluster is deployed, we don't select the highest available
>> metadata.version, but rather the quorum leader picks a bootstrap version
>> defined in meta.properties. As mentioned earlier, we should add validation
>> here to ensure a majority of the followers could support this version
>> before initializing it. This would avoid a situation where a failover
>> results in a new leader who can't support the selected metadata.version.
>>
>> Thinking a bit more on this, we do need to define a state where we're
>> running newer software, but we don't have the feature flag set. This could
>> happen if we were running an older IBP that did not support KIP-778.
>> Following on this, it doesn't seem too difficult to consider a case where
>> the IBP has been upgraded, but we still have not finalized a
>> metadata.version. Here are some possible version combinations (assuming
>> KIP-778 is added to Kafka 3.2):
>>
>> Case  SoftwareIBPmetadata.versioneffective version
>> --
>> A 3.1 3.1-   0  software too old for
>> feature flag
>> B 3.2 3.1-   0  feature flag supported,
>> but IBP too old
>> C 3.2 3.2-   0  feature flag supported,
>> but not initialized
>> D 3.2 3.21   1  feature flag initialized
>> to 1 (via operator or bootstrap process)
>> ...
>> E 3.8 3.1-   0  ...IBP too old
>> F 3.8 3.2-   0  ...not initialized
>> G 3.8 3.24   4
>>
>>
>> Here I'm defining version 0 as "no metadata.version set". So back to your
>> comment, I think the KIP omits discussion of case C from the above table
>> which I can amend. Does that cover your concerns, or am I missing something
>> else?
>>
>>
>> > it's inconvenient for a user to manually upgrade every feature version
>>
>> For this, we would probably want to extend the capabilities of KIP-584. I
>> don't think anything we've discussed for KIP-778 will preclude us from
>> adding some kind of auto-upgrade in the future.
>>
>> 21, "disable" sounds good to me. I agree "delete feature-x" sounds a bit
>> weird.
>>
>>
>>
>> On Mon, Nov 8, 2021 at 8:47 PM Guozhang Wang  wrote:
>>
>>> Hello David,
>>>
>>> Thanks for the very nice writeup! It helped me a lot to refresh my memory
>>> on KIP-630/590/584 :)
>>>
>>> I just had two clarification questions after reading through the KIP:
>>>
>>> 1. For the initialization procedure, do we guarantee that all the quorum

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Colin McCabe
On Fri, Nov 5, 2021, at 08:10, David Arthur wrote:
> Colin and Jun, thanks for the additional comments!
>
> Colin:
>
>> We've been talking about having an automated RPC compatibility checker
>
> Do we have a way to mark fields in schemas as deprecated? It can stay in
> the RPC, it just complicates the logic a bit.
>

Hi David,

The easiest thing to do is probably just to add a comment to the "description" 
field.

Thinking about it more, I really think we should stick to the normal RPC 
versioning rules. Making exceptions just tends to lead to issues down the road.

>> It would be nice if the active controller could validate that a majority
> of the quorum could use the proposed metadata.version. The active
> controller should have this information, right? If we don't have recent
> information  from a quorum of voters, we wouldn't be active.
>
> I believe we should have this information from the ApiVersionsResponse. It
> would be good to do this validation to avoid a situation where a
> quorum leader can't be elected due to unprocessable records.
>

Sounds good... can you add this to the KIP?

>> Do we need delete as a command separate from downgrade?
>
> I think from an operator's perspective, it is nice to distinguish between
> changing a feature flag and unsetting it. It might be surprising to an
> operator to see the flag's version set to nothing when they requested the
> downgrade to version 0 (or less).
>

Fair enough. If we want to go this route we should also specify whether set to 
0 is legal, or whether it's necessary to use "delete".

Jun's proposal of using "disable" instead of delete also seems reasonable 
here...

>
> > it seems like we should spell out that metadata.version begins at 1 in
> >KRaft clusters
>
> I added this text:
>

Thanks.

> > We probably also want an RPC implemented by both brokers and controllers
> > that will reveal the min and max supported versions for each feature level
> > supported by the server
>
> This is available in ApiVersionsResponse (we include the server's supported
> features as well as the cluster's finalized features)

Good point.

>
> Jun:
>
> 20. Indeed snapshots are not strictly necessary during an upgrade, I've
> reworded this
>

I thought we agreed that we would do a snapshot before each upgrade so that if 
there was a big problem with the new metadata version, we would at least have a 
clean snapshot to fall back on?

best,
Colin


>
> Thanks!
> David
>
>
> On Thu, Nov 4, 2021 at 6:51 PM Jun Rao  wrote:
>
>> Hi, David, Jose and Colin,
>>
>> Thanks for the reply. A few more comments.
>>
>> 12. It seems that we haven't updated the AdminClient accordingly?
>>
>> 14. "Metadata snapshot is generated and sent to the other inactive
>> controllers and to brokers". I thought we wanted each broker to generate
>> its own snapshot independently? If only the controller generates the
>> snapshot, how do we force other brokers to pick it up?
>>
>> 16. If a feature version is new, one may not want to enable it immediately
>> after the cluster is upgraded. However, if a feature version has been
>> stable, requiring every user to run a command to upgrade to that version
>> seems inconvenient. One way to improve this is for each feature to define
>> one version as the default. Then, when we upgrade a cluster, we will
>> automatically upgrade the feature to the default version. An admin could
>> use the tool to upgrade to a version higher than the default.
>>
>> 20. "The quorum controller can assist with this process by generating a
>> metadata snapshot after a metadata.version increase has been committed to
>> the metadata log. This snapshot will be a convenient way to let broker and
>> controller components rebuild their entire in-memory state following an
>> upgrade." The new version of the software could read both the new and the
>> old version. Is generating a new snapshot during upgrade needed?
>>
>> Jun
>>
>>
>> On Wed, Nov 3, 2021 at 5:42 PM Colin McCabe  wrote:
>>
>> > On Tue, Oct 12, 2021, at 10:34, Jun Rao wrote:
>> > > Hi, David,
>> > >
>> > > One more comment.
>> > >
>> > > 16. The main reason why KIP-584 requires finalizing a feature manually
>> is
>> > > that in the ZK world, the controller doesn't know all brokers in a
>> > cluster.
>> > > A broker temporarily down is not registered in ZK. in the KRaft world,
>> > the
>> > > controller keeps track of all brokers, including those that are
>> > temporarily
>> > > down. This makes it possible for the controller to automatically
>> > finalize a
>> > > feature---it's safe to do so when all brokers support that feature.
>> This
>> > > will make the upgrade process much simpler since no manual command is
>> > > required to turn on a new feature. Have we considered this?
>> > >
>> > > Thanks,
>> > >
>> > > Jun
>> >
>> > Hi Jun,
>> >
>> > I guess David commented on this point already, but I'll comment as well.
>> I
>> > always had the perception that users viewed rolls as potentially risky
>> and
>> > were 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread Colin McCabe
On Fri, Nov 5, 2021, at 15:18, Jun Rao wrote:
> Hi, David,
>
> Thanks for the reply.
>
> 16. My first concern is that the KIP picks up meta.version inconsistently
> during the deployment. If a new cluster is started, we pick up the highest
> version. If we upgrade, we leave the feature version unchanged.

Hi Jun,

Thanks again for taking a look.

The proposed behavior in KIP-778 is consistent with how it works today. 
Upgrading the software is distinct from upgrading the IBP.

I think it is important to keep these two operations ("upgrading IBP/metadata 
version" and "upgrading software version") separate. If they are coupled it 
will create a situation where software upgrades are difficult and dangerous.

Consider a situation where you find some bug in your current software, and you 
want to upgrade to new software that fixes the bug. If upgrades and IBP bumps 
are coupled, you can't do this without also bumping the IBP, which is usually 
considered a high-risk change. That means that either you have to make a 
special build that includes only the fix (time-consuming and error-prone), live 
with the bug for longer, or be very conservative about ever introducing new 
IBP/metadata versions. None of those are really good choices.

> Intuitively, it seems that independent of how a cluster is deployed, we
> should always pick the same feature version.

I think it makes sense to draw a distinction between upgrading an existing 
cluster and deploying a new one. What most people want out of upgrades is that 
things should keep working, but with bug fixes. If we change that, it just 
makes people more reluctant to upgrade (which is always a problem...)

> I think we need to think this through in this KIP. My second concern is
> that as a particular version matures, it's inconvenient for a user to manually
> upgrade every feature version. As long as we have a path to achieve that in
> the future, we don't need to address that in this KIP.

If people are managing a large number of Kafka clusters, they will want to do 
some sort of A/B testing with IBP/metadata versions. So if you have 1000 Kafka 
clusters, you roll out the new IBP version to 10 of them and see how it goes. 
If that goes well, you roll it out to more, etc.

So, the automation needs to be at the cluster management layer, not at the 
Kafka layer. Each Kafka cluster doesn't know how well things went in the other 
999 clusters. Automatically upgrading is a bad thing for the same reason Kafka 
automatically upgrading its own software version would be a bad thing -- it 
could lead to a disruption to a sensitive cluster at the wrong time.

For people who are just managing one or two Kafka clusters, automatically 
upgrading feature versions isn't a big burden and can be done manually. This is 
all consistent with how IBP works today.

Also, we already have a command-line option to the feature tool which upgrades 
all features to the latest available, if that is what the administrator 
desires. Perhaps we could add documentation saying that this should be done as 
the last step of the upgrade.

best,
Colin

>
> 21. "./kafka-features.sh delete": Deleting a feature seems a bit weird
> since the logic is always there. Would it be better to use disable?
>
> Jun
>
> On Fri, Nov 5, 2021 at 8:11 AM David Arthur
>  wrote:
>
>> Colin and Jun, thanks for the additional comments!
>>
>> Colin:
>>
>> > We've been talking about having an automated RPC compatibility checker
>>
>> Do we have a way to mark fields in schemas as deprecated? It can stay in
>> the RPC, it just complicates the logic a bit.
>>
>> > It would be nice if the active controller could validate that a majority
>> of the quorum could use the proposed metadata.version. The active
>> controller should have this information, right? If we don't have recent
>> information  from a quorum of voters, we wouldn't be active.
>>
>> I believe we should have this information from the ApiVersionsResponse. It
>> would be good to do this validation to avoid a situation where a
>> quorum leader can't be elected due to unprocessable records.
>>
>> > Do we need delete as a command separate from downgrade?
>>
>> I think from an operator's perspective, it is nice to distinguish between
>> changing a feature flag and unsetting it. It might be surprising to an
>> operator to see the flag's version set to nothing when they requested the
>> downgrade to version 0 (or less).
>>
>> > it seems like we should spell out that metadata.version begins at 1 in
>> KRaft clusters
>>
>> I added this text:
>>
>> Introduce an IBP version to indicate the lowest software version that
>> > supports *metadata.version*. Below this IBP, the *metadata.version* is
>> > undefined and will not be examined. At or above this IBP, the
>> > *metadata.version* must be *0* for ZooKeeper clusters and will be
>> > initialized as *1* for KRaft clusters.
>>
>>
>> > We probably also want an RPC implemented by both brokers and controllers
>> that will reveal the 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-16 Thread David Arthur
Guozhang,

1. By requiring a majority of all controller nodes to support the version
selected by the leader, we increase the likelihood that the next leader
will also support it. We can't guarantee that all nodes definitely support
the selected metadata.version because there could always be an offline or
partitioned peer that is running old software.

If a controller running old software manages elected, we hit this case:

 In the unlikely event that an active controller encounters an unsupported
> metadata.version, it should resign and terminate.


So, given this, we should be able to eventually elect a controller that
does support the metadata.version.


Consider controllers C1, C2, C3 with this arrangement:

Node  SoftwareVer MaxMetadataVer
C13.2 1
C23.3 4
C33.3 4

If the current metadata.version is 1 and we're trying to upgrade to 4, we
would allow it since two of the three nodes support it. If any one
controller is down while we are attempting an upgrade, we would require
that both of remaining alive nodes support the target metadata.version
(since we require a majority of _all_ controller nodes, not just alive
ones).

An interesting case here is how to handle a version update if a majority of
the quorum supports it, but the leader doesn't. For example, if C1 was the
leader and an upgrade to version 4 was requested. Maybe this would trigger
C1 to resign and inform the client to retry the update later.

We may eventually want to consider the metadata.version when electing a
leader, but as long as we have the majority requirement before committing a
new metadata.version, I think we should be safe.

-David

On Mon, Nov 15, 2021 at 12:52 PM Guozhang Wang  wrote:

> Thanks David,
>
> 1. Got it. One thing I'm still not very clear is why it's sufficient to
> select a metadata.version which is supported by majority of the quorum, but
> not the whole quorum (i.e. choosing the lowest version among all the quorum
> members)? Since the leader election today does not take this value into
> consideration, we are not guaranteed that newly selected leaders would
> always be able to recognize and support the initialized metadata.version
> right?
>
> 2. Yeah I think I agree the behavior-but-not-RPC-change scenario is beyond
> the scope of this KIP, we can defer it to later discussions.
>
> On Mon, Nov 15, 2021 at 8:13 AM David Arthur
>  wrote:
>
> > Guozhang, thanks for the review!
> >
> > 1, As we've defined it so far, the initial metadata.version is set by an
> > operator via the "kafka-storage.sh" tool. It would be possible for
> > different values to be selected, but only the quorum leader would commit
> a
> > FeatureLevelRecord with the version they read locally. See the above
> reply
> > to Jun's question for a little more detail.
> >
> > We need to enable the KRaft RPCs regardless of metadata.version (vote,
> > heartbeat, fetch, etc) so that the quorum can be formed, a leader can be
> > elected, and followers can learn about the selected metadata.version. I
> > believe the quorum startup procedure would go something like:
> >
> > * Controller nodes start, read their logs, begin leader election
> > * While the elected node is becoming leader (in
> > QuorumMetaLogListener#handleLeaderChange), initialize metadata.version if
> > necessary
> > * Followers replicate the FeatureLevelRecord
> >
> > This (probably) means the quorum peers must continue to rely on
> ApiVersions
> > to negotiate compatible RPC versions and these versions cannot depend on
> > metadata.version.
> >
> > Does this make sense?
> >
> >
> > 2, ApiVersionResponse includes the broker's supported feature flags as
> well
> > as the cluster-wide finalized feature flags. We probably need to add
> > something like the feature flag "epoch" to this response payload in order
> > to see which broker is most up to date.
> >
> > If the new feature flag version included RPC changes, we are helped by
> the
> > fact that a client won't attempt to use the new RPC until it has
> discovered
> > a broker that supports it via ApiVersions. The problem is more difficult
> > for cases like you described where the feature flag upgrade changes the
> > behavior of the broker, but not its RPCs. This is actually the same
> problem
> > as upgrading the IBP. During a rolling restart, clients may hit different
> > brokers with different capabilities and not know it.
> >
> > This probably warrants further investigation, but hopefully you agree it
> is
> > beyond the scope of this KIP :)
> >
> > -David
> >
> >
> > On Mon, Nov 15, 2021 at 10:26 AM David Arthur  >
> > wrote:
> >
> > > Jun, thanks for the comments!
> > >
> > > 16, When a new cluster is deployed, we don't select the highest
> available
> > > metadata.version, but rather the quorum leader picks a bootstrap
> version
> > > defined in meta.properties. As mentioned earlier, we should add
> > validation
> > > here to ensure a majority of the followers could support this version
> > > 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-15 Thread Guozhang Wang
Thanks David,

1. Got it. One thing I'm still not very clear is why it's sufficient to
select a metadata.version which is supported by majority of the quorum, but
not the whole quorum (i.e. choosing the lowest version among all the quorum
members)? Since the leader election today does not take this value into
consideration, we are not guaranteed that newly selected leaders would
always be able to recognize and support the initialized metadata.version
right?

2. Yeah I think I agree the behavior-but-not-RPC-change scenario is beyond
the scope of this KIP, we can defer it to later discussions.

On Mon, Nov 15, 2021 at 8:13 AM David Arthur
 wrote:

> Guozhang, thanks for the review!
>
> 1, As we've defined it so far, the initial metadata.version is set by an
> operator via the "kafka-storage.sh" tool. It would be possible for
> different values to be selected, but only the quorum leader would commit a
> FeatureLevelRecord with the version they read locally. See the above reply
> to Jun's question for a little more detail.
>
> We need to enable the KRaft RPCs regardless of metadata.version (vote,
> heartbeat, fetch, etc) so that the quorum can be formed, a leader can be
> elected, and followers can learn about the selected metadata.version. I
> believe the quorum startup procedure would go something like:
>
> * Controller nodes start, read their logs, begin leader election
> * While the elected node is becoming leader (in
> QuorumMetaLogListener#handleLeaderChange), initialize metadata.version if
> necessary
> * Followers replicate the FeatureLevelRecord
>
> This (probably) means the quorum peers must continue to rely on ApiVersions
> to negotiate compatible RPC versions and these versions cannot depend on
> metadata.version.
>
> Does this make sense?
>
>
> 2, ApiVersionResponse includes the broker's supported feature flags as well
> as the cluster-wide finalized feature flags. We probably need to add
> something like the feature flag "epoch" to this response payload in order
> to see which broker is most up to date.
>
> If the new feature flag version included RPC changes, we are helped by the
> fact that a client won't attempt to use the new RPC until it has discovered
> a broker that supports it via ApiVersions. The problem is more difficult
> for cases like you described where the feature flag upgrade changes the
> behavior of the broker, but not its RPCs. This is actually the same problem
> as upgrading the IBP. During a rolling restart, clients may hit different
> brokers with different capabilities and not know it.
>
> This probably warrants further investigation, but hopefully you agree it is
> beyond the scope of this KIP :)
>
> -David
>
>
> On Mon, Nov 15, 2021 at 10:26 AM David Arthur 
> wrote:
>
> > Jun, thanks for the comments!
> >
> > 16, When a new cluster is deployed, we don't select the highest available
> > metadata.version, but rather the quorum leader picks a bootstrap version
> > defined in meta.properties. As mentioned earlier, we should add
> validation
> > here to ensure a majority of the followers could support this version
> > before initializing it. This would avoid a situation where a failover
> > results in a new leader who can't support the selected metadata.version.
> >
> > Thinking a bit more on this, we do need to define a state where we're
> > running newer software, but we don't have the feature flag set. This
> could
> > happen if we were running an older IBP that did not support KIP-778.
> > Following on this, it doesn't seem too difficult to consider a case where
> > the IBP has been upgraded, but we still have not finalized a
> > metadata.version. Here are some possible version combinations (assuming
> > KIP-778 is added to Kafka 3.2):
> >
> > Case  SoftwareIBPmetadata.versioneffective version
> > --
> > A 3.1 3.1-   0  software too old for
> > feature flag
> > B 3.2 3.1-   0  feature flag supported,
> > but IBP too old
> > C 3.2 3.2-   0  feature flag supported,
> > but not initialized
> > D 3.2 3.21   1  feature flag initialized
> > to 1 (via operator or bootstrap process)
> > ...
> > E 3.8 3.1-   0  ...IBP too old
> > F 3.8 3.2-   0  ...not initialized
> > G 3.8 3.24   4
> >
> >
> > Here I'm defining version 0 as "no metadata.version set". So back to your
> > comment, I think the KIP omits discussion of case C from the above table
> > which I can amend. Does that cover your concerns, or am I missing
> something
> > else?
> >
> >
> > > it's inconvenient for a user to manually upgrade every feature version
> >
> > For this, we would probably want to extend the capabilities of KIP-584. I
> > don't think anything we've discussed for KIP-778 will preclude us from
> > adding some 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-15 Thread David Arthur
Guozhang, thanks for the review!

1, As we've defined it so far, the initial metadata.version is set by an
operator via the "kafka-storage.sh" tool. It would be possible for
different values to be selected, but only the quorum leader would commit a
FeatureLevelRecord with the version they read locally. See the above reply
to Jun's question for a little more detail.

We need to enable the KRaft RPCs regardless of metadata.version (vote,
heartbeat, fetch, etc) so that the quorum can be formed, a leader can be
elected, and followers can learn about the selected metadata.version. I
believe the quorum startup procedure would go something like:

* Controller nodes start, read their logs, begin leader election
* While the elected node is becoming leader (in
QuorumMetaLogListener#handleLeaderChange), initialize metadata.version if
necessary
* Followers replicate the FeatureLevelRecord

This (probably) means the quorum peers must continue to rely on ApiVersions
to negotiate compatible RPC versions and these versions cannot depend on
metadata.version.

Does this make sense?


2, ApiVersionResponse includes the broker's supported feature flags as well
as the cluster-wide finalized feature flags. We probably need to add
something like the feature flag "epoch" to this response payload in order
to see which broker is most up to date.

If the new feature flag version included RPC changes, we are helped by the
fact that a client won't attempt to use the new RPC until it has discovered
a broker that supports it via ApiVersions. The problem is more difficult
for cases like you described where the feature flag upgrade changes the
behavior of the broker, but not its RPCs. This is actually the same problem
as upgrading the IBP. During a rolling restart, clients may hit different
brokers with different capabilities and not know it.

This probably warrants further investigation, but hopefully you agree it is
beyond the scope of this KIP :)

-David


On Mon, Nov 15, 2021 at 10:26 AM David Arthur 
wrote:

> Jun, thanks for the comments!
>
> 16, When a new cluster is deployed, we don't select the highest available
> metadata.version, but rather the quorum leader picks a bootstrap version
> defined in meta.properties. As mentioned earlier, we should add validation
> here to ensure a majority of the followers could support this version
> before initializing it. This would avoid a situation where a failover
> results in a new leader who can't support the selected metadata.version.
>
> Thinking a bit more on this, we do need to define a state where we're
> running newer software, but we don't have the feature flag set. This could
> happen if we were running an older IBP that did not support KIP-778.
> Following on this, it doesn't seem too difficult to consider a case where
> the IBP has been upgraded, but we still have not finalized a
> metadata.version. Here are some possible version combinations (assuming
> KIP-778 is added to Kafka 3.2):
>
> Case  SoftwareIBPmetadata.versioneffective version
> --
> A 3.1 3.1-   0  software too old for
> feature flag
> B 3.2 3.1-   0  feature flag supported,
> but IBP too old
> C 3.2 3.2-   0  feature flag supported,
> but not initialized
> D 3.2 3.21   1  feature flag initialized
> to 1 (via operator or bootstrap process)
> ...
> E 3.8 3.1-   0  ...IBP too old
> F 3.8 3.2-   0  ...not initialized
> G 3.8 3.24   4
>
>
> Here I'm defining version 0 as "no metadata.version set". So back to your
> comment, I think the KIP omits discussion of case C from the above table
> which I can amend. Does that cover your concerns, or am I missing something
> else?
>
>
> > it's inconvenient for a user to manually upgrade every feature version
>
> For this, we would probably want to extend the capabilities of KIP-584. I
> don't think anything we've discussed for KIP-778 will preclude us from
> adding some kind of auto-upgrade in the future.
>
> 21, "disable" sounds good to me. I agree "delete feature-x" sounds a bit
> weird.
>
>
>
> On Mon, Nov 8, 2021 at 8:47 PM Guozhang Wang  wrote:
>
>> Hello David,
>>
>> Thanks for the very nice writeup! It helped me a lot to refresh my memory
>> on KIP-630/590/584 :)
>>
>> I just had two clarification questions after reading through the KIP:
>>
>> 1. For the initialization procedure, do we guarantee that all the quorum
>> nodes (inactive candidates and leaders, a.k.a. controllers) would always
>> initialize with the same metadata.version? If yes, how is that guaranteed?
>> More specifically, when a quorum candidate is starting up, would it avoid
>> handling any controller requests (including APIVersionsRequest) from its
>> peers in the quorum until it completes reading the local log? 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-15 Thread David Arthur
Jun, thanks for the comments!

16, When a new cluster is deployed, we don't select the highest available
metadata.version, but rather the quorum leader picks a bootstrap version
defined in meta.properties. As mentioned earlier, we should add validation
here to ensure a majority of the followers could support this version
before initializing it. This would avoid a situation where a failover
results in a new leader who can't support the selected metadata.version.

Thinking a bit more on this, we do need to define a state where we're
running newer software, but we don't have the feature flag set. This could
happen if we were running an older IBP that did not support KIP-778.
Following on this, it doesn't seem too difficult to consider a case where
the IBP has been upgraded, but we still have not finalized a
metadata.version. Here are some possible version combinations (assuming
KIP-778 is added to Kafka 3.2):

Case  SoftwareIBPmetadata.versioneffective version
--
A 3.1 3.1-   0  software too old for
feature flag
B 3.2 3.1-   0  feature flag supported, but
IBP too old
C 3.2 3.2-   0  feature flag supported, but
not initialized
D 3.2 3.21   1  feature flag initialized to
1 (via operator or bootstrap process)
...
E 3.8 3.1-   0  ...IBP too old
F 3.8 3.2-   0  ...not initialized
G 3.8 3.24   4


Here I'm defining version 0 as "no metadata.version set". So back to your
comment, I think the KIP omits discussion of case C from the above table
which I can amend. Does that cover your concerns, or am I missing something
else?


> it's inconvenient for a user to manually upgrade every feature version

For this, we would probably want to extend the capabilities of KIP-584. I
don't think anything we've discussed for KIP-778 will preclude us from
adding some kind of auto-upgrade in the future.

21, "disable" sounds good to me. I agree "delete feature-x" sounds a bit
weird.



On Mon, Nov 8, 2021 at 8:47 PM Guozhang Wang  wrote:

> Hello David,
>
> Thanks for the very nice writeup! It helped me a lot to refresh my memory
> on KIP-630/590/584 :)
>
> I just had two clarification questions after reading through the KIP:
>
> 1. For the initialization procedure, do we guarantee that all the quorum
> nodes (inactive candidates and leaders, a.k.a. controllers) would always
> initialize with the same metadata.version? If yes, how is that guaranteed?
> More specifically, when a quorum candidate is starting up, would it avoid
> handling any controller requests (including APIVersionsRequest) from its
> peers in the quorum until it completes reading the local log? And even if
> yes, what would happen if there's no FeatureLevelRecord found, and
> different nodes read different values from their local meta.properties file
> or initializing from their binary's hard-code values?
>
> 2. This is not only for the metadata.version itself, but for general
> feature.versions: when a version is upgraded / downgraded, since brokers
> would read the FeatureLevelRecord at their own pace, there will always be a
> race window when some brokers has processed the record and completed the
> upgrade while others have not, hence may behave differently --- I'm
> thinking for the future like the specific replica selector to allow
> fetching from follower, and even more advanced selectors --- i.e. should we
> consider letting clients to only talk to brokers with the highest metadata
> log offsets for example?
>
>
> Guozhang
>
>
>
>
> On Fri, Nov 5, 2021 at 3:18 PM Jun Rao  wrote:
>
> > Hi, David,
> >
> > Thanks for the reply.
> >
> > 16. My first concern is that the KIP picks up meta.version inconsistently
> > during the deployment. If a new cluster is started, we pick up the
> highest
> > version. If we upgrade, we leave the feature version unchanged.
> > Intuitively, it seems that independent of how a cluster is deployed, we
> > should always pick the same feature version. I think we need to think
> > this through in this KIP. My second concern is that as a particular
> version
> > matures, it's inconvenient for a user to manually upgrade every feature
> > version. As long as we have a path to achieve that in the future, we
> don't
> > need to address that in this KIP.
> >
> > 21. "./kafka-features.sh delete": Deleting a feature seems a bit weird
> > since the logic is always there. Would it be better to use disable?
> >
> > Jun
> >
> > On Fri, Nov 5, 2021 at 8:11 AM David Arthur
> >  wrote:
> >
> > > Colin and Jun, thanks for the additional comments!
> > >
> > > Colin:
> > >
> > > > We've been talking about having an automated RPC compatibility
> checker
> > >
> > > Do we have a way to mark fields in schemas as deprecated? It can stay
> in
> > > the RPC, it just 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-08 Thread Guozhang Wang
Hello David,

Thanks for the very nice writeup! It helped me a lot to refresh my memory
on KIP-630/590/584 :)

I just had two clarification questions after reading through the KIP:

1. For the initialization procedure, do we guarantee that all the quorum
nodes (inactive candidates and leaders, a.k.a. controllers) would always
initialize with the same metadata.version? If yes, how is that guaranteed?
More specifically, when a quorum candidate is starting up, would it avoid
handling any controller requests (including APIVersionsRequest) from its
peers in the quorum until it completes reading the local log? And even if
yes, what would happen if there's no FeatureLevelRecord found, and
different nodes read different values from their local meta.properties file
or initializing from their binary's hard-code values?

2. This is not only for the metadata.version itself, but for general
feature.versions: when a version is upgraded / downgraded, since brokers
would read the FeatureLevelRecord at their own pace, there will always be a
race window when some brokers has processed the record and completed the
upgrade while others have not, hence may behave differently --- I'm
thinking for the future like the specific replica selector to allow
fetching from follower, and even more advanced selectors --- i.e. should we
consider letting clients to only talk to brokers with the highest metadata
log offsets for example?


Guozhang




On Fri, Nov 5, 2021 at 3:18 PM Jun Rao  wrote:

> Hi, David,
>
> Thanks for the reply.
>
> 16. My first concern is that the KIP picks up meta.version inconsistently
> during the deployment. If a new cluster is started, we pick up the highest
> version. If we upgrade, we leave the feature version unchanged.
> Intuitively, it seems that independent of how a cluster is deployed, we
> should always pick the same feature version. I think we need to think
> this through in this KIP. My second concern is that as a particular version
> matures, it's inconvenient for a user to manually upgrade every feature
> version. As long as we have a path to achieve that in the future, we don't
> need to address that in this KIP.
>
> 21. "./kafka-features.sh delete": Deleting a feature seems a bit weird
> since the logic is always there. Would it be better to use disable?
>
> Jun
>
> On Fri, Nov 5, 2021 at 8:11 AM David Arthur
>  wrote:
>
> > Colin and Jun, thanks for the additional comments!
> >
> > Colin:
> >
> > > We've been talking about having an automated RPC compatibility checker
> >
> > Do we have a way to mark fields in schemas as deprecated? It can stay in
> > the RPC, it just complicates the logic a bit.
> >
> > > It would be nice if the active controller could validate that a
> majority
> > of the quorum could use the proposed metadata.version. The active
> > controller should have this information, right? If we don't have recent
> > information  from a quorum of voters, we wouldn't be active.
> >
> > I believe we should have this information from the ApiVersionsResponse.
> It
> > would be good to do this validation to avoid a situation where a
> > quorum leader can't be elected due to unprocessable records.
> >
> > > Do we need delete as a command separate from downgrade?
> >
> > I think from an operator's perspective, it is nice to distinguish between
> > changing a feature flag and unsetting it. It might be surprising to an
> > operator to see the flag's version set to nothing when they requested the
> > downgrade to version 0 (or less).
> >
> > > it seems like we should spell out that metadata.version begins at 1 in
> > KRaft clusters
> >
> > I added this text:
> >
> > Introduce an IBP version to indicate the lowest software version that
> > > supports *metadata.version*. Below this IBP, the *metadata.version* is
> > > undefined and will not be examined. At or above this IBP, the
> > > *metadata.version* must be *0* for ZooKeeper clusters and will be
> > > initialized as *1* for KRaft clusters.
> >
> >
> > > We probably also want an RPC implemented by both brokers and
> controllers
> > that will reveal the min and max supported versions for each feature
> level
> > supported by the server
> >
> > This is available in ApiVersionsResponse (we include the server's
> supported
> > features as well as the cluster's finalized features)
> >
> > 
> >
> > Jun:
> >
> > 12. I've updated the KIP with AdminClient changes
> >
> > 14. You're right, it looks like I missed a few sections regarding
> snapshot
> > generation. I've corrected it
> >
> > 16. This feels more like an enhancement to KIP-584. I agree it could be
> > useful, but perhaps we could address it separately from KRaft upgrades?
> >
> > 20. Indeed snapshots are not strictly necessary during an upgrade, I've
> > reworded this
> >
> >
> > Thanks!
> > David
> >
> >
> > On Thu, Nov 4, 2021 at 6:51 PM Jun Rao  wrote:
> >
> > > Hi, David, Jose and Colin,
> > >
> > > Thanks for the reply. A few more comments.
> > >
> > > 12. It seems that we 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-05 Thread Jun Rao
Hi, David,

Thanks for the reply.

16. My first concern is that the KIP picks up meta.version inconsistently
during the deployment. If a new cluster is started, we pick up the highest
version. If we upgrade, we leave the feature version unchanged.
Intuitively, it seems that independent of how a cluster is deployed, we
should always pick the same feature version. I think we need to think
this through in this KIP. My second concern is that as a particular version
matures, it's inconvenient for a user to manually upgrade every feature
version. As long as we have a path to achieve that in the future, we don't
need to address that in this KIP.

21. "./kafka-features.sh delete": Deleting a feature seems a bit weird
since the logic is always there. Would it be better to use disable?

Jun

On Fri, Nov 5, 2021 at 8:11 AM David Arthur
 wrote:

> Colin and Jun, thanks for the additional comments!
>
> Colin:
>
> > We've been talking about having an automated RPC compatibility checker
>
> Do we have a way to mark fields in schemas as deprecated? It can stay in
> the RPC, it just complicates the logic a bit.
>
> > It would be nice if the active controller could validate that a majority
> of the quorum could use the proposed metadata.version. The active
> controller should have this information, right? If we don't have recent
> information  from a quorum of voters, we wouldn't be active.
>
> I believe we should have this information from the ApiVersionsResponse. It
> would be good to do this validation to avoid a situation where a
> quorum leader can't be elected due to unprocessable records.
>
> > Do we need delete as a command separate from downgrade?
>
> I think from an operator's perspective, it is nice to distinguish between
> changing a feature flag and unsetting it. It might be surprising to an
> operator to see the flag's version set to nothing when they requested the
> downgrade to version 0 (or less).
>
> > it seems like we should spell out that metadata.version begins at 1 in
> KRaft clusters
>
> I added this text:
>
> Introduce an IBP version to indicate the lowest software version that
> > supports *metadata.version*. Below this IBP, the *metadata.version* is
> > undefined and will not be examined. At or above this IBP, the
> > *metadata.version* must be *0* for ZooKeeper clusters and will be
> > initialized as *1* for KRaft clusters.
>
>
> > We probably also want an RPC implemented by both brokers and controllers
> that will reveal the min and max supported versions for each feature level
> supported by the server
>
> This is available in ApiVersionsResponse (we include the server's supported
> features as well as the cluster's finalized features)
>
> 
>
> Jun:
>
> 12. I've updated the KIP with AdminClient changes
>
> 14. You're right, it looks like I missed a few sections regarding snapshot
> generation. I've corrected it
>
> 16. This feels more like an enhancement to KIP-584. I agree it could be
> useful, but perhaps we could address it separately from KRaft upgrades?
>
> 20. Indeed snapshots are not strictly necessary during an upgrade, I've
> reworded this
>
>
> Thanks!
> David
>
>
> On Thu, Nov 4, 2021 at 6:51 PM Jun Rao  wrote:
>
> > Hi, David, Jose and Colin,
> >
> > Thanks for the reply. A few more comments.
> >
> > 12. It seems that we haven't updated the AdminClient accordingly?
> >
> > 14. "Metadata snapshot is generated and sent to the other inactive
> > controllers and to brokers". I thought we wanted each broker to generate
> > its own snapshot independently? If only the controller generates the
> > snapshot, how do we force other brokers to pick it up?
> >
> > 16. If a feature version is new, one may not want to enable it
> immediately
> > after the cluster is upgraded. However, if a feature version has been
> > stable, requiring every user to run a command to upgrade to that version
> > seems inconvenient. One way to improve this is for each feature to define
> > one version as the default. Then, when we upgrade a cluster, we will
> > automatically upgrade the feature to the default version. An admin could
> > use the tool to upgrade to a version higher than the default.
> >
> > 20. "The quorum controller can assist with this process by generating a
> > metadata snapshot after a metadata.version increase has been committed to
> > the metadata log. This snapshot will be a convenient way to let broker
> and
> > controller components rebuild their entire in-memory state following an
> > upgrade." The new version of the software could read both the new and the
> > old version. Is generating a new snapshot during upgrade needed?
> >
> > Jun
> >
> >
> > On Wed, Nov 3, 2021 at 5:42 PM Colin McCabe  wrote:
> >
> > > On Tue, Oct 12, 2021, at 10:34, Jun Rao wrote:
> > > > Hi, David,
> > > >
> > > > One more comment.
> > > >
> > > > 16. The main reason why KIP-584 requires finalizing a feature
> manually
> > is
> > > > that in the ZK world, the controller doesn't know all brokers in a
> > > 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-05 Thread David Arthur
Colin and Jun, thanks for the additional comments!

Colin:

> We've been talking about having an automated RPC compatibility checker

Do we have a way to mark fields in schemas as deprecated? It can stay in
the RPC, it just complicates the logic a bit.

> It would be nice if the active controller could validate that a majority
of the quorum could use the proposed metadata.version. The active
controller should have this information, right? If we don't have recent
information  from a quorum of voters, we wouldn't be active.

I believe we should have this information from the ApiVersionsResponse. It
would be good to do this validation to avoid a situation where a
quorum leader can't be elected due to unprocessable records.

> Do we need delete as a command separate from downgrade?

I think from an operator's perspective, it is nice to distinguish between
changing a feature flag and unsetting it. It might be surprising to an
operator to see the flag's version set to nothing when they requested the
downgrade to version 0 (or less).

> it seems like we should spell out that metadata.version begins at 1 in
KRaft clusters

I added this text:

Introduce an IBP version to indicate the lowest software version that
> supports *metadata.version*. Below this IBP, the *metadata.version* is
> undefined and will not be examined. At or above this IBP, the
> *metadata.version* must be *0* for ZooKeeper clusters and will be
> initialized as *1* for KRaft clusters.


> We probably also want an RPC implemented by both brokers and controllers
that will reveal the min and max supported versions for each feature level
supported by the server

This is available in ApiVersionsResponse (we include the server's supported
features as well as the cluster's finalized features)



Jun:

12. I've updated the KIP with AdminClient changes

14. You're right, it looks like I missed a few sections regarding snapshot
generation. I've corrected it

16. This feels more like an enhancement to KIP-584. I agree it could be
useful, but perhaps we could address it separately from KRaft upgrades?

20. Indeed snapshots are not strictly necessary during an upgrade, I've
reworded this


Thanks!
David


On Thu, Nov 4, 2021 at 6:51 PM Jun Rao  wrote:

> Hi, David, Jose and Colin,
>
> Thanks for the reply. A few more comments.
>
> 12. It seems that we haven't updated the AdminClient accordingly?
>
> 14. "Metadata snapshot is generated and sent to the other inactive
> controllers and to brokers". I thought we wanted each broker to generate
> its own snapshot independently? If only the controller generates the
> snapshot, how do we force other brokers to pick it up?
>
> 16. If a feature version is new, one may not want to enable it immediately
> after the cluster is upgraded. However, if a feature version has been
> stable, requiring every user to run a command to upgrade to that version
> seems inconvenient. One way to improve this is for each feature to define
> one version as the default. Then, when we upgrade a cluster, we will
> automatically upgrade the feature to the default version. An admin could
> use the tool to upgrade to a version higher than the default.
>
> 20. "The quorum controller can assist with this process by generating a
> metadata snapshot after a metadata.version increase has been committed to
> the metadata log. This snapshot will be a convenient way to let broker and
> controller components rebuild their entire in-memory state following an
> upgrade." The new version of the software could read both the new and the
> old version. Is generating a new snapshot during upgrade needed?
>
> Jun
>
>
> On Wed, Nov 3, 2021 at 5:42 PM Colin McCabe  wrote:
>
> > On Tue, Oct 12, 2021, at 10:34, Jun Rao wrote:
> > > Hi, David,
> > >
> > > One more comment.
> > >
> > > 16. The main reason why KIP-584 requires finalizing a feature manually
> is
> > > that in the ZK world, the controller doesn't know all brokers in a
> > cluster.
> > > A broker temporarily down is not registered in ZK. in the KRaft world,
> > the
> > > controller keeps track of all brokers, including those that are
> > temporarily
> > > down. This makes it possible for the controller to automatically
> > finalize a
> > > feature---it's safe to do so when all brokers support that feature.
> This
> > > will make the upgrade process much simpler since no manual command is
> > > required to turn on a new feature. Have we considered this?
> > >
> > > Thanks,
> > >
> > > Jun
> >
> > Hi Jun,
> >
> > I guess David commented on this point already, but I'll comment as well.
> I
> > always had the perception that users viewed rolls as potentially risky
> and
> > were looking for ways to reduce the risk. Not enabling features right
> away
> > after installing new software seems like one way to do that. If we had a
> > feature to automatically upgrade during a roll, I'm not sure that I would
> > recommend that people use it, because if something fails, it makes it
> > harder to tell if the 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-04 Thread Jun Rao
Hi, David, Jose and Colin,

Thanks for the reply. A few more comments.

12. It seems that we haven't updated the AdminClient accordingly?

14. "Metadata snapshot is generated and sent to the other inactive
controllers and to brokers". I thought we wanted each broker to generate
its own snapshot independently? If only the controller generates the
snapshot, how do we force other brokers to pick it up?

16. If a feature version is new, one may not want to enable it immediately
after the cluster is upgraded. However, if a feature version has been
stable, requiring every user to run a command to upgrade to that version
seems inconvenient. One way to improve this is for each feature to define
one version as the default. Then, when we upgrade a cluster, we will
automatically upgrade the feature to the default version. An admin could
use the tool to upgrade to a version higher than the default.

20. "The quorum controller can assist with this process by generating a
metadata snapshot after a metadata.version increase has been committed to
the metadata log. This snapshot will be a convenient way to let broker and
controller components rebuild their entire in-memory state following an
upgrade." The new version of the software could read both the new and the
old version. Is generating a new snapshot during upgrade needed?

Jun


On Wed, Nov 3, 2021 at 5:42 PM Colin McCabe  wrote:

> On Tue, Oct 12, 2021, at 10:34, Jun Rao wrote:
> > Hi, David,
> >
> > One more comment.
> >
> > 16. The main reason why KIP-584 requires finalizing a feature manually is
> > that in the ZK world, the controller doesn't know all brokers in a
> cluster.
> > A broker temporarily down is not registered in ZK. in the KRaft world,
> the
> > controller keeps track of all brokers, including those that are
> temporarily
> > down. This makes it possible for the controller to automatically
> finalize a
> > feature---it's safe to do so when all brokers support that feature. This
> > will make the upgrade process much simpler since no manual command is
> > required to turn on a new feature. Have we considered this?
> >
> > Thanks,
> >
> > Jun
>
> Hi Jun,
>
> I guess David commented on this point already, but I'll comment as well. I
> always had the perception that users viewed rolls as potentially risky and
> were looking for ways to reduce the risk. Not enabling features right away
> after installing new software seems like one way to do that. If we had a
> feature to automatically upgrade during a roll, I'm not sure that I would
> recommend that people use it, because if something fails, it makes it
> harder to tell if the new feature is at fault, or something else in the new
> software.
>
> We already tell users to do a "double roll" when going to a new IBP. (Just
> to give background to people who haven't heard that phrase, the first roll
> installs the new software, and the second roll updates the IBP). So this
> KIP-778 mechanism is basically very similar to that, except the second
> thing isn't a roll, but just an upgrade command. So I think this is
> consistent with what we currently do.
>
> Also, just like David said, we can always add auto-upgrade later if there
> is demand...
>
> best,
> Colin
>
>
> >
> > On Thu, Oct 7, 2021 at 5:19 PM Jun Rao  wrote:
> >
> >> Hi, David,
> >>
> >> Thanks for the KIP. A few comments below.
> >>
> >> 10. It would be useful to describe how the controller node determines
> the
> >> RPC version used to communicate to other controller nodes. There seems
> to
> >> be a bootstrap problem. A controller node can't read the log and
> >> therefore the feature level until a quorum leader is elected. But leader
> >> election requires an RPC.
> >>
> >> 11. For downgrades, it would be useful to describe how to determine the
> >> downgrade process (generating new snapshot, propagating the snapshot,
> etc)
> >> has completed. We could block the UpdateFeature request until the
> process
> >> is completed. However, since the process could take time, the request
> could
> >> time out. Another way is through DescribeFeature and the server only
> >> reports downgraded versions after the process is completed.
> >>
> >> 12. Since we are changing UpdateFeaturesRequest, do we need to change
> the
> >> AdminClient api for updateFeatures too?
> >>
> >> 13. For the paragraph starting with "In the absence of an operator
> >> defined value for metadata.version", in KIP-584, we described how to
> >> finalize features with New cluster bootstrap. In that case, it's
> >> inconvenient for the users to have to run an admin tool to finalize the
> >> version for each feature. Instead, the system detects that the /features
> >> path is missing in ZK and thus automatically finalizes every feature
> with
> >> the latest supported version. Could we do something similar in the KRaft
> >> mode?
> >>
> >> 14. After the quorum leader generates a new snapshot, how do we force
> >> other nodes to pick up the new snapshot?
> >>
> >> 15. I agree with Jose 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-03 Thread Colin McCabe
On Tue, Oct 12, 2021, at 10:34, Jun Rao wrote:
> Hi, David,
>
> One more comment.
>
> 16. The main reason why KIP-584 requires finalizing a feature manually is
> that in the ZK world, the controller doesn't know all brokers in a cluster.
> A broker temporarily down is not registered in ZK. in the KRaft world, the
> controller keeps track of all brokers, including those that are temporarily
> down. This makes it possible for the controller to automatically finalize a
> feature---it's safe to do so when all brokers support that feature. This
> will make the upgrade process much simpler since no manual command is
> required to turn on a new feature. Have we considered this?
>
> Thanks,
>
> Jun

Hi Jun,

I guess David commented on this point already, but I'll comment as well. I 
always had the perception that users viewed rolls as potentially risky and were 
looking for ways to reduce the risk. Not enabling features right away after 
installing new software seems like one way to do that. If we had a feature to 
automatically upgrade during a roll, I'm not sure that I would recommend that 
people use it, because if something fails, it makes it harder to tell if the 
new feature is at fault, or something else in the new software.

We already tell users to do a "double roll" when going to a new IBP. (Just to 
give background to people who haven't heard that phrase, the first roll 
installs the new software, and the second roll updates the IBP). So this 
KIP-778 mechanism is basically very similar to that, except the second thing 
isn't a roll, but just an upgrade command. So I think this is consistent with 
what we currently do.

Also, just like David said, we can always add auto-upgrade later if there is 
demand...

best,
Colin


>
> On Thu, Oct 7, 2021 at 5:19 PM Jun Rao  wrote:
>
>> Hi, David,
>>
>> Thanks for the KIP. A few comments below.
>>
>> 10. It would be useful to describe how the controller node determines the
>> RPC version used to communicate to other controller nodes. There seems to
>> be a bootstrap problem. A controller node can't read the log and
>> therefore the feature level until a quorum leader is elected. But leader
>> election requires an RPC.
>>
>> 11. For downgrades, it would be useful to describe how to determine the
>> downgrade process (generating new snapshot, propagating the snapshot, etc)
>> has completed. We could block the UpdateFeature request until the process
>> is completed. However, since the process could take time, the request could
>> time out. Another way is through DescribeFeature and the server only
>> reports downgraded versions after the process is completed.
>>
>> 12. Since we are changing UpdateFeaturesRequest, do we need to change the
>> AdminClient api for updateFeatures too?
>>
>> 13. For the paragraph starting with "In the absence of an operator
>> defined value for metadata.version", in KIP-584, we described how to
>> finalize features with New cluster bootstrap. In that case, it's
>> inconvenient for the users to have to run an admin tool to finalize the
>> version for each feature. Instead, the system detects that the /features
>> path is missing in ZK and thus automatically finalizes every feature with
>> the latest supported version. Could we do something similar in the KRaft
>> mode?
>>
>> 14. After the quorum leader generates a new snapshot, how do we force
>> other nodes to pick up the new snapshot?
>>
>> 15. I agree with Jose that it will be useful to describe when generating a
>> new snapshot is needed. To me, it seems the new snapshot is only needed
>> when incompatible changes are made.
>>
>> 7. Jose, what control records were you referring?
>>
>> Thanks,
>>
>> Jun
>>
>>
>> On Tue, Oct 5, 2021 at 8:53 AM David Arthur 
>> wrote:
>>
>>> Jose, thanks for the thorough review and comments!
>>>
>>> I am out of the office until next week, so I probably won't be able to
>>> update the KIP until then. Here are some replies to your questions:
>>>
>>> 1. Generate snapshot on upgrade
>>> > > Metadata snapshot is generated and sent to the other nodes
>>> > Why does the Active Controller need to generate a new snapshot and
>>> > force a snapshot fetch from the replicas (inactive controller and
>>> > brokers) on an upgrade? Isn't writing the FeatureLevelRecord good
>>> > enough to communicate the upgrade to the replicas?
>>>
>>>
>>> You're right, we don't necessarily need to _transmit_ a snapshot, since
>>> each node can generate its own equivalent snapshot
>>>
>>> 2. Generate snapshot on downgrade
>>> > > Metadata snapshot is generated and sent to the other inactive
>>> > controllers and to brokers (this snapshot may be lossy!)
>>> > Why do we need to send this downgraded snapshot to the brokers? The
>>> > replicas have seen the FeatureLevelRecord and noticed the downgrade.
>>> > Can we have the replicas each independently generate a downgraded
>>> > snapshot at the offset for the downgraded FeatureLevelRecord? I assume
>>> > that the active controller will 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-03 Thread Colin McCabe
Hi David,

Thanks again for the KIP.

David Arthur wrote:
 > 101. Change AllowDowngrade bool to DowngradeType int8 in
 > UpgradeFeatureRequest RPC. I'm wondering if we can kind of "cheat" on this
 > incompatible change since it's not currently in use and totally remove the
 > old field and leave the versions at 0+. Thoughts?

We've been talking about having an automated RPC compatibility checker, and 
doing something like this might make that more complex. On the other hand, I 
suppose it could be annotated as a special case.

David Arthur wrote:
 > The active controller should probably validate whatever value is read from
 > meta.properties against its own range of supported versions (statically
 > defined in code). If the operator sets a version unsupported by the active
 > controller, that sounds like a configuration error and we should shutdown.
 > I'm not sure what other validation we could do here without introducing
 > ordering dependencies (e.g., must have quorum before initializing the
 > version)

It would be nice if the active controller could validate that a majority of the 
quorum could use the proposed metadata.version. The active controller should 
have this information, right? If we don't have recent information  from a 
quorum of voters, we wouldn't be active.

David Arthur wrote:
 > Hey folks, I just updated the KIP with details on proposed changes to the
 > kafka-features.sh tool. It includes four proposed sub-commands which will
 > provide the Basic and Advanced functions detailed in KIP-584. Please have a
 > look, thanks!

I like the new sub-commands... seems very clean.

Do we need delete as a command separate from downgrade? In KIP-584, we agreed
to keep version 0 reserved for "no such feature flag." So downgrading to
version 0 should be the same as deletion.

On another note, it seems like we should spell out that metadata.version begins 
at 1 in KRaft clusters, and that it's always 0 in ZK-based clusters. Maybe this 
is obvious but it would be good to write it down here.

It seems like we never got closure on the issue of simplifying feature levels. 
As I said earlier, I think the min/max thing is not needed for 99% of the 
use-cases we've talked about for feature levels. Certainly, it's not needed for 
metadata.version. Having it be mandatory for every feature level, whether it 
needs it or not, is an extra complication that I think we should get rid of, 
before this interface becomes set in stone. (Use-cases that need this can 
simply have two feature flags, X_min and X_max, as I said before.)

We probably also want an RPC implemented by both brokers and controllers that 
will reveal the min and max supported versions for each feature level supported 
by the server. This is useful for diagnostics. And I suppose we should have a 
command line option to the features command that broadcasts it to everyone and 
returns the results (or lack of result, if the connection couldn't be made...)

best,
Colin


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-11-03 Thread David Arthur
Hey folks, I just updated the KIP with details on proposed changes to the
kafka-features.sh tool. It includes four proposed sub-commands which will
provide the Basic and Advanced functions detailed in KIP-584. Please have a
look, thanks!
https://cwiki.apache.org/confluence/display/KAFKA/KIP-778%3A+KRaft+Upgrades#KIP778:KRaftUpgrades-KIP-584Addendum

Aside from this change, if there isn't any more feedback on the KIP I'd
like to start a vote soon.

Cheers,
David

On Thu, Oct 21, 2021 at 3:09 AM Kowshik Prakasam
 wrote:

> Hi David,
>
> Thanks for the explanations. Few comments below.
>
> 7001. Sounds good.
>
> 7002. Sounds good. The --force-downgrade-all option can be used for the
> basic CLI while the --force-downgrade option can be used for the advanced
> CLI.
>
> 7003. I like your suggestion on separate sub-commands, I agree it's more
> convenient to use.
>
> 7004/7005. Your explanation sounds good to me. Regarding the min finalized
> version level, this becomes useful for feature version deprecation as
> explained here:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP584:Versioningschemeforfeatures-Featureversiondeprecation
> . This is not implemented yet, and the work item is tracked in KAFKA-10622.
>
>
> Cheers,
> Kowshik
>
>
>
> On Fri, Oct 15, 2021 at 11:38 AM David Arthur  wrote:
>
> > >
> > > How does the active controller know what is a valid `metadata.version`
> > > to persist? Could the active controller learn this from the
> > > ApiVersions response from all of the inactive controllers?
> >
> >
> > The active controller should probably validate whatever value is read
> from
> > meta.properties against its own range of supported versions (statically
> > defined in code). If the operator sets a version unsupported by the
> active
> > controller, that sounds like a configuration error and we should
> shutdown.
> > I'm not sure what other validation we could do here without introducing
> > ordering dependencies (e.g., must have quorum before initializing the
> > version)
> >
> > For example, let's say that we have a cluster that only has remote
> > > controllers, what are the valid metadata.version in that case?
> >
> >
> > I believe it would be the intersection of supported versions across all
> > brokers and controllers. This does raise a concern with upgrading the
> > metadata.version in general. Currently, the active controller only
> > validates the target version based on the brokers' support versions. We
> > will need to include controllers supported versions here as well (using
> > ApiVersions, probably).
> >
> > On Fri, Oct 15, 2021 at 1:44 PM José Armando García Sancio
> >  wrote:
> >
> > > On Fri, Oct 15, 2021 at 7:24 AM David Arthur  wrote:
> > > > Hmm. So I think you are proposing the following flow:
> > > > > 1. Cluster metadata partition replicas establish a quorum using
> > > > > ApiVersions and the KRaft protocol.
> > > > > 2. Inactive controllers send a registration RPC to the active
> > > controller.
> > > > > 3. The active controller persists this information to the metadata
> > log.
> > > >
> > > >
> > > > What happens if the inactive controllers send a metadata.version
> range
> > > > > that is not compatible with the metadata.version set for the
> cluster?
> > > >
> > > >
> > > > As we discussed offline, we don't need the explicit registration
> step.
> > > Once
> > > > a controller has joined the quorum, it will learn about the finalized
> > > > "metadata.version" level once it reads that record.
> > >
> > > How does the active controller know what is a valid `metadata.version`
> > > to persist? Could the active controller learn this from the
> > > ApiVersions response from all of the inactive controllers? For
> > > example, let's say that we have a cluster that only has remote
> > > controllers, what are the valid metadata.version in that case?
> > >
> > > > If it encounters a
> > > > version it can't support it should probably shutdown since it might
> not
> > > be
> > > > able to process any more records.
> > >
> > > I think that makes sense. If a controller cannot replay the metadata
> > > log, it might as well not be part of the quorum. If the cluster
> > > continues in this state it won't guarantee availability based on the
> > > replication factor.
> > >
> > > Thanks
> > > --
> > > -Jose
> > >
> >
> >
> > --
> > David Arthur
> >
>


-- 
David Arthur


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-10-21 Thread Kowshik Prakasam
Hi David,

Thanks for the explanations. Few comments below.

7001. Sounds good.

7002. Sounds good. The --force-downgrade-all option can be used for the
basic CLI while the --force-downgrade option can be used for the advanced
CLI.

7003. I like your suggestion on separate sub-commands, I agree it's more
convenient to use.

7004/7005. Your explanation sounds good to me. Regarding the min finalized
version level, this becomes useful for feature version deprecation as
explained here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP584:Versioningschemeforfeatures-Featureversiondeprecation
. This is not implemented yet, and the work item is tracked in KAFKA-10622.


Cheers,
Kowshik



On Fri, Oct 15, 2021 at 11:38 AM David Arthur  wrote:

> >
> > How does the active controller know what is a valid `metadata.version`
> > to persist? Could the active controller learn this from the
> > ApiVersions response from all of the inactive controllers?
>
>
> The active controller should probably validate whatever value is read from
> meta.properties against its own range of supported versions (statically
> defined in code). If the operator sets a version unsupported by the active
> controller, that sounds like a configuration error and we should shutdown.
> I'm not sure what other validation we could do here without introducing
> ordering dependencies (e.g., must have quorum before initializing the
> version)
>
> For example, let's say that we have a cluster that only has remote
> > controllers, what are the valid metadata.version in that case?
>
>
> I believe it would be the intersection of supported versions across all
> brokers and controllers. This does raise a concern with upgrading the
> metadata.version in general. Currently, the active controller only
> validates the target version based on the brokers' support versions. We
> will need to include controllers supported versions here as well (using
> ApiVersions, probably).
>
> On Fri, Oct 15, 2021 at 1:44 PM José Armando García Sancio
>  wrote:
>
> > On Fri, Oct 15, 2021 at 7:24 AM David Arthur  wrote:
> > > Hmm. So I think you are proposing the following flow:
> > > > 1. Cluster metadata partition replicas establish a quorum using
> > > > ApiVersions and the KRaft protocol.
> > > > 2. Inactive controllers send a registration RPC to the active
> > controller.
> > > > 3. The active controller persists this information to the metadata
> log.
> > >
> > >
> > > What happens if the inactive controllers send a metadata.version range
> > > > that is not compatible with the metadata.version set for the cluster?
> > >
> > >
> > > As we discussed offline, we don't need the explicit registration step.
> > Once
> > > a controller has joined the quorum, it will learn about the finalized
> > > "metadata.version" level once it reads that record.
> >
> > How does the active controller know what is a valid `metadata.version`
> > to persist? Could the active controller learn this from the
> > ApiVersions response from all of the inactive controllers? For
> > example, let's say that we have a cluster that only has remote
> > controllers, what are the valid metadata.version in that case?
> >
> > > If it encounters a
> > > version it can't support it should probably shutdown since it might not
> > be
> > > able to process any more records.
> >
> > I think that makes sense. If a controller cannot replay the metadata
> > log, it might as well not be part of the quorum. If the cluster
> > continues in this state it won't guarantee availability based on the
> > replication factor.
> >
> > Thanks
> > --
> > -Jose
> >
>
>
> --
> David Arthur
>


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-10-15 Thread David Arthur
>
> How does the active controller know what is a valid `metadata.version`
> to persist? Could the active controller learn this from the
> ApiVersions response from all of the inactive controllers?


The active controller should probably validate whatever value is read from
meta.properties against its own range of supported versions (statically
defined in code). If the operator sets a version unsupported by the active
controller, that sounds like a configuration error and we should shutdown.
I'm not sure what other validation we could do here without introducing
ordering dependencies (e.g., must have quorum before initializing the
version)

For example, let's say that we have a cluster that only has remote
> controllers, what are the valid metadata.version in that case?


I believe it would be the intersection of supported versions across all
brokers and controllers. This does raise a concern with upgrading the
metadata.version in general. Currently, the active controller only
validates the target version based on the brokers' support versions. We
will need to include controllers supported versions here as well (using
ApiVersions, probably).

On Fri, Oct 15, 2021 at 1:44 PM José Armando García Sancio
 wrote:

> On Fri, Oct 15, 2021 at 7:24 AM David Arthur  wrote:
> > Hmm. So I think you are proposing the following flow:
> > > 1. Cluster metadata partition replicas establish a quorum using
> > > ApiVersions and the KRaft protocol.
> > > 2. Inactive controllers send a registration RPC to the active
> controller.
> > > 3. The active controller persists this information to the metadata log.
> >
> >
> > What happens if the inactive controllers send a metadata.version range
> > > that is not compatible with the metadata.version set for the cluster?
> >
> >
> > As we discussed offline, we don't need the explicit registration step.
> Once
> > a controller has joined the quorum, it will learn about the finalized
> > "metadata.version" level once it reads that record.
>
> How does the active controller know what is a valid `metadata.version`
> to persist? Could the active controller learn this from the
> ApiVersions response from all of the inactive controllers? For
> example, let's say that we have a cluster that only has remote
> controllers, what are the valid metadata.version in that case?
>
> > If it encounters a
> > version it can't support it should probably shutdown since it might not
> be
> > able to process any more records.
>
> I think that makes sense. If a controller cannot replay the metadata
> log, it might as well not be part of the quorum. If the cluster
> continues in this state it won't guarantee availability based on the
> replication factor.
>
> Thanks
> --
> -Jose
>


-- 
David Arthur


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-10-15 Thread José Armando García Sancio
On Fri, Oct 15, 2021 at 7:24 AM David Arthur  wrote:
> Hmm. So I think you are proposing the following flow:
> > 1. Cluster metadata partition replicas establish a quorum using
> > ApiVersions and the KRaft protocol.
> > 2. Inactive controllers send a registration RPC to the active controller.
> > 3. The active controller persists this information to the metadata log.
>
>
> What happens if the inactive controllers send a metadata.version range
> > that is not compatible with the metadata.version set for the cluster?
>
>
> As we discussed offline, we don't need the explicit registration step. Once
> a controller has joined the quorum, it will learn about the finalized
> "metadata.version" level once it reads that record.

How does the active controller know what is a valid `metadata.version`
to persist? Could the active controller learn this from the
ApiVersions response from all of the inactive controllers? For
example, let's say that we have a cluster that only has remote
controllers, what are the valid metadata.version in that case?

> If it encounters a
> version it can't support it should probably shutdown since it might not be
> able to process any more records.

I think that makes sense. If a controller cannot replay the metadata
log, it might as well not be part of the quorum. If the cluster
continues in this state it won't guarantee availability based on the
replication factor.

Thanks
-- 
-Jose


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-10-15 Thread David Arthur
Hey everyone, I've updated the KIP with a few items we've discussed so far:

101. Change AllowDowngrade bool to DowngradeType int8 in
UpgradeFeatureRequest RPC. I'm wondering if we can kind of "cheat" on this
incompatible change since it's not currently in use and totally remove the
old field and leave the versions at 0+. Thoughts?

102. Add section on metadata.version initialization. This includes a new
option to kafka-storage.sh to let the operator set the initial version

103. Add section on behavior of controllers and brokers if they encounter
an unsupported version

Thanks for the great discussion so far!
-David

On Fri, Oct 15, 2021 at 10:23 AM David Arthur  wrote:

> Jose,
>
> Are you saying that for metadata.version X different software versions
>> could generate different snapshots? If so, I would consider this an
>> implementation bug, no? The format and content of a snapshot is a
>> public API that needs to be supported across software versions.
>
>
> I agree this would be a bug. This is probably a non-issue since we will
> recommend brokers to be at the same software version before performing an
> upgrade or downgrade. That eliminates the possibility that brokers could
> generate different snapshots.
>
> Hmm. So I think you are proposing the following flow:
>> 1. Cluster metadata partition replicas establish a quorum using
>> ApiVersions and the KRaft protocol.
>> 2. Inactive controllers send a registration RPC to the active controller.
>> 3. The active controller persists this information to the metadata log.
>
>
> What happens if the inactive controllers send a metadata.version range
>> that is not compatible with the metadata.version set for the cluster?
>
>
> As we discussed offline, we don't need the explicit registration step.
> Once a controller has joined the quorum, it will learn about the finalized
> "metadata.version" level once it reads that record. If it encounters a
> version it can't support it should probably shutdown since it might not be
> able to process any more records.
>
> However, this does lead to a race condition when a controller is catching
> up on the metadata log. Let's say a controller comes up and fetches the
> following records from the leader:
>
> > [M1, M2, M3, FeatureLevelRecord]
>
> Between the time the controller starts handling requests and it processes
> the FeatureLevelRecord, it could report an old value of metadata.version
> when handling ApiVersionsRequest. In the brokers, I believe we delay
> starting the request handler threads until the broker has been unfenced
> (meaning, it is reasonably caught up). We might need something similar for
> controllers.
>
> -David
>
> On Thu, Oct 14, 2021 at 9:29 PM David Arthur  wrote:
>
>> Kowshik, thanks for the review!
>>
>> 7001. An enum sounds like a good idea here. Especially since setting
>> Upgrade=false and Force=true doesn't really make sense. An enum would avoid
>> confusing/invalid combinations of flags
>>
>> 7002. How about adding --force-downgrade as an alternative to the
>> --downgrade argument? So, it would take the same arguments (list of
>> feature:version), but use the DOWNGRADE_UNSAFE option in the RPC.
>>
>> 7003. Yes we will need the advanced CLI since we will need to only modify
>> the "metadata.version" FF. I was kind of wondering if we might want
>> separate sub-commands for the different operations instead of all under
>> "update". E.g., "kafka-features.sh
>> upgrade|downgrade|force-downgrade|delete|describe".
>>
>> 7004/7005. The intention in this KIP is that we bump the
>> "metadata.version" liberally and that most upgrades are backwards
>> compatible. We're relying on this for feature gating as well as indicating
>> compatibility. The enum is indeed an implementation detail that is enforced
>> by the controller when handling UpdateFeaturesRequest. As for lossy
>> downgrades, this really only applies to the metadata log as we will lose
>> some fields and records when downgrading to an older version. This is
>> useful as an escape hatch for cases when a software upgrade has occurred,
>> the feature flag was increased, and then bugs are discovered. Without the
>> lossy downgrade scenario, we have no path back to a previous software
>> version.
>>
>> As for the min/max finalized version, I'm not totally clear on cases
>> where these would differ. I think for "metadata.version" we just want a
>> single finalized version for the whole cluster (like we do for IBP).
>>
>> -David
>>
>>
>> On Thu, Oct 14, 2021 at 1:59 PM José Armando García Sancio
>>  wrote:
>>
>>> On Tue, Oct 12, 2021 at 10:57 AM Colin McCabe 
>>> wrote:
>>> > > 11. For downgrades, it would be useful to describe how to determine
>>> the
>>> > > downgrade process (generating new snapshot, propagating the
>>> snapshot, etc)
>>> > > has completed. We could block the UpdateFeature request until the
>>> process
>>> > > is completed. However, since the process could take time, the
>>> request could
>>> > > time out. Another way is 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-10-15 Thread David Arthur
Jose,

Are you saying that for metadata.version X different software versions
> could generate different snapshots? If so, I would consider this an
> implementation bug, no? The format and content of a snapshot is a
> public API that needs to be supported across software versions.


I agree this would be a bug. This is probably a non-issue since we will
recommend brokers to be at the same software version before performing an
upgrade or downgrade. That eliminates the possibility that brokers could
generate different snapshots.

Hmm. So I think you are proposing the following flow:
> 1. Cluster metadata partition replicas establish a quorum using
> ApiVersions and the KRaft protocol.
> 2. Inactive controllers send a registration RPC to the active controller.
> 3. The active controller persists this information to the metadata log.


What happens if the inactive controllers send a metadata.version range
> that is not compatible with the metadata.version set for the cluster?


As we discussed offline, we don't need the explicit registration step. Once
a controller has joined the quorum, it will learn about the finalized
"metadata.version" level once it reads that record. If it encounters a
version it can't support it should probably shutdown since it might not be
able to process any more records.

However, this does lead to a race condition when a controller is catching
up on the metadata log. Let's say a controller comes up and fetches the
following records from the leader:

> [M1, M2, M3, FeatureLevelRecord]

Between the time the controller starts handling requests and it processes
the FeatureLevelRecord, it could report an old value of metadata.version
when handling ApiVersionsRequest. In the brokers, I believe we delay
starting the request handler threads until the broker has been unfenced
(meaning, it is reasonably caught up). We might need something similar for
controllers.

-David

On Thu, Oct 14, 2021 at 9:29 PM David Arthur  wrote:

> Kowshik, thanks for the review!
>
> 7001. An enum sounds like a good idea here. Especially since setting
> Upgrade=false and Force=true doesn't really make sense. An enum would avoid
> confusing/invalid combinations of flags
>
> 7002. How about adding --force-downgrade as an alternative to the
> --downgrade argument? So, it would take the same arguments (list of
> feature:version), but use the DOWNGRADE_UNSAFE option in the RPC.
>
> 7003. Yes we will need the advanced CLI since we will need to only modify
> the "metadata.version" FF. I was kind of wondering if we might want
> separate sub-commands for the different operations instead of all under
> "update". E.g., "kafka-features.sh
> upgrade|downgrade|force-downgrade|delete|describe".
>
> 7004/7005. The intention in this KIP is that we bump the
> "metadata.version" liberally and that most upgrades are backwards
> compatible. We're relying on this for feature gating as well as indicating
> compatibility. The enum is indeed an implementation detail that is enforced
> by the controller when handling UpdateFeaturesRequest. As for lossy
> downgrades, this really only applies to the metadata log as we will lose
> some fields and records when downgrading to an older version. This is
> useful as an escape hatch for cases when a software upgrade has occurred,
> the feature flag was increased, and then bugs are discovered. Without the
> lossy downgrade scenario, we have no path back to a previous software
> version.
>
> As for the min/max finalized version, I'm not totally clear on cases where
> these would differ. I think for "metadata.version" we just want a single
> finalized version for the whole cluster (like we do for IBP).
>
> -David
>
>
> On Thu, Oct 14, 2021 at 1:59 PM José Armando García Sancio
>  wrote:
>
>> On Tue, Oct 12, 2021 at 10:57 AM Colin McCabe  wrote:
>> > > 11. For downgrades, it would be useful to describe how to determine
>> the
>> > > downgrade process (generating new snapshot, propagating the snapshot,
>> etc)
>> > > has completed. We could block the UpdateFeature request until the
>> process
>> > > is completed. However, since the process could take time, the request
>> could
>> > > time out. Another way is through DescribeFeature and the server only
>> > > reports downgraded versions after the process is completed.
>> >
>> > Hmm.. I think we need to avoid blocking, since we don't know how long
>> it will take for all nodes to act on the downgrade request. After all, some
>> nodes may be down.
>> >
>> > But I agree we should have some way of knowing when the upgrade is
>> done. DescribeClusterResponse seems like the natural place to put
>> information about each node's feature level. While we're at it, we should
>> also add a boolean to indicate whether the given node is fenced. (This will
>> always be false for ZK mode, of course...)
>> >
>>
>> I agree. I think from the user's point of view, they would like to
>> know if it is safe to downgrade the software version of a specific
>> broker or 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-10-14 Thread David Arthur
Kowshik, thanks for the review!

7001. An enum sounds like a good idea here. Especially since setting
Upgrade=false and Force=true doesn't really make sense. An enum would avoid
confusing/invalid combinations of flags

7002. How about adding --force-downgrade as an alternative to the
--downgrade argument? So, it would take the same arguments (list of
feature:version), but use the DOWNGRADE_UNSAFE option in the RPC.

7003. Yes we will need the advanced CLI since we will need to only modify
the "metadata.version" FF. I was kind of wondering if we might want
separate sub-commands for the different operations instead of all under
"update". E.g., "kafka-features.sh
upgrade|downgrade|force-downgrade|delete|describe".

7004/7005. The intention in this KIP is that we bump the "metadata.version"
liberally and that most upgrades are backwards compatible. We're relying on
this for feature gating as well as indicating compatibility. The enum is
indeed an implementation detail that is enforced by the controller when
handling UpdateFeaturesRequest. As for lossy downgrades, this really only
applies to the metadata log as we will lose some fields and records when
downgrading to an older version. This is useful as an escape hatch for
cases when a software upgrade has occurred, the feature flag was increased,
and then bugs are discovered. Without the lossy downgrade scenario, we have
no path back to a previous software version.

As for the min/max finalized version, I'm not totally clear on cases where
these would differ. I think for "metadata.version" we just want a single
finalized version for the whole cluster (like we do for IBP).

-David


On Thu, Oct 14, 2021 at 1:59 PM José Armando García Sancio
 wrote:

> On Tue, Oct 12, 2021 at 10:57 AM Colin McCabe  wrote:
> > > 11. For downgrades, it would be useful to describe how to determine the
> > > downgrade process (generating new snapshot, propagating the snapshot,
> etc)
> > > has completed. We could block the UpdateFeature request until the
> process
> > > is completed. However, since the process could take time, the request
> could
> > > time out. Another way is through DescribeFeature and the server only
> > > reports downgraded versions after the process is completed.
> >
> > Hmm.. I think we need to avoid blocking, since we don't know how long it
> will take for all nodes to act on the downgrade request. After all, some
> nodes may be down.
> >
> > But I agree we should have some way of knowing when the upgrade is done.
> DescribeClusterResponse seems like the natural place to put information
> about each node's feature level. While we're at it, we should also add a
> boolean to indicate whether the given node is fenced. (This will always be
> false for ZK mode, of course...)
> >
>
> I agree. I think from the user's point of view, they would like to
> know if it is safe to downgrade the software version of a specific
> broker or controller. I think that it is safe to downgrade a broker or
> controller when the metadata.version has been downgraded and the node
> has generated a snapshot with the downgraded metadata.version.
>
> --
> -Jose
>


-- 
David Arthur


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-10-14 Thread José Armando García Sancio
On Tue, Oct 12, 2021 at 10:57 AM Colin McCabe  wrote:
> > 11. For downgrades, it would be useful to describe how to determine the
> > downgrade process (generating new snapshot, propagating the snapshot, etc)
> > has completed. We could block the UpdateFeature request until the process
> > is completed. However, since the process could take time, the request could
> > time out. Another way is through DescribeFeature and the server only
> > reports downgraded versions after the process is completed.
>
> Hmm.. I think we need to avoid blocking, since we don't know how long it will 
> take for all nodes to act on the downgrade request. After all, some nodes may 
> be down.
>
> But I agree we should have some way of knowing when the upgrade is done. 
> DescribeClusterResponse seems like the natural place to put information about 
> each node's feature level. While we're at it, we should also add a boolean to 
> indicate whether the given node is fenced. (This will always be false for ZK 
> mode, of course...)
>

I agree. I think from the user's point of view, they would like to
know if it is safe to downgrade the software version of a specific
broker or controller. I think that it is safe to downgrade a broker or
controller when the metadata.version has been downgraded and the node
has generated a snapshot with the downgraded metadata.version.

-- 
-Jose


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-10-14 Thread José Armando García Sancio
On Thu, Oct 7, 2021 at 5:20 PM Jun Rao  wrote:
> 7. Jose, what control records were you referring?
>

Hey Jun, in KRaft we have 3 control records.
- LeaderChangeMessage - this is persistent in the replica log when a
new leader gets elected and the epoch increases.  We never included
this record in the snapshot so this KIP doesn't need to process these
control records.
- SnapshotFooterRecord and SnapshotHeaderRecord - these records are
control records at the end and beginning of the snapshot respectively.
They only exist in the snapshot and never in the replicated log. This
KIP needs to make sure that we downgrade these records.

Thanks!
-Jose


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-10-14 Thread José Armando García Sancio
On Tue, Oct 5, 2021 at 8:53 AM David Arthur  wrote:
> 2. Generate snapshot on downgrade
> > > Metadata snapshot is generated and sent to the other inactive
> > controllers and to brokers (this snapshot may be lossy!)
> > Why do we need to send this downgraded snapshot to the brokers? The
> > replicas have seen the FeatureLevelRecord and noticed the downgrade.
> > Can we have the replicas each independently generate a downgraded
> > snapshot at the offset for the downgraded FeatureLevelRecord? I assume
> > that the active controller will guarantee that all records after the
> > FatureLevelRecord use the downgraded version. If so, it would be good
> > to mention that explicitly.
>
>
> Similar to above, yes a broker that detects a downgrade via
> FeatureLevelRecord could generate its own downgrade snapshot and reload its
> state from that. This does get a little fuzzy when we consider cases where
> brokers are on different software versions and could be generating a
> downgrade snapshot for version X, but using different versions of the code.
> It might be safer to let the controller generate the snapshot so each
> broker (regardless of software version) gets the same records. However, for
> upgrades (or downgrades) we expect the whole cluster to be running the same
> software version before triggering the metadata.version change, so perhaps
> this isn't a likely scenario. Thoughts?
>
>

Are you saying that for metadata.version X different software versions
could generate different snapshots? If so, I would consider this an
implementation bug, no? The format and content of a snapshot is a
public API that needs to be supported across software versions.

> 3. Max metadata version
> > >For the first release that supports metadata.version, we can simply
> > initialize metadata.version with the current (and only) version. For future
> > releases, we will need a mechanism to bootstrap a particular version. This
> > could be done using the meta.properties file or some similar mechanism. The
> > reason we need the allow for a specific initial version is to support the
> > use case of starting a Kafka cluster at version X with an older
> > metadata.version.
>
>
> I assume that the Active Controller will learn the metadata version of
> > the broker through the BrokerRegistrationRequest. How will the Active
> > Controller learn about the max metadata version of the inactive
> > controller nodes? We currently don't send a registration request from
> > the inactive controller to the active controller.
>
>
> This came up during the design, but I neglected to add it to the KIP. We
> will need a mechanism for determining the supported features of each
> controller similar to how brokers use BrokerRegistrationRequest. Perhaps
> controllers could write a FeatureLevelRecord (or similar) to the metadata
> log indicating their supported version. WDYT?
>

Hmm. So I think you are proposing the following flow:
1. Cluster metadata partition replicas establish a quorum using
ApiVersions and the KRaft protocol.
2. Inactive controllers send a registration RPC to the active controller.
3. The active controller persists this information to the metadata log.

What happens if the inactive controllers send a metadata.version range
that is not compatible with the metadata.version set for the cluster?

> Why do you need to bootstrap a particular version? Isn't the intent
> > that the broker will learn the active metadata version by reading the
> > metadata before unfencing?
>
>
> This bootstrapping is needed for when a KRaft cluster is first started. If
> we don't have this mechanism, the cluster can't really do anything until
> the operator finalizes the metadata.version with the tool. The
> bootstrapping will be done by the controller and the brokers will see this
> version as a record (like you say). I'll add some text to clarify this.
>
>

Got it. A new cluster will use the metadata.version of the first
active controller. The first active controller will persist this
information on the metadata log. All replicas (inactive controller and
brokers) will configure themselves based on this record.

> 4. Reject Registration - This is related to the bullet point above.
> > What will be the behavior of the active controller if the broker sends
> > a metadata version that is not compatible with the cluster wide
> > metadata version?
>
>
> If a broker starts up with a lower supported version range than the current
> cluster metadata.version, it should log an error and shutdown. This is in
> line with KIP-584.
>

Okay. We need to extend this for the controller case.

Thanks
-- 
-Jose


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-10-13 Thread Kowshik Prakasam
Hi David,

Thanks for the great KIP! It's good to see KIP-584 is starting to get used.

Few comments below.

7001. In the UpdateFeaturesRequest definition, there is a newly introduced
ForceDowngrade parameter. There is also an existing AllowDowngrade
parameter. My understanding is that the AllowDowngrade flag could be set in
a FeatureUpdate whenever a downgrade is attempted via the
Admin.updateFeatures API. Then, the ForceDowngrade flag could be set if we
would like to ask the controller to proceed with the downgrade even if it
was deemed unsafe. Could we document the distinction between the two flags
in the KIP? If we can just keep one of the flags, that would simplify
things. But, as it stands it seems like we will need both flags. To avoid
confusions, does it make sense to deprecate the dual boolean flags and
instead change the updateFeatures API to use an enum parameter called
DOWNGRADE_REQUEST_TYPE with 3 values: NONE (default value meaning no
downgrade is allowed), DOWNGRADE_SAFE (ask the controller to downgrade if
the operation is safe) and DOWNGRADE_UNSAFE (ask the controller to
downgrade disregarding safety)?

7002. The KIP introduces ForceDowngrade flag into UpdateFeaturesRequest
definition. But the KIP also introduces a generic --force CLI flag in the
kafka-features.sh tool. Should we call the CLI flag instead as
--force-downgrade instead so that its intent is more specific?

7003. Currently the kafka-features.sh tool does not yet implement the
"Advanced CLI usage" explained in KIP-584. The associated jira
is KAFKA-10621. For the needs of this KIP, do you need the advanced CLI or
would the basic version work?

7004. KIP-584 feature flag max version is typically incremented to indicate
a breaking change. It usually means that version level downgrades would
break something. Could this KIP explain why it is useful to support a lossy
downgrade for the metadata.version feature flag? i.e. what are some
situations when a lossy downgrade is useful?

7005. Regarding downgrade, I read the `enum MetadataVersions` type
introduced in this KIP that captures the rules for backwards compatibility.
For my understanding, is this enum an implementation detail on the
controller specific to this feature flag? i.e. when processing the
UpdateFeaturesRequest, would the controller introduce a flag-specific logic
(based on the enum values) to decide whether a downgrade is allowed?


Cheers,
Kowshik



On Tue, Oct 12, 2021 at 2:13 PM David Arthur  wrote:

> Jun and Colin, thanks very much for the comments. See below
>
> 10. Colin, I agree that ApiVersionsRequest|Response is the most
> straightforward approach here
>
> 11. This brings up an interesting point. Once a UpdateFeature request is
> finished processing, a subsequent "describe features" (ApiVersionsRequest)
> made by the admin client will go to a random broker, and so might not
> reflect the feature update. We could add blocking behavior to the admin
> client so it would polls ApiVersions for some time until the expected
> version is reflected on that broker. However, this does not mean _every_
> broker has finished upgrading/downgrading -- just the one handling that
> request. Maybe we could have the admin client poll all the brokers until
> the expected version is seen.
>
> If at a later time a broker comes online that needs to process an
> upgrade/downgrade, I don't think there will be a problem since the broker
> will be fenced until it catches up on latest metadata (which will include
> the feature level).
>
> 12. Yes, we will need changes to the admin client for "updateFeatures".
> I'll update the KIP to reflect this.
>
> 13. I'll expand the paragraph on the initial "metadata.version" into its
> own section and add some detail.
>
> 14/15. As mentioned, I think we can avoid this and rely on
> brokers/controllers generating their own snapshots. We will probably want
> some kind of manual recovery mode where we can force load a snapshot, but
> that is out of scope here (I think..)
>
> 16. Automatic upgrades should be feasible, but I think we will want to
> start with manual upgrades (while we work out the design and fix bugs).
> Following the design detailed in this KIP, we could have a controller
> component that (as you suggest) automatically finalizes the feature to the
> max of all broker supported versions. I can include a section on this or we
> could defer to a future KIP. WDYT?
>
> -David
>
>
> On Tue, Oct 12, 2021 at 1:57 PM Colin McCabe  wrote:
>
> > On Thu, Oct 7, 2021, at 17:19, Jun Rao wrote:
> > > Hi, David,
> > >
> > > Thanks for the KIP. A few comments below.
> > >
> > > 10. It would be useful to describe how the controller node determines
> the
> > > RPC version used to communicate to other controller nodes. There seems
> to
> > > be a bootstrap problem. A controller node can't read the log and
> > > therefore the feature level until a quorum leader is elected. But
> leader
> > > election requires an RPC.
> > >
> >
> > Hi Jun,
> >
> > I agree 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-10-12 Thread David Arthur
Jun and Colin, thanks very much for the comments. See below

10. Colin, I agree that ApiVersionsRequest|Response is the most
straightforward approach here

11. This brings up an interesting point. Once a UpdateFeature request is
finished processing, a subsequent "describe features" (ApiVersionsRequest)
made by the admin client will go to a random broker, and so might not
reflect the feature update. We could add blocking behavior to the admin
client so it would polls ApiVersions for some time until the expected
version is reflected on that broker. However, this does not mean _every_
broker has finished upgrading/downgrading -- just the one handling that
request. Maybe we could have the admin client poll all the brokers until
the expected version is seen.

If at a later time a broker comes online that needs to process an
upgrade/downgrade, I don't think there will be a problem since the broker
will be fenced until it catches up on latest metadata (which will include
the feature level).

12. Yes, we will need changes to the admin client for "updateFeatures".
I'll update the KIP to reflect this.

13. I'll expand the paragraph on the initial "metadata.version" into its
own section and add some detail.

14/15. As mentioned, I think we can avoid this and rely on
brokers/controllers generating their own snapshots. We will probably want
some kind of manual recovery mode where we can force load a snapshot, but
that is out of scope here (I think..)

16. Automatic upgrades should be feasible, but I think we will want to
start with manual upgrades (while we work out the design and fix bugs).
Following the design detailed in this KIP, we could have a controller
component that (as you suggest) automatically finalizes the feature to the
max of all broker supported versions. I can include a section on this or we
could defer to a future KIP. WDYT?

-David


On Tue, Oct 12, 2021 at 1:57 PM Colin McCabe  wrote:

> On Thu, Oct 7, 2021, at 17:19, Jun Rao wrote:
> > Hi, David,
> >
> > Thanks for the KIP. A few comments below.
> >
> > 10. It would be useful to describe how the controller node determines the
> > RPC version used to communicate to other controller nodes. There seems to
> > be a bootstrap problem. A controller node can't read the log and
> > therefore the feature level until a quorum leader is elected. But leader
> > election requires an RPC.
> >
>
> Hi Jun,
>
> I agree that we need to figure this out. I think it would be best to
> simply use ApiVersionsRequest and ApiVersionsResponse. That way each
> controller can use the appropriate RPC versions when communicating with
> each other controller. This would allow us to upgrade them one by one.
>
> > 11. For downgrades, it would be useful to describe how to determine the
> > downgrade process (generating new snapshot, propagating the snapshot,
> etc)
> > has completed. We could block the UpdateFeature request until the process
> > is completed. However, since the process could take time, the request
> could
> > time out. Another way is through DescribeFeature and the server only
> > reports downgraded versions after the process is completed.
>
> Hmm.. I think we need to avoid blocking, since we don't know how long it
> will take for all nodes to act on the downgrade request. After all, some
> nodes may be down.
>
> But I agree we should have some way of knowing when the upgrade is done.
> DescribeClusterResponse seems like the natural place to put information
> about each node's feature level. While we're at it, we should also add a
> boolean to indicate whether the given node is fenced. (This will always be
> false for ZK mode, of course...)
>
> >
> > 12. Since we are changing UpdateFeaturesRequest, do we need to change the
> > AdminClient api for updateFeatures too?
> >
> > 13. For the paragraph starting with "In the absence of an operator
> defined
> > value for metadata.version", in KIP-584, we described how to finalize
> > features with New cluster bootstrap. In that case, it's inconvenient for
> > the users to have to run an admin tool to finalize the version for each
> > feature. Instead, the system detects that the /features path is missing
> in
> > ZK and thus automatically finalizes every feature with the latest
> supported
> > version. Could we do something similar in the KRaft mode?
> >
>
> Yes, I think we need to have a section describing how this ties into
> creating new clusters. The simplest thing is probably to have the
> controller notice that there are no FeatureRecords at all, and then create
> a record for the latest metadata.version. This is analogous to how we
> assume the latest IBP if no IBP is configured.
>
> There is also the question of how to create a cluster that starts up with
> something other than the latest metadata.version. We could have a config
> for that, like initial.metadata.version, or pass a flag to the
> controllers... alternately, we could pass a flag to "kafka-storage.sh
> format".
>
> > 14. After the quorum leader 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-10-12 Thread Colin McCabe
On Wed, Sep 29, 2021, at 10:34, David Arthur wrote:
> Hey all,
>
> I'd like to start a discussion around upgrades for KRaft. This design does
> not cover any of the ZooKeeper to KRaft migration (which will be covered in
> a future KIP).
>
> The proposal includes the addition of a "metadata.version" feature flag and
> deprecates the IBP. A procedure for downgrades is also included in the KIP.
>
> https://cwiki.apache.org/confluence/x/WAxACw
>
> Thanks!
> David

Hi David,

Thanks for the KIP! It's good to see us finding a use for the feature flag 
mechanism.

Since this is the first use of the mechanism, I think we should take some time 
to fix some issues with the original KIP, before the design gets "set in 
stone," so to speak.

1. It is not necessary to have two version numbers for each feature flag. I 
believe the rationale for adding this is that a few features would find it 
useful. However, those features can just define two flags (x_min and x_max). So 
I think each feature should have only one feature level number rather than a 
min and a max. This will simplify a lot of things (including your KIP...)

2. kafka-features.sh should use the more modern Argparse4J format rather than 
the old school "everything is a separate flag" format. This would allow us to 
have subcommands. For example, we could have an "upgrade" subcommand, a "nodes" 
subcommand, etc.

Check out kafka-storage.sh for an example of how this can work. There are three 
subcommands (info, format, random-uuid) and they each have a separate set of 
flags that they take. For example, "kafka-storage.sh format" takes an 
--ignore-formatted flag, but this is not valid for "kafka-storage.sh info".

best,
Colin


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-10-12 Thread Colin McCabe
On Thu, Oct 7, 2021, at 17:19, Jun Rao wrote:
> Hi, David,
>
> Thanks for the KIP. A few comments below.
>
> 10. It would be useful to describe how the controller node determines the
> RPC version used to communicate to other controller nodes. There seems to
> be a bootstrap problem. A controller node can't read the log and
> therefore the feature level until a quorum leader is elected. But leader
> election requires an RPC.
>

Hi Jun,

I agree that we need to figure this out. I think it would be best to simply use 
ApiVersionsRequest and ApiVersionsResponse. That way each controller can use 
the appropriate RPC versions when communicating with each other controller. 
This would allow us to upgrade them one by one.

> 11. For downgrades, it would be useful to describe how to determine the
> downgrade process (generating new snapshot, propagating the snapshot, etc)
> has completed. We could block the UpdateFeature request until the process
> is completed. However, since the process could take time, the request could
> time out. Another way is through DescribeFeature and the server only
> reports downgraded versions after the process is completed.

Hmm.. I think we need to avoid blocking, since we don't know how long it will 
take for all nodes to act on the downgrade request. After all, some nodes may 
be down.

But I agree we should have some way of knowing when the upgrade is done. 
DescribeClusterResponse seems like the natural place to put information about 
each node's feature level. While we're at it, we should also add a boolean to 
indicate whether the given node is fenced. (This will always be false for ZK 
mode, of course...)

>
> 12. Since we are changing UpdateFeaturesRequest, do we need to change the
> AdminClient api for updateFeatures too?
>
> 13. For the paragraph starting with "In the absence of an operator defined
> value for metadata.version", in KIP-584, we described how to finalize
> features with New cluster bootstrap. In that case, it's inconvenient for
> the users to have to run an admin tool to finalize the version for each
> feature. Instead, the system detects that the /features path is missing in
> ZK and thus automatically finalizes every feature with the latest supported
> version. Could we do something similar in the KRaft mode?
>

Yes, I think we need to have a section describing how this ties into creating 
new clusters. The simplest thing is probably to have the controller notice that 
there are no FeatureRecords at all, and then create a record for the latest 
metadata.version. This is analogous to how we assume the latest IBP if no IBP 
is configured.

There is also the question of how to create a cluster that starts up with 
something other than the latest metadata.version. We could have a config for 
that, like initial.metadata.version, or pass a flag to the controllers... 
alternately, we could pass a flag to "kafka-storage.sh format".

> 14. After the quorum leader generates a new snapshot, how do we force other
> nodes to pick up the new snapshot?
>
> 15. I agree with Jose that it will be useful to describe when generating a
> new snapshot is needed. To me, it seems the new snapshot is only needed
> when incompatible changes are made.
>

I think it would be good to always generate a snapshot right before the 
upgrade. Then, if the upgrade goes wrong, we have a metadata state we could 
revert back to, albeit with some effort and potential data loss. But, I agree 
that the rationale for this should be spelled out in the KIP.

I also think that the brokers should generate their own snapshots rather than 
fetching from the controller, both in the upgrade and downgrade case. Jose 
mentioned this earlier and I agree.

best,
Colin

> 7. Jose, what control records were you referring?
>
> Thanks,
>
> Jun
>
>
> On Tue, Oct 5, 2021 at 8:53 AM David Arthur  wrote:
>
>> Jose, thanks for the thorough review and comments!
>>
>> I am out of the office until next week, so I probably won't be able to
>> update the KIP until then. Here are some replies to your questions:
>>
>> 1. Generate snapshot on upgrade
>> > > Metadata snapshot is generated and sent to the other nodes
>> > Why does the Active Controller need to generate a new snapshot and
>> > force a snapshot fetch from the replicas (inactive controller and
>> > brokers) on an upgrade? Isn't writing the FeatureLevelRecord good
>> > enough to communicate the upgrade to the replicas?
>>
>>
>> You're right, we don't necessarily need to _transmit_ a snapshot, since
>> each node can generate its own equivalent snapshot
>>
>> 2. Generate snapshot on downgrade
>> > > Metadata snapshot is generated and sent to the other inactive
>> > controllers and to brokers (this snapshot may be lossy!)
>> > Why do we need to send this downgraded snapshot to the brokers? The
>> > replicas have seen the FeatureLevelRecord and noticed the downgrade.
>> > Can we have the replicas each independently generate a downgraded
>> > snapshot at the offset for 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-10-12 Thread Jun Rao
Hi, David,

One more comment.

16. The main reason why KIP-584 requires finalizing a feature manually is
that in the ZK world, the controller doesn't know all brokers in a cluster.
A broker temporarily down is not registered in ZK. in the KRaft world, the
controller keeps track of all brokers, including those that are temporarily
down. This makes it possible for the controller to automatically finalize a
feature---it's safe to do so when all brokers support that feature. This
will make the upgrade process much simpler since no manual command is
required to turn on a new feature. Have we considered this?

Thanks,

Jun

On Thu, Oct 7, 2021 at 5:19 PM Jun Rao  wrote:

> Hi, David,
>
> Thanks for the KIP. A few comments below.
>
> 10. It would be useful to describe how the controller node determines the
> RPC version used to communicate to other controller nodes. There seems to
> be a bootstrap problem. A controller node can't read the log and
> therefore the feature level until a quorum leader is elected. But leader
> election requires an RPC.
>
> 11. For downgrades, it would be useful to describe how to determine the
> downgrade process (generating new snapshot, propagating the snapshot, etc)
> has completed. We could block the UpdateFeature request until the process
> is completed. However, since the process could take time, the request could
> time out. Another way is through DescribeFeature and the server only
> reports downgraded versions after the process is completed.
>
> 12. Since we are changing UpdateFeaturesRequest, do we need to change the
> AdminClient api for updateFeatures too?
>
> 13. For the paragraph starting with "In the absence of an operator
> defined value for metadata.version", in KIP-584, we described how to
> finalize features with New cluster bootstrap. In that case, it's
> inconvenient for the users to have to run an admin tool to finalize the
> version for each feature. Instead, the system detects that the /features
> path is missing in ZK and thus automatically finalizes every feature with
> the latest supported version. Could we do something similar in the KRaft
> mode?
>
> 14. After the quorum leader generates a new snapshot, how do we force
> other nodes to pick up the new snapshot?
>
> 15. I agree with Jose that it will be useful to describe when generating a
> new snapshot is needed. To me, it seems the new snapshot is only needed
> when incompatible changes are made.
>
> 7. Jose, what control records were you referring?
>
> Thanks,
>
> Jun
>
>
> On Tue, Oct 5, 2021 at 8:53 AM David Arthur 
> wrote:
>
>> Jose, thanks for the thorough review and comments!
>>
>> I am out of the office until next week, so I probably won't be able to
>> update the KIP until then. Here are some replies to your questions:
>>
>> 1. Generate snapshot on upgrade
>> > > Metadata snapshot is generated and sent to the other nodes
>> > Why does the Active Controller need to generate a new snapshot and
>> > force a snapshot fetch from the replicas (inactive controller and
>> > brokers) on an upgrade? Isn't writing the FeatureLevelRecord good
>> > enough to communicate the upgrade to the replicas?
>>
>>
>> You're right, we don't necessarily need to _transmit_ a snapshot, since
>> each node can generate its own equivalent snapshot
>>
>> 2. Generate snapshot on downgrade
>> > > Metadata snapshot is generated and sent to the other inactive
>> > controllers and to brokers (this snapshot may be lossy!)
>> > Why do we need to send this downgraded snapshot to the brokers? The
>> > replicas have seen the FeatureLevelRecord and noticed the downgrade.
>> > Can we have the replicas each independently generate a downgraded
>> > snapshot at the offset for the downgraded FeatureLevelRecord? I assume
>> > that the active controller will guarantee that all records after the
>> > FatureLevelRecord use the downgraded version. If so, it would be good
>> > to mention that explicitly.
>>
>>
>> Similar to above, yes a broker that detects a downgrade via
>> FeatureLevelRecord could generate its own downgrade snapshot and reload
>> its
>> state from that. This does get a little fuzzy when we consider cases where
>> brokers are on different software versions and could be generating a
>> downgrade snapshot for version X, but using different versions of the
>> code.
>> It might be safer to let the controller generate the snapshot so each
>> broker (regardless of software version) gets the same records. However,
>> for
>> upgrades (or downgrades) we expect the whole cluster to be running the
>> same
>> software version before triggering the metadata.version change, so perhaps
>> this isn't a likely scenario. Thoughts?
>>
>>
>> 3. Max metadata version
>> > >For the first release that supports metadata.version, we can simply
>> > initialize metadata.version with the current (and only) version. For
>> future
>> > releases, we will need a mechanism to bootstrap a particular version.
>> This
>> > could be done using the meta.properties 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-10-07 Thread Jun Rao
Hi, David,

Thanks for the KIP. A few comments below.

10. It would be useful to describe how the controller node determines the
RPC version used to communicate to other controller nodes. There seems to
be a bootstrap problem. A controller node can't read the log and
therefore the feature level until a quorum leader is elected. But leader
election requires an RPC.

11. For downgrades, it would be useful to describe how to determine the
downgrade process (generating new snapshot, propagating the snapshot, etc)
has completed. We could block the UpdateFeature request until the process
is completed. However, since the process could take time, the request could
time out. Another way is through DescribeFeature and the server only
reports downgraded versions after the process is completed.

12. Since we are changing UpdateFeaturesRequest, do we need to change the
AdminClient api for updateFeatures too?

13. For the paragraph starting with "In the absence of an operator defined
value for metadata.version", in KIP-584, we described how to finalize
features with New cluster bootstrap. In that case, it's inconvenient for
the users to have to run an admin tool to finalize the version for each
feature. Instead, the system detects that the /features path is missing in
ZK and thus automatically finalizes every feature with the latest supported
version. Could we do something similar in the KRaft mode?

14. After the quorum leader generates a new snapshot, how do we force other
nodes to pick up the new snapshot?

15. I agree with Jose that it will be useful to describe when generating a
new snapshot is needed. To me, it seems the new snapshot is only needed
when incompatible changes are made.

7. Jose, what control records were you referring?

Thanks,

Jun


On Tue, Oct 5, 2021 at 8:53 AM David Arthur  wrote:

> Jose, thanks for the thorough review and comments!
>
> I am out of the office until next week, so I probably won't be able to
> update the KIP until then. Here are some replies to your questions:
>
> 1. Generate snapshot on upgrade
> > > Metadata snapshot is generated and sent to the other nodes
> > Why does the Active Controller need to generate a new snapshot and
> > force a snapshot fetch from the replicas (inactive controller and
> > brokers) on an upgrade? Isn't writing the FeatureLevelRecord good
> > enough to communicate the upgrade to the replicas?
>
>
> You're right, we don't necessarily need to _transmit_ a snapshot, since
> each node can generate its own equivalent snapshot
>
> 2. Generate snapshot on downgrade
> > > Metadata snapshot is generated and sent to the other inactive
> > controllers and to brokers (this snapshot may be lossy!)
> > Why do we need to send this downgraded snapshot to the brokers? The
> > replicas have seen the FeatureLevelRecord and noticed the downgrade.
> > Can we have the replicas each independently generate a downgraded
> > snapshot at the offset for the downgraded FeatureLevelRecord? I assume
> > that the active controller will guarantee that all records after the
> > FatureLevelRecord use the downgraded version. If so, it would be good
> > to mention that explicitly.
>
>
> Similar to above, yes a broker that detects a downgrade via
> FeatureLevelRecord could generate its own downgrade snapshot and reload its
> state from that. This does get a little fuzzy when we consider cases where
> brokers are on different software versions and could be generating a
> downgrade snapshot for version X, but using different versions of the code.
> It might be safer to let the controller generate the snapshot so each
> broker (regardless of software version) gets the same records. However, for
> upgrades (or downgrades) we expect the whole cluster to be running the same
> software version before triggering the metadata.version change, so perhaps
> this isn't a likely scenario. Thoughts?
>
>
> 3. Max metadata version
> > >For the first release that supports metadata.version, we can simply
> > initialize metadata.version with the current (and only) version. For
> future
> > releases, we will need a mechanism to bootstrap a particular version.
> This
> > could be done using the meta.properties file or some similar mechanism.
> The
> > reason we need the allow for a specific initial version is to support the
> > use case of starting a Kafka cluster at version X with an older
> > metadata.version.
>
>
> I assume that the Active Controller will learn the metadata version of
> > the broker through the BrokerRegistrationRequest. How will the Active
> > Controller learn about the max metadata version of the inactive
> > controller nodes? We currently don't send a registration request from
> > the inactive controller to the active controller.
>
>
> This came up during the design, but I neglected to add it to the KIP. We
> will need a mechanism for determining the supported features of each
> controller similar to how brokers use BrokerRegistrationRequest. Perhaps
> controllers could write a 

Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-10-05 Thread David Arthur
Jose, thanks for the thorough review and comments!

I am out of the office until next week, so I probably won't be able to
update the KIP until then. Here are some replies to your questions:

1. Generate snapshot on upgrade
> > Metadata snapshot is generated and sent to the other nodes
> Why does the Active Controller need to generate a new snapshot and
> force a snapshot fetch from the replicas (inactive controller and
> brokers) on an upgrade? Isn't writing the FeatureLevelRecord good
> enough to communicate the upgrade to the replicas?


You're right, we don't necessarily need to _transmit_ a snapshot, since
each node can generate its own equivalent snapshot

2. Generate snapshot on downgrade
> > Metadata snapshot is generated and sent to the other inactive
> controllers and to brokers (this snapshot may be lossy!)
> Why do we need to send this downgraded snapshot to the brokers? The
> replicas have seen the FeatureLevelRecord and noticed the downgrade.
> Can we have the replicas each independently generate a downgraded
> snapshot at the offset for the downgraded FeatureLevelRecord? I assume
> that the active controller will guarantee that all records after the
> FatureLevelRecord use the downgraded version. If so, it would be good
> to mention that explicitly.


Similar to above, yes a broker that detects a downgrade via
FeatureLevelRecord could generate its own downgrade snapshot and reload its
state from that. This does get a little fuzzy when we consider cases where
brokers are on different software versions and could be generating a
downgrade snapshot for version X, but using different versions of the code.
It might be safer to let the controller generate the snapshot so each
broker (regardless of software version) gets the same records. However, for
upgrades (or downgrades) we expect the whole cluster to be running the same
software version before triggering the metadata.version change, so perhaps
this isn't a likely scenario. Thoughts?


3. Max metadata version
> >For the first release that supports metadata.version, we can simply
> initialize metadata.version with the current (and only) version. For future
> releases, we will need a mechanism to bootstrap a particular version. This
> could be done using the meta.properties file or some similar mechanism. The
> reason we need the allow for a specific initial version is to support the
> use case of starting a Kafka cluster at version X with an older
> metadata.version.


I assume that the Active Controller will learn the metadata version of
> the broker through the BrokerRegistrationRequest. How will the Active
> Controller learn about the max metadata version of the inactive
> controller nodes? We currently don't send a registration request from
> the inactive controller to the active controller.


This came up during the design, but I neglected to add it to the KIP. We
will need a mechanism for determining the supported features of each
controller similar to how brokers use BrokerRegistrationRequest. Perhaps
controllers could write a FeatureLevelRecord (or similar) to the metadata
log indicating their supported version. WDYT?

Why do you need to bootstrap a particular version? Isn't the intent
> that the broker will learn the active metadata version by reading the
> metadata before unfencing?


This bootstrapping is needed for when a KRaft cluster is first started. If
we don't have this mechanism, the cluster can't really do anything until
the operator finalizes the metadata.version with the tool. The
bootstrapping will be done by the controller and the brokers will see this
version as a record (like you say). I'll add some text to clarify this.


4. Reject Registration - This is related to the bullet point above.
> What will be the behavior of the active controller if the broker sends
> a metadata version that is not compatible with the cluster wide
> metadata version?


If a broker starts up with a lower supported version range than the current
cluster metadata.version, it should log an error and shutdown. This is in
line with KIP-584.

5. Discover upgrade
> > This snapshot will be a convenient way to let broker and controller
> components rebuild their entire in-memory state following an upgrade.
> Can we rely on the presence of the FeatureLevelRecord for the metadata
> version for this functionality? If so, it avoids having to reload the
> snapshot.


For upgrades, yes probably since we won't need to "rewrite" any records in
this case. For downgrades, we will need to generate the snapshot and reload
everything.

6. Metadata version specification
> >  V4(version=4, isBackwardsCompatible=false, description="New metadata
> record type Bar"),


Very cool. Do you have plans to generate Apache Kafka HTML
> documentation for this information? Would be helpful to display this
> information to the user using the kafka-features.sh and feature RPC?


Hm good idea :) I'll add a brief section on documentation. This would
certainly be very useful


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-09-29 Thread José Armando García Sancio
One more comment.

7.Downgrade records
I think we should explicitly mention that the downgrade process will
downgrade both metadata records and controller records. In KIP-630 we
introduced two control records for snapshots.


Re: [DISCUSS] KIP-778 KRaft Upgrades

2021-09-29 Thread José Armando García Sancio
Thank you David for the detailed KIP.

1. Generate snapshot on upgrade
> Metadata snapshot is generated and sent to the other nodes
Why does the Active Controller need to generate a new snapshot and
force a snapshot fetch from the replicas (inactive controller and
brokers) on an upgrade? Isn't writing the FeatureLevelRecord good
enough to communicate the upgrade to the replicas?

2. Generate snapshot on downgrade
> Metadata snapshot is generated and sent to the other inactive controllers and 
> to brokers (this snapshot may be lossy!)
Why do we need to send this downgraded snapshot to the brokers? The
replicas have seen the FeatureLevelRecord and noticed the downgrade.
Can we have the replicas each independently generate a downgraded
snapshot at the offset for the downgraded FeatureLevelRecord? I assume
that the active controller will guarantee that all records after the
FatureLevelRecord use the downgraded version. If so, it would be good
to mention that explicitly.

3. Max metadata version
>For the first release that supports metadata.version, we can simply initialize 
>metadata.version with the current (and only) version. For future releases, we 
>will need a mechanism to bootstrap a particular version. This could be done 
>using the meta.properties file or some similar mechanism. The reason we need 
>the allow for a specific initial version is to support the use case of 
>starting a Kafka cluster at version X with an older metadata.version.

I assume that the Active Controller will learn the metadata version of
the broker through the BrokerRegistrationRequest. How will the Active
Controller learn about the max metadata version of the inactive
controller nodes? We currently don't send a registration request from
the inactive controller to the active controller.

Why do you need to bootstrap a particular version? Isn't the intent
that the broker will learn the active metadata version by reading the
metadata before unfencing?

4. Reject Registration - This is related to the bullet point above.
What will be the behavior of the active controller if the broker sends
a metadata version that is not compatible with the cluster wide
metadata version?

5. Discover upgrade
> This snapshot will be a convenient way to let broker and controller 
> components rebuild their entire in-memory state following an upgrade.
Can we rely on the presence of the FeatureLevelRecord for the metadata
version for this functionality? If so, it avoids having to reload the
snapshot.

6. Metadata version specification
>  V4(version=4, isBackwardsCompatible=false, description="New metadata record 
> type Bar"),

Very cool. Do you have plans to generate Apache Kafka HTML
documentation for this information? Would be helpful to display this
information to the user using the kafka-features.sh and feature RPC?

-- 
-Jose


[DISCUSS] KIP-778 KRaft Upgrades

2021-09-29 Thread David Arthur
Hey all,

I'd like to start a discussion around upgrades for KRaft. This design does
not cover any of the ZooKeeper to KRaft migration (which will be covered in
a future KIP).

The proposal includes the addition of a "metadata.version" feature flag and
deprecates the IBP. A procedure for downgrades is also included in the KIP.

https://cwiki.apache.org/confluence/x/WAxACw

Thanks!
David