Mickael, after looking more closely, I definitely think KAFKA-15010 is a blocker. It creates the case where the controller can totally miss a metadata update and not write it back to ZK. Since things like dynamic configs and ACLs are only read from ZK by the ZK brokers, we could have significant problems while the brokers are being migrated (when some are KRaft and some are ZK). E.g., ZK brokers could be totally unaware of an ACL change while the KRaft brokers have it. I have a fix ready here https://github.com/apache/kafka/pull/13758. I think we can get it committed soon.
Another blocker is KAFKA-15004 which was just merged to trunk. This is another dual-write bug where new topic/broker configs will not be written back to ZK by the controller. The fix for KAFKA-15010 has a few dependencies on fixes we made this past week, so we'll need to cherry-pick a few commits. The changes are totally contained within the migration area of code, so I think the risk in including them is fairly low. -David On Thu, May 25, 2023 at 2:15 PM Greg Harris <greg.har...@aiven.io.invalid> wrote: > Hey all, > > A contributor just pointed out a small but noticeable flaw in the > implementation of KIP-581 > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-581%3A+Value+of+optional+null+field+which+has+default+value > which is planned for this release. > Impact: the feature works for root values in a record, but does not > work for any fields within structs. Fields within structs will > continue to have their previous, backwards-compatible behavior. > The contributor has submitted a bug-fix PR which reports the problem > and does not yet have a merge-able solution, but they are actively > responding and interested in having this fixed: > https://github.com/apache/kafka/pull/13748 > The overall fix should be a one-liner + some unit tests. While this is > not a regression, it does make the feature largely useless, as the > majority of use-cases will be for struct fields. > > Thanks! > Greg Harris > > On Wed, May 24, 2023 at 7:05 PM Ismael Juma <ism...@juma.me.uk> wrote: > > > > I agree the migration should be functional - it wasn't obvious if the > > migration issues are edge cases or not. If they are edge cases, I think > > 3.5.1 would be fine given the preview status. > > > > I understand that a new RC is needed, but that doesn't mean we should let > > everything in. Each change carries some risk. And if we don't agree on > the > > bar for the migration work, we may be having the same discussion next > week. > > :) > > > > Ismael > > > > On Wed, May 24, 2023, 12:00 PM Josep Prat <josep.p...@aiven.io.invalid> > > wrote: > > > > > Hi there, > > > Is the plan described in KIP-833[1] still valid? In there it states > that > > > 3.5.0 should aim at deprecation of Zookeeper, so conceptually, the > path to > > > migrate to Kraft should be somewhat functional (in my opinion). If we > don't > > > want to deprecate Zookeeper in 3.5.0, then I share Ismael's opinion > that > > > these could be fixed in subsequent patches of 3.5.x. Just my 5cts. > > > > > > [1]: > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-833:+Mark+KRaft+as+Production+Ready#KIP833:MarkKRaftasProductionReady-Kafka3.5 > > > Best, > > > > > > On Wed, May 24, 2023 at 8:51 PM Ismael Juma <ism...@juma.me.uk> wrote: > > > > > > > Are all these blockers? For example, zk to kraft migration are is > still > > > in > > > > preview - can we fix some of these in 3.5.1? > > > > > > > > Ismael > > > > > > > > On Wed, May 24, 2023, 10:22 AM Colin McCabe <cmcc...@apache.org> > wrote: > > > > > > > > > Hi Mickael, > > > > > > > > > > Thanks for putting together this RC. Unfortunately, we've > identified > > > > > several blocker issues in this release candidate. > > > > > > > > > > KAFKA-15009: New ACLs are not written to ZK during migration > > > > > KAFKA-15007: MV is not set correctly in the MetadataPropagator in > > > > > migration. > > > > > KAFKA-15004: Topic config changes are not synced during zk to kraft > > > > > migration (dual-write) > > > > > KAFKA-15003: TopicIdReplicaAssignment is not updated in migration > > > > > (dual-write) when partitions are changed for topic > > > > > KAFKA-14996: The KRaft controller should properly handle overly > large > > > > user > > > > > operations > > > > > > > > > > We are working on PRs for these issues and will get them in soon, > we > > > > think! > > > > > > > > > > So unfortunately I have to leave a -1 here for RC0. Let's aim for > > > another > > > > > RC next week. > > > > > > > > > > best, > > > > > Colin > > > > > > > > > > On Wed, May 24, 2023, at 07:05, Mickael Maison wrote: > > > > > > Hi David, > > > > > > > > > > > > We're already quite a bit behind schedule. If you think these > fixes > > > > > > are really important and can be ready in the next couple of > days, I'm > > > > > > open to backport them and build another release candidate. Let me > > > know > > > > > > once you've investigated the severity of KAFKA-15010. > > > > > > > > > > > > Thanks, > > > > > > Mickael > > > > > > > > > > > > > > > > > > On Tue, May 23, 2023 at 6:34 PM David Arthur > > > > > > <david.art...@confluent.io.invalid> wrote: > > > > > >> > > > > > >> Mickael, we have some migration fixes on trunk, is it okay to > > > > > cherry-pick > > > > > >> these to 3.5? > > > > > >> > > > > > >> KAFKA-15007 Use the correct MetadataVersion in > MigrationPropagator > > > > > >> KAFKA-15009 Handle new ACLs in KRaft snapshot during migration > > > > > >> > > > > > >> There is another issue KAFKA-15010 that I'm also investigating > to > > > > > determine > > > > > >> the impact and likelihood of seeing it in practice. This one > may be > > > a > > > > > >> significant migration blocker > > > > > >> > > > > > >> Cheers, > > > > > >> David > > > > > >> > > > > > >> On Tue, May 23, 2023 at 9:57 AM Mickael Maison < > > > > > mickael.mai...@gmail.com> > > > > > >> wrote: > > > > > >> > > > > > >> > Hi Christo, > > > > > >> > > > > > > >> > Yes this is expected. This happens when nested fields also > accept > > > > > >> > optional tagged fields. The tables list all fields, so they > may > > > > > >> > include _tagged_fields multiple times. > > > > > >> > Clearly the layout of this page could be improved, if you have > > > ideas > > > > > >> > how to describe the protocol in a better way, feel free to > share > > > > them. > > > > > >> > > > > > > >> > Thanks, > > > > > >> > Mickael > > > > > >> > > > > > > >> > On Tue, May 23, 2023 at 3:50 PM Mickael Maison < > > > > > mickael.mai...@gmail.com> > > > > > >> > wrote: > > > > > >> > > > > > > > >> > > Hi Josep, > > > > > >> > > > > > > > >> > > Good catch! I opened a PR to fix this: > > > > > >> > > https://github.com/apache/kafka-site/pull/514 > > > > > >> > > > > > > > >> > > Thanks, > > > > > >> > > Mickael > > > > > >> > > > > > > > >> > > > > > > > >> > > On Tue, May 23, 2023 at 3:36 PM Christo Lolov < > > > > > christolo...@gmail.com> > > > > > >> > wrote: > > > > > >> > > > > > > > > >> > > > Hey Mickael! > > > > > >> > > > > > > > > >> > > > I am giving a +1 (non-binding) for this candidate release. > > > > > >> > > > > > > > > >> > > > * Built from the binary tar.gz source with Java 17 and > Scala > > > > 2.13 > > > > > on > > > > > >> > Intel > > > > > >> > > > (m5.4xlarge) and ARM (m6g.4xlarge) machines. > > > > > >> > > > * Ran unit and integration tests on Intel and ARM > machines. > > > > > >> > > > * Ran the Quickstart in both Zookeeper and KRaft modes on > > > Intel > > > > > and ARM > > > > > >> > > > machines. > > > > > >> > > > > > > > > >> > > > Question: > > > > > >> > > > * I went through > https://kafka.apache.org/35/protocol.html > > > and > > > > > there > > > > > >> > are > > > > > >> > > > quite a few repetitive __tagged_fileds fields within the > same > > > > > >> > structures - > > > > > >> > > > is this expected? > > > > > >> > > > > > > > > >> > > > On Tue, 23 May 2023 at 12:01, Josep Prat > > > > > <josep.p...@aiven.io.invalid> > > > > > >> > > > wrote: > > > > > >> > > > > > > > > >> > > > > Hi Mickael, > > > > > >> > > > > I just wanted to point out that I think the > documentation > > > you > > > > > >> > recently > > > > > >> > > > > merged on Kafka site regarding the 3.5.0 version has a > > > problem > > > > > when > > > > > >> > it > > > > > >> > > > > states the version number and the sub-menu that links to > > > > > previous > > > > > >> > versions. > > > > > >> > > > > Left a comment here: > > > > > >> > > > > > > > > > >> > > > > > > > > > > > > > > https://github.com/apache/kafka-site/pull/513#pullrequestreview-1438927939 > > > > > >> > > > > > > > > > >> > > > > Best, > > > > > >> > > > > > > > > > >> > > > > On Tue, May 23, 2023 at 9:29 AM Josep Prat < > > > > josep.p...@aiven.io > > > > > > > > > > > >> > wrote: > > > > > >> > > > > > > > > > >> > > > > > Hi Mickael, > > > > > >> > > > > > > > > > > >> > > > > > I can +1 this candidate. I verified the following: > > > > > >> > > > > > - Built from source with Java 17 and Scala 2.13 > > > > > >> > > > > > - Signatures and hashes of the artifacts generated > > > > > >> > > > > > - Navigated through Javadoc including links to JDK > classes > > > > > >> > > > > > - Run the unit tests > > > > > >> > > > > > - Run integration tests > > > > > >> > > > > > - Run the quickstart in KRaft and Zookeeper mode > > > > > >> > > > > > > > > > > >> > > > > > Best, > > > > > >> > > > > > > > > > > >> > > > > > On Mon, May 22, 2023 at 5:30 PM Mickael Maison < > > > > > >> > mimai...@apache.org> > > > > > >> > > > > > wrote: > > > > > >> > > > > > > > > > > >> > > > > >> Hello Kafka users, developers and client-developers, > > > > > >> > > > > >> > > > > > >> > > > > >> This is the first candidate for release of Apache > Kafka > > > > > 3.5.0. > > > > > >> > Some of > > > > > >> > > > > the > > > > > >> > > > > >> major features include: > > > > > >> > > > > >> - KIP-710: Full support for distributed mode in > dedicated > > > > > >> > MirrorMaker > > > > > >> > > > > >> 2.0 clusters > > > > > >> > > > > >> - KIP-881: Rack-aware Partition Assignment for Kafka > > > > > Consumers > > > > > >> > > > > >> - KIP-887: Add ConfigProvider to make use of > environment > > > > > variables > > > > > >> > > > > >> - KIP-889: Versioned State Stores > > > > > >> > > > > >> - KIP-894: Use incrementalAlterConfig for syncing > topic > > > > > >> > configurations > > > > > >> > > > > >> - KIP-900: KRaft kafka-storage.sh API additions to > > > support > > > > > SCRAM > > > > > >> > for > > > > > >> > > > > >> Kafka Brokers > > > > > >> > > > > >> > > > > > >> > > > > >> Release notes for the 3.5.0 release: > > > > > >> > > > > >> > > > > > >> > > > > > https://home.apache.org/~mimaison/kafka-3.5.0-rc0/RELEASE_NOTES.html > > > > > >> > > > > >> > > > > > >> > > > > >> *** Please download, test and vote by Friday, May > 26, 5pm > > > > PT > > > > > >> > > > > >> > > > > > >> > > > > >> Kafka's KEYS file containing PGP keys we use to sign > the > > > > > release: > > > > > >> > > > > >> https://kafka.apache.org/KEYS > > > > > >> > > > > >> > > > > > >> > > > > >> * Release artifacts to be voted upon (source and > binary): > > > > > >> > > > > >> https://home.apache.org/~mimaison/kafka-3.5.0-rc0/ > > > > > >> > > > > >> > > > > > >> > > > > >> * Maven artifacts to be voted upon: > > > > > >> > > > > >> > > > > > >> > > > > > > > https://repository.apache.org/content/groups/staging/org/apache/kafka/ > > > > > >> > > > > >> > > > > > >> > > > > >> * Javadoc: > > > > > >> > > > > >> > > > https://home.apache.org/~mimaison/kafka-3.5.0-rc0/javadoc/ > > > > > >> > > > > >> > > > > > >> > > > > >> * Tag to be voted upon (off 3.5 branch) is the 3.5.0 > tag: > > > > > >> > > > > >> > https://github.com/apache/kafka/releases/tag/3.5.0-rc0 > > > > > >> > > > > >> > > > > > >> > > > > >> The PR adding the 35 documentation is not merged yet > > > > > >> > > > > >> (https://github.com/apache/kafka-site/pull/513) > > > > > >> > > > > >> * Documentation: > > > > > >> > > > > >> https://kafka.apache.org/35/documentation.html > > > > > >> > > > > >> * Protocol: > > > > > >> > > > > >> https://kafka.apache.org/35/protocol.html > > > > > >> > > > > >> > > > > > >> > > > > >> * Successful Jenkins builds for the 3.5 branch: > > > > > >> > > > > >> Unit/integration tests: Jenkins is not detecting the > 3.5 > > > > > branch, > > > > > >> > > > > >> working with INFRA to sort it out: > > > > > >> > > > > >> https://issues.apache.org/jira/browse/INFRA-24577 > > > > > >> > > > > >> System tests: The build is still running, I'll send > an > > > > update > > > > > >> > once I > > > > > >> > > > > >> have the results > > > > > >> > > > > >> > > > > > >> > > > > >> Thanks, > > > > > >> > > > > >> Mickael > > > > > >> > > > > >> > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > -- > > > > > >> > > > > > [image: Aiven] <https://www.aiven.io> > > > > > >> > > > > > > > > > > >> > > > > > *Josep Prat* > > > > > >> > > > > > Open Source Engineering Director, *Aiven* > > > > > >> > > > > > josep.p...@aiven.io | +491715557497 > > > > > >> > > > > > aiven.io <https://www.aiven.io> | > > > > > >> > > > > > <https://www.facebook.com/aivencloud> > > > > > >> > > > > > <https://www.linkedin.com/company/aiven/> < > > > > > >> > > > > https://twitter.com/aiven_io> > > > > > >> > > > > > *Aiven Deutschland GmbH* > > > > > >> > > > > > Alexanderufer 3-7, 10117 Berlin > > > > > >> > > > > > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen > > > > > >> > > > > > Amtsgericht Charlottenburg, HRB 209739 B > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > -- > > > > > >> > > > > [image: Aiven] <https://www.aiven.io> > > > > > >> > > > > > > > > > >> > > > > *Josep Prat* > > > > > >> > > > > Open Source Engineering Director, *Aiven* > > > > > >> > > > > josep.p...@aiven.io | +491715557497 > > > > > >> > > > > aiven.io <https://www.aiven.io> | < > > > > > >> > https://www.facebook.com/aivencloud > > > > > >> > > > > > > > > > > >> > > > > <https://www.linkedin.com/company/aiven/> < > > > > > >> > > > > https://twitter.com/aiven_io> > > > > > >> > > > > *Aiven Deutschland GmbH* > > > > > >> > > > > Alexanderufer 3-7, 10117 Berlin > > > > > >> > > > > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen > > > > > >> > > > > Amtsgericht Charlottenburg, HRB 209739 B > > > > > >> > > > > > > > > > >> > > > > > > >> > > > > > >> > > > > > >> -- > > > > > >> -David > > > > > > > > > > > > > > > > > > -- > > > [image: Aiven] <https://www.aiven.io> > > > > > > *Josep Prat* > > > Open Source Engineering Director, *Aiven* > > > josep.p...@aiven.io | +491715557497 > > > aiven.io <https://www.aiven.io> | < > https://www.facebook.com/aivencloud > > > > > > > <https://www.linkedin.com/company/aiven/> < > > > https://twitter.com/aiven_io> > > > *Aiven Deutschland GmbH* > > > Alexanderufer 3-7, 10117 Berlin > > > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen > > > Amtsgericht Charlottenburg, HRB 209739 B > > > > -- -David