We have another blocking issue for the RC :
https://issues.apache.org/jira/browse/KAFKA-16157. This bug is similar to
https://issues.apache.org/jira/browse/KAFKA-14616. The new issue however
can lead to the new topic having partitions that a producer cannot write to.

--Proven

On Tue, Jan 16, 2024 at 12:04 PM Proven Provenzano <pprovenz...@confluent.io>
wrote:

>
> I have a PR https://github.com/apache/kafka/pull/15197 for
> https://issues.apache.org/jira/browse/KAFKA-16131 that is building now.
> --Proven
>
> On Mon, Jan 15, 2024 at 5:03 AM Jakub Scholz <ja...@scholz.cz> wrote:
>
>> *> Hi Jakub,> > Thanks for trying the RC. I think what you found is a
>> blocker bug because it *
>> *> will generate huge amount of logspam. I guess we didn't find it in
>> junit
>> tests *
>> *> since logspam doesn't fail the automated tests. But certainly it's not
>> suitable *
>> *> for production. Did you file a JIRA yet?*
>>
>> Hi Colin,
>>
>> I opened https://issues.apache.org/jira/browse/KAFKA-16131.
>>
>> Thanks & Regards
>> Jakub
>>
>> On Mon, Jan 15, 2024 at 8:57 AM Colin McCabe <cmcc...@apache.org> wrote:
>>
>> > Hi Stanislav,
>> >
>> > Thanks for making the first RC. The fact that it's titled RC2 is messing
>> > with my mind a bit. I hope this doesn't make people think that we're
>> > farther along than we are, heh.
>> >
>> > On Sun, Jan 14, 2024, at 13:54, Jakub Scholz wrote:
>> > > *> Nice catch! It does seem like we should have gated this behind the
>> > > metadata> version as KIP-858 implies. Is the cluster configured with
>> > > multiple log> dirs? What is the impact of the error messages?*
>> > >
>> > > I did not observe any obvious impact. I was able to send and receive
>> > > messages as normally. But to be honest, I have no idea what else
>> > > this might impact, so I did not try anything special.
>> > >
>> > > I think everyone upgrading an existing KRaft cluster will go through
>> this
>> > > stage (running Kafka 3.7 with an older metadata version for at least a
>> > > while). So even if it is just a logged exception without any other
>> > impact I
>> > > wonder if it might scare users from upgrading. But I leave it to
>> others
>> > to
>> > > decide if this is a blocker or not.
>> > >
>> >
>> > Hi Jakub,
>> >
>> > Thanks for trying the RC. I think what you found is a blocker bug
>> because
>> > it will generate huge amount of logspam. I guess we didn't find it in
>> junit
>> > tests since logspam doesn't fail the automated tests. But certainly it's
>> > not suitable for production. Did you file a JIRA yet?
>> >
>> > > On Sun, Jan 14, 2024 at 10:17 PM Stanislav Kozlovski
>> > > <stanis...@confluent.io.invalid> wrote:
>> > >
>> > >> Hey Luke,
>> > >>
>> > >> This is an interesting problem. Given the fact that the KIP for
>> having a
>> > >> 3.8 release passed, I think it weights the scale towards not calling
>> > this a
>> > >> blocker and expecting it to be solved in 3.7.1.
>> > >>
>> > >> It is unfortunate that it would not seem safe to migrate to KRaft in
>> > 3.7.0
>> > >> (given the inability to rollback safely), but if that's true - the
>> same
>> > >> case would apply for 3.6.0. So in any case users w\ould be expected
>> to
>> > use a
>> > >> patch release for this.
>> >
>> > Hi Luke,
>> >
>> > Thanks for testing rollback. I think this is a case where the
>> > documentation is wrong. The intention was to for the steps to basically
>> be:
>> >
>> > 1. roll all the brokers into zk mode, but with migration enabled
>> > 2. take down the kraft quorum
>> > 3. rmr /controller, allowing a hybrid broker to take over.
>> > 4. roll all the brokers into zk mode without migration enabled (if
>> desired)
>> >
>> > With these steps, there isn't really unavailability since a ZK
>> controller
>> > can be elected quickly after the kraft quorum is gone.
>> >
>> > >> Further, since we will have a 3.8 release - it is
>> > >> likely we will ultimately recommend users upgrade from that version
>> > given
>> > >> its aim is to have strategic KRaft feature parity with ZK.
>> > >> That being said, I am not 100% on this. Let me know whether you think
>> > this
>> > >> should block the release, Luke. I am also tagging Colin and David to
>> > weigh
>> > >> in with their opinions, as they worked on the migration logic.
>> >
>> > The rollback docs are new in 3.7 so the fact that they're wrong is a
>> clear
>> > blocker, I think. But easy to fix, I believe. I will create a PR.
>> >
>> > best,
>> > Colin
>> >
>> > >>
>> > >> Hey Kirk and Chris,
>> > >>
>> > >> Unless I'm missing something - KAFKALESS-16029 is simply a bad log
>> due
>> > to
>> > >> improper closing. And the PR description implies this has been
>> present
>> > >> since 3.5. While annoying, I don't see a strong reason for this to
>> block
>> > >> the release.
>> > >>
>> > >> Hey Jakub,
>> > >>
>> > >> Nice catch! It does seem like we should have gated this behind the
>> > metadata
>> > >> version as KIP-858 implies. Is the cluster configured with multiple
>> log
>> > >> dirs? What is the impact of the error messages?
>> > >>
>> > >> Tagging Igor (the author of the KIP) to weigh in.
>> > >>
>> > >> Best,
>> > >> Stanislav
>> > >>
>> > >> On Sat, Jan 13, 2024 at 7:22 PM Jakub Scholz <ja...@scholz.cz>
>> wrote:
>> > >>
>> > >> > Hi,
>> > >> >
>> > >> > I was trying the RC2 and run into the following issue ... when I
>> run
>> > >> > 3.7.0-RC2 KRaft cluster with metadata version set to 3.6-IV2
>> metadata
>> > >> > version, I seem to be getting repeated errors like this in the
>> > controller
>> > >> > logs:
>> > >> >
>> > >> > 2024-01-13 16:58:01,197 INFO [QuorumController id=0]
>> > >> assignReplicasToDirs:
>> > >> > event failed with UnsupportedVersionException in 15 microseconds.
>> > >> > (org.apache.kafka.controller.QuorumController)
>> > >> > [quorum-controller-0-event-handler]
>> > >> > 2024-01-13 16:58:01,197 ERROR [ControllerApis nodeId=0] Unexpected
>> > error
>> > >> > handling request RequestHeader(apiKey=ASSIGN_REPLICAS_TO_DIRS,
>> > >> > apiVersion=0, clientId=1000, correlationId=14, headerVersion=2) --
>> > >> > AssignReplicasToDirsRequestData(brokerId=1000, brokerEpoch=5,
>> > >> > directories=[DirectoryData(id=w_uxN7pwQ6eXSMrOKceYIQ,
>> > >> > topics=[TopicData(topicId=bvAKLSwmR7iJoKv2yZgygQ,
>> > >> > partitions=[PartitionData(partitionIndex=2),
>> > >> > PartitionData(partitionIndex=1)]),
>> > >> > TopicData(topicId=uNe7f5VrQgO0zST6yH1jDQ,
>> > >> > partitions=[PartitionData(partitionIndex=0)])])]) with context
>> > >> > RequestContext(header=RequestHeader(apiKey=ASSIGN_REPLICAS_TO_DIRS,
>> > >> > apiVersion=0, clientId=1000, correlationId=14, headerVersion=2),
>> > >> > connectionId='172.16.14.219:9090-172.16.14.217:53590-7',
>> > clientAddress=/
>> > >> > 172.16.14.217, principal=User:CN=my-cluster-kafka,O=io.strimzi,
>> > >> > listenerName=ListenerName(CONTROLPLANE-9090), securityProtocol=SSL,
>> > >> > clientInformation=ClientInformation(softwareName=apache-kafka-java,
>> > >> > softwareVersion=3.7.0), fromPrivilegedListener=false,
>> > >> >
>> > >> >
>> > >>
>> >
>> principalSerde=Optional[org.apache.kafka.common.security.authenticator.DefaultKafkaPrincipalBuilder@71004ad2
>> > >> > ])
>> > >> > (kafka.server.ControllerApis) [quorum-controller-0-event-handler]
>> > >> > java.util.concurrent.CompletionException:
>> > >> > org.apache.kafka.common.errors.UnsupportedVersionException:
>> Directory
>> > >> > assignment is not supported yet.
>> > >> >
>> > >> >  at
>> > >> >
>> > >> >
>> > >>
>> >
>> java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:332)
>> > >> >  at
>> > >> >
>> > >> >
>> > >>
>> >
>> java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:347)
>> > >> >  at
>> > >> >
>> > >> >
>> > >>
>> >
>> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:636)
>> > >> >  at
>> > >> >
>> > >> >
>> > >>
>> >
>> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
>> > >> >  at
>> > >> >
>> > >> >
>> > >>
>> >
>> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
>> > >> >  at
>> > >> >
>> > >> >
>> > >>
>> >
>> org.apache.kafka.controller.QuorumController$ControllerWriteEvent.complete(QuorumController.java:880)
>> > >> >  at
>> > >> >
>> > >> >
>> > >>
>> >
>> org.apache.kafka.controller.QuorumController$ControllerWriteEvent.handleException(QuorumController.java:871)
>> > >> >  at
>> > >> >
>> > >> >
>> > >>
>> >
>> org.apache.kafka.queue.KafkaEventQueue$EventContext.completeWithException(KafkaEventQueue.java:148)
>> > >> >  at
>> > >> >
>> > >> >
>> > >>
>> >
>> org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:137)
>> > >> >  at
>> > >> >
>> > >> >
>> > >>
>> >
>> org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210)
>> > >> >  at
>> > >> >
>> > >> >
>> > >>
>> >
>> org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181)
>> > >> >  at java.base/java.lang.Thread.run(Thread.java:840)
>> > >> >
>> > >> > Caused by:
>> org.apache.kafka.common.errors.UnsupportedVersionException:
>> > >> > Directory assignment is not supported yet.
>> > >> >
>> > >> > Is that expected? I guess with the metadata version set to
>> 3.6-IV2, it
>> > >> > makes sense that the request is not supported. But shouldn't then
>> the
>> > >> > request not be sent at all by the brokers? (I did not opened a JIRA
>> > for
>> > >> it,
>> > >> > but I can open one if you agree this is not expected)
>> > >> >
>> > >> > Thanks & Regards
>> > >> > Jakub
>> > >> >
>> > >> > On Sat, Jan 13, 2024 at 8:03 AM Luke Chen <show...@gmail.com>
>> wrote:
>> > >> >
>> > >> > > Hi Stanislav,
>> > >> > >
>> > >> > > I commented in the "Apache Kafka 3.7.0 Release" thread, but maybe
>> > you
>> > >> > > missed it.
>> > >> > > cross-posting here:
>> > >> > >
>> > >> > > There is a bug KAFKA-16101
>> > >> > > <https://issues.apache.org/jira/browse/KAFKA-16101> reporting
>> that
>> > >> > "Kafka
>> > >> > > cluster will be unavailable during KRaft migration rollback".
>> > >> > > The impact for this issue is that if brokers try to rollback to
>> ZK
>> > mode
>> > >> > > during KRaft migration process, there will be a period of time
>> the
>> > >> > cluster
>> > >> > > is unavailable.
>> > >> > > Since ZK migrating to KRaft feature is a production ready
>> feature, I
>> > >> > think
>> > >> > > this should be addressed soon.
>> > >> > > Do you think this is a blocker for v3.7.0?
>> > >> > >
>> > >> > > Thanks.
>> > >> > > Luke
>> > >> > >
>> > >> > > On Sat, Jan 13, 2024 at 8:36 AM Chris Egerton <
>> > fearthecel...@gmail.com
>> > >> >
>> > >> > > wrote:
>> > >> > >
>> > >> > > > Thanks, Kirk!
>> > >> > > >
>> > >> > > > @Stanislav--do you believe that this warrants a new RC?
>> > >> > > >
>> > >> > > > On Fri, Jan 12, 2024, 19:08 Kirk True <k...@kirktrue.pro>
>> wrote:
>> > >> > > >
>> > >> > > > > Hi Chris/Stanislav,
>> > >> > > > >
>> > >> > > > > I'm working on the 'Unable to find FetchSessionHandler' log
>> > problem
>> > >> > > > > (KAFKA-16029) and have put out a draft PR (
>> > >> > > > > https://github.com/apache/kafka/pull/15186). I will use the
>> > >> > quickstart
>> > >> > > > > approach as a second means to reproduce/verify while I wait
>> for
>> > the
>> > >> > > PR's
>> > >> > > > > Jenkins job to finish.
>> > >> > > > >
>> > >> > > > > Thanks,
>> > >> > > > > Kirk
>> > >> > > > >
>> > >> > > > > On Fri, Jan 12, 2024, at 11:31 AM, Chris Egerton wrote:
>> > >> > > > > > Hi Stanislav,
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > > > Thanks for running this release!
>> > >> > > > > >
>> > >> > > > > > To verify, I:
>> > >> > > > > > - Built from source using Java 11 with both:
>> > >> > > > > > - - the 3.7.0-rc2 tag on GitHub
>> > >> > > > > > - - the kafka-3.7.0-src.tgz artifact from
>> > >> > > > > >
>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/
>> > >> > > > > > - Checked signatures and checksums
>> > >> > > > > > - Ran the quickstart using both:
>> > >> > > > > > - - The kafka_2.13-3.7.0.tgz artifact from
>> > >> > > > > >
>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/
>> > >> with
>> > >> > > Java
>> > >> > > > > 11
>> > >> > > > > > and Scala 13 in KRaft mode
>> > >> > > > > > - - Our shiny new broker Docker image,
>> apache/kafka:3.7.0-rc2
>> > >> > > > > > - Ran all unit tests
>> > >> > > > > > - Ran all integration tests for Connect and MM2
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > > > I found two minor areas for concern:
>> > >> > > > > >
>> > >> > > > > > 1. (Possibly a blocker)
>> > >> > > > > > When running the quickstart, I noticed this ERROR-level log
>> > >> message
>> > >> > > > being
>> > >> > > > > > emitted frequently (not not every time) when I killed my
>> > console
>> > >> > > > consumer
>> > >> > > > > > via ctrl-C:
>> > >> > > > > >
>> > >> > > > > > > [2024-01-12 11:00:31,088] ERROR [Consumer
>> > >> > > clientId=console-consumer,
>> > >> > > > > > groupId=console-consumer-74388] Unable to find
>> > >> FetchSessionHandler
>> > >> > > for
>> > >> > > > > node
>> > >> > > > > > 1. Ignoring fetch response
>> > >> > > > > > (org.apache.kafka.clients.consumer.internals.AbstractFetch)
>> > >> > > > > >
>> > >> > > > > > I see that this error message is already reported in
>> > >> > > > > > https://issues.apache.org/jira/browse/KAFKA-16029. I
>> think we
>> > >> > should
>> > >> > > > > > prioritize fixing it for this release. I know it's probably
>> > >> benign
>> > >> > > but
>> > >> > > > > it's
>> > >> > > > > > really not a good look for us when basic operations log
>> error
>> > >> > > messages,
>> > >> > > > > and
>> > >> > > > > > it may give new users some headaches.
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > > > 2. (Probably not a blocker)
>> > >> > > > > > The following unit tests failed the first time around, and
>> > all of
>> > >> > > them
>> > >> > > > > > passed the second time I ran them:
>> > >> > > > > >
>> > >> > > > > > - (clients)
>> > >> > > > >
>> ClientUtilsTest.testParseAndValidateAddressesWithReverseLookup()
>> > >> > > > > > - (clients) SelectorTest.testConnectionsByClientMetric()
>> > >> > > > > > - (clients)
>> Tls13SelectorTest.testConnectionsByClientMetric()
>> > >> > > > > > - (connect)
>> > >> > > TopicAdminTest.retryEndOffsetsShouldRetryWhenTopicNotFound
>> > >> > > > (I
>> > >> > > > > > thought I fixed this one! 🤬🤬)
>> > >> > > > > > - (core)
>> > ProducerIdManagerTest.testUnrecoverableErrors(Errors)[2]
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > > > Thanks again for your work on this release, and
>> > congratulations
>> > >> to
>> > >> > > > Kafka
>> > >> > > > > > Streams for having zero flaky unit tests during my
>> > >> > > highly-experimental
>> > >> > > > > > single laptop run!
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > > > Cheers,
>> > >> > > > > >
>> > >> > > > > > Chris
>> > >> > > > > >
>> > >> > > > > > On Thu, Jan 11, 2024 at 1:33 PM Stanislav Kozlovski
>> > >> > > > > > <stanis...@confluent.io.invalid> wrote:
>> > >> > > > > >
>> > >> > > > > > > Hello Kafka users, developers, and client-developers,
>> > >> > > > > > >
>> > >> > > > > > > This is the first candidate for release of Apache Kafka
>> > 3.7.0.
>> > >> > > > > > >
>> > >> > > > > > > Note it's named "RC2" because I had a few "failed" RCs
>> that
>> > I
>> > >> had
>> > >> > > > > > > cut/uploaded but ultimately had to scrap prior to
>> announcing
>> > >> due
>> > >> > to
>> > >> > > > new
>> > >> > > > > > > blockers arriving before I could even announce them.
>> > >> > > > > > >
>> > >> > > > > > > Further - I haven't yet been able to set up the system
>> tests
>> > >> > > > > successfully.
>> > >> > > > > > > And the integration/unit tests do have a few failures
>> that I
>> > >> have
>> > >> > > to
>> > >> > > > > spend
>> > >> > > > > > > time triaging. I would appreciate any help in case anyone
>> > >> notices
>> > >> > > any
>> > >> > > > > tests
>> > >> > > > > > > failing that they're subject matters experts in. Expect
>> me
>> > to
>> > >> > > follow
>> > >> > > > > up in
>> > >> > > > > > > a day or two with more detailed analysis.
>> > >> > > > > > >
>> > >> > > > > > > Major changes include:
>> > >> > > > > > > - Early Access to KIP-848 - the next generation of the
>> > consumer
>> > >> > > > > rebalance
>> > >> > > > > > > protocol
>> > >> > > > > > > - KIP-858: Adding JBOD support to KRaft
>> > >> > > > > > > - KIP-714: Observability into Client metrics via a
>> > standardized
>> > >> > > > > interface
>> > >> > > > > > >
>> > >> > > > > > > Check more information in the WIP blog post:
>> > >> > > > > > > https://github.com/apache/kafka-site/pull/578
>> > >> > > > > > >
>> > >> > > > > > > Release notes for the 3.7.0 release:
>> > >> > > > > > >
>> > >> > > > > > >
>> > >> > > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> >
>> https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/RELEASE_NOTES.html
>> > >> > > > > > >
>> > >> > > > > > > *** Please download, test and vote by Thursday, January
>> 18,
>> > 9am
>> > >> > PT
>> > >> > > > ***
>> > >> > > > > > >
>> > >> > > > > > > Usually these deadlines tend to be 2-3 days, but due to
>> this
>> > >> > being
>> > >> > > > the
>> > >> > > > > > > first RC and the tests not having ran yet, I am giving
>> it a
>> > bit
>> > >> > > more
>> > >> > > > > time.
>> > >> > > > > > >
>> > >> > > > > > > Kafka's KEYS file containing PGP keys we use to sign the
>> > >> release:
>> > >> > > > > > > https://kafka.apache.org/KEYS
>> > >> > > > > > >
>> > >> > > > > > > * Release artifacts to be voted upon (source and binary):
>> > >> > > > > > >
>> > https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/
>> > >> > > > > > >
>> > >> > > > > > > * Docker release artifact to be voted upon:
>> > >> > > > > > > apache/kafka:3.7.0-rc2
>> > >> > > > > > >
>> > >> > > > > > > * Maven artifacts to be voted upon:
>> > >> > > > > > >
>> > >> > > >
>> > >>
>> https://repository.apache.org/content/groups/staging/org/apache/kafka/
>> > >> > > > > > >
>> > >> > > > > > > * Javadoc:
>> > >> > > > > > >
>> > >> > >
>> > https://home.apache.org/~stanislavkozlovski/kafka-3.7.0-rc2/javadoc/
>> > >> > > > > > >
>> > >> > > > > > > * Tag to be voted upon (off 3.7 branch) is the 3.7.0 tag:
>> > >> > > > > > > https://github.com/apache/kafka/releases/tag/3.7.0-rc2
>> > >> > > > > > >
>> > >> > > > > > > * Documentation:
>> > >> > > > > > > https://kafka.apache.org/37/documentation.html
>> > >> > > > > > >
>> > >> > > > > > > * Protocol:
>> > >> > > > > > > https://kafka.apache.org/37/protocol.html
>> > >> > > > > > >
>> > >> > > > > > > * Successful Jenkins builds for the 3.7 branch:
>> > >> > > > > > > Unit/integration tests:
>> > >> > > > > > >
>> > https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.7/58/
>> > >> > > > > > > There are failing tests here. I have to follow up with
>> > triaging
>> > >> > > some
>> > >> > > > of
>> > >> > > > > > > the failures and figuring out if they're actual problems
>> or
>> > >> > simply
>> > >> > > > > flakes.
>> > >> > > > > > >
>> > >> > > > > > > System tests:
>> > >> > > > > https://jenkins.confluent.io/job/system-test-kafka/job/3.7/
>> > >> > > > > > >
>> > >> > > > > > > No successful system test runs yet. I am working on
>> getting
>> > the
>> > >> > job
>> > >> > > > to
>> > >> > > > > run.
>> > >> > > > > > >
>> > >> > > > > > > * Successful Docker Image Github Actions Pipeline for 3.7
>> > >> branch:
>> > >> > > > > > > Attached are the scan_report and report_jvm output files
>> > from
>> > >> the
>> > >> > > > > Docker
>> > >> > > > > > > Build run:
>> > >> > > > > > >
>> > >> > > > >
>> > >> > >
>> > >>
>> https://github.com/apache/kafka/actions/runs/7486094960/job/20375761673
>> > >> > > > > > >
>> > >> > > > > > > And the final docker image build job - Docker Build Test
>> > >> > Pipeline:
>> > >> > > > > > > https://github.com/apache/kafka/actions/runs/7486178277
>> > >> > > > > > >
>> > >> > > > > > > The image is apache/kafka:3.7.0-rc2 -
>> > >> > > > > > >
>> > >> > > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> >
>> https://hub.docker.com/layers/apache/kafka/3.7.0-rc2/images/sha256-5b4707c08170d39549fbb6e2a3dbb83936a50f987c0c097f23cb26b4c210c226?context=explore
>> > >> > > > > > >
>> > >> > > > > > > /**************************************
>> > >> > > > > > >
>> > >> > > > > > > Thanks,
>> > >> > > > > > > Stanislav Kozlovski
>> > >> > > > > > >
>> > >> > > > > >
>> > >> > > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> > >>
>> > >> --
>> > >> Best,
>> > >> Stanislav
>> > >>
>> >
>>
>

Reply via email to