Hi, Ismael,

Thanks for the reply.

Yes, that's the main issue. By changing a field to non-nullable without a
version bump, it implies that this change applies to all existing versions.
If anyone implements this protocol for a client, the client could break
when talking to a broker before 4.0.

Since changing the protocol properly takes time, I agree that we can revert
the change in 4.0 for now and follow up on this with a separate KIP.

Thanks,

Jun

On Wed, Mar 5, 2025 at 11:41 PM Ismael Juma <m...@ismaeljuma.com> wrote:

> Hi Jun,
>
> Thanks for uncovering this issue. Thinking through it, I think there are
> two hard constraints:
>
> 1. We do not want 4.0 brokers to break recent releases of widely deployed
> clients.
> 2. We do not want 4.0 clients to break with supported broker versions (2.1
> and newer).
>
> The issue you raised would mean that 4.0 clients would fail during
> deserialization when they run into these edge cases with brokers with
> version from 2.1 to 3.9 (inclusive). Is it correct that this is your main
> concern? If so, I agree.
>
> For 4.0, let's make `records` nullable again. I don't like this change,
> because the broker should never return nullable records, but I can't think
> of another solution that satisfies the two hard constraints I outlined
> above. As you suggested in the PR discussion, let's discuss longer term
> improvements (including your suggestion in this thread) via a separate KIP.
>
> PR: https://github.com/apache/kafka/pull/19131
>
> Ismael
>
> On Mon, Mar 3, 2025 at 9:42 AM Jun Rao <j...@confluent.io.invalid> wrote:
>
> > Hi, Ismael,
> >
> > I want to start a followup discussion on the following update in the KIP.
> >
> > "Finally, we will fix a protocol specification inconsistency:
> FetchResponse
> >  has a records  field that is tagged as nullable  even though it is
> always
> > non-null and all librdkafka-based clients (which cover a large percentage
> > of clients in the wild) break if this field is set to null . We adjust
> the
> > spec to match reality - this way implementors do not have to find out
> about
> > this requirement during testing with real clients."
> >
> > As discussed in
> > https://github.com/apache/kafka/pull/18726#discussion_r1972525165, in
> 3.9,
> > we actually do have a few code paths that will leave the records field
> null
> > in the FetchResponse.
> >
> > Compression codec for topic is ZSTD and fetch version < 10.
> >
> >
> https://github.com/apache/kafka/blob/3.9/core/src/main/scala/kafka/server/KafkaApis.scala#L835
> >
> > Down-conversion of zstandard-compressed
> >
> >
> https://github.com/apache/kafka/blob/3.9/core/src/main/scala/kafka/server/KafkaApis.scala#L884
> >
> > Generic uncaught exception through
> >
> >
> https://github.com/apache/kafka/blob/3.9/clients/src/main/java/org/apache/kafka/common/requests/FetchRequest.java#L365
> >
> > So, it seems that we can't just mark records as non-nullable without a
> > version change since it can be null in some of the rare cases.
> >
> > The inconsistency in the current implementation is not ideal though and
> it
> > would be useful to improve it. One potential way is to bump up the
> > FetchRequest version and make records officially non-nullable. If we do
> > that, we probably want to do the same for
> > FetchResponse.AbortedTransactions, ShareFetchResponse.records and
> > potentially ProduceRequest.records for better consistency.
> >
> > Thanks,
> >
> > Jun
> >
> > On Mon, Nov 27, 2023 at 12:12 AM Anton Agestam
> > <anton.ages...@aiven.io.invalid> wrote:
> >
> > > Sorry, the underscore was meant to refer to the
> > >
> > >
> >
> https://github.com/apache/kafka/tree/trunk/clients/src/main/resources/common/message
> > > directory in the previous message.
> > >
> > > Den fre 24 nov. 2023 kl 14:30 skrev Anton Agestam <
> > anton.ages...@aiven.io
> > > >:
> > >
> > > > Hi Ismael,
> > > >
> > > > This looks like a healthy KIP for Kafka 👍
> > > >
> > > > As the implementer of a Kafka Protocol library for Python,
> > Aiven-Open/kio
> > > > [1], I'm curious how this change will affect the library.
> > > >
> > > > We generate entities for the full protocol by introspecting the JSON
> > > > schema definitions under _. How will the KIP change those
> definitions?
> > > Will
> > > > the dropped versions be reflected as bumps of the lower limit
> > > > in "validVersions"?
> > > >
> > > > Thanks and BR,
> > > > Anton
> > > >
> > > > [1]: https://github.com/Aiven-Open/kio
> > > >
> > > > On 2023/01/03 16:17:24 Ismael Juma wrote:
> > > > > Hi all,
> > > > >
> > > > > I would like to start a discussion regarding the removal of very
> old
> > > > client
> > > > > protocol API versions in Apache Kafka 4.0 to improve
> maintainability
> > &
> > > > > supportability of Kafka. Please take a look at the proposal:
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-896%3A+Remove+old+client+protocol+API+versions+in+Kafka+4.0
> > > > >
> > > > > Ismael
> > > > >
> > > >
> > >
> >
>

Reply via email to