Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-03-11 Thread via GitHub


ijuma commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1982839120


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   Thanks @junrao. I responded to the KIP message and submitted 
https://github.com/apache/kafka/pull/19131



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-03-05 Thread via GitHub


junrao commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1981925892


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   cc @dajac since this is a potential 4.0.0 blocker.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-03-03 Thread via GitHub


junrao commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1977923340


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   Thanks. I missed the update to the KIP wiki. Somehow I didn't see this 
particular update in the discussion thread. We could discuss a bit more on the 
options as part of this KIP. I just sent an update to the discussion thread.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-02-28 Thread via GitHub


ijuma commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1976211371


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   You have pointed out a few things we had missed in the original analysis 
though.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-02-28 Thread via GitHub


junrao commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1976015748


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   @chia7712 : You are right that the Errors.TOPIC_AUTHORIZATION_FAILED path 
doesn't introduce null records. We do have a few paths that could generate null 
records. In addition to the two you pointed out, there is another one in the 
uncaught error handling.
   
https://github.com/apache/kafka/blob/3.9/core/src/main/scala/kafka/server/KafkaApis.scala#L175
   
   This eventually calls 
   
https://github.com/apache/kafka/blob/3.9/core/src/main/scala/kafka/server/KafkaApis.scala#L175



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-02-28 Thread via GitHub


ijuma commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1976211222


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   Hmm, we did not set that precedent here though, I did update KIP-896 and 
noted it in the discussion thread:
   
   > Finally, we will fix a protocol specification inconsistency: FetchResponse 
 has a records  field that is tagged as nullable  even though it is always 
non-null and all librdkafka-based clients (which cover a large percentage of 
clients in the wild) break if this field is set to null . We adjust the spec to 
match reality - this way implementors do not have to find out about this 
requirement during testing with real clients. This small change is not strictly 
related to this KIP, but it was found during the testing phase of this KIP and 
hence we included it here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-02-28 Thread via GitHub


junrao commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1976209643


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   @ijuma : 
   
   > 1. I can remove non-nullable from the schema if people feel strongly.
   
   That's my feeling since we don't want to set a precedence of changing a 
schema without a KIP, unless there is a strong reason.
   
   > 2. But I will still make sure the Kafka code never returns null for 
records.
   
   That's fine since it doesn't violate the original schema.
   
   Post 4.0, if we want to enforce non-null at the schema level, we could do 
that through a KIP.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-02-28 Thread via GitHub


ijuma commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1976186661


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   @junrao I don't think we can break so many clients in the wild. That would 
definitely require a KIP (and I would be against a KIP that did that 
immediately - it would require years of notice). So, we'd have to specify very 
clearly when null can or cannot be set. Your proposed definition doesn't 
actually cover the case that breaks librdkafka: [authz 
errors](https://github.com/confluentinc/librdkafka/blob/876cf4a7e4a339fda8cea1c4a650c168a351a176/tests/0119-consumer_auth.cpp#L141).
   
   My take:
   1. I can remove non-nullable from the schema if people feel strongly.
   2. But I will still make sure the Kafka code never returns null for records.
   
   It's actually more work to ensure the latter without the former, but I am ok 
going that direction. I am not ok with any solution that breaks recently 
released and widely deployed clients.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-02-28 Thread via GitHub


junrao commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1976167164


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   @chia7712 : My feeling is that we need a strong reason to change a schema 
without a KIP. Assuming that the above findings are correct, there is no strong 
reason. So, I'd recommend that we do the following in 4.0.
   
   1. Revert the schema change and make records nullable.
   2. Add a comment in the schema that records could be null only when 
errorCode is not 0.
   3. Ask librd to handle the potential null records correctly.
   
   Post 4.0, it would be useful to be more consistent on setting the values for 
records. We could (1) consistently set records to null for any error; or (2) 
make records non-nullable (ideally with a KIP). Either way, we probably want to 
handle the records in ShareFetchResponse in the same way.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-02-28 Thread via GitHub


chia7712 commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1976088168


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   If we agree that librdkafka should handle null records and the bug fix, we 
can simply reinstate the nullable field in the protocol. 
   
   Alternatively, we can announce that version 4.x will never return null 
records, and the FetchResponse will manually convert null records to empty 
record. This approach benefits all third-party client libraries, even those 
libraries don't handle null records.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-02-28 Thread via GitHub


junrao commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1976078647


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   It seems that 4.0 and trunk have the same path that could lead to null 
records. 
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/requests/FetchRequest.java#L365



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-02-28 Thread via GitHub


junrao commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1976015748


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   @chia7712 : You are right that the Errors.TOPIC_AUTHORIZATION_FAILED path 
doesn't introduce null records. We do have a few paths that could generate null 
records. In addition to the two you pointed out, there is another one in the 
uncaught error handling.
   
https://github.com/apache/kafka/blob/3.9/core/src/main/scala/kafka/server/KafkaApis.scala#L175
   
   This eventually calls 
   
https://github.com/apache/kafka/blob/3.9/clients/src/main/java/org/apache/kafka/common/requests/FetchRequest.java#L365



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-02-26 Thread via GitHub


chia7712 commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1972681634


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   >  are you saying that we actually don't end up with null records in that 
case?
   
   A 3.9 server does return null records in extremely rare cases, but we don't 
end up with null records for newer clients as down-conversion does not happen.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-02-26 Thread via GitHub


ijuma commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1972672957


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   @chia7712 The unsupported compression case is extremely rare because it 
needs a client that does not support zstd and the topic to be configured with 
zstd. Each is rare in isolation (compression is usually defined on the producer 
and zstd has been supported by clients since 2.1) and the combination is even 
more so.
   
   For @junrao's question, @chia7712 are you saying that we actually don't end 
up with null records in that case? I haven't checked it carefully yet.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-02-26 Thread via GitHub


junrao commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1972659486


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   My point is that before 4.0, the server could already include null records 
for certain errors. So the client already needs to deal with that. If it 
doesn't, it's a bug in the client that needs to be fixed. Once the client is 
fixed, it's not necessary to make the records non-null. We could consider 
changing it to be non-null, but it would be useful to think through if this 
change should be applied consistently to other places such as 
ShareFetchResponse and ProduceRequest.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-02-26 Thread via GitHub


chia7712 commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1972645207


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   > It seems that before 
https://github.com/apache/kafka/commit/fe56fc98fa736c79c9dcbb1f64f810065161a1cc,
 we already have a bunch of calls like the following that will leave records as 
null in the fetchResponse.
   
   in the normal path, `FetchResponse.recordsOrFail(partitionData)` will return 
empty record to replace null.
   ```
   public static Records recordsOrFail(FetchResponseData.PartitionData 
partition) {
   if (partition.records() == null) return MemoryRecords.EMPTY;
   if (partition.records() instanceof Records) return (Records) 
partition.records();
   throw new ClassCastException("The record type is " + 
partition.records().getClass().getSimpleName() + ", which is not a subtype of " 
+
   Records.class.getSimpleName() + ". This method is only safe to 
call if the `FetchResponse` was deserialized from bytes.");
   }
   ```
   
   
https://github.com/apache/kafka/blob/3.9/core/src/main/scala/kafka/server/KafkaApis.scala#L848
   
   However, your comment inspired me to notice that we do have a path which 
returns null records. See the following links – it could return 
`FetchResponse.partitionResponse(tp, Errors.XXX)` when converting the records.
   
   
https://github.com/apache/kafka/blob/3.9/core/src/main/scala/kafka/server/KafkaApis.scala#L835
   
https://github.com/apache/kafka/blob/3.9/core/src/main/scala/kafka/server/KafkaApis.scala#L884
   
   ```
   val downConvertMagic =
 logConfig.map(_.recordVersion.value).flatMap { magic =>
   if (magic > RecordBatch.MAGIC_VALUE_V0 && versionId <= 1)
 Some(RecordBatch.MAGIC_VALUE_V0)
   else if (magic > RecordBatch.MAGIC_VALUE_V1 && versionId <= 3)
 Some(RecordBatch.MAGIC_VALUE_V1)
   else
 None
 }
   ```
   However, a 4.0 consumer should never receive a null record because it cannot 
use fetch versions v0-v3. Conversely, the server rejects v0-v3 requests, and 
the down-conversion process is also removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-02-26 Thread via GitHub


junrao commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1972525165


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   Sorry for the late reply. It seems that before 
https://github.com/apache/kafka/commit/fe56fc98fa736c79c9dcbb1f64f810065161a1cc,
 we already have a bunch of calls like the following that will leave records as 
null in the fetchResponse.
   
   `  erroneous += topicIdPartition -> 
FetchResponse.partitionResponse(topicIdPartition, 
Errors.TOPIC_AUTHORIZATION_FAILED)`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-29 Thread via GitHub


ijuma commented on PR #18726:
URL: https://github.com/apache/kafka/pull/18726#issuecomment-2621937261

   I added the following to KIP-896, so there is a record of this:
   
   > Finally, we will fix a protocol specification inconsistency: FetchResponse 
 has a records  field that is tagged as nullable  even though it is always 
non-null and all librdkafka-based clients (which cover a large percentage of 
clients in the wild) break if this field is set to null . We adjust the spec to 
match reality - this way implementors do not have to find out about this 
requirement during testing with real clients. This small change is not strictly 
related to this KIP, but it was found during the testing phase of this KIP and 
hence we included it here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-29 Thread via GitHub


ijuma merged PR #18726:
URL: https://github.com/apache/kafka/pull/18726


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-29 Thread via GitHub


ijuma commented on PR #18726:
URL: https://github.com/apache/kafka/pull/18726#issuecomment-2621901656

   Thanks for the reviews! We're getting close to the end of this KIP-896 saga. 
:)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-29 Thread via GitHub


mumrah commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1933988018


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   Since this behavior in librdkafka has existed for a long time, and we have 
not seen an issue with null records prior to the recent changes, I think it is 
safe to say we have never returned null records.
   
   I think fixing the protocol spec makes sense.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-29 Thread via GitHub


chia7712 commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1933658363


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   > Again, let's discuss a concrete use case - you're worried that the Java 
consumer might break because there is an unknown Kafka broker that implements 
this differently?
   
   I have no evidence that there is a broker implementation that returns null 
records. If librdkafka and related clients constitute 30-50% of clients, it is 
acceptable to fix the specification.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-28 Thread via GitHub


ijuma commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1933347741


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   I disagree, the actual spec here is that this field can never be null. As I 
said, there is no Kafka implementation that doesn't work with librdkafka and 
related clients - it constitutes 30-50% of Kafka clients in the wild.
   
   What happened here is that the spec definition was actually incorrect - it 
specified the field as nullable even though it couldn't be. And that bug meant 
we accidentally broke a large part of the ecosystem. I don't see the benefit in 
not fixing the spec.
   
   Again, let's discuss a concrete use case - you're worried that the Java 
consumer might break because there is an unknown Kafka broker that implements 
this differently?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-28 Thread via GitHub


ijuma commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1933347741


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   I disagree, the actual spec here is that this field can never be null. As I 
said, there is no Kafka implementation that doesn't work with librdkafka - it 
constitutes 30-50% of Kafka clients in the wild.
   
   What happened here is that the spec definition was actually incorrect - it 
specified the field as nullable even though it couldn't be. And that bug meant 
we accidentally broke a large part of the ecosystem. I don't see the benefit in 
not fixing the spec.
   
   Again, let's discuss a concrete use case - you're worried that the Java 
consumer might break because there is an unknown Kafka broker that implements 
this differently?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-28 Thread via GitHub


chia7712 commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1933342089


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   > Please elaborate on the risk - what is the exact use case where that would 
happen? I couldn't come up with one.
   
   I believe `FetchResponse.json` should be considered part of the public 
interface. Consequently, modifying the "released spec" carries inherent risks. 
We cannot guarantee that no external implementations adhere to our 
specification. For instance, other server implementations might return null 
records, and after this PR, our 4.0 client would no longer be able to read them.
   
   > Keep in mind that released versions of Kafka never return null records
   
   that is true and it does not violate the spec, right? I mean "apache kafka 
can never return null even though the spec says it is valid to return null"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-28 Thread via GitHub


ijuma commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1933326937


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   Keep in mind that released versions of Kafka never returns null records - if 
that ever happened, it would have broken librdkafka. Same for any other 
implementation of the Kafka protocol.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-28 Thread via GitHub


ijuma commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1933326937


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   Keep in mind that released versions of Kafka never return null records - if 
that ever happened, it would have broken librdkafka. Same for any other 
implementation of the Kafka protocol.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-28 Thread via GitHub


ijuma commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1933326937


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   Keep in mind that released versions of Kafka never returns null records - if 
that ever happened, it would have broken librdkafka.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-28 Thread via GitHub


ijuma commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1933326937


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   Keep in mind that Kafka never returns null records - if that ever happened, 
it would have broken librdkafka.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-28 Thread via GitHub


ijuma commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1933326371


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   Please elaborate on the risk - what is the exact use case where that would 
happen? I couldn't come up with one.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-28 Thread via GitHub


chia7712 commented on code in PR #18726:
URL: https://github.com/apache/kafka/pull/18726#discussion_r1933316449


##
clients/src/main/resources/common/message/FetchResponse.json:
##
@@ -106,7 +106,7 @@
 ]},
 { "name": "PreferredReadReplica", "type": "int32", "versions": "11+", 
"default": "-1", "ignorable": false, "entityType": "brokerId",
   "about": "The preferred read replica for the consumer to use on its 
next fetch request."},
-{ "name": "Records", "type": "records", "versions": "0+", 
"nullableVersions": "0+", "about": "The record data."}

Review Comment:
   Removing `nullableVersions` would prevent Kafka from "reading" null records. 
This could potentially pose a risk. Therefore, we might consider a strategy 
where Kafka strictly enforces that null records are never "written" while still 
maintaining the ability to "read" them.
   
   In summary, I favor @mumrah's suggestion of adding a null check. 
Furthermore, we should include a comment explaining the rationale behind 
preventing Kafka from writing null records.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-28 Thread via GitHub


ijuma commented on PR #18726:
URL: https://github.com/apache/kafka/pull/18726#issuecomment-2620376230

   > With this patch it looks like we will only set records in 
processResponseCallback or when building an error response for a partition 
(FetchResponse#partitionResponse)
   
   We only changed this one place because it was the only place where it was 
incorrect outside of tests. This was confirmed after we make `records` 
non-nullable in the schema (only tests had to be updated).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-28 Thread via GitHub


ijuma commented on PR #18726:
URL: https://github.com/apache/kafka/pull/18726#issuecomment-2620376991

   @chia7712 @mumrah Let me know if this looks good to you now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-28 Thread via GitHub


ijuma commented on PR #18726:
URL: https://github.com/apache/kafka/pull/18726#issuecomment-2620376663

   The only test failing is:
   
   > ❌ KafkaAdminClientTest > testAdminClientApisAuthenticationFailure()


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-28 Thread via GitHub


ijuma commented on PR #18726:
URL: https://github.com/apache/kafka/pull/18726#issuecomment-2619484430

   Is the suggestion to remove nullable from this field? I can try that change 
and check what fails.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-28 Thread via GitHub


chia7712 commented on PR #18726:
URL: https://github.com/apache/kafka/pull/18726#issuecomment-2619440235

   @mumrah Thanks for sharing. You've saved me time :)
   
   > Should we add a null check on records in FetchPartitionData to prevent 
reintroducing this bug?
   
   I completely agree that 4.0 should not burn out third-party libraries. 
However, this seems to conflict with specification `nullableVersions": "0+`. I 
suggest adding a comment to explain the rationale behind the null check.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18646: Null records in fetch response breaks librdkafka [kafka]

2025-01-28 Thread via GitHub


ijuma commented on PR #18726:
URL: https://github.com/apache/kafka/pull/18726#issuecomment-2619186161

   Failures are unrelated to this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org