[ 
https://issues.apache.org/jira/browse/KAFKA-14935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17716418#comment-17716418
 ] 

Andrew Thaddeus Martin commented on KAFKA-14935:
------------------------------------------------

In `clients/build/resources/main/common/message/ApiVersionsResponse.json`, I've 
found this comment:
{code:java}
  // Version 3 is the first flexible version. Tagged fields are only supported 
in the body but
  // not in the header. The length of the header must not change in order to 
guarantee the
  // backward compatibility.{code}
So it looks like there is a carve-out for this one API Key. I've looked through 
the source code and I've not been able to find where this logic is. Maybe 
everything using flexibleVersions is on Request Header v2 or Response Header 
v1. Except for ApiVersionsResponse, which uses something its custom response 
header type. That seems plausible, but I'm still mostly guessing.

> Wire Protocol Documentation Does Not Explain Header Versioning
> --------------------------------------------------------------
>
>                 Key: KAFKA-14935
>                 URL: https://issues.apache.org/jira/browse/KAFKA-14935
>             Project: Kafka
>          Issue Type: Wish
>          Components: documentation, protocol
>            Reporter: Andrew Thaddeus Martin
>            Priority: Minor
>         Attachments: kafka-versions-exchange.pcap
>
>
> The documentation for Kafka's wire protocol does not explain how an 
> individual implementing a client is able to figure out:
>  # What version of request header to use when sending messages
>  # What version of response header to expect when receiving messages
> I've been working on writing a kafka client library, which is how I came 
> across this problem. Here is the specific situation that suprised me. I took 
> a pcap of the exchange that occurs when using kafka-broker-api-versions.sh to 
> pull version support from a single-node Kafka 3.3.1 cluster. The entire 
> request is:
> {noformat}
>     00 00 00 2b     # Length: 43
>     00 12           # API Key: ApiVersions (18)
>     00 03           # API Version: 3
>     00 00 00 00     # Correlation ID: 0
>     07 61 64 6d 69 6e 2d 31    # Client ID: admin-1
>     00              # Tagged fields: None
>     12 61 70 61 63 68 65 2d 6b 61 66 6b 61 2d 6a 61 76 61   # Client Software 
> Name: apache-kafka-java
>     06 33 2e 33 2e 31                                       # Client Software 
> Version: 3.3.1
>     00              # Tagged Fields{noformat}
> From the header, we can see that the request is an ApiVersions request, 
> version 3. But how do we know about the version of the request header? The 
> presence of the null byte (indicating a zero-length tag buffer) tells us that 
> it's the v3 request header:
> {noformat}
>     Request Header v2 => request_api_key request_api_version correlation_id 
> client_id TAG_BUFFER 
>       request_api_key => INT16
>       request_api_version => INT16
>       correlation_id => INT32
>       client_id => NULLABLE_STRING{noformat}
> But how should anyone know that this is the right version of the request 
> header to use? What would happen if I sent it with a v0 or v1 or v2 request 
> header (still using a v3 ApiVersions request)? Is this even allowed? Nothing 
> in the payload itself tells us what version the version of the request header 
> is, so how was the server able to decode what it received. Maybe the kafka 
> server uses backtracking to support all of the possible request header 
> versions, but maybe it doesn't. Maybe instead, each recognized pair of 
> api_key+api_version is mapped to a specific request header version. It's not 
> clear without digging into the source code.
> I had originally decided to ignore this issue and proceed by assuming that 
> only the latest versions of request and response headers were ever used. But 
> then the response from kafka for this ApiVersions request began with:
> {noformat}
>      00 00 01 9f    # Length: 415
>      00 00 00 00    # Correlation ID: 0
>      00 00          # Error: No error
>      32             # Length: 50 (number of api_version objects that follow)
>      ...{noformat}
> Surprisingly, we get a v0 response header (and old version!). Here's the 
> difference between v0 and v1:
> {noformat}
>     Response Header v0 => correlation_id 
>       correlation_id => INT32
>     Response Header v1 => correlation_id TAG_BUFFER 
>       correlation_id => INT32{noformat}
> We do not see a null byte for an empty tag buffer, so we know this is v0. As 
> someone trying to implement a client, this was surprising to me. And on the 
> receiving end, it's no longer a "let the server figure it out with 
> heuristics" problem. The client has to be able to figure this out. How? 
> Backtracking? Some kind of implied mapping from api versions to response 
> versions?
> I want to understand how a client is expected to behave. I assume that over 
> the years people have been rediscovering whatever the answer is by reading 
> the source code and taking pcaps, but I'd like to see it spelled out plainly 
> in the documentation. Then all future client implementers can benefit from 
> this.
>  
> (I've attached the full pcap in case anyone wants to look through it.)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to