Hi David,

Yeah, this is kind of difficult.

>From the high level, we either need forward compatibility (old brokers able to 
>read new ApiVersionsRequests) or some kind of challenge/response system.  I 
>think you're on the right track to be thinking about forward compatibility...  
>a challenge/response system would have a lot of overhead in cases where the 
>client opens a lot of new connections.

I agree that we can have the new versions "add stuff at the end" to maintain 
forward compatibility.  Tagged fields will also help here, by avoiding the need 
for version bumps.  We also have to think about the impact this will have on 
the request header.  It seems like we might need to freeze the version of the 
request header forever in order to support full forwards compatibility.  
Otherwise, the old brokers will not know how long the new request headers are.  
This is maybe OK if we can evolve the request header by adding tagged fields...

Another option is to fall all the way back to version 0 when the broker doesn't 
understand the client's supplied version.  Version 0 had no content in the 
request at all, so there is no need for the broker to know exactly how long the 
supplied request header is.  (This assumes that the first 4 fields of the 
request header will never change, which seems like a reasonable assumption...)

best,
Colin


On Tue, Sep 3, 2019, at 23:54, David Jacot wrote:
> Hi Colin,
> 
> Thanks for your input. Please, find my comments below:
> 
> >> Currently, we don't parse the contents of ApiVersionsRequest at all,
> since it's an empty message.  KIP-511 proposes adding some fields here,
> which will clearly change that situation.  In the future, any changes to
> ApiVersionsRequest will have to only add stuff at the end, rather than
> changing fields that already exist.  This will significantly limit our
> ability to add new things over time.
> >> Therefore, I think we should make sure that KIP-511 is implemented after
> KIP-482, so that the version we freeze into place can be one that includes
> the ability to add tagged fields, and includes the more efficient string
> and array serialization specified in KIP-482.  It's probably worth spelling
> that out here.
> 
> I agree with you. It makes sense to bump the version once for the two
> cases.
> 
> >> On another topic, when the client sends an unsupported version N of
> ApiVersionsRequest, and the broker only supports versions up to M, the
> broker currently falls back to sending a version 0 response.  Why can't the
> broker fall back to version M instead?  Previously, we only had versions 1
> and 0 of ApiVersionsRequest, so the two behaviors were identical.  But
> going forward, once we have version 2 and later of ApiVersoinsRequest, it
> doesn't make sense to fall back all the way to 0 if the broker supports 1
> (for example).
> >> If you agree, it would be good to spell this out in the KIP, so that if
> we want to add more things to the response, we can, without losing them
> each time the client's version of ApiVersionsRequest exceeds the broker's.
> 
> I fully agree with you and I have already outlined this in the proposal,
> see "ApiVersions Request/Response Handling". The idea is to use the version
> M in the broker so it can leverage the provided information it knows about.
> The broker can then send back response with version M as well. On the
> client side, it is a bit more tricky. As there is no version in the
> response, the client will use the version used for the request to parse the
> response. If fields have been added to the schema of the response version
> N, it won't work to parse M. As the client doesn't know the version used,
> we have two options: 1) use version 0 which is a prefix of all others but
> it means losing information; or 2) try versions in descending order from N
> to 0. I guess that N-1 would the one in most of the cases.
> 
> What is your opinion regarding the client side?
> 
> Best,
> David
> 
> On Wed, Sep 4, 2019 at 12:13 AM Colin McCabe <cmcc...@apache.org> wrote:
> 
> > Hi David,
> >
> > Thanks again for the KIP.
> >
> > Currently, we don't parse the contents of ApiVersionsRequest at all, since
> > it's an empty message.  KIP-511 proposes adding some fields here, which
> > will clearly change that situation.  In the future, any changes to
> > ApiVersionsRequest will have to only add stuff at the end, rather than
> > changing fields that already exist.  This will significantly limit our
> > ability to add new things over time.
> >
> > Therefore, I think we should make sure that KIP-511 is implemented after
> > KIP-482, so that the version we freeze into place can be one that includes
> > the ability to add tagged fields, and includes the more efficient string
> > and array serialization specified in KIP-482.  It's probably worth spelling
> > that out here.
> >
> > On another topic, when the client sends an unsupported version N of
> > ApiVersionsRequest, and the broker only supports versions up to M, the
> > broker currently falls back to sending a version 0 response.  Why can't the
> > broker fall back to version M instead?  Previously, we only had versions 1
> > and 0 of ApiVersionsRequest, so the two behaviors were identical.  But
> > going forward, once we have version 2 and later of ApiVersoinsRequest, it
> > doesn't make sense to fall back all the way to 0 if the broker supports 1
> > (for example).
> >
> > If you agree, it would be good to spell this out in the KIP, so that if we
> > want to add more things to the response, we can, without losing them each
> > time the client's version of ApiVersionsRequest exceeds the broker's.
> >
> > best,
> > Colin
> >
> >
> > On Tue, Sep 3, 2019, at 01:26, David Jacot wrote:
> > > Hi all,
> > >
> > > I have updated the KIP to address the various comments. I have also
> > added a
> > > section about the handling of the ApiVersionsRequest/Response in the
> > broker.
> > >
> > > Please, let me know what you think. I would like to make it for the next
> > > release if possible.
> > >
> > > Best,
> > > David
> > >
> > > On Fri, Aug 30, 2019 at 10:31 AM David Jacot <dja...@confluent.io>
> > wrote:
> > >
> > > > Hi Magnus,
> > > >
> > > > Thank you for your feedback. Please, find my comments below.
> > > >
> > > > 1. I thought that the clientId was meant for this purpose (providing
> > > > information about the application). Is there a gap I don't see?
> > > >
> > > > 2. I have put two fields to avoid requiring deep validation and
> > parsing on
> > > > the broker. I believe that it will be easier to use the metadata
> > downstream
> > > > like this.
> > > >
> > > > 3. Good point. I think that the broker should have some sort of
> > validation
> > > > and fail the ApiVersionRequest and/or the Connection if the validation
> > > > fails. To avoid backward compatibility issues, I think we should come
> > up
> > > > with a minimal validation and ensure it won't become more restrictive
> > in
> > > > the future.
> > > >
> > > > 4. I don't have strong opinion regarding this one but as the focus of
> > the
> > > > KIP is to gather the client's information, I suggest to discuss/address
> > > > this later on.
> > > >
> > > > Best,
> > > > David
> > > >
> > > > On Thu, Aug 29, 2019 at 6:56 PM Colin McCabe <cmcc...@apache.org>
> > wrote:
> > > >
> > > >> On Fri, Aug 23, 2019, at 00:07, Magnus Edenhill wrote:
> > > >> > Great proposal, this feature is well overdue!
> > > >> >
> > > >> > 1)
> > > >> > From an operator's perspective I don't think the kafka client
> > > >> > implementation name and version are sufficient,
> > > >> > I also believe the application name and version are of interest.
> > > >> > You could have all applications in your cluster run the same kafka
> > > >> client
> > > >> > and version, but only one type or
> > > >> > version of an application misbehave and needing to be tracked down.
> > > >>
> > > >> Hi Magnus,
> > > >>
> > > >> I think it might be better to leave this out of scope for now, and
> > think
> > > >> about it in the context of more generalized request tracing.  This is
> > a
> > > >> very deep rabbit hole, and I could easily see it delaying this KIP
> > for a
> > > >> long time.  For example, if you have multiple Spark jobs producing to
> > > >> Kafka, just knowing that a client is being used by Spark may not be
> > that
> > > >> helpful.  So maybe you want a third set of fields to describe the
> > spark
> > > >> application ID and version, etc?  And then maybe that, itself, was
> > created
> > > >> by some framework... etc. Probably better to defer this discussion
> > for now
> > > >> and see how version tracking works out.
> > > >>
> > > >> >
> > > >> > While the application and client name and version could be combined
> > in
> > > >> the
> > > >> > ClientName/ClientVersion fields by
> > > >> > the user (e.g. like User-Agent), it would not be in a generalized
> > format
> > > >> > and hard for generic monitoring tools to parse correctly.
> > > >> >
> > > >> > So I'd suggest keeping ClientName and ClientVersion as the client
> > > >> > implementation name ("java" or "org.apache.kafka...") and version,
> > > >> > which can't be changed by the user/app developer, and providing two
> > > >> > optional fields for the application counterpart:
> > > >> > ApplicationName and ApplicationVersion, which are backed by
> > > >> corresponding
> > > >> > configuration properties (application.name, application.version).
> > > >> >
> > > >> > 2)
> > > >> > Do ..Name and ..Version need to be two separate fields, seeing how
> > the
> > > >> two
> > > >> > fields are ambigious when separated?
> > > >> > If we're looking to identify unique versions, combining the two
> > fields
> > > >> > would be sufficient (e.g., "java-2.3.1", "librdkafka/1.2.0",
> > > >> "sarama@1.2.3")
> > > >> > and perhaps easier to work with.
> > > >> > The actual format or content of the name-version string is
> > irrelevant as
> > > >> > long as it identifies a unique name+version.
> > > >> >
> > > >>
> > > >> Hmm.  Wouldn't the same arguments you made above about a combined
> > > >> named+version field being "hard for generic monitoring tools to parse
> > > >> correctly" apply here?  In any case, there seems to be no reason not
> > to
> > > >> just have two fields.  It avoids string parsing.
> > > >>
> > > >> >
> > > >> > 3)
> > > >> > As for allowed characters, will the broker fail the
> > ApiVersionResponse
> > > >> if
> > > >> > any of these fields contain invalid characters,
> > > >> > or will the broker sanitize the strings?
> > > >> > For future backwards compatibility (when the broker constraints
> > change
> > > >> but
> > > >> > clients are not updated) I suggest the latter.
> > > >> >
> > > >>
> > > >> I would argue that we should be strict about the characters that we
> > > >> accept, and just close the connection if the string is bad.  There's
> > no
> > > >> reason to let clients troll us with "librdkafka  " (two spaces at the
> > end)
> > > >> or a version string with slashes or control characters in it.  In
> > fact, I
> > > >> would argue we should just allow something like ([\.][a-z][A-Z][0-9])+
> > > >> This ensures that JMX will work well.  We can always loosen the
> > > >> restrictions later if there is a real need.
> > > >>
> > > >> > 4)
> > > >> > And while we're at it, can we add the broker name and version to the
> > > >> > ApiVersionResponse?
> > > >> > While an application must not use this information to detect
> > features
> > > >> (Hi
> > > >> > Jay!), it is good for troubleshooting
> > > >> > and providing more meaningful logs to the client user in case a
> > feature
> > > >> > (based on the actual api versions) is not available.
> > > >>
> > > >> I can think of several cases where people tried to set up client-side
> > > >> hacks based on the broker version, and were only stopped by the fact
> > that
> > > >> we don't expose this information.  I agree with Jay that we should
> > think
> > > >> very carefully before exposing it.  In any case, this seems out of
> > scope...
> > > >>
> > > >> best,
> > > >> Colin
> > > >>
> > > >> >
> > > >> > /Magnus
> > > >> >
> > > >> >
> > > >> > Den tors 22 aug. 2019 kl 10:09 skrev David Jacot <
> > dja...@confluent.io>:
> > > >> >
> > > >> > > Hi Satish,
> > > >> > >
> > > >> > > Thank you for your feedback!
> > > >> > >
> > > >> > > Please find my answers below.
> > > >> > >
> > > >> > > >> Did you consider taking version property by loading
> > > >> > > “kafka/kafka-version.properties” as a resource while java client
> > is
> > > >> > > initialized?  “kafka/kafka-version.properties” is shipped with
> > > >> > > kafka-clients jar.
> > > >> > >
> > > >> > > I wasn't aware of the property file. It is exactly what I need.
> > > >> Thanks for
> > > >> > > pointing that out!
> > > >> > >
> > > >> > > >> I assume this metric value will be the total no of clients
> > > >> connected
> > > >> > > to a broker irrespective of whether name and version follow the
> > > >> > > expected pattern ([-.\w]+) or not.
> > > >> > >
> > > >> > > That is correct.
> > > >> > >
> > > >> > > >> It seems client name and client version are treated as tags for
> > > >> > > `ConnectedClients` metric. If so, you may implement this metric
> > > >> > > similar to `BrokerTopicMetrics` with topic tag as mentioned
> > here[1].
> > > >> > > When is the metric removed for a specific client-name and
> > > >> > > client-version?
> > > >> > >
> > > >> > > That is correct. Client name and version are treated as tags like
> > in
> > > >> > > BrokerTopicMetrics. My plan is to remove the metric when it goes
> > > >> > > back to zero - when all clients with a given name & version are
> > > >> > > disconnected.
> > > >> > >
> > > >> > > Best,
> > > >> > > David
> > > >> > >
> > > >> > > On Wed, Aug 21, 2019 at 6:52 PM Satish Duggana <
> > > >> satish.dugg...@gmail.com>
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Hi David,
> > > >> > > > Thanks for the KIP. I have a couple of questions.
> > > >> > > >
> > > >> > > > >> For the Java client, the idea is to define two constants in
> > the
> > > >> code
> > > >> > > to
> > > >> > > > store its name and its version. If possible, the version will
> > be set
> > > >> > > > automatically based on metadata coming from gradle (or the repo
> > > >> itself)
> > > >> > > to
> > > >> > > > avoid having to do manual changes.
> > > >> > > >
> > > >> > > > Did you consider taking version property by loading
> > > >> > > > “kafka/kafka-version.properties” as a resource while java
> > client is
> > > >> > > > initialized?  “kafka/kafka-version.properties” is shipped with
> > > >> > > > kafka-clients jar.
> > > >> > > >
> > > >> > > > >> kafka.server:type=ClientMetrics,name=ConnectedClients
> > > >> > > > I assume this metric value will be the total no of clients
> > connected
> > > >> > > > to a broker irrespective of whether name and version follow the
> > > >> > > > expected pattern ([-.\w]+) or not.
> > > >> > > >
> > > >> > > > >>
> > > >> > > >
> > > >> > >
> > > >>
> > kafka.server:type=ClientMetrics,name=ConnectedClients,clientname=([-.\w]+),clientversion=([-.\w]+)
> > > >> > > > It seems client name and client version are treated as tags for
> > > >> > > > `ConnectedClients` metric. If so, you may implement this metric
> > > >> > > > similar to `BrokerTopicMetrics` with topic tag as mentioned
> > here[1].
> > > >> > > > When is the metric removed for a specific client-name and
> > > >> > > > client-version?
> > > >> > > >
> > > >> > > > 1.
> > > >> > > >
> > > >> > >
> > > >>
> > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaRequestHandler.scala#L231
> > > >> > > >
> > > >> > > > Thanks,
> > > >> > > > Satish.
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > On Wed, Aug 21, 2019 at 5:33 PM David Jacot <
> > dja...@confluent.io>
> > > >> wrote:
> > > >> > > > >
> > > >> > > > > Hi all,
> > > >> > > > >
> > > >> > > > > I would like to start a discussion for KIP-511:
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >>
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-511%3A+Collect+and+Expose+Client%27s+Name+and+Version+in+the+Brokers
> > > >> > > > >
> > > >> > > > > Let me know what you think.
> > > >> > > > >
> > > >> > > > > Best,
> > > >> > > > > David
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > >
> >
>

Reply via email to