Hi David, Thanks again for the KIP.
Currently, we don't parse the contents of ApiVersionsRequest at all, since it's an empty message. KIP-511 proposes adding some fields here, which will clearly change that situation. In the future, any changes to ApiVersionsRequest will have to only add stuff at the end, rather than changing fields that already exist. This will significantly limit our ability to add new things over time. Therefore, I think we should make sure that KIP-511 is implemented after KIP-482, so that the version we freeze into place can be one that includes the ability to add tagged fields, and includes the more efficient string and array serialization specified in KIP-482. It's probably worth spelling that out here. On another topic, when the client sends an unsupported version N of ApiVersionsRequest, and the broker only supports versions up to M, the broker currently falls back to sending a version 0 response. Why can't the broker fall back to version M instead? Previously, we only had versions 1 and 0 of ApiVersionsRequest, so the two behaviors were identical. But going forward, once we have version 2 and later of ApiVersoinsRequest, it doesn't make sense to fall back all the way to 0 if the broker supports 1 (for example). If you agree, it would be good to spell this out in the KIP, so that if we want to add more things to the response, we can, without losing them each time the client's version of ApiVersionsRequest exceeds the broker's. best, Colin On Tue, Sep 3, 2019, at 01:26, David Jacot wrote: > Hi all, > > I have updated the KIP to address the various comments. I have also added a > section about the handling of the ApiVersionsRequest/Response in the broker. > > Please, let me know what you think. I would like to make it for the next > release if possible. > > Best, > David > > On Fri, Aug 30, 2019 at 10:31 AM David Jacot <dja...@confluent.io> wrote: > > > Hi Magnus, > > > > Thank you for your feedback. Please, find my comments below. > > > > 1. I thought that the clientId was meant for this purpose (providing > > information about the application). Is there a gap I don't see? > > > > 2. I have put two fields to avoid requiring deep validation and parsing on > > the broker. I believe that it will be easier to use the metadata downstream > > like this. > > > > 3. Good point. I think that the broker should have some sort of validation > > and fail the ApiVersionRequest and/or the Connection if the validation > > fails. To avoid backward compatibility issues, I think we should come up > > with a minimal validation and ensure it won't become more restrictive in > > the future. > > > > 4. I don't have strong opinion regarding this one but as the focus of the > > KIP is to gather the client's information, I suggest to discuss/address > > this later on. > > > > Best, > > David > > > > On Thu, Aug 29, 2019 at 6:56 PM Colin McCabe <cmcc...@apache.org> wrote: > > > >> On Fri, Aug 23, 2019, at 00:07, Magnus Edenhill wrote: > >> > Great proposal, this feature is well overdue! > >> > > >> > 1) > >> > From an operator's perspective I don't think the kafka client > >> > implementation name and version are sufficient, > >> > I also believe the application name and version are of interest. > >> > You could have all applications in your cluster run the same kafka > >> client > >> > and version, but only one type or > >> > version of an application misbehave and needing to be tracked down. > >> > >> Hi Magnus, > >> > >> I think it might be better to leave this out of scope for now, and think > >> about it in the context of more generalized request tracing. This is a > >> very deep rabbit hole, and I could easily see it delaying this KIP for a > >> long time. For example, if you have multiple Spark jobs producing to > >> Kafka, just knowing that a client is being used by Spark may not be that > >> helpful. So maybe you want a third set of fields to describe the spark > >> application ID and version, etc? And then maybe that, itself, was created > >> by some framework... etc. Probably better to defer this discussion for now > >> and see how version tracking works out. > >> > >> > > >> > While the application and client name and version could be combined in > >> the > >> > ClientName/ClientVersion fields by > >> > the user (e.g. like User-Agent), it would not be in a generalized format > >> > and hard for generic monitoring tools to parse correctly. > >> > > >> > So I'd suggest keeping ClientName and ClientVersion as the client > >> > implementation name ("java" or "org.apache.kafka...") and version, > >> > which can't be changed by the user/app developer, and providing two > >> > optional fields for the application counterpart: > >> > ApplicationName and ApplicationVersion, which are backed by > >> corresponding > >> > configuration properties (application.name, application.version). > >> > > >> > 2) > >> > Do ..Name and ..Version need to be two separate fields, seeing how the > >> two > >> > fields are ambigious when separated? > >> > If we're looking to identify unique versions, combining the two fields > >> > would be sufficient (e.g., "java-2.3.1", "librdkafka/1.2.0", > >> "sarama@1.2.3") > >> > and perhaps easier to work with. > >> > The actual format or content of the name-version string is irrelevant as > >> > long as it identifies a unique name+version. > >> > > >> > >> Hmm. Wouldn't the same arguments you made above about a combined > >> named+version field being "hard for generic monitoring tools to parse > >> correctly" apply here? In any case, there seems to be no reason not to > >> just have two fields. It avoids string parsing. > >> > >> > > >> > 3) > >> > As for allowed characters, will the broker fail the ApiVersionResponse > >> if > >> > any of these fields contain invalid characters, > >> > or will the broker sanitize the strings? > >> > For future backwards compatibility (when the broker constraints change > >> but > >> > clients are not updated) I suggest the latter. > >> > > >> > >> I would argue that we should be strict about the characters that we > >> accept, and just close the connection if the string is bad. There's no > >> reason to let clients troll us with "librdkafka " (two spaces at the end) > >> or a version string with slashes or control characters in it. In fact, I > >> would argue we should just allow something like ([\.][a-z][A-Z][0-9])+ > >> This ensures that JMX will work well. We can always loosen the > >> restrictions later if there is a real need. > >> > >> > 4) > >> > And while we're at it, can we add the broker name and version to the > >> > ApiVersionResponse? > >> > While an application must not use this information to detect features > >> (Hi > >> > Jay!), it is good for troubleshooting > >> > and providing more meaningful logs to the client user in case a feature > >> > (based on the actual api versions) is not available. > >> > >> I can think of several cases where people tried to set up client-side > >> hacks based on the broker version, and were only stopped by the fact that > >> we don't expose this information. I agree with Jay that we should think > >> very carefully before exposing it. In any case, this seems out of scope... > >> > >> best, > >> Colin > >> > >> > > >> > /Magnus > >> > > >> > > >> > Den tors 22 aug. 2019 kl 10:09 skrev David Jacot <dja...@confluent.io>: > >> > > >> > > Hi Satish, > >> > > > >> > > Thank you for your feedback! > >> > > > >> > > Please find my answers below. > >> > > > >> > > >> Did you consider taking version property by loading > >> > > “kafka/kafka-version.properties” as a resource while java client is > >> > > initialized? “kafka/kafka-version.properties” is shipped with > >> > > kafka-clients jar. > >> > > > >> > > I wasn't aware of the property file. It is exactly what I need. > >> Thanks for > >> > > pointing that out! > >> > > > >> > > >> I assume this metric value will be the total no of clients > >> connected > >> > > to a broker irrespective of whether name and version follow the > >> > > expected pattern ([-.\w]+) or not. > >> > > > >> > > That is correct. > >> > > > >> > > >> It seems client name and client version are treated as tags for > >> > > `ConnectedClients` metric. If so, you may implement this metric > >> > > similar to `BrokerTopicMetrics` with topic tag as mentioned here[1]. > >> > > When is the metric removed for a specific client-name and > >> > > client-version? > >> > > > >> > > That is correct. Client name and version are treated as tags like in > >> > > BrokerTopicMetrics. My plan is to remove the metric when it goes > >> > > back to zero - when all clients with a given name & version are > >> > > disconnected. > >> > > > >> > > Best, > >> > > David > >> > > > >> > > On Wed, Aug 21, 2019 at 6:52 PM Satish Duggana < > >> satish.dugg...@gmail.com> > >> > > wrote: > >> > > > >> > > > Hi David, > >> > > > Thanks for the KIP. I have a couple of questions. > >> > > > > >> > > > >> For the Java client, the idea is to define two constants in the > >> code > >> > > to > >> > > > store its name and its version. If possible, the version will be set > >> > > > automatically based on metadata coming from gradle (or the repo > >> itself) > >> > > to > >> > > > avoid having to do manual changes. > >> > > > > >> > > > Did you consider taking version property by loading > >> > > > “kafka/kafka-version.properties” as a resource while java client is > >> > > > initialized? “kafka/kafka-version.properties” is shipped with > >> > > > kafka-clients jar. > >> > > > > >> > > > >> kafka.server:type=ClientMetrics,name=ConnectedClients > >> > > > I assume this metric value will be the total no of clients connected > >> > > > to a broker irrespective of whether name and version follow the > >> > > > expected pattern ([-.\w]+) or not. > >> > > > > >> > > > >> > >> > > > > >> > > > >> kafka.server:type=ClientMetrics,name=ConnectedClients,clientname=([-.\w]+),clientversion=([-.\w]+) > >> > > > It seems client name and client version are treated as tags for > >> > > > `ConnectedClients` metric. If so, you may implement this metric > >> > > > similar to `BrokerTopicMetrics` with topic tag as mentioned here[1]. > >> > > > When is the metric removed for a specific client-name and > >> > > > client-version? > >> > > > > >> > > > 1. > >> > > > > >> > > > >> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaRequestHandler.scala#L231 > >> > > > > >> > > > Thanks, > >> > > > Satish. > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > On Wed, Aug 21, 2019 at 5:33 PM David Jacot <dja...@confluent.io> > >> wrote: > >> > > > > > >> > > > > Hi all, > >> > > > > > >> > > > > I would like to start a discussion for KIP-511: > >> > > > > > >> > > > > >> > > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-511%3A+Collect+and+Expose+Client%27s+Name+and+Version+in+the+Brokers > >> > > > > > >> > > > > Let me know what you think. > >> > > > > > >> > > > > Best, > >> > > > > David > >> > > > > >> > > > >> > > >> > > >