Hi Magnus, Thank you for your feedback. Please, find my comments below.
1. I thought that the clientId was meant for this purpose (providing information about the application). Is there a gap I don't see? 2. I have put two fields to avoid requiring deep validation and parsing on the broker. I believe that it will be easier to use the metadata downstream like this. 3. Good point. I think that the broker should have some sort of validation and fail the ApiVersionRequest and/or the Connection if the validation fails. To avoid backward compatibility issues, I think we should come up with a minimal validation and ensure it won't become more restrictive in the future. 4. I don't have strong opinion regarding this one but as the focus of the KIP is to gather the client's information, I suggest to discuss/address this later on. Best, David On Thu, Aug 29, 2019 at 6:56 PM Colin McCabe <cmcc...@apache.org> wrote: > On Fri, Aug 23, 2019, at 00:07, Magnus Edenhill wrote: > > Great proposal, this feature is well overdue! > > > > 1) > > From an operator's perspective I don't think the kafka client > > implementation name and version are sufficient, > > I also believe the application name and version are of interest. > > You could have all applications in your cluster run the same kafka client > > and version, but only one type or > > version of an application misbehave and needing to be tracked down. > > Hi Magnus, > > I think it might be better to leave this out of scope for now, and think > about it in the context of more generalized request tracing. This is a > very deep rabbit hole, and I could easily see it delaying this KIP for a > long time. For example, if you have multiple Spark jobs producing to > Kafka, just knowing that a client is being used by Spark may not be that > helpful. So maybe you want a third set of fields to describe the spark > application ID and version, etc? And then maybe that, itself, was created > by some framework... etc. Probably better to defer this discussion for now > and see how version tracking works out. > > > > > While the application and client name and version could be combined in > the > > ClientName/ClientVersion fields by > > the user (e.g. like User-Agent), it would not be in a generalized format > > and hard for generic monitoring tools to parse correctly. > > > > So I'd suggest keeping ClientName and ClientVersion as the client > > implementation name ("java" or "org.apache.kafka...") and version, > > which can't be changed by the user/app developer, and providing two > > optional fields for the application counterpart: > > ApplicationName and ApplicationVersion, which are backed by corresponding > > configuration properties (application.name, application.version). > > > > 2) > > Do ..Name and ..Version need to be two separate fields, seeing how the > two > > fields are ambigious when separated? > > If we're looking to identify unique versions, combining the two fields > > would be sufficient (e.g., "java-2.3.1", "librdkafka/1.2.0", > "sarama@1.2.3") > > and perhaps easier to work with. > > The actual format or content of the name-version string is irrelevant as > > long as it identifies a unique name+version. > > > > Hmm. Wouldn't the same arguments you made above about a combined > named+version field being "hard for generic monitoring tools to parse > correctly" apply here? In any case, there seems to be no reason not to > just have two fields. It avoids string parsing. > > > > > 3) > > As for allowed characters, will the broker fail the ApiVersionResponse if > > any of these fields contain invalid characters, > > or will the broker sanitize the strings? > > For future backwards compatibility (when the broker constraints change > but > > clients are not updated) I suggest the latter. > > > > I would argue that we should be strict about the characters that we > accept, and just close the connection if the string is bad. There's no > reason to let clients troll us with "librdkafka " (two spaces at the end) > or a version string with slashes or control characters in it. In fact, I > would argue we should just allow something like ([\.][a-z][A-Z][0-9])+ > This ensures that JMX will work well. We can always loosen the > restrictions later if there is a real need. > > > 4) > > And while we're at it, can we add the broker name and version to the > > ApiVersionResponse? > > While an application must not use this information to detect features (Hi > > Jay!), it is good for troubleshooting > > and providing more meaningful logs to the client user in case a feature > > (based on the actual api versions) is not available. > > I can think of several cases where people tried to set up client-side > hacks based on the broker version, and were only stopped by the fact that > we don't expose this information. I agree with Jay that we should think > very carefully before exposing it. In any case, this seems out of scope... > > best, > Colin > > > > > /Magnus > > > > > > Den tors 22 aug. 2019 kl 10:09 skrev David Jacot <dja...@confluent.io>: > > > > > Hi Satish, > > > > > > Thank you for your feedback! > > > > > > Please find my answers below. > > > > > > >> Did you consider taking version property by loading > > > “kafka/kafka-version.properties” as a resource while java client is > > > initialized? “kafka/kafka-version.properties” is shipped with > > > kafka-clients jar. > > > > > > I wasn't aware of the property file. It is exactly what I need. Thanks > for > > > pointing that out! > > > > > > >> I assume this metric value will be the total no of clients connected > > > to a broker irrespective of whether name and version follow the > > > expected pattern ([-.\w]+) or not. > > > > > > That is correct. > > > > > > >> It seems client name and client version are treated as tags for > > > `ConnectedClients` metric. If so, you may implement this metric > > > similar to `BrokerTopicMetrics` with topic tag as mentioned here[1]. > > > When is the metric removed for a specific client-name and > > > client-version? > > > > > > That is correct. Client name and version are treated as tags like in > > > BrokerTopicMetrics. My plan is to remove the metric when it goes > > > back to zero - when all clients with a given name & version are > > > disconnected. > > > > > > Best, > > > David > > > > > > On Wed, Aug 21, 2019 at 6:52 PM Satish Duggana < > satish.dugg...@gmail.com> > > > wrote: > > > > > > > Hi David, > > > > Thanks for the KIP. I have a couple of questions. > > > > > > > > >> For the Java client, the idea is to define two constants in the > code > > > to > > > > store its name and its version. If possible, the version will be set > > > > automatically based on metadata coming from gradle (or the repo > itself) > > > to > > > > avoid having to do manual changes. > > > > > > > > Did you consider taking version property by loading > > > > “kafka/kafka-version.properties” as a resource while java client is > > > > initialized? “kafka/kafka-version.properties” is shipped with > > > > kafka-clients jar. > > > > > > > > >> kafka.server:type=ClientMetrics,name=ConnectedClients > > > > I assume this metric value will be the total no of clients connected > > > > to a broker irrespective of whether name and version follow the > > > > expected pattern ([-.\w]+) or not. > > > > > > > > >> > > > > > > > > kafka.server:type=ClientMetrics,name=ConnectedClients,clientname=([-.\w]+),clientversion=([-.\w]+) > > > > It seems client name and client version are treated as tags for > > > > `ConnectedClients` metric. If so, you may implement this metric > > > > similar to `BrokerTopicMetrics` with topic tag as mentioned here[1]. > > > > When is the metric removed for a specific client-name and > > > > client-version? > > > > > > > > 1. > > > > > > > > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaRequestHandler.scala#L231 > > > > > > > > Thanks, > > > > Satish. > > > > > > > > > > > > > > > > > > > > On Wed, Aug 21, 2019 at 5:33 PM David Jacot <dja...@confluent.io> > wrote: > > > > > > > > > > Hi all, > > > > > > > > > > I would like to start a discussion for KIP-511: > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-511%3A+Collect+and+Expose+Client%27s+Name+and+Version+in+the+Brokers > > > > > > > > > > Let me know what you think. > > > > > > > > > > Best, > > > > > David > > > > > > > > > >