Re: [DISCUSS] KIP-190: Handle client-ids consistently between clients and brokers

2017-10-29 Thread Ted Yu
Can the Discussion Thread link be filled out ?

Cheers

On Wed, Sep 27, 2017 at 2:03 AM, Mickael Maison 
wrote:

> I don't know the history either, I quickly scanned the KIP-55 threads
> and couldn't see it being discussed.
>
> Anyway, your suggestion sounds good to me, are you planning to do that
> as part of KIP-196 or should I create a new JIRA ?
>
> On Wed, Sep 27, 2017 at 3:20 AM, Ewen Cheslack-Postava
>  wrote:
> > It's a fair point that it would undo that sanitization. It's possible
> that
> > for compatibility reasons, doing so would require a bit more work and
> care
> > (e.g. supporting both sanitized and unsanitized for awhile so users have
> a
> > chance to migrate). But I guess my point is that I view the location
> where
> > sanitization is happening as more of a bug, and I actually haven't
> followed
> > the whole history of that discussion so I'm not entirely sure whether it
> > was intentional or just sort of happened along with the sanitization
> > required for generating ZK paths.
> >
> > Anyway, I think we shouldn't choose to cripple all metrics just because
> the
> > metrics involving those user principals weren't handled ideally. If
> you're
> > open to this, we could always make the change to the code affected by
> this
> > KIP, leave the user principal metrics sanitized as they are today, and
> then
> > follow up with another KIP to get all the metrics unsanitized when they
> hit
> > the metrics reporter, but for Jmx sanitized as they are today when
> > characters are encountered that Jmx cannot support.
> >
> > -Ewen
> >
> > On Tue, Sep 26, 2017 at 2:24 AM, Mickael Maison <
> mickael.mai...@gmail.com>
> > wrote:
> >
> >> Hi Ewen,
> >>
> >> By consistency, I meant having all fields sanitized the same way we
> >> were previously doing for user principal.
> >>
> >> But re-reading your previous email, I'm guessing you meant to also
> >> remove the current user principal sanitization from the metrics (only
> >> use that internally for ZK) and have all metrics use the actual
> >> values. If so, yes that's an option.
> >>
> >> On Mon, Sep 25, 2017 at 5:58 PM, Ewen Cheslack-Postava
> >>  wrote:
> >> > On Mon, Sep 25, 2017 at 3:35 AM, Mickael Maison <
> >> mickael.mai...@gmail.com>
> >> > wrote:
> >> >
> >> >> Hi Ewen,
> >> >>
> >> >> I understand your point of view and ideally we'd have one convention
> >> >> for handling all user provided strings. This KIP reused the
> >> >> sanitization mechanism we had in place for user principals.
> >> >>
> >> >> I think both ways have pros and cons but what I like about early
> >> >> sanitization (as is currently) is that it's consistent. While
> >> >> monitoring tools have to know about the sanitization, all metrics are
> >> >> sanitized the same way before being passed to reporters and that
> >> >> includes all "fields" in the metric name (client-id, user).
> >> >>
> >> >
> >> > How is just passing the actual values to the reporters any less
> >> consistent?
> >> > The "sanitization" process that was in place was really only for
> internal
> >> > purposes, right? i.e. so that we'd have a path for ZK that ZK could
> >> handle?
> >> >
> >> > I think the real question is why JmxReporter is being special-cased?
> >> >
> >> >
> >> >>
> >> >> Moving the sanitization into JMXReporter is a publicly visible change
> >> >> as it would affect the metrics we pass to other reporters.
> >> >>
> >> >
> >> > How would this be any more publicly visible than the change already
> being
> >> > made? In fact, since we haven't really specified what reporters should
> >> > accept, if anything the change to the sanitized strings is more of a
> >> > publicly visible change (you need to understand exactly what
> >> transformation
> >> > is being applied) than the change I am suggesting (which could be
> >> > considered a bug fix that now just fixes support for certain client
> IDs
> >> and
> >> > only affects JMX metric names because of JMX limitations).
> >> >
> >> > -Ewen
> >> >
> >> >
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> On Fri, Sep 22, 2017 at 8:53 PM, Ewen Cheslack-Postava
> >> >>  wrote:
> >> >> > Hi all,
> >> >> >
> >> >> > In working on the patch for KIP-196: Add metrics to Kafka Connect
> >> >> > framework, we realized that we have worker and connector/task IDs
> that
> >> >> are
> >> >> > to be included in metrics and those don't currently have
> constraints
> >> on
> >> >> > naming. I'd prefer to avoid adding naming restrictions or mangling
> >> names
> >> >> > unnecessarily, and for users that define a custom metrics reporter
> the
> >> >> name
> >> >> > mangling may be unexpected since their metrics system may not have
> the
> >> >> same
> >> >> > limitations as JMX.
> >> >> >
> >> >> > The text of the KIP is pretty JMX specific and doesn't really
> define
> >> >> where
> >> >> > this mangling happens. Currently, it is being applied essentially
> as
> >> >> early
> >> >> > as 

Re: [DISCUSS] KIP-190: Handle client-ids consistently between clients and brokers

2017-09-27 Thread Mickael Maison
I don't know the history either, I quickly scanned the KIP-55 threads
and couldn't see it being discussed.

Anyway, your suggestion sounds good to me, are you planning to do that
as part of KIP-196 or should I create a new JIRA ?

On Wed, Sep 27, 2017 at 3:20 AM, Ewen Cheslack-Postava
 wrote:
> It's a fair point that it would undo that sanitization. It's possible that
> for compatibility reasons, doing so would require a bit more work and care
> (e.g. supporting both sanitized and unsanitized for awhile so users have a
> chance to migrate). But I guess my point is that I view the location where
> sanitization is happening as more of a bug, and I actually haven't followed
> the whole history of that discussion so I'm not entirely sure whether it
> was intentional or just sort of happened along with the sanitization
> required for generating ZK paths.
>
> Anyway, I think we shouldn't choose to cripple all metrics just because the
> metrics involving those user principals weren't handled ideally. If you're
> open to this, we could always make the change to the code affected by this
> KIP, leave the user principal metrics sanitized as they are today, and then
> follow up with another KIP to get all the metrics unsanitized when they hit
> the metrics reporter, but for Jmx sanitized as they are today when
> characters are encountered that Jmx cannot support.
>
> -Ewen
>
> On Tue, Sep 26, 2017 at 2:24 AM, Mickael Maison 
> wrote:
>
>> Hi Ewen,
>>
>> By consistency, I meant having all fields sanitized the same way we
>> were previously doing for user principal.
>>
>> But re-reading your previous email, I'm guessing you meant to also
>> remove the current user principal sanitization from the metrics (only
>> use that internally for ZK) and have all metrics use the actual
>> values. If so, yes that's an option.
>>
>> On Mon, Sep 25, 2017 at 5:58 PM, Ewen Cheslack-Postava
>>  wrote:
>> > On Mon, Sep 25, 2017 at 3:35 AM, Mickael Maison <
>> mickael.mai...@gmail.com>
>> > wrote:
>> >
>> >> Hi Ewen,
>> >>
>> >> I understand your point of view and ideally we'd have one convention
>> >> for handling all user provided strings. This KIP reused the
>> >> sanitization mechanism we had in place for user principals.
>> >>
>> >> I think both ways have pros and cons but what I like about early
>> >> sanitization (as is currently) is that it's consistent. While
>> >> monitoring tools have to know about the sanitization, all metrics are
>> >> sanitized the same way before being passed to reporters and that
>> >> includes all "fields" in the metric name (client-id, user).
>> >>
>> >
>> > How is just passing the actual values to the reporters any less
>> consistent?
>> > The "sanitization" process that was in place was really only for internal
>> > purposes, right? i.e. so that we'd have a path for ZK that ZK could
>> handle?
>> >
>> > I think the real question is why JmxReporter is being special-cased?
>> >
>> >
>> >>
>> >> Moving the sanitization into JMXReporter is a publicly visible change
>> >> as it would affect the metrics we pass to other reporters.
>> >>
>> >
>> > How would this be any more publicly visible than the change already being
>> > made? In fact, since we haven't really specified what reporters should
>> > accept, if anything the change to the sanitized strings is more of a
>> > publicly visible change (you need to understand exactly what
>> transformation
>> > is being applied) than the change I am suggesting (which could be
>> > considered a bug fix that now just fixes support for certain client IDs
>> and
>> > only affects JMX metric names because of JMX limitations).
>> >
>> > -Ewen
>> >
>> >
>> >>
>> >>
>> >>
>> >>
>> >> On Fri, Sep 22, 2017 at 8:53 PM, Ewen Cheslack-Postava
>> >>  wrote:
>> >> > Hi all,
>> >> >
>> >> > In working on the patch for KIP-196: Add metrics to Kafka Connect
>> >> > framework, we realized that we have worker and connector/task IDs that
>> >> are
>> >> > to be included in metrics and those don't currently have constraints
>> on
>> >> > naming. I'd prefer to avoid adding naming restrictions or mangling
>> names
>> >> > unnecessarily, and for users that define a custom metrics reporter the
>> >> name
>> >> > mangling may be unexpected since their metrics system may not have the
>> >> same
>> >> > limitations as JMX.
>> >> >
>> >> > The text of the KIP is pretty JMX specific and doesn't really define
>> >> where
>> >> > this mangling happens. Currently, it is being applied essentially as
>> >> early
>> >> > as possible. I would like to propose moving the name mangling into the
>> >> > JmxReporter itself so the only impact is on JMX metrics, but other
>> >> metrics
>> >> > reporters would see the original. In other words, leave
>> system-specific
>> >> > name mangling up to that system.
>> >> >
>> >> > In the JmxReporter, the mangling could remain the same (though I think
>> >> the
>> >> > mangling for 

Re: [DISCUSS] KIP-190: Handle client-ids consistently between clients and brokers

2017-09-26 Thread Ewen Cheslack-Postava
It's a fair point that it would undo that sanitization. It's possible that
for compatibility reasons, doing so would require a bit more work and care
(e.g. supporting both sanitized and unsanitized for awhile so users have a
chance to migrate). But I guess my point is that I view the location where
sanitization is happening as more of a bug, and I actually haven't followed
the whole history of that discussion so I'm not entirely sure whether it
was intentional or just sort of happened along with the sanitization
required for generating ZK paths.

Anyway, I think we shouldn't choose to cripple all metrics just because the
metrics involving those user principals weren't handled ideally. If you're
open to this, we could always make the change to the code affected by this
KIP, leave the user principal metrics sanitized as they are today, and then
follow up with another KIP to get all the metrics unsanitized when they hit
the metrics reporter, but for Jmx sanitized as they are today when
characters are encountered that Jmx cannot support.

-Ewen

On Tue, Sep 26, 2017 at 2:24 AM, Mickael Maison 
wrote:

> Hi Ewen,
>
> By consistency, I meant having all fields sanitized the same way we
> were previously doing for user principal.
>
> But re-reading your previous email, I'm guessing you meant to also
> remove the current user principal sanitization from the metrics (only
> use that internally for ZK) and have all metrics use the actual
> values. If so, yes that's an option.
>
> On Mon, Sep 25, 2017 at 5:58 PM, Ewen Cheslack-Postava
>  wrote:
> > On Mon, Sep 25, 2017 at 3:35 AM, Mickael Maison <
> mickael.mai...@gmail.com>
> > wrote:
> >
> >> Hi Ewen,
> >>
> >> I understand your point of view and ideally we'd have one convention
> >> for handling all user provided strings. This KIP reused the
> >> sanitization mechanism we had in place for user principals.
> >>
> >> I think both ways have pros and cons but what I like about early
> >> sanitization (as is currently) is that it's consistent. While
> >> monitoring tools have to know about the sanitization, all metrics are
> >> sanitized the same way before being passed to reporters and that
> >> includes all "fields" in the metric name (client-id, user).
> >>
> >
> > How is just passing the actual values to the reporters any less
> consistent?
> > The "sanitization" process that was in place was really only for internal
> > purposes, right? i.e. so that we'd have a path for ZK that ZK could
> handle?
> >
> > I think the real question is why JmxReporter is being special-cased?
> >
> >
> >>
> >> Moving the sanitization into JMXReporter is a publicly visible change
> >> as it would affect the metrics we pass to other reporters.
> >>
> >
> > How would this be any more publicly visible than the change already being
> > made? In fact, since we haven't really specified what reporters should
> > accept, if anything the change to the sanitized strings is more of a
> > publicly visible change (you need to understand exactly what
> transformation
> > is being applied) than the change I am suggesting (which could be
> > considered a bug fix that now just fixes support for certain client IDs
> and
> > only affects JMX metric names because of JMX limitations).
> >
> > -Ewen
> >
> >
> >>
> >>
> >>
> >>
> >> On Fri, Sep 22, 2017 at 8:53 PM, Ewen Cheslack-Postava
> >>  wrote:
> >> > Hi all,
> >> >
> >> > In working on the patch for KIP-196: Add metrics to Kafka Connect
> >> > framework, we realized that we have worker and connector/task IDs that
> >> are
> >> > to be included in metrics and those don't currently have constraints
> on
> >> > naming. I'd prefer to avoid adding naming restrictions or mangling
> names
> >> > unnecessarily, and for users that define a custom metrics reporter the
> >> name
> >> > mangling may be unexpected since their metrics system may not have the
> >> same
> >> > limitations as JMX.
> >> >
> >> > The text of the KIP is pretty JMX specific and doesn't really define
> >> where
> >> > this mangling happens. Currently, it is being applied essentially as
> >> early
> >> > as possible. I would like to propose moving the name mangling into the
> >> > JmxReporter itself so the only impact is on JMX metrics, but other
> >> metrics
> >> > reporters would see the original. In other words, leave
> system-specific
> >> > name mangling up to that system.
> >> >
> >> > In the JmxReporter, the mangling could remain the same (though I think
> >> the
> >> > mangling for principals is an internal implementation detail, whereas
> >> this
> >> > name mangling is user-visible). The quota metrics on the broker would
> now
> >> > be inconsistent with the others, but I think trying to be less
> >> JMX-specific
> >> > given that we support pluggable reporters is the right direction to
> go.
> >> >
> >> > I think in practice this has no publicly visible impact and
> definitely no
> >> > compatibility concerns, it just 

Re: [DISCUSS] KIP-190: Handle client-ids consistently between clients and brokers

2017-09-26 Thread Mickael Maison
Hi Ewen,

By consistency, I meant having all fields sanitized the same way we
were previously doing for user principal.

But re-reading your previous email, I'm guessing you meant to also
remove the current user principal sanitization from the metrics (only
use that internally for ZK) and have all metrics use the actual
values. If so, yes that's an option.

On Mon, Sep 25, 2017 at 5:58 PM, Ewen Cheslack-Postava
 wrote:
> On Mon, Sep 25, 2017 at 3:35 AM, Mickael Maison 
> wrote:
>
>> Hi Ewen,
>>
>> I understand your point of view and ideally we'd have one convention
>> for handling all user provided strings. This KIP reused the
>> sanitization mechanism we had in place for user principals.
>>
>> I think both ways have pros and cons but what I like about early
>> sanitization (as is currently) is that it's consistent. While
>> monitoring tools have to know about the sanitization, all metrics are
>> sanitized the same way before being passed to reporters and that
>> includes all "fields" in the metric name (client-id, user).
>>
>
> How is just passing the actual values to the reporters any less consistent?
> The "sanitization" process that was in place was really only for internal
> purposes, right? i.e. so that we'd have a path for ZK that ZK could handle?
>
> I think the real question is why JmxReporter is being special-cased?
>
>
>>
>> Moving the sanitization into JMXReporter is a publicly visible change
>> as it would affect the metrics we pass to other reporters.
>>
>
> How would this be any more publicly visible than the change already being
> made? In fact, since we haven't really specified what reporters should
> accept, if anything the change to the sanitized strings is more of a
> publicly visible change (you need to understand exactly what transformation
> is being applied) than the change I am suggesting (which could be
> considered a bug fix that now just fixes support for certain client IDs and
> only affects JMX metric names because of JMX limitations).
>
> -Ewen
>
>
>>
>>
>>
>>
>> On Fri, Sep 22, 2017 at 8:53 PM, Ewen Cheslack-Postava
>>  wrote:
>> > Hi all,
>> >
>> > In working on the patch for KIP-196: Add metrics to Kafka Connect
>> > framework, we realized that we have worker and connector/task IDs that
>> are
>> > to be included in metrics and those don't currently have constraints on
>> > naming. I'd prefer to avoid adding naming restrictions or mangling names
>> > unnecessarily, and for users that define a custom metrics reporter the
>> name
>> > mangling may be unexpected since their metrics system may not have the
>> same
>> > limitations as JMX.
>> >
>> > The text of the KIP is pretty JMX specific and doesn't really define
>> where
>> > this mangling happens. Currently, it is being applied essentially as
>> early
>> > as possible. I would like to propose moving the name mangling into the
>> > JmxReporter itself so the only impact is on JMX metrics, but other
>> metrics
>> > reporters would see the original. In other words, leave system-specific
>> > name mangling up to that system.
>> >
>> > In the JmxReporter, the mangling could remain the same (though I think
>> the
>> > mangling for principals is an internal implementation detail, whereas
>> this
>> > name mangling is user-visible). The quota metrics on the broker would now
>> > be inconsistent with the others, but I think trying to be less
>> JMX-specific
>> > given that we support pluggable reporters is the right direction to go.
>> >
>> > I think in practice this has no publicly visible impact and definitely no
>> > compatibility concerns, it just moves where we're doing the JMX name
>> > mangling. However, since discussion about metric naming/character
>> > substitutions had happened here recently I wanted to raise it here and
>> make
>> > sure there would be agreement on this direction.
>> >
>> > (Long term I'd also like to get the required instantiation of JmxReporter
>> > removed as well, but that requires its own KIP.)
>> >
>> > Thanks,
>> > Ewen
>> >
>> > On Thu, Sep 14, 2017 at 2:09 AM, Tom Bentley 
>> wrote:
>> >
>> >> Hi Mickael,
>> >>
>> >> I was just wondering why the restriction was imposed for Java clients
>> the
>> >> first place, do you know?
>> >>
>> >> Cheers,
>> >>
>> >> Tom
>> >>
>> >> On 14 September 2017 at 09:16, Ismael Juma  wrote:
>> >>
>> >> > Thanks for the KIP Mickael. I suggest starting a vote.
>> >> >
>> >> > Ismael
>> >> >
>> >> > On Mon, Aug 21, 2017 at 2:51 PM, Mickael Maison <
>> >> mickael.mai...@gmail.com>
>> >> > wrote:
>> >> >
>> >> > > Hi all,
>> >> > >
>> >> > > I have created a KIP to cleanup the way client-ids are handled by
>> >> > > brokers and clients.
>> >> > >
>> >> > > Currently the Java clients have some restrictions on the client-ids
>> >> > > that are not enforced by the brokers. Using 3rd party clients,
>> >> > > client-ids containing any characters can be used causing 

Re: [DISCUSS] KIP-190: Handle client-ids consistently between clients and brokers

2017-09-25 Thread Ewen Cheslack-Postava
On Mon, Sep 25, 2017 at 3:35 AM, Mickael Maison 
wrote:

> Hi Ewen,
>
> I understand your point of view and ideally we'd have one convention
> for handling all user provided strings. This KIP reused the
> sanitization mechanism we had in place for user principals.
>
> I think both ways have pros and cons but what I like about early
> sanitization (as is currently) is that it's consistent. While
> monitoring tools have to know about the sanitization, all metrics are
> sanitized the same way before being passed to reporters and that
> includes all "fields" in the metric name (client-id, user).
>

How is just passing the actual values to the reporters any less consistent?
The "sanitization" process that was in place was really only for internal
purposes, right? i.e. so that we'd have a path for ZK that ZK could handle?

I think the real question is why JmxReporter is being special-cased?


>
> Moving the sanitization into JMXReporter is a publicly visible change
> as it would affect the metrics we pass to other reporters.
>

How would this be any more publicly visible than the change already being
made? In fact, since we haven't really specified what reporters should
accept, if anything the change to the sanitized strings is more of a
publicly visible change (you need to understand exactly what transformation
is being applied) than the change I am suggesting (which could be
considered a bug fix that now just fixes support for certain client IDs and
only affects JMX metric names because of JMX limitations).

-Ewen


>
>
>
>
> On Fri, Sep 22, 2017 at 8:53 PM, Ewen Cheslack-Postava
>  wrote:
> > Hi all,
> >
> > In working on the patch for KIP-196: Add metrics to Kafka Connect
> > framework, we realized that we have worker and connector/task IDs that
> are
> > to be included in metrics and those don't currently have constraints on
> > naming. I'd prefer to avoid adding naming restrictions or mangling names
> > unnecessarily, and for users that define a custom metrics reporter the
> name
> > mangling may be unexpected since their metrics system may not have the
> same
> > limitations as JMX.
> >
> > The text of the KIP is pretty JMX specific and doesn't really define
> where
> > this mangling happens. Currently, it is being applied essentially as
> early
> > as possible. I would like to propose moving the name mangling into the
> > JmxReporter itself so the only impact is on JMX metrics, but other
> metrics
> > reporters would see the original. In other words, leave system-specific
> > name mangling up to that system.
> >
> > In the JmxReporter, the mangling could remain the same (though I think
> the
> > mangling for principals is an internal implementation detail, whereas
> this
> > name mangling is user-visible). The quota metrics on the broker would now
> > be inconsistent with the others, but I think trying to be less
> JMX-specific
> > given that we support pluggable reporters is the right direction to go.
> >
> > I think in practice this has no publicly visible impact and definitely no
> > compatibility concerns, it just moves where we're doing the JMX name
> > mangling. However, since discussion about metric naming/character
> > substitutions had happened here recently I wanted to raise it here and
> make
> > sure there would be agreement on this direction.
> >
> > (Long term I'd also like to get the required instantiation of JmxReporter
> > removed as well, but that requires its own KIP.)
> >
> > Thanks,
> > Ewen
> >
> > On Thu, Sep 14, 2017 at 2:09 AM, Tom Bentley 
> wrote:
> >
> >> Hi Mickael,
> >>
> >> I was just wondering why the restriction was imposed for Java clients
> the
> >> first place, do you know?
> >>
> >> Cheers,
> >>
> >> Tom
> >>
> >> On 14 September 2017 at 09:16, Ismael Juma  wrote:
> >>
> >> > Thanks for the KIP Mickael. I suggest starting a vote.
> >> >
> >> > Ismael
> >> >
> >> > On Mon, Aug 21, 2017 at 2:51 PM, Mickael Maison <
> >> mickael.mai...@gmail.com>
> >> > wrote:
> >> >
> >> > > Hi all,
> >> > >
> >> > > I have created a KIP to cleanup the way client-ids are handled by
> >> > > brokers and clients.
> >> > >
> >> > > Currently the Java clients have some restrictions on the client-ids
> >> > > that are not enforced by the brokers. Using 3rd party clients,
> >> > > client-ids containing any characters can be used causing some
> strange
> >> > > behaviours in the way brokers handle metrics and quotas.
> >> > >
> >> > > Feedback is appreciated.
> >> > >
> >> > > Thanks
> >> > >
> >> >
> >>
>


Re: [DISCUSS] KIP-190: Handle client-ids consistently between clients and brokers

2017-09-25 Thread Mickael Maison
Hi Ewen,

I understand your point of view and ideally we'd have one convention
for handling all user provided strings. This KIP reused the
sanitization mechanism we had in place for user principals.

I think both ways have pros and cons but what I like about early
sanitization (as is currently) is that it's consistent. While
monitoring tools have to know about the sanitization, all metrics are
sanitized the same way before being passed to reporters and that
includes all "fields" in the metric name (client-id, user).

Moving the sanitization into JMXReporter is a publicly visible change
as it would affect the metrics we pass to other reporters.




On Fri, Sep 22, 2017 at 8:53 PM, Ewen Cheslack-Postava
 wrote:
> Hi all,
>
> In working on the patch for KIP-196: Add metrics to Kafka Connect
> framework, we realized that we have worker and connector/task IDs that are
> to be included in metrics and those don't currently have constraints on
> naming. I'd prefer to avoid adding naming restrictions or mangling names
> unnecessarily, and for users that define a custom metrics reporter the name
> mangling may be unexpected since their metrics system may not have the same
> limitations as JMX.
>
> The text of the KIP is pretty JMX specific and doesn't really define where
> this mangling happens. Currently, it is being applied essentially as early
> as possible. I would like to propose moving the name mangling into the
> JmxReporter itself so the only impact is on JMX metrics, but other metrics
> reporters would see the original. In other words, leave system-specific
> name mangling up to that system.
>
> In the JmxReporter, the mangling could remain the same (though I think the
> mangling for principals is an internal implementation detail, whereas this
> name mangling is user-visible). The quota metrics on the broker would now
> be inconsistent with the others, but I think trying to be less JMX-specific
> given that we support pluggable reporters is the right direction to go.
>
> I think in practice this has no publicly visible impact and definitely no
> compatibility concerns, it just moves where we're doing the JMX name
> mangling. However, since discussion about metric naming/character
> substitutions had happened here recently I wanted to raise it here and make
> sure there would be agreement on this direction.
>
> (Long term I'd also like to get the required instantiation of JmxReporter
> removed as well, but that requires its own KIP.)
>
> Thanks,
> Ewen
>
> On Thu, Sep 14, 2017 at 2:09 AM, Tom Bentley  wrote:
>
>> Hi Mickael,
>>
>> I was just wondering why the restriction was imposed for Java clients the
>> first place, do you know?
>>
>> Cheers,
>>
>> Tom
>>
>> On 14 September 2017 at 09:16, Ismael Juma  wrote:
>>
>> > Thanks for the KIP Mickael. I suggest starting a vote.
>> >
>> > Ismael
>> >
>> > On Mon, Aug 21, 2017 at 2:51 PM, Mickael Maison <
>> mickael.mai...@gmail.com>
>> > wrote:
>> >
>> > > Hi all,
>> > >
>> > > I have created a KIP to cleanup the way client-ids are handled by
>> > > brokers and clients.
>> > >
>> > > Currently the Java clients have some restrictions on the client-ids
>> > > that are not enforced by the brokers. Using 3rd party clients,
>> > > client-ids containing any characters can be used causing some strange
>> > > behaviours in the way brokers handle metrics and quotas.
>> > >
>> > > Feedback is appreciated.
>> > >
>> > > Thanks
>> > >
>> >
>>


Re: [DISCUSS] KIP-190: Handle client-ids consistently between clients and brokers

2017-09-22 Thread Ewen Cheslack-Postava
Hi all,

In working on the patch for KIP-196: Add metrics to Kafka Connect
framework, we realized that we have worker and connector/task IDs that are
to be included in metrics and those don't currently have constraints on
naming. I'd prefer to avoid adding naming restrictions or mangling names
unnecessarily, and for users that define a custom metrics reporter the name
mangling may be unexpected since their metrics system may not have the same
limitations as JMX.

The text of the KIP is pretty JMX specific and doesn't really define where
this mangling happens. Currently, it is being applied essentially as early
as possible. I would like to propose moving the name mangling into the
JmxReporter itself so the only impact is on JMX metrics, but other metrics
reporters would see the original. In other words, leave system-specific
name mangling up to that system.

In the JmxReporter, the mangling could remain the same (though I think the
mangling for principals is an internal implementation detail, whereas this
name mangling is user-visible). The quota metrics on the broker would now
be inconsistent with the others, but I think trying to be less JMX-specific
given that we support pluggable reporters is the right direction to go.

I think in practice this has no publicly visible impact and definitely no
compatibility concerns, it just moves where we're doing the JMX name
mangling. However, since discussion about metric naming/character
substitutions had happened here recently I wanted to raise it here and make
sure there would be agreement on this direction.

(Long term I'd also like to get the required instantiation of JmxReporter
removed as well, but that requires its own KIP.)

Thanks,
Ewen

On Thu, Sep 14, 2017 at 2:09 AM, Tom Bentley  wrote:

> Hi Mickael,
>
> I was just wondering why the restriction was imposed for Java clients the
> first place, do you know?
>
> Cheers,
>
> Tom
>
> On 14 September 2017 at 09:16, Ismael Juma  wrote:
>
> > Thanks for the KIP Mickael. I suggest starting a vote.
> >
> > Ismael
> >
> > On Mon, Aug 21, 2017 at 2:51 PM, Mickael Maison <
> mickael.mai...@gmail.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > I have created a KIP to cleanup the way client-ids are handled by
> > > brokers and clients.
> > >
> > > Currently the Java clients have some restrictions on the client-ids
> > > that are not enforced by the brokers. Using 3rd party clients,
> > > client-ids containing any characters can be used causing some strange
> > > behaviours in the way brokers handle metrics and quotas.
> > >
> > > Feedback is appreciated.
> > >
> > > Thanks
> > >
> >
>


Re: [DISCUSS] KIP-190: Handle client-ids consistently between clients and brokers

2017-09-14 Thread Tom Bentley
Hi Mickael,

I was just wondering why the restriction was imposed for Java clients the
first place, do you know?

Cheers,

Tom

On 14 September 2017 at 09:16, Ismael Juma  wrote:

> Thanks for the KIP Mickael. I suggest starting a vote.
>
> Ismael
>
> On Mon, Aug 21, 2017 at 2:51 PM, Mickael Maison 
> wrote:
>
> > Hi all,
> >
> > I have created a KIP to cleanup the way client-ids are handled by
> > brokers and clients.
> >
> > Currently the Java clients have some restrictions on the client-ids
> > that are not enforced by the brokers. Using 3rd party clients,
> > client-ids containing any characters can be used causing some strange
> > behaviours in the way brokers handle metrics and quotas.
> >
> > Feedback is appreciated.
> >
> > Thanks
> >
>


Re: [DISCUSS] KIP-190: Handle client-ids consistently between clients and brokers

2017-09-14 Thread Ismael Juma
Thanks for the KIP Mickael. I suggest starting a vote.

Ismael

On Mon, Aug 21, 2017 at 2:51 PM, Mickael Maison 
wrote:

> Hi all,
>
> I have created a KIP to cleanup the way client-ids are handled by
> brokers and clients.
>
> Currently the Java clients have some restrictions on the client-ids
> that are not enforced by the brokers. Using 3rd party clients,
> client-ids containing any characters can be used causing some strange
> behaviours in the way brokers handle metrics and quotas.
>
> Feedback is appreciated.
>
> Thanks
>


Re: [DISCUSS] KIP-190: Handle client-ids consistently between clients and brokers

2017-09-13 Thread Gwen Shapira
OK. LGTM then :)

On Wed, Sep 13, 2017 at 12:38 PM Mickael Maison 
wrote:

> Yes exactly !
> I've updated the KIP to make it more explicit.
>
> Also I noticed my initial email didn't contain the link to the KIP, so
> here it is:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-190%3A+Handle+client-ids+consistently+between+clients+and+brokers
>
> On Tue, Sep 12, 2017 at 7:23 PM, Gwen Shapira  wrote:
> > If I understand you correctly, you are saying:
> > 1. KIP-190 will not affect anyone who doesn't use special characters in
> > their client IDs
> > 2. Those who have special characters in client IDs already have tons of
> > metrics issues and won't be inconvenienced by a KIP that fixes them.
> >
> > Did I get it right?
> >
> > On Sat, Sep 9, 2017 at 9:54 AM Mickael Maison 
> > wrote:
> >
> >> Hi Gwen, thanks for taking a look at the KIP.
> >>
> >> I understand your concern trying to make the transition as smooth as
> >> possible. However there are several issues with the way client-ids
> >> with special characters are handled:
> >> Client-ids that contain invalid ObjectName characters (colon, equals,
> >> etc) currently fail to be registered by the build-in JMX reporter so
> >> they already don't appear in all monitoring systems ! These also cause
> >> issues with Quotas.
> >>
> >> The Java clients as well as the kafka-configs.sh tool already reject
> >> them (even though the error you get from the Produce/Consumer is
> >> pretty cryptic).
> >>
> >> People currently using client-ids with special characters have to be
> >> running 3rd party clients and probably encounter strange quotas issues
> >> as well as missing metrics (if they use JMX).
> >>
> >> So if we really want to do the smallest possible change, we could only
> >> encode ObjectName special characters instead of all special
> >> characters. That way at least the JMX reporter would work correctly.
> >> Display a warning when using any other special characters. Then in a
> >> later release, encode everything like we currently do for the
> >> User/Principal.
> >>
> >> What do you think ?
> >>
> >> On Fri, Sep 1, 2017 at 7:33 AM, Gwen Shapira  wrote:
> >> > Thanks for bumping this. I do have a concern:
> >> >
> >> > This proposal changes the names of existing metrics - as such, it will
> >> > require all owners of monitoring systems to update their dashboards.
> It
> >> > will also complicate monitoring of multiple clusters with different
> >> > versions and require some modifications to existing monitoring
> automation
> >> > systems.
> >> >
> >> > What do you think of an alternative solution:
> >> > 1. For the next release, add the validations on the broker side and
> >> print a
> >> > "warning" that this client id is invalid, that it will break metrics
> and
> >> > that it will be rejected in newer versions.
> >> > 2. Few releases later, actually turn the validation on and return
> >> > InvalidClientID error to clients.
> >> >
> >> > We did something similar when we deprecated acks=2.
> >> >
> >> > Gwen
> >> >
> >> >
> >> > On Thu, Aug 31, 2017 at 12:13 PM Mickael Maison <
> >> mickael.mai...@gmail.com>
> >> > wrote:
> >> >
> >> >> Even though it's pretty non controversial, I was expecting a few
> >> comments.
> >> >> I'll wait until next week for comments then I'll start the vote.
> >> >>
> >> >> Thanks
> >> >>
> >> >> On Mon, Aug 21, 2017 at 6:51 AM, Mickael Maison
> >> >>  wrote:
> >> >> > Hi all,
> >> >> >
> >> >> > I have created a KIP to cleanup the way client-ids are handled by
> >> >> > brokers and clients.
> >> >> >
> >> >> > Currently the Java clients have some restrictions on the client-ids
> >> >> > that are not enforced by the brokers. Using 3rd party clients,
> >> >> > client-ids containing any characters can be used causing some
> strange
> >> >> > behaviours in the way brokers handle metrics and quotas.
> >> >> >
> >> >> > Feedback is appreciated.
> >> >> >
> >> >> > Thanks
> >> >>
> >>
>


Re: [DISCUSS] KIP-190: Handle client-ids consistently between clients and brokers

2017-09-13 Thread Mickael Maison
Yes exactly !
I've updated the KIP to make it more explicit.

Also I noticed my initial email didn't contain the link to the KIP, so
here it is:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-190%3A+Handle+client-ids+consistently+between+clients+and+brokers

On Tue, Sep 12, 2017 at 7:23 PM, Gwen Shapira  wrote:
> If I understand you correctly, you are saying:
> 1. KIP-190 will not affect anyone who doesn't use special characters in
> their client IDs
> 2. Those who have special characters in client IDs already have tons of
> metrics issues and won't be inconvenienced by a KIP that fixes them.
>
> Did I get it right?
>
> On Sat, Sep 9, 2017 at 9:54 AM Mickael Maison 
> wrote:
>
>> Hi Gwen, thanks for taking a look at the KIP.
>>
>> I understand your concern trying to make the transition as smooth as
>> possible. However there are several issues with the way client-ids
>> with special characters are handled:
>> Client-ids that contain invalid ObjectName characters (colon, equals,
>> etc) currently fail to be registered by the build-in JMX reporter so
>> they already don't appear in all monitoring systems ! These also cause
>> issues with Quotas.
>>
>> The Java clients as well as the kafka-configs.sh tool already reject
>> them (even though the error you get from the Produce/Consumer is
>> pretty cryptic).
>>
>> People currently using client-ids with special characters have to be
>> running 3rd party clients and probably encounter strange quotas issues
>> as well as missing metrics (if they use JMX).
>>
>> So if we really want to do the smallest possible change, we could only
>> encode ObjectName special characters instead of all special
>> characters. That way at least the JMX reporter would work correctly.
>> Display a warning when using any other special characters. Then in a
>> later release, encode everything like we currently do for the
>> User/Principal.
>>
>> What do you think ?
>>
>> On Fri, Sep 1, 2017 at 7:33 AM, Gwen Shapira  wrote:
>> > Thanks for bumping this. I do have a concern:
>> >
>> > This proposal changes the names of existing metrics - as such, it will
>> > require all owners of monitoring systems to update their dashboards. It
>> > will also complicate monitoring of multiple clusters with different
>> > versions and require some modifications to existing monitoring automation
>> > systems.
>> >
>> > What do you think of an alternative solution:
>> > 1. For the next release, add the validations on the broker side and
>> print a
>> > "warning" that this client id is invalid, that it will break metrics and
>> > that it will be rejected in newer versions.
>> > 2. Few releases later, actually turn the validation on and return
>> > InvalidClientID error to clients.
>> >
>> > We did something similar when we deprecated acks=2.
>> >
>> > Gwen
>> >
>> >
>> > On Thu, Aug 31, 2017 at 12:13 PM Mickael Maison <
>> mickael.mai...@gmail.com>
>> > wrote:
>> >
>> >> Even though it's pretty non controversial, I was expecting a few
>> comments.
>> >> I'll wait until next week for comments then I'll start the vote.
>> >>
>> >> Thanks
>> >>
>> >> On Mon, Aug 21, 2017 at 6:51 AM, Mickael Maison
>> >>  wrote:
>> >> > Hi all,
>> >> >
>> >> > I have created a KIP to cleanup the way client-ids are handled by
>> >> > brokers and clients.
>> >> >
>> >> > Currently the Java clients have some restrictions on the client-ids
>> >> > that are not enforced by the brokers. Using 3rd party clients,
>> >> > client-ids containing any characters can be used causing some strange
>> >> > behaviours in the way brokers handle metrics and quotas.
>> >> >
>> >> > Feedback is appreciated.
>> >> >
>> >> > Thanks
>> >>
>>


Re: [DISCUSS] KIP-190: Handle client-ids consistently between clients and brokers

2017-09-12 Thread Gwen Shapira
If I understand you correctly, you are saying:
1. KIP-190 will not affect anyone who doesn't use special characters in
their client IDs
2. Those who have special characters in client IDs already have tons of
metrics issues and won't be inconvenienced by a KIP that fixes them.

Did I get it right?

On Sat, Sep 9, 2017 at 9:54 AM Mickael Maison 
wrote:

> Hi Gwen, thanks for taking a look at the KIP.
>
> I understand your concern trying to make the transition as smooth as
> possible. However there are several issues with the way client-ids
> with special characters are handled:
> Client-ids that contain invalid ObjectName characters (colon, equals,
> etc) currently fail to be registered by the build-in JMX reporter so
> they already don't appear in all monitoring systems ! These also cause
> issues with Quotas.
>
> The Java clients as well as the kafka-configs.sh tool already reject
> them (even though the error you get from the Produce/Consumer is
> pretty cryptic).
>
> People currently using client-ids with special characters have to be
> running 3rd party clients and probably encounter strange quotas issues
> as well as missing metrics (if they use JMX).
>
> So if we really want to do the smallest possible change, we could only
> encode ObjectName special characters instead of all special
> characters. That way at least the JMX reporter would work correctly.
> Display a warning when using any other special characters. Then in a
> later release, encode everything like we currently do for the
> User/Principal.
>
> What do you think ?
>
> On Fri, Sep 1, 2017 at 7:33 AM, Gwen Shapira  wrote:
> > Thanks for bumping this. I do have a concern:
> >
> > This proposal changes the names of existing metrics - as such, it will
> > require all owners of monitoring systems to update their dashboards. It
> > will also complicate monitoring of multiple clusters with different
> > versions and require some modifications to existing monitoring automation
> > systems.
> >
> > What do you think of an alternative solution:
> > 1. For the next release, add the validations on the broker side and
> print a
> > "warning" that this client id is invalid, that it will break metrics and
> > that it will be rejected in newer versions.
> > 2. Few releases later, actually turn the validation on and return
> > InvalidClientID error to clients.
> >
> > We did something similar when we deprecated acks=2.
> >
> > Gwen
> >
> >
> > On Thu, Aug 31, 2017 at 12:13 PM Mickael Maison <
> mickael.mai...@gmail.com>
> > wrote:
> >
> >> Even though it's pretty non controversial, I was expecting a few
> comments.
> >> I'll wait until next week for comments then I'll start the vote.
> >>
> >> Thanks
> >>
> >> On Mon, Aug 21, 2017 at 6:51 AM, Mickael Maison
> >>  wrote:
> >> > Hi all,
> >> >
> >> > I have created a KIP to cleanup the way client-ids are handled by
> >> > brokers and clients.
> >> >
> >> > Currently the Java clients have some restrictions on the client-ids
> >> > that are not enforced by the brokers. Using 3rd party clients,
> >> > client-ids containing any characters can be used causing some strange
> >> > behaviours in the way brokers handle metrics and quotas.
> >> >
> >> > Feedback is appreciated.
> >> >
> >> > Thanks
> >>
>


Re: [DISCUSS] KIP-190: Handle client-ids consistently between clients and brokers

2017-09-09 Thread Mickael Maison
Hi Gwen, thanks for taking a look at the KIP.

I understand your concern trying to make the transition as smooth as
possible. However there are several issues with the way client-ids
with special characters are handled:
Client-ids that contain invalid ObjectName characters (colon, equals,
etc) currently fail to be registered by the build-in JMX reporter so
they already don't appear in all monitoring systems ! These also cause
issues with Quotas.

The Java clients as well as the kafka-configs.sh tool already reject
them (even though the error you get from the Produce/Consumer is
pretty cryptic).

People currently using client-ids with special characters have to be
running 3rd party clients and probably encounter strange quotas issues
as well as missing metrics (if they use JMX).

So if we really want to do the smallest possible change, we could only
encode ObjectName special characters instead of all special
characters. That way at least the JMX reporter would work correctly.
Display a warning when using any other special characters. Then in a
later release, encode everything like we currently do for the
User/Principal.

What do you think ?

On Fri, Sep 1, 2017 at 7:33 AM, Gwen Shapira  wrote:
> Thanks for bumping this. I do have a concern:
>
> This proposal changes the names of existing metrics - as such, it will
> require all owners of monitoring systems to update their dashboards. It
> will also complicate monitoring of multiple clusters with different
> versions and require some modifications to existing monitoring automation
> systems.
>
> What do you think of an alternative solution:
> 1. For the next release, add the validations on the broker side and print a
> "warning" that this client id is invalid, that it will break metrics and
> that it will be rejected in newer versions.
> 2. Few releases later, actually turn the validation on and return
> InvalidClientID error to clients.
>
> We did something similar when we deprecated acks=2.
>
> Gwen
>
>
> On Thu, Aug 31, 2017 at 12:13 PM Mickael Maison 
> wrote:
>
>> Even though it's pretty non controversial, I was expecting a few comments.
>> I'll wait until next week for comments then I'll start the vote.
>>
>> Thanks
>>
>> On Mon, Aug 21, 2017 at 6:51 AM, Mickael Maison
>>  wrote:
>> > Hi all,
>> >
>> > I have created a KIP to cleanup the way client-ids are handled by
>> > brokers and clients.
>> >
>> > Currently the Java clients have some restrictions on the client-ids
>> > that are not enforced by the brokers. Using 3rd party clients,
>> > client-ids containing any characters can be used causing some strange
>> > behaviours in the way brokers handle metrics and quotas.
>> >
>> > Feedback is appreciated.
>> >
>> > Thanks
>>


Re: [DISCUSS] KIP-190: Handle client-ids consistently between clients and brokers

2017-09-01 Thread Gwen Shapira
Thanks for bumping this. I do have a concern:

This proposal changes the names of existing metrics - as such, it will
require all owners of monitoring systems to update their dashboards. It
will also complicate monitoring of multiple clusters with different
versions and require some modifications to existing monitoring automation
systems.

What do you think of an alternative solution:
1. For the next release, add the validations on the broker side and print a
"warning" that this client id is invalid, that it will break metrics and
that it will be rejected in newer versions.
2. Few releases later, actually turn the validation on and return
InvalidClientID error to clients.

We did something similar when we deprecated acks=2.

Gwen


On Thu, Aug 31, 2017 at 12:13 PM Mickael Maison 
wrote:

> Even though it's pretty non controversial, I was expecting a few comments.
> I'll wait until next week for comments then I'll start the vote.
>
> Thanks
>
> On Mon, Aug 21, 2017 at 6:51 AM, Mickael Maison
>  wrote:
> > Hi all,
> >
> > I have created a KIP to cleanup the way client-ids are handled by
> > brokers and clients.
> >
> > Currently the Java clients have some restrictions on the client-ids
> > that are not enforced by the brokers. Using 3rd party clients,
> > client-ids containing any characters can be used causing some strange
> > behaviours in the way brokers handle metrics and quotas.
> >
> > Feedback is appreciated.
> >
> > Thanks
>


Re: [DISCUSS] KIP-190: Handle client-ids consistently between clients and brokers

2017-08-31 Thread Mickael Maison
Even though it's pretty non controversial, I was expecting a few comments.
I'll wait until next week for comments then I'll start the vote.

Thanks

On Mon, Aug 21, 2017 at 6:51 AM, Mickael Maison
 wrote:
> Hi all,
>
> I have created a KIP to cleanup the way client-ids are handled by
> brokers and clients.
>
> Currently the Java clients have some restrictions on the client-ids
> that are not enforced by the brokers. Using 3rd party clients,
> client-ids containing any characters can be used causing some strange
> behaviours in the way brokers handle metrics and quotas.
>
> Feedback is appreciated.
>
> Thanks


[DISCUSS] KIP-190: Handle client-ids consistently between clients and brokers

2017-08-21 Thread Mickael Maison
Hi all,

I have created a KIP to cleanup the way client-ids are handled by
brokers and clients.

Currently the Java clients have some restrictions on the client-ids
that are not enforced by the brokers. Using 3rd party clients,
client-ids containing any characters can be used causing some strange
behaviours in the way brokers handle metrics and quotas.

Feedback is appreciated.

Thanks