Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-17 Thread Ron Dagostino
HI again, Rajini.  Would we ever want the max session time to be different
across different SASL mechanisms?  I'm wondering, now that we are
supporting all SASL mechanisms via this KIP, if we still need to prefix
this config with the "[listener].[mechanism]." prefix.  I've kept the
prefix in the KIP for now, but it would be easier to just set it once for
all mechanisms, and I don't see that as being a problem.  Let me know what
you think.

Ron

On Mon, Sep 17, 2018 at 9:51 PM Ron Dagostino  wrote:

> Hi Rajini.  The KIP is updated.  Aside from a once-over to make sure it is
> all accurate, I think we need to confirm the metrics.  The decision to not
> reject authentications that use tokens with too-long a lifetime allowed the
> metrics to be simpler.  I decided that in addition to tracking these
> metrics on the broker:
>
> failed-reauthentication-{rate,total} and
> successful-reauthentication-{rate,total}
>
> we simply need one more set of broker metrics to track the subset of
> clients clients that are not upgraded to v2.1.0 and are still using a V0
> SaslAuthenticateRequest:
>
> failed-v0-authentication-{rate,total} and
> successful-v0-authentication-{rate,total}
>
> See the Migration section of the KIP for details of how this would be used.
>
> I wonder if we need a broker metric documenting the number of "expired"
> sessions killed by the broker since it would be the same as
> successful-v0-authentication-total. I've eliminated that from the KIP for
> now.  Thoughts?
>
> There is also a client-side metric for re-authentication latency tracking
> (still unnamed -- do you have a preference?)
>
> I think we're close to being able to put this KIP up for a vote.
>
> Ron
>
>
> On Mon, Sep 17, 2018 at 2:45 PM Ron Dagostino  wrote:
>
>> <<> the KIP
>> Right, that was a concern I had mentioned in a follow-up email.  Agree it
>> should go alongside errors.
>>
>> <<> <<> <<> Ah, ok, I was not aware that this case existed.  Agreed, for consistency,
>> server will always send it back to the client.
>>
>> <<> for lifetime]
>> Since we now agree that the server will send it back, yes, there is no
>> need for this.
>>
>> <<> [sasl.login.refresh.reauthenticate.enable]
>> I think this might have been more useful when we weren't necessarily
>> going to support all SASL mechanisms and/or when the broker was not going
>> to advertise the fact that it supported re-authentication.  You are
>> correct, now that we support it for all SASL mechanisms and we are bumping
>> an API version, I think it is okay to simply enable it wherever both the
>> client and server meet the required versions.
>>
>> <<> <<> <<> <<> <<> <<> I was going under the assumption that it would matter, but based on your
>> pushback I just realized that the same functionality can be implemented as
>> part of token validation if there is a desire to limit token lifetimes to a
>> certain max value (and the token validator has to be provided in production
>> anyway since all we provide out-of-the-box is the unsecured validator).  So
>> I'm willing to abandon this check as part of re-authentication.
>>
>> I'll adjust the KIP accordingly a bit later.  Thanks for the continued
>> feedback/discussion.
>>
>> Ron
>>
>>
>>
>>
>> On Mon, Sep 17, 2018 at 2:10 PM Rajini Sivaram 
>> wrote:
>>
>>> Hi Ron,
>>>
>>>
>>> *1) Is there a reason to communicate this value back to the client when
>>> the
>>> client already knows it?  It's an extra network round-trip, and since the
>>> SASL round-tripsare defined by the spec I'm not certain adding an extra
>>> round-trip is acceptable.*
>>>
>>> I wasn't suggesting an extra round-trip for propagating session
>>> lifetime. I
>>> was expecting session lifetime to be added to the last SASL_AUTHENTICATE
>>> response from the broker. Because SASL is a challenge-response mechanism,
>>> SaslServer knows when the response being sent is the last one and hence
>>> can
>>> send the session lifetime in the response (in the same way as we
>>> propagate
>>> errors). I was expecting this to be added as an extra field alongside
>>> errors, not in the opaque body as mentioned in the KIP. The opaque byte
>>> array strictly conforms to the SASL mechanism wire protocol and we want
>>> to
>>> keep it that way.
>>>
>>> As you have said, we don't need server to propagate session lifetime for
>>> OAUTHBEARER since client knows the token lifetime. But server also knows
>>> the credential lifetime and by having the server decide the lifetime, we
>>> can use the same code path for all mechanisms. If we only have a constant
>>> max lifetime in the server, for PLAIN and SCRAM we will end up having the
>>> same lifetime for all credentials with no ability to set actual expiry.
>>> We
>>> use SCRAM for delegation token based authentication where credentials
>>> have
>>> an expiry time, so we do need to be able to set individual credential
>>> lifetimes, where the server knows the expiry time but client may not.
>>>
>>> *2) I also just realized that if the 

Jenkins build is back to normal : kafka-trunk-jdk8 #2971

2018-09-17 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-17 Thread Ron Dagostino
Hi Rajini.  The KIP is updated.  Aside from a once-over to make sure it is
all accurate, I think we need to confirm the metrics.  The decision to not
reject authentications that use tokens with too-long a lifetime allowed the
metrics to be simpler.  I decided that in addition to tracking these
metrics on the broker:

failed-reauthentication-{rate,total} and
successful-reauthentication-{rate,total}

we simply need one more set of broker metrics to track the subset of
clients clients that are not upgraded to v2.1.0 and are still using a V0
SaslAuthenticateRequest:

failed-v0-authentication-{rate,total} and
successful-v0-authentication-{rate,total}

See the Migration section of the KIP for details of how this would be used.

I wonder if we need a broker metric documenting the number of "expired"
sessions killed by the broker since it would be the same as
successful-v0-authentication-total. I've eliminated that from the KIP for
now.  Thoughts?

There is also a client-side metric for re-authentication latency tracking
(still unnamed -- do you have a preference?)

I think we're close to being able to put this KIP up for a vote.

Ron


On Mon, Sep 17, 2018 at 2:45 PM Ron Dagostino  wrote:

> << the KIP
> Right, that was a concern I had mentioned in a follow-up email.  Agree it
> should go alongside errors.
>
> << << << Ah, ok, I was not aware that this case existed.  Agreed, for consistency,
> server will always send it back to the client.
>
> << for lifetime]
> Since we now agree that the server will send it back, yes, there is no
> need for this.
>
> << [sasl.login.refresh.reauthenticate.enable]
> I think this might have been more useful when we weren't necessarily going
> to support all SASL mechanisms and/or when the broker was not going to
> advertise the fact that it supported re-authentication.  You are correct,
> now that we support it for all SASL mechanisms and we are bumping an API
> version, I think it is okay to simply enable it wherever both the client
> and server meet the required versions.
>
> << << << << << << I was going under the assumption that it would matter, but based on your
> pushback I just realized that the same functionality can be implemented as
> part of token validation if there is a desire to limit token lifetimes to a
> certain max value (and the token validator has to be provided in production
> anyway since all we provide out-of-the-box is the unsecured validator).  So
> I'm willing to abandon this check as part of re-authentication.
>
> I'll adjust the KIP accordingly a bit later.  Thanks for the continued
> feedback/discussion.
>
> Ron
>
>
>
>
> On Mon, Sep 17, 2018 at 2:10 PM Rajini Sivaram 
> wrote:
>
>> Hi Ron,
>>
>>
>> *1) Is there a reason to communicate this value back to the client when
>> the
>> client already knows it?  It's an extra network round-trip, and since the
>> SASL round-tripsare defined by the spec I'm not certain adding an extra
>> round-trip is acceptable.*
>>
>> I wasn't suggesting an extra round-trip for propagating session lifetime.
>> I
>> was expecting session lifetime to be added to the last SASL_AUTHENTICATE
>> response from the broker. Because SASL is a challenge-response mechanism,
>> SaslServer knows when the response being sent is the last one and hence
>> can
>> send the session lifetime in the response (in the same way as we propagate
>> errors). I was expecting this to be added as an extra field alongside
>> errors, not in the opaque body as mentioned in the KIP. The opaque byte
>> array strictly conforms to the SASL mechanism wire protocol and we want to
>> keep it that way.
>>
>> As you have said, we don't need server to propagate session lifetime for
>> OAUTHBEARER since client knows the token lifetime. But server also knows
>> the credential lifetime and by having the server decide the lifetime, we
>> can use the same code path for all mechanisms. If we only have a constant
>> max lifetime in the server, for PLAIN and SCRAM we will end up having the
>> same lifetime for all credentials with no ability to set actual expiry. We
>> use SCRAM for delegation token based authentication where credentials have
>> an expiry time, so we do need to be able to set individual credential
>> lifetimes, where the server knows the expiry time but client may not.
>>
>> *2) I also just realized that if the client is to learn the credential
>> lifetime we wouldn't want to put special-case code in the Authenticator
>> for
>> GSSAPI and OAUTHBEARER; we would want to expose the value generically,
>> probably as a negotiated property on the SaslClient instance.*
>>
>> I was trying to avoid this altogether. Client doesn't need to know
>> credential lifetime. Server asks the client to re-authenticate within its
>> session lifetime.
>>
>> 3) From the KIP, I wasn't entirely clear about the purpose of the two
>> configs:
>>
>> sasl.login.refresh.reauthenticate.enable: Do we need this? Client knows if
>> broker version supports re-authentication based on the 

Re: [VOTE] KIP-110: Add Codec for ZStandard Compression

2018-09-17 Thread Dongjin Lee
Hi All,

This KIP is now accepted with:
- 3 Binding +1 (Ismael, Jason, Harsha)
- 3 Non-binding +1 (Manikumar, Mickael, Priyank)

Thanks everyone for the votes.

Best,
Dongjin

@Dong Lin

Thanks for reminding the rule.

@Ivan

Special Thanks to you. Without your support, this KIP could not progress.

On Tue, Sep 18, 2018 at 9:42 AM Dong Lin  wrote:

> Hey Dongjin,
>
> The KIP passes vote after having 3 binding +1 votes and more binding +1
> votes than -1 votes. So it is not necessary to have more vote for this KIP.
>
> Thanks,
> Dong
>
> On Mon, Sep 17, 2018 at 5:37 PM, Dongjin Lee  wrote:
>
> > @Ismael
> >
> > +3 (binding), +3 (non-binding) until now. Do we need more vote?
> >
> > Thanks,
> > Dongjin
> >
> > On Sat, Sep 15, 2018 at 4:03 AM Priyank Shah 
> > wrote:
> >
> > > +1(non-binding)
> > >
> > > On 9/13/18, 1:44 AM, "Mickael Maison" 
> wrote:
> > >
> > > +1 (non binding)
> > > Thanks for your perseverance, it's great to see this KIP finally
> > > reaching completion!
> > > On Thu, Sep 13, 2018 at 4:58 AM Harsha  wrote:
> > > >
> > > > +1 (binding).
> > > >
> > > > Thanks,
> > > > Harsha
> > > >
> > > > On Wed, Sep 12, 2018, at 4:56 PM, Jason Gustafson wrote:
> > > > > Great contribution! +1
> > > > >
> > > > > On Wed, Sep 12, 2018 at 10:20 AM, Manikumar <
> > > manikumar.re...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > +1 (non-binding).
> > > > > >
> > > > > > Thanks for the KIP.
> > > > > >
> > > > > > On Wed, Sep 12, 2018 at 10:44 PM Ismael Juma <
> > ism...@juma.me.uk>
> > > wrote:
> > > > > >
> > > > > > > Thanks for the KIP, +1 (binding).
> > > > > > >
> > > > > > > Ismael
> > > > > > >
> > > > > > > On Wed, Sep 12, 2018 at 10:02 AM Dongjin Lee <
> > > dong...@apache.org> wrote:
> > > > > > >
> > > > > > > > Hello, I would like to start a VOTE on KIP-110: Add Codec
> > > for ZStandard
> > > > > > > > Compression.
> > > > > > > >
> > > > > > > > The KIP:
> > > > > > > >
> > > > > > > >
> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > 110%3A+Add+Codec+for+ZStandard+Compression
> > > > > > > > Discussion thread:
> > > > > > > >
> > > https://www.mail-archive.com/dev@kafka.apache.org/msg88673.html
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Dongjin
> > > > > > > >
> > > > > > > > --
> > > > > > > > *Dongjin Lee*
> > > > > > > >
> > > > > > > > *A hitchhiker in the mathematical world.*
> > > > > > > >
> > > > > > > > *github:  
> github.com/dongjinleekr
> > > > > > > > linkedin:
> > > > > > > kr.linkedin.com/in/dongjinleekr
> > > > > > > > slideshare:
> > > > > > > > www.slideshare.net/dongjinleekr
> > > > > > > > *
> > > > > > > >
> > > > > > >
> > > > > >
> > >
> > >
> > >
> > >
> >
> > --
> > *Dongjin Lee*
> >
> > *A hitchhiker in the mathematical world.*
> >
> > *github:  github.com/dongjinleekr
> > linkedin:
> kr.linkedin.com/in/dongjinleekr
> > slideshare:
> > www.slideshare.net/dongjinleekr
> > *
> >
>


-- 
*Dongjin Lee*

*A hitchhiker in the mathematical world.*

*github:  github.com/dongjinleekr
linkedin: kr.linkedin.com/in/dongjinleekr
slideshare:
www.slideshare.net/dongjinleekr
*


Re: [VOTE] KIP-110: Add Codec for ZStandard Compression

2018-09-17 Thread Dong Lin
Hey Dongjin,

The KIP passes vote after having 3 binding +1 votes and more binding +1
votes than -1 votes. So it is not necessary to have more vote for this KIP.

Thanks,
Dong

On Mon, Sep 17, 2018 at 5:37 PM, Dongjin Lee  wrote:

> @Ismael
>
> +3 (binding), +3 (non-binding) until now. Do we need more vote?
>
> Thanks,
> Dongjin
>
> On Sat, Sep 15, 2018 at 4:03 AM Priyank Shah 
> wrote:
>
> > +1(non-binding)
> >
> > On 9/13/18, 1:44 AM, "Mickael Maison"  wrote:
> >
> > +1 (non binding)
> > Thanks for your perseverance, it's great to see this KIP finally
> > reaching completion!
> > On Thu, Sep 13, 2018 at 4:58 AM Harsha  wrote:
> > >
> > > +1 (binding).
> > >
> > > Thanks,
> > > Harsha
> > >
> > > On Wed, Sep 12, 2018, at 4:56 PM, Jason Gustafson wrote:
> > > > Great contribution! +1
> > > >
> > > > On Wed, Sep 12, 2018 at 10:20 AM, Manikumar <
> > manikumar.re...@gmail.com>
> > > > wrote:
> > > >
> > > > > +1 (non-binding).
> > > > >
> > > > > Thanks for the KIP.
> > > > >
> > > > > On Wed, Sep 12, 2018 at 10:44 PM Ismael Juma <
> ism...@juma.me.uk>
> > wrote:
> > > > >
> > > > > > Thanks for the KIP, +1 (binding).
> > > > > >
> > > > > > Ismael
> > > > > >
> > > > > > On Wed, Sep 12, 2018 at 10:02 AM Dongjin Lee <
> > dong...@apache.org> wrote:
> > > > > >
> > > > > > > Hello, I would like to start a VOTE on KIP-110: Add Codec
> > for ZStandard
> > > > > > > Compression.
> > > > > > >
> > > > > > > The KIP:
> > > > > > >
> > > > > > >
> > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > 110%3A+Add+Codec+for+ZStandard+Compression
> > > > > > > Discussion thread:
> > > > > > >
> > https://www.mail-archive.com/dev@kafka.apache.org/msg88673.html
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Dongjin
> > > > > > >
> > > > > > > --
> > > > > > > *Dongjin Lee*
> > > > > > >
> > > > > > > *A hitchhiker in the mathematical world.*
> > > > > > >
> > > > > > > *github:  github.com/dongjinleekr
> > > > > > > linkedin:
> > > > > > kr.linkedin.com/in/dongjinleekr
> > > > > > > slideshare:
> > > > > > > www.slideshare.net/dongjinleekr
> > > > > > > *
> > > > > > >
> > > > > >
> > > > >
> >
> >
> >
> >
>
> --
> *Dongjin Lee*
>
> *A hitchhiker in the mathematical world.*
>
> *github:  github.com/dongjinleekr
> linkedin: kr.linkedin.com/in/dongjinleekr
> slideshare:
> www.slideshare.net/dongjinleekr
> *
>


Re: [VOTE] KIP-110: Add Codec for ZStandard Compression

2018-09-17 Thread Dongjin Lee
@Ismael

+3 (binding), +3 (non-binding) until now. Do we need more vote?

Thanks,
Dongjin

On Sat, Sep 15, 2018 at 4:03 AM Priyank Shah  wrote:

> +1(non-binding)
>
> On 9/13/18, 1:44 AM, "Mickael Maison"  wrote:
>
> +1 (non binding)
> Thanks for your perseverance, it's great to see this KIP finally
> reaching completion!
> On Thu, Sep 13, 2018 at 4:58 AM Harsha  wrote:
> >
> > +1 (binding).
> >
> > Thanks,
> > Harsha
> >
> > On Wed, Sep 12, 2018, at 4:56 PM, Jason Gustafson wrote:
> > > Great contribution! +1
> > >
> > > On Wed, Sep 12, 2018 at 10:20 AM, Manikumar <
> manikumar.re...@gmail.com>
> > > wrote:
> > >
> > > > +1 (non-binding).
> > > >
> > > > Thanks for the KIP.
> > > >
> > > > On Wed, Sep 12, 2018 at 10:44 PM Ismael Juma 
> wrote:
> > > >
> > > > > Thanks for the KIP, +1 (binding).
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Wed, Sep 12, 2018 at 10:02 AM Dongjin Lee <
> dong...@apache.org> wrote:
> > > > >
> > > > > > Hello, I would like to start a VOTE on KIP-110: Add Codec
> for ZStandard
> > > > > > Compression.
> > > > > >
> > > > > > The KIP:
> > > > > >
> > > > > >
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 110%3A+Add+Codec+for+ZStandard+Compression
> > > > > > Discussion thread:
> > > > > >
> https://www.mail-archive.com/dev@kafka.apache.org/msg88673.html
> > > > > >
> > > > > > Thanks,
> > > > > > Dongjin
> > > > > >
> > > > > > --
> > > > > > *Dongjin Lee*
> > > > > >
> > > > > > *A hitchhiker in the mathematical world.*
> > > > > >
> > > > > > *github:  github.com/dongjinleekr
> > > > > > linkedin:
> > > > > kr.linkedin.com/in/dongjinleekr
> > > > > > slideshare:
> > > > > > www.slideshare.net/dongjinleekr
> > > > > > *
> > > > > >
> > > > >
> > > >
>
>
>
>

-- 
*Dongjin Lee*

*A hitchhiker in the mathematical world.*

*github:  github.com/dongjinleekr
linkedin: kr.linkedin.com/in/dongjinleekr
slideshare:
www.slideshare.net/dongjinleekr
*


Re: [VOTE] KIP-372: Naming Repartition Topics for Joins and Grouping

2018-09-17 Thread Dongjin Lee
Great improvements. +1. (Non-binding)

On Tue, Sep 18, 2018 at 5:14 AM Matthias J. Sax 
wrote:

> +1 (binding)
>
> -Matthias
>
> On 9/17/18 1:12 PM, Guozhang Wang wrote:
> > +1 from me, thanks Bill !
> >
> > On Mon, Sep 17, 2018 at 12:43 PM, Bill Bejeck  wrote:
> >
> >> All,
> >>
> >> I'd like to start the voting process for KIP-372.  Here's the link to
> the
> >> updated proposal
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> 372%3A+Naming+Repartition+Topics+for+Joins+and+Grouping
> >>
> >> I'll start with my own +1.
> >>
> >> Thanks,
> >> Bill
> >>
> >
> >
> >
>
>

-- 
*Dongjin Lee*

*A hitchhiker in the mathematical world.*

*github:  github.com/dongjinleekr
linkedin: kr.linkedin.com/in/dongjinleekr
slideshare:
www.slideshare.net/dongjinleekr
*


Re: [DISCUSS] KIP-347: Enable batching in FindCoordinatorRequest

2018-09-17 Thread Yishun Guan
@Guozhang Wang What do you think?
On Fri, Sep 14, 2018 at 2:39 PM Yishun Guan  wrote:
>
> Hi All,
>
> After looking into AdminClient.java and ConsumerClient.java, following
> the original idea, I think some type specific codes are unavoidable
> (we can have a enum class that contain a list of batch-enabled APIs).
> As the compatibility codes that breaks down the batch, we need to
> either map one Builder to multiple Builders, or map one request to
> multiple requests. (I am not an expert, so I would love other's output
> on this.) This will be an extra conditional check before building or
> sending out a request.
>
> From my observation, now a batching optimization for request is only
> needed in KafkaAdminClient (In other words, we want to replace the
> for-loop with a batch request). That limited the scope of the
> optimization, maybe this optimization might seem a little trivial
> compare to the incompatible risk or inconsistency within codes that we
> might face?
>
> If we are not comfortable with making it "ugly and dirty" (or I just
> couldn't enough to come up with a balanced solution) within
> AdminNetworkClient.java and ConsumerNetworkClient.java, we should
> revisit this: https://github.com/apache/kafka/pull/5353 or postpone
> this improvement?
>
> Thanks,
> Yishun
> On Thu, Sep 6, 2018 at 5:22 PM Yishun Guan  wrote:
> >
> > Hi Collin and Guozhang,
> >
> > I see. But even if the fall-back logic goes into AdminClient and 
> > ConsumerClient, it still have to be somehow type specific right?
> > So either way, there will be api-specific process code somewhere?
> >
> > Thanks,
> > Yishun
> >
> >
> > On Tue, Sep 4, 2018 at 5:46 PM Colin McCabe  wrote:
> > >
> > > Hi Yishun,
> > >
> > > I agree with Guozhang.  NetworkClient is the wrong place to put things 
> > > which are specific to a particular message type.  NetworkClient should 
> > > not have to care what the type of the message is that it is sending.
> > >
> > > Adding type-specific handling is much more "ugly and dirty" than adding 
> > > some compatibility code to AdminClient and ConsumerClient.  It is true 
> > > that there is some code duplication, but I think it will be minimal in 
> > > this case.
> > >
> > > best,
> > > Colin
> > >
> > >
> > > On Tue, Sep 4, 2018, at 13:28, Guozhang Wang wrote:
> > > > Hello Yishun,
> > > >
> > > > I reviewed the latest wiki page, and noticed that the special handling
> > > > logic needs to be in the NetworkClient.
> > > >
> > > > Comparing it with another alternative way, i.e. we add the fall-back 
> > > > logic
> > > > in the AdminClient, as well as in the ConsumerClient to capture the
> > > > UnsupportedException and fallback, because the two of them are possibly
> > > > sending FindCoordinatorRequest (though for consumers today we do not 
> > > > expect
> > > > it to send for more than one coordinator); personally I think the 
> > > > current
> > > > approach is better, but I'd like to hear other people's opinion as well
> > > > (cc'ed Colin, who implemented the AdminClient).
> > > >
> > > >
> > > > Guozhang
> > > >
> > > >
> > > > On Mon, Sep 3, 2018 at 11:57 AM, Yishun Guan  wrote:
> > > >
> > > > > Hi Guozhang,
> > > > >
> > > > > Yes, I totally agree with you. Like I said, I think it is an overkill
> > > > > for now, to make so many changes for a small improvement.
> > > > > And like you said, the only way to do this is the "ugly and dirty"
> > > > > way, do you think we should still apply this improvement?
> > > > >
> > > > > I updated a new BuildToList() (method name pending) and add a check
> > > > > condition in the doSend().
> > > > > This is the KIP:https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > 347%3A++Enable+batching+in+FindCoordinatorRequest
> > > > >
> > > > > Let me know what you think.
> > > > >
> > > > > Thanks,
> > > > > Yishun
> > > > > On Sun, Sep 2, 2018 at 10:02 PM Guozhang Wang  
> > > > > wrote:
> > > > > >
> > > > > > Hi Yishun,
> > > > > >
> > > > > > I was actually not suggesting we should immediately make such 
> > > > > > dramatic
> > > > > > change on the AbstractRequest APIs which will affect all requests 
> > > > > > types,
> > > > > > just clarifying if it is your intent or not, since your code 
> > > > > > snippet in
> > > > > the
> > > > > > KIP has "@Override"  :)
> > > > > >
> > > > > > I think an alternative way is to add such a function for for
> > > > > > FindCoordinator only, i.e. besides the overridden `public
> > > > > > FindCoordinatorRequest build(short version)` we can have one more
> > > > > function
> > > > > > (note the function name need to be different since Java type erasure
> > > > > caused
> > > > > > it to not able to differentiate these two otherwise, but we can 
> > > > > > consider
> > > > > a
> > > > > > better name: buildMulti is only for illustration)
> > > > > >
> > > > > > public List buildMulti(short version)
> > > > > >
> > > > > >
> > > > > > It does mean that we now need to special-handle
> > > > > > 

[jira] [Resolved] (KAFKA-7370) Enhance FileConfigProvider to read a directory

2018-09-17 Thread Robert Yokota (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Yokota resolved KAFKA-7370.
--
Resolution: Won't Do

> Enhance FileConfigProvider to read a directory
> --
>
> Key: KAFKA-7370
> URL: https://issues.apache.org/jira/browse/KAFKA-7370
> Project: Kafka
>  Issue Type: Improvement
>  Components: config
>Affects Versions: 2.0.0
>Reporter: Robert Yokota
>Assignee: Robert Yokota
>Priority: Minor
>
> Currently FileConfigProvider can read a Properties file as a set of key-value 
> pairs.  This enhancement is to augment FileConfigProvider so that it can also 
> read a directory, where the file names are the keys and the corresponding 
> file contents are the values.
> This will allow for easier integration with secret management systems where 
> each secret is often an individual file, such as in Docker and Kubernetes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: kafka-trunk-jdk8 #2970

2018-09-17 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] MINOR: log and fail on missing task in Streams (#5655)

[lindong28] KAFKA-5690; Add support to list ACLs for a given principal (KIP-357)

--
[...truncated 2.68 MB...]

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestampWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValueTimestamp 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValueTimestamp 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualWithNullForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualWithNullForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 PASSED


Build failed in Jenkins: kafka-2.0-jdk8 #145

2018-09-17 Thread Apache Jenkins Server
See 


Changes:

[rajinisivaram] MINOR: Increase timeout in log4j system test to avoid transient 
failures

--
[...truncated 434.09 KB...]
kafka.controller.PartitionStateMachineTest > 
testInvalidNonexistentPartitionToOfflinePartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testInvalidNonexistentPartitionToOfflinePartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testOnlinePartitionToOnlineTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testOnlinePartitionToOnlineTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testNewPartitionToOnlinePartitionTransitionZkUtilsExceptionFromCreateStates 
STARTED

kafka.controller.PartitionStateMachineTest > 
testNewPartitionToOnlinePartitionTransitionZkUtilsExceptionFromCreateStates 
PASSED

kafka.controller.PartitionStateMachineTest > 
testInvalidNewPartitionToNonexistentPartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testInvalidNewPartitionToNonexistentPartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testNewPartitionToOnlinePartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testNewPartitionToOnlinePartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testInvalidOnlinePartitionToNewPartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testInvalidOnlinePartitionToNewPartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToOnlinePartitionTransitionErrorCodeFromStateLookup STARTED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToOnlinePartitionTransitionErrorCodeFromStateLookup PASSED

kafka.controller.PartitionStateMachineTest > 
testOnlinePartitionToOnlineTransitionForControlledShutdown STARTED

kafka.controller.PartitionStateMachineTest > 
testOnlinePartitionToOnlineTransitionForControlledShutdown PASSED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToOnlinePartitionTransitionZkUtilsExceptionFromStateLookup 
STARTED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToOnlinePartitionTransitionZkUtilsExceptionFromStateLookup 
PASSED

kafka.controller.PartitionStateMachineTest > 
testInvalidOnlinePartitionToNonexistentPartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testInvalidOnlinePartitionToNonexistentPartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testInvalidOfflinePartitionToNewPartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testInvalidOfflinePartitionToNewPartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToOnlinePartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToOnlinePartitionTransition PASSED

kafka.controller.ControllerEventManagerTest > testEventThatThrowsException 
STARTED

kafka.controller.ControllerEventManagerTest > testEventThatThrowsException 
PASSED

kafka.controller.ControllerEventManagerTest > testSuccessfulEvent STARTED

kafka.controller.ControllerEventManagerTest > testSuccessfulEvent PASSED

kafka.controller.ControllerFailoverTest > testHandleIllegalStateException 
STARTED

kafka.controller.ControllerFailoverTest > testHandleIllegalStateException PASSED

kafka.network.SocketServerTest > testGracefulClose STARTED

kafka.network.SocketServerTest > testGracefulClose PASSED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingAlreadyDone STARTED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingAlreadyDone PASSED

kafka.network.SocketServerTest > controlThrowable STARTED

kafka.network.SocketServerTest > controlThrowable PASSED

kafka.network.SocketServerTest > testRequestMetricsAfterStop STARTED

kafka.network.SocketServerTest > testRequestMetricsAfterStop PASSED

kafka.network.SocketServerTest > testConnectionIdReuse STARTED

kafka.network.SocketServerTest > testConnectionIdReuse PASSED

kafka.network.SocketServerTest > testClientDisconnectionUpdatesRequestMetrics 
STARTED

kafka.network.SocketServerTest > testClientDisconnectionUpdatesRequestMetrics 
PASSED

kafka.network.SocketServerTest > testProcessorMetricsTags STARTED

kafka.network.SocketServerTest > testProcessorMetricsTags PASSED

kafka.network.SocketServerTest > testMaxConnectionsPerIp STARTED

kafka.network.SocketServerTest > testMaxConnectionsPerIp PASSED

kafka.network.SocketServerTest > testConnectionId STARTED

kafka.network.SocketServerTest > testConnectionId PASSED

kafka.network.SocketServerTest > 
testBrokerSendAfterChannelClosedUpdatesRequestMetrics STARTED

kafka.network.SocketServerTest > 
testBrokerSendAfterChannelClosedUpdatesRequestMetrics PASSED

kafka.network.SocketServerTest > testNoOpAction STARTED


Jenkins build is back to normal : kafka-trunk-jdk10 #492

2018-09-17 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-349 Priorities for Source Topics

2018-09-17 Thread Matthias J. Sax
I see. That makes sense.

Actually, GlobalKTable work exactly this way -- it's not just a
broadcasted table, it's also a non-synchronized table. Not sure if this
would work for you -- the broadcasted property might be a deal breaker
for your use case.

Personally, I believe there is a design space with broadcasted vs
sharded, and time-synchronized and non-synchronized, offering 4
different implementations. Atm, we only have two of them and I would
love to get all four variants.


-Matthias


On 9/17/18 11:18 AM, Thomas Becker wrote:
> Hi Matthias,
> I'm familiar with how the timestamp synchronization currently works. I also 
> submit that it does not work for our use-case, which is the following: The 
> table-backing topic contains records with the best available data we have for 
> a given item. IF a record in this topic is updated, we would always prefer to 
> join using this data *regardless* of whether it is "newer" than the incoming 
> event we are trying to join it with.
> 
> Essentially, streams assumes that we must want the table data that was 
> current at the time the event was produced, and here we simply don't. If we 
> have newer data, we want that. But my larger concern here is actually 
> reprocessing; when doing that the older table-data will be log compacted away 
> and the current timestamp semantics will result in events that occurred prior 
> to the latest table updates being unjoined. Does this make sense now?
> 
> Thanks!
> Tommy
> 
> On Mon, 2018-09-17 at 09:51 -0700, Matthias J. Sax wrote:
> 
> I am not sure if this feature would help with stream-table joins. Also
> 
> note, that we recently merged a PR that improves the timestamp
> 
> synchronization of Kafka Streams -- this will vastly improve the guarantees.
> 
> 
> What I don't understand:
> 
> 
> So table records that have been updated recently will not be read until the 
> stream records reach or exceed that same timestamp.
> 
> 
> Yes, this is on purpose / by design.
> 
> 
> and if they do it will be with old data
> 
> 
> What do you mean by "old data"? By definition, the stream record will
> 
> join with a table that contains data up-to the stream record's
> 
> timestamp. It does semantically not make sense to advance the table
> 
> beyond the stream record's timestamp, because if you do this, you would
> 
> semantically join with "future data" what---from my point of view---is
> 
> semantically incorrect.
> 
> 
> Shameless plug: you might want to read
> 
> https://www.confluent.io/blog/streams-tables-two-sides-same-coin
> 
> 
> 
> 
> -Matthias
> 
> 
> On 9/17/18 8:23 AM, Thomas Becker wrote:
> 
> For my part, a major use-case for this feature is stream-table joins. 
> Currently, KafkaStreams does the wrong thing in some cases because the only 
> message choosing strategy available is timestamp-based. So table records that 
> have been updated recently will not be read until the stream records reach or 
> exceed that same timestamp. So there is no guarantee these records get joined 
> at all, and if they do it will be with old data. I realize we're talking 
> about the consumer here and not streams specifically, but as it stands I 
> can't even write a non-streams application that does a join but prioritizes 
> table-topic records over stream records without using multiple consumers.
> 
> 
> On Wed, 2018-09-05 at 08:18 -0700, Colin McCabe wrote:
> 
> 
> Hi all,
> 
> 
> 
> I agree that DISCUSS is more appropriate than VOTE at this point, since I 
> don't remember the last discussion coming to a definite conclusion.
> 
> 
> 
> I guess my concern is that this will add complexity and memory consumption on 
> the server side.  In the case of incremental fetch requests, we will have to 
> track at least two extra bytes per partition, to know what the priority of 
> each partition is within each active fetch session.
> 
> 
> 
> It would be nice to hear more about the use-cases for this feature.  I think 
> Gwen asked about this earlier, and I don't remember reading a response.  The 
> fact that we're now talking about Samza interfaces is a bit of a red flag.  
> After all, Samza didn't need partition priorities to do what it did.  You can 
> do a lot with muting partitions and using appropriate threading in your code.
> 
> 
> 
> For example, you can hand data from a partition off to a work queue with a 
> fixed size, which is handled by a separate service thread.  If the queue gets 
> full, you can mute the partition until some of the buffered data is 
> processed.  Kafka Streams uses a similar approach to avoid reading partition 
> data that isn't immediately needed.
> 
> 
> 
> There might be some use-cases that need priorities eventually, but I'm 
> concerned that we're jumping the gun by trying to implement this before we 
> know what they are.
> 
> 
> 
> best,
> 
> 
> Colin
> 
> 
> 
> 
> On Wed, Sep 5, 2018, at 01:06, Jan Filipiak wrote:
> 
> 
> 
> On 05.09.2018 02:38, 
> 

Re: [VOTE] KIP-372: Naming Repartition Topics for Joins and Grouping

2018-09-17 Thread Matthias J. Sax
+1 (binding)

-Matthias

On 9/17/18 1:12 PM, Guozhang Wang wrote:
> +1 from me, thanks Bill !
> 
> On Mon, Sep 17, 2018 at 12:43 PM, Bill Bejeck  wrote:
> 
>> All,
>>
>> I'd like to start the voting process for KIP-372.  Here's the link to the
>> updated proposal
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 372%3A+Naming+Repartition+Topics+for+Joins+and+Grouping
>>
>> I'll start with my own +1.
>>
>> Thanks,
>> Bill
>>
> 
> 
> 



signature.asc
Description: OpenPGP digital signature


Re: [VOTE] KIP-372: Naming Repartition Topics for Joins and Grouping

2018-09-17 Thread Guozhang Wang
+1 from me, thanks Bill !

On Mon, Sep 17, 2018 at 12:43 PM, Bill Bejeck  wrote:

> All,
>
> I'd like to start the voting process for KIP-372.  Here's the link to the
> updated proposal
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 372%3A+Naming+Repartition+Topics+for+Joins+and+Grouping
>
> I'll start with my own +1.
>
> Thanks,
> Bill
>



-- 
-- Guozhang


[VOTE] KIP-372: Naming Repartition Topics for Joins and Grouping

2018-09-17 Thread Bill Bejeck
All,

I'd like to start the voting process for KIP-372.  Here's the link to the
updated proposal
https://cwiki.apache.org/confluence/display/KAFKA/KIP-372%3A+Naming+Repartition+Topics+for+Joins+and+Grouping

I'll start with my own +1.

Thanks,
Bill


[jira] [Resolved] (KAFKA-7150) Error in processing fetched data for one partition may stop follower fetching other partitions

2018-09-17 Thread Jason Gustafson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-7150.

Resolution: Not A Problem

This is no longer a problem since we have removed the logic to raise a fatal 
error from the out of range error handling.

> Error in processing fetched data for one partition may stop follower fetching 
> other partitions
> --
>
> Key: KAFKA-7150
> URL: https://issues.apache.org/jira/browse/KAFKA-7150
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 0.10.2.2, 0.11.0.3, 1.0.2, 1.1.0
>Reporter: Anna Povzner
>Priority: Major
>
> If the followers fails to process data for one topic partitions, like out of 
> order offsets error, the whole ReplicaFetcherThread is killed, which also 
> stops fetching for other topic partitions serviced by this fetcher thread. 
> This may result in un-necessary under-replicated partitions. I think it would 
> be better to continue fetching for other topic partitions, and just remove 
> the partition with an error from the responsibility of the fetcher thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7414) Do not fail broker on out of range offsets in replica fetcher

2018-09-17 Thread Jason Gustafson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-7414.

   Resolution: Fixed
Fix Version/s: 2.1.0

> Do not fail broker on out of range offsets in replica fetcher
> -
>
> Key: KAFKA-7414
> URL: https://issues.apache.org/jira/browse/KAFKA-7414
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Major
> Fix For: 2.1.0
>
>
> In the replica fetcher, we have logic to detect the case when the follower's 
> offset is ahead of the leader's. If unclean leader election is not enabled, 
> we raise a fatal error and kill the broker. 
> This behavior is inconsistent depending on the message format. With 
> KIP-101/KIP-279, upon becoming a follower, the replica would use leader epoch 
> information to reconcile the end of the log with the leader and simply 
> truncate. Additionally, with the old format, the check is not really 
> bulletproof for detecting data loss since the unclean leader's end offset 
> might have already caught up to the follower's offset at the time of its 
> initial fetch or when it queries for the current log end offset.
> To make the logic consistent, we could raise a fatal error whenever the 
> follower has to truncate below the high watermark. However, the fatal error 
> is probably overkill and it would be better to log a warning since most of 
> the damage is already done if the leader has already been elected and this 
> causes a huge blast radius.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7417) Some replicas cannot become in-sync

2018-09-17 Thread Mikhail Khomenko (JIRA)
Mikhail Khomenko created KAFKA-7417:
---

 Summary: Some replicas cannot become in-sync
 Key: KAFKA-7417
 URL: https://issues.apache.org/jira/browse/KAFKA-7417
 Project: Kafka
  Issue Type: Bug
  Components: replication
Affects Versions: 2.0.0
Reporter: Mikhail Khomenko


Hi,
we have faced with the next issue - some replicas cannot become in-sync. 
Distribution of in-sync replicas amongst topics is random. For instance:
{code:java}
$ kafka-topics --zookeeper 1.2.3.4:8181 --describe --topic TEST
Topic:TEST PartitionCount:8 ReplicationFactor:3 Configs:
Topic: TEST Partition: 0 Leader: 2 Replicas: 0,2,1 Isr: 0,1,2
Topic: TEST Partition: 1 Leader: 1 Replicas: 1,0,2 Isr: 0,1,2
Topic: TEST Partition: 2 Leader: 2 Replicas: 2,1,0 Isr: 0,1,2
Topic: TEST Partition: 3 Leader: 2 Replicas: 0,1,2 Isr: 0,1,2
Topic: TEST Partition: 4 Leader: 1 Replicas: 1,2,0 Isr: 0,1,2
Topic: TEST Partition: 5 Leader: 2 Replicas: 2,0,1 Isr: 0,1,2
Topic: TEST Partition: 6 Leader: 0 Replicas: 0,2,1 Isr: 0,1,2
Topic: TEST Partition: 7 Leader: 0 Replicas: 1,0,2 Isr: 0,2{code}
Files in segment TEST-7 are equal (the same md5sum) on all 3 brokers. Also were 
checked by kafka.tools.DumpLogSegments - messages are the same.


We have 3-broker cluster configuration with Confluent Kafka 5.0.0 (it's Apache 
Kafka 2.0.0).
Each broker has the next configuration:
{code:java}
advertised.host.name = null
advertised.listeners = PLAINTEXT://1.2.3.4:9200
advertised.port = null
alter.config.policy.class.name = null
alter.log.dirs.replication.quota.window.num = 11
alter.log.dirs.replication.quota.window.size.seconds = 1
authorizer.class.name = 
auto.create.topics.enable = true
auto.leader.rebalance.enable = true
background.threads = 10
broker.id = 1
broker.id.generation.enable = true
broker.interceptor.class = class 
org.apache.kafka.server.interceptor.DefaultBrokerInterceptor
broker.rack = null
client.quota.callback.class = null
compression.type = producer
connections.max.idle.ms = 60
controlled.shutdown.enable = true
controlled.shutdown.max.retries = 3
controlled.shutdown.retry.backoff.ms = 5000
controller.socket.timeout.ms = 3
create.topic.policy.class.name = null
default.replication.factor = 3
delegation.token.expiry.check.interval.ms = 360
delegation.token.expiry.time.ms = 8640
delegation.token.master.key = null
delegation.token.max.lifetime.ms = 60480
delete.records.purgatory.purge.interval.requests = 1
delete.topic.enable = true
fetch.purgatory.purge.interval.requests = 1000
group.initial.rebalance.delay.ms = 3000
group.max.session.timeout.ms = 30
group.min.session.timeout.ms = 6000
host.name = 
inter.broker.listener.name = null
inter.broker.protocol.version = 2.0
leader.imbalance.check.interval.seconds = 300
leader.imbalance.per.broker.percentage = 10
listener.security.protocol.map = 
PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL
listeners = PLAINTEXT://0.0.0.0:9200
log.cleaner.backoff.ms = 15000
log.cleaner.dedupe.buffer.size = 134217728
log.cleaner.delete.retention.ms = 8640
log.cleaner.enable = true
log.cleaner.io.buffer.load.factor = 0.9
log.cleaner.io.buffer.size = 524288
log.cleaner.io.max.bytes.per.second = 1.7976931348623157E308
log.cleaner.min.cleanable.ratio = 0.5
log.cleaner.min.compaction.lag.ms = 0
log.cleaner.threads = 1
log.cleanup.policy = [delete]
log.dir = /tmp/kafka-logs
log.dirs = /var/lib/kafka/data
log.flush.interval.messages = 9223372036854775807
log.flush.interval.ms = null
log.flush.offset.checkpoint.interval.ms = 6
log.flush.scheduler.interval.ms = 9223372036854775807
log.flush.start.offset.checkpoint.interval.ms = 6
log.index.interval.bytes = 4096
log.index.size.max.bytes = 10485760
log.message.downconversion.enable = true
log.message.format.version = 2.0
log.message.timestamp.difference.max.ms = 9223372036854775807
log.message.timestamp.type = CreateTime
log.preallocate = false
log.retention.bytes = -1
log.retention.check.interval.ms = 30
log.retention.hours = 8760
log.retention.minutes = null
log.retention.ms = null
log.roll.hours = 168
log.roll.jitter.hours = 0
log.roll.jitter.ms = null
log.roll.ms = null
log.segment.bytes = 1073741824
log.segment.delete.delay.ms = 6
max.connections.per.ip = 2147483647
max.connections.per.ip.overrides = 
max.incremental.fetch.session.cache.slots = 1000
message.max.bytes = 112
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 3
min.insync.replicas = 2
num.io.threads = 8
num.network.threads = 8
num.partitions = 8
num.recovery.threads.per.data.dir = 1
num.replica.alter.log.dirs.threads = null
num.replica.fetchers = 4
offset.metadata.max.bytes = 4096
offsets.commit.required.acks = -1
offsets.commit.timeout.ms = 5000
offsets.load.buffer.size = 5242880
offsets.retention.check.interval.ms = 60
offsets.retention.minutes = 525600

Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-17 Thread Ron Dagostino
<<
wrote:

> Hi Ron,
>
>
> *1) Is there a reason to communicate this value back to the client when the
> client already knows it?  It's an extra network round-trip, and since the
> SASL round-tripsare defined by the spec I'm not certain adding an extra
> round-trip is acceptable.*
>
> I wasn't suggesting an extra round-trip for propagating session lifetime. I
> was expecting session lifetime to be added to the last SASL_AUTHENTICATE
> response from the broker. Because SASL is a challenge-response mechanism,
> SaslServer knows when the response being sent is the last one and hence can
> send the session lifetime in the response (in the same way as we propagate
> errors). I was expecting this to be added as an extra field alongside
> errors, not in the opaque body as mentioned in the KIP. The opaque byte
> array strictly conforms to the SASL mechanism wire protocol and we want to
> keep it that way.
>
> As you have said, we don't need server to propagate session lifetime for
> OAUTHBEARER since client knows the token lifetime. But server also knows
> the credential lifetime and by having the server decide the lifetime, we
> can use the same code path for all mechanisms. If we only have a constant
> max lifetime in the server, for PLAIN and SCRAM we will end up having the
> same lifetime for all credentials with no ability to set actual expiry. We
> use SCRAM for delegation token based authentication where credentials have
> an expiry time, so we do need to be able to set individual credential
> lifetimes, where the server knows the expiry time but client may not.
>
> *2) I also just realized that if the client is to learn the credential
> lifetime we wouldn't want to put special-case code in the Authenticator for
> GSSAPI and OAUTHBEARER; we would want to expose the value generically,
> probably as a negotiated property on the SaslClient instance.*
>
> I was trying to avoid this altogether. Client doesn't need to know
> credential lifetime. Server asks the client to re-authenticate within its
> session lifetime.
>
> 3) From the KIP, I wasn't entirely clear about the purpose of the two
> configs:
>
> sasl.login.refresh.reauthenticate.enable: Do we need this? Client knows if
> broker version supports re-authentication based on the SASL_AUTHENTICATE
> version returned in ApiVersionsResponse. Client knows if broker is
> configured to enable re-authentication based on session lifetime returned
> in SaslAuthenticateResponse. If broker has re-authentication configured and
> client supports re-authentication, you would always want re-authenticate.
> So I wasn't sure why we need a config to opt in or out on the client-side.
>
>
> connections.max.reauth.ms: We obviously need a broker-side config. Not
> entirely sure about the semantics of the config that drives
> re-authentication. In particular, I wasn't expecting that we would reject
> tokens or tickets simply because they were too long-lived. Since tokens or
> tickets are granted by a 3rd party authority, I am not sure if clients will
> always have control over the lifetime. Do we need to support any more
> scenarios than these:
>
>
> A) reauth.ms=10,credential.lifetime.ms=10 : Broker sets
> session.lifetime=10,
> so this works.
>
> B) reauth.ms=10, credential.lifetime.ms=5 : Broker sets
> session.lifetime=5,
> so this works.
> C) reauth.ms=10, credential.lifetime.ms=20 : Broker sets
> session.lifetime=10. Client re-authenticates even though token was not
> refreshed. Does this matter?
> D) reauth.ms=Long.MAX_VALUE, credential.lifetime.ms=10: Broker sets
> session.lifetime=10, client re-authenticates based on credential expiry.
> E) reauth.ms=0 (default), credential.lifetime.ms=10 : Broker sets
> session.lifetime=0, Broker doesn't terminate sessions, client doesn't
> re-authenticate. We generate useful metrics.
> F) reauth.ms=0 (default),no lifetime for credential (e.g. PLAIN): Broker
> sets session.lifetime=0, Broker doesn't terminate sessions, client doesn't
> re-authenticate
> G) reauth.ms=10,no lifetime for credential (e.g. PLAIN) : Broker sets
> session.lifetime=10. Client re-authenticates.
>
> I would have thought that D) is the typical scenario for OAuth/Kerberos to
> respect token expiry time. G) would be typical scenario for PLAIN to force
> re-authenication at regular intervals. A/B/C are useful to force
> re-authentication in scenarios where you might check for credential
> revocation in the server. And E/F are useful to disable re-authentication
> and generate metrics (also the default behaviour useful during migration).
> Have I missed something?
>
>
> On Mon, Sep 17, 2018 at 4:27 PM, Ron Dagostino  wrote:
>
> > Hi yet again, Rajini.  I also just realized that if the client is to
> learn
> > the credential lifetime we wouldn't want to put special-case code in the
> > Authenticator for GSSAPI and OAUTHBEARER; we would want to expose the
> value
> > generically, probably as a negotiated property on the SaslClient
> instance.
> > We might be talking 

Re: [DISCUSS] KIP-349 Priorities for Source Topics

2018-09-17 Thread Thomas Becker
To sum up here, I don't disagree that timestamp semantics are nice and often 
useful. But currently, there is no way to opt-out of these semantics. In our 
case the timestamp of an updated item record, which say, provides a better or 
corrected description, is simply preferred over the old record, period. Trying 
to force a temporal relationship between that and an event where the item was 
viewed is non-sensical.


On Mon, 2018-09-17 at 18:18 +, Thomas Becker wrote:

Hi Matthias,

I'm familiar with how the timestamp synchronization currently works. I also 
submit that it does not work for our use-case, which is the following: The 
table-backing topic contains records with the best available data we have for a 
given item. IF a record in this topic is updated, we would always prefer to 
join using this data *regardless* of whether it is "newer" than the incoming 
event we are trying to join it with.


Essentially, streams assumes that we must want the table data that was current 
at the time the event was produced, and here we simply don't. If we have newer 
data, we want that. But my larger concern here is actually reprocessing; when 
doing that the older table-data will be log compacted away and the current 
timestamp semantics will result in events that occurred prior to the latest 
table updates being unjoined. Does this make sense now?


Thanks!

Tommy


On Mon, 2018-09-17 at 09:51 -0700, Matthias J. Sax wrote:


I am not sure if this feature would help with stream-table joins. Also


note, that we recently merged a PR that improves the timestamp


synchronization of Kafka Streams -- this will vastly improve the guarantees.



What I don't understand:



So table records that have been updated recently will not be read until the 
stream records reach or exceed that same timestamp.



Yes, this is on purpose / by design.



and if they do it will be with old data



What do you mean by "old data"? By definition, the stream record will


join with a table that contains data up-to the stream record's


timestamp. It does semantically not make sense to advance the table


beyond the stream record's timestamp, because if you do this, you would


semantically join with "future data" what---from my point of view---is


semantically incorrect.



Shameless plug: you might want to read


https://www.confluent.io/blog/streams-tables-two-sides-same-coin





-Matthias



On 9/17/18 8:23 AM, Thomas Becker wrote:


For my part, a major use-case for this feature is stream-table joins. 
Currently, KafkaStreams does the wrong thing in some cases because the only 
message choosing strategy available is timestamp-based. So table records that 
have been updated recently will not be read until the stream records reach or 
exceed that same timestamp. So there is no guarantee these records get joined 
at all, and if they do it will be with old data. I realize we're talking about 
the consumer here and not streams specifically, but as it stands I can't even 
write a non-streams application that does a join but prioritizes table-topic 
records over stream records without using multiple consumers.



On Wed, 2018-09-05 at 08:18 -0700, Colin McCabe wrote:



Hi all,




I agree that DISCUSS is more appropriate than VOTE at this point, since I don't 
remember the last discussion coming to a definite conclusion.




I guess my concern is that this will add complexity and memory consumption on 
the server side.  In the case of incremental fetch requests, we will have to 
track at least two extra bytes per partition, to know what the priority of each 
partition is within each active fetch session.




It would be nice to hear more about the use-cases for this feature.  I think 
Gwen asked about this earlier, and I don't remember reading a response.  The 
fact that we're now talking about Samza interfaces is a bit of a red flag.  
After all, Samza didn't need partition priorities to do what it did.  You can 
do a lot with muting partitions and using appropriate threading in your code.




For example, you can hand data from a partition off to a work queue with a 
fixed size, which is handled by a separate service thread.  If the queue gets 
full, you can mute the partition until some of the buffered data is processed.  
Kafka Streams uses a similar approach to avoid reading partition data that 
isn't immediately needed.




There might be some use-cases that need priorities eventually, but I'm 
concerned that we're jumping the gun by trying to implement this before we know 
what they are.




best,



Colin





On Wed, Sep 5, 2018, at 01:06, Jan Filipiak wrote:




On 05.09.2018 02:38, 
n...@afshartous.com>>>
 wrote:




On Sep 4, 2018, at 4:20 PM, Jan Filipiak 

Re: [DISCUSS] KIP-349 Priorities for Source Topics

2018-09-17 Thread Thomas Becker
Hi Matthias,
I'm familiar with how the timestamp synchronization currently works. I also 
submit that it does not work for our use-case, which is the following: The 
table-backing topic contains records with the best available data we have for a 
given item. IF a record in this topic is updated, we would always prefer to 
join using this data *regardless* of whether it is "newer" than the incoming 
event we are trying to join it with.

Essentially, streams assumes that we must want the table data that was current 
at the time the event was produced, and here we simply don't. If we have newer 
data, we want that. But my larger concern here is actually reprocessing; when 
doing that the older table-data will be log compacted away and the current 
timestamp semantics will result in events that occurred prior to the latest 
table updates being unjoined. Does this make sense now?

Thanks!
Tommy

On Mon, 2018-09-17 at 09:51 -0700, Matthias J. Sax wrote:

I am not sure if this feature would help with stream-table joins. Also

note, that we recently merged a PR that improves the timestamp

synchronization of Kafka Streams -- this will vastly improve the guarantees.


What I don't understand:


So table records that have been updated recently will not be read until the 
stream records reach or exceed that same timestamp.


Yes, this is on purpose / by design.


and if they do it will be with old data


What do you mean by "old data"? By definition, the stream record will

join with a table that contains data up-to the stream record's

timestamp. It does semantically not make sense to advance the table

beyond the stream record's timestamp, because if you do this, you would

semantically join with "future data" what---from my point of view---is

semantically incorrect.


Shameless plug: you might want to read

https://www.confluent.io/blog/streams-tables-two-sides-same-coin




-Matthias


On 9/17/18 8:23 AM, Thomas Becker wrote:

For my part, a major use-case for this feature is stream-table joins. 
Currently, KafkaStreams does the wrong thing in some cases because the only 
message choosing strategy available is timestamp-based. So table records that 
have been updated recently will not be read until the stream records reach or 
exceed that same timestamp. So there is no guarantee these records get joined 
at all, and if they do it will be with old data. I realize we're talking about 
the consumer here and not streams specifically, but as it stands I can't even 
write a non-streams application that does a join but prioritizes table-topic 
records over stream records without using multiple consumers.


On Wed, 2018-09-05 at 08:18 -0700, Colin McCabe wrote:


Hi all,



I agree that DISCUSS is more appropriate than VOTE at this point, since I don't 
remember the last discussion coming to a definite conclusion.



I guess my concern is that this will add complexity and memory consumption on 
the server side.  In the case of incremental fetch requests, we will have to 
track at least two extra bytes per partition, to know what the priority of each 
partition is within each active fetch session.



It would be nice to hear more about the use-cases for this feature.  I think 
Gwen asked about this earlier, and I don't remember reading a response.  The 
fact that we're now talking about Samza interfaces is a bit of a red flag.  
After all, Samza didn't need partition priorities to do what it did.  You can 
do a lot with muting partitions and using appropriate threading in your code.



For example, you can hand data from a partition off to a work queue with a 
fixed size, which is handled by a separate service thread.  If the queue gets 
full, you can mute the partition until some of the buffered data is processed.  
Kafka Streams uses a similar approach to avoid reading partition data that 
isn't immediately needed.



There might be some use-cases that need priorities eventually, but I'm 
concerned that we're jumping the gun by trying to implement this before we know 
what they are.



best,


Colin




On Wed, Sep 5, 2018, at 01:06, Jan Filipiak wrote:



On 05.09.2018 02:38, 
n...@afshartous.com>
 wrote:



On Sep 4, 2018, at 4:20 PM, Jan Filipiak 
mailto:jan.filip...@trivago.com>>>
 wrote:



what I meant is litterally this interface:



https://samza.apache.org/learn/documentation/0.7.0/api/javadocs/org/apache/samza/system/chooser/MessageChooser.html
 



Hi Jan,



Thanks for the reply and I have a few questions.  This Samza doc



   https://samza.apache.org/learn/documentation/0.14/container/streams.html 




indicates that the chooser is set via configuration.  Are you 

Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-17 Thread Rajini Sivaram
Hi Ron,


*1) Is there a reason to communicate this value back to the client when the
client already knows it?  It's an extra network round-trip, and since the
SASL round-tripsare defined by the spec I'm not certain adding an extra
round-trip is acceptable.*

I wasn't suggesting an extra round-trip for propagating session lifetime. I
was expecting session lifetime to be added to the last SASL_AUTHENTICATE
response from the broker. Because SASL is a challenge-response mechanism,
SaslServer knows when the response being sent is the last one and hence can
send the session lifetime in the response (in the same way as we propagate
errors). I was expecting this to be added as an extra field alongside
errors, not in the opaque body as mentioned in the KIP. The opaque byte
array strictly conforms to the SASL mechanism wire protocol and we want to
keep it that way.

As you have said, we don't need server to propagate session lifetime for
OAUTHBEARER since client knows the token lifetime. But server also knows
the credential lifetime and by having the server decide the lifetime, we
can use the same code path for all mechanisms. If we only have a constant
max lifetime in the server, for PLAIN and SCRAM we will end up having the
same lifetime for all credentials with no ability to set actual expiry. We
use SCRAM for delegation token based authentication where credentials have
an expiry time, so we do need to be able to set individual credential
lifetimes, where the server knows the expiry time but client may not.

*2) I also just realized that if the client is to learn the credential
lifetime we wouldn't want to put special-case code in the Authenticator for
GSSAPI and OAUTHBEARER; we would want to expose the value generically,
probably as a negotiated property on the SaslClient instance.*

I was trying to avoid this altogether. Client doesn't need to know
credential lifetime. Server asks the client to re-authenticate within its
session lifetime.

3) From the KIP, I wasn't entirely clear about the purpose of the two
configs:

sasl.login.refresh.reauthenticate.enable: Do we need this? Client knows if
broker version supports re-authentication based on the SASL_AUTHENTICATE
version returned in ApiVersionsResponse. Client knows if broker is
configured to enable re-authentication based on session lifetime returned
in SaslAuthenticateResponse. If broker has re-authentication configured and
client supports re-authentication, you would always want re-authenticate.
So I wasn't sure why we need a config to opt in or out on the client-side.


connections.max.reauth.ms: We obviously need a broker-side config. Not
entirely sure about the semantics of the config that drives
re-authentication. In particular, I wasn't expecting that we would reject
tokens or tickets simply because they were too long-lived. Since tokens or
tickets are granted by a 3rd party authority, I am not sure if clients will
always have control over the lifetime. Do we need to support any more
scenarios than these:


A) reauth.ms=10,credential.lifetime.ms=10 : Broker sets session.lifetime=10,
so this works.

B) reauth.ms=10, credential.lifetime.ms=5 : Broker sets session.lifetime=5,
so this works.
C) reauth.ms=10, credential.lifetime.ms=20 : Broker sets
session.lifetime=10. Client re-authenticates even though token was not
refreshed. Does this matter?
D) reauth.ms=Long.MAX_VALUE, credential.lifetime.ms=10: Broker sets
session.lifetime=10, client re-authenticates based on credential expiry.
E) reauth.ms=0 (default), credential.lifetime.ms=10 : Broker sets
session.lifetime=0, Broker doesn't terminate sessions, client doesn't
re-authenticate. We generate useful metrics.
F) reauth.ms=0 (default),no lifetime for credential (e.g. PLAIN): Broker
sets session.lifetime=0, Broker doesn't terminate sessions, client doesn't
re-authenticate
G) reauth.ms=10,no lifetime for credential (e.g. PLAIN) : Broker sets
session.lifetime=10. Client re-authenticates.

I would have thought that D) is the typical scenario for OAuth/Kerberos to
respect token expiry time. G) would be typical scenario for PLAIN to force
re-authenication at regular intervals. A/B/C are useful to force
re-authentication in scenarios where you might check for credential
revocation in the server. And E/F are useful to disable re-authentication
and generate metrics (also the default behaviour useful during migration).
Have I missed something?


On Mon, Sep 17, 2018 at 4:27 PM, Ron Dagostino  wrote:

> Hi yet again, Rajini.  I also just realized that if the client is to learn
> the credential lifetime we wouldn't want to put special-case code in the
> Authenticator for GSSAPI and OAUTHBEARER; we would want to expose the value
> generically, probably as a negotiated property on the SaslClient instance.
> We might be talking about making the negotiated property key part of the
> public API.  In other words, the SaslClient would be responsible for
> exposing the credential (i.e. token or ticket) lifetime at a 

Re: [VOTE] KIP-358: Migrate Streams API to Duration instead of long ms times

2018-09-17 Thread Nikolay Izhikov
John, 

Got it.

Will do my best to meet this deadline.

В Пн, 17/09/2018 в 11:52 -0500, John Roesler пишет:
> Yay! Thanks so much for sticking with this Nikolay.
> 
> I look forward to your PR!
> 
> Not to put pressure on you, but just to let you know, the deadline for
> getting your pr *merged* for 2.1 is _October 1st_,
> so you basically have 2 weeks to send the PR, have the reviews, and get it
> merged.
> 
> (see
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=91554044)
> 
> Thanks again,
> -John
> 
> On Mon, Sep 17, 2018 at 10:29 AM Nikolay Izhikov 
> wrote:
> 
> > This KIP is now accepted with:
> > - 3 binding +1
> > - 2 non binding +1
> > 
> > Thanks, all.
> > 
> > Especially, John, Matthias, Guozhang, Bill, Damian!
> > 
> > В Чт, 13/09/2018 в 22:16 -0700, Guozhang Wang пишет:
> > > +1 (binding), thank you Nikolay!
> > > 
> > > Guozhang
> > > 
> > > On Thu, Sep 13, 2018 at 9:39 AM, Matthias J. Sax 
> > > wrote:
> > > 
> > > > Thanks for the KIP.
> > > > 
> > > > +1 (binding)
> > > > 
> > > > 
> > > > -Matthias
> > > > 
> > > > On 9/5/18 8:52 AM, John Roesler wrote:
> > > > > I'm a +1 (non-binding)
> > > > > 
> > > > > On Mon, Sep 3, 2018 at 8:33 AM Nikolay Izhikov 
> > > > 
> > > > wrote:
> > > > > 
> > > > > > Dear commiters.
> > > > > > 
> > > > > > Please, vote on a KIP.
> > > > > > 
> > > > > > В Пт, 31/08/2018 в 12:05 -0500, John Roesler пишет:
> > > > > > > Hi Nikolay,
> > > > > > > 
> > > > > > > You can start a PR any time, but we cannot per it (and probably
> > 
> > won't
> > > > 
> > > > do
> > > > > > > serious reviews) until after the KIP is voted and approved.
> > > > > > > 
> > > > > > > Sometimes people start a PR during discussion just to help
> > 
> > provide more
> > > > > > > context, but it's not required (and can also be distracting
> > 
> > because the
> > > > > > 
> > > > > > KIP
> > > > > > > discussion should avoid implementation details).
> > > > > > > 
> > > > > > > Let's wait one more day for any other comments and plan to start
> > 
> > the
> > > > 
> > > > vote
> > > > > > > on Monday if there are no other debates.
> > > > > > > 
> > > > > > > Once you start the vote, you have to leave it up for at least 72
> > 
> > hours,
> > > > > > 
> > > > > > and
> > > > > > > it requires 3 binding votes to pass. Only Kafka Committers have
> > 
> > binding
> > > > > > > votes (https://kafka.apache.org/committers).
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > -John
> > > > > > > 
> > > > > > > On Fri, Aug 31, 2018 at 11:09 AM Bill Bejeck 
> > > > 
> > > > wrote:
> > > > > > > 
> > > > > > > > Hi Nickolay,
> > > > > > > > 
> > > > > > > > Thanks for the clarification.
> > > > > > > > 
> > > > > > > > -Bill
> > > > > > > > 
> > > > > > > > On Fri, Aug 31, 2018 at 11:59 AM Nikolay Izhikov <
> > 
> > nizhi...@apache.org
> > > > > > > > wrote:
> > > > > > > > 
> > > > > > > > > Hello, John.
> > > > > > > > > 
> > > > > > > > > This is my first KIP, so, please, help me with kafka
> > 
> > development
> > > > > > 
> > > > > > process.
> > > > > > > > > 
> > > > > > > > > Should I start to work on PR now? Or should I wait for a
> > 
> > "+1" from
> > > > > > > > > commiters?
> > > > > > > > > 
> > > > > > > > > В Пт, 31/08/2018 в 10:33 -0500, John Roesler пишет:
> > > > > > > > > > I see. I guess that once we are in the PR-reviewing phase,
> > 
> > we'll
> > > > > > 
> > > > > > be in
> > > > > > > > 
> > > > > > > > a
> > > > > > > > > > better position to see what else can/should be done, and
> > 
> > we can
> > > > > > 
> > > > > > talk
> > > > > > > > > 
> > > > > > > > > about
> > > > > > > > > > follow-on work at that time.
> > > > > > > > > > 
> > > > > > > > > > Thanks for the clarification,
> > > > > > > > > > -John
> > > > > > > > > > 
> > > > > > > > > > On Fri, Aug 31, 2018 at 1:19 AM Nikolay Izhikov <
> > > > > > 
> > > > > > nizhi...@apache.org>
> > > > > > > > > 
> > > > > > > > > wrote:
> > > > > > > > > > 
> > > > > > > > > > > Hello, Bill
> > > > > > > > > > > 
> > > > > > > > > > > > In the "Proposed Changes" section, there is "Try to
> > 
> > reduce the
> > > > > > > > > > > 
> > > > > > > > > > > visibility of methods in next tickets" does that mean
> > 
> > eventual
> > > > > > > > > 
> > > > > > > > > deprecation
> > > > > > > > > > > and removal?
> > > > > > > > > > > 
> > > > > > > > > > > 1. Some methods will become deprecated. I think they
> > 
> > will be
> > > > > > 
> > > > > > removed
> > > > > > > > 
> > > > > > > > in
> > > > > > > > > > > the future.
> > > > > > > > > > > You can find list of deprecated methods in KIP.
> > > > > > > > > > > 
> > > > > > > > > > > 2. Some internal methods can't be deprecated or hid from
> > 
> > the
> > > > > > 
> > > > > > user for
> > > > > > > > > 
> > > > > > > > > now.
> > > > > > > > > > > I was trying to say that we should research possibility
> > 
> > to reduce
> > > > > > > > > > > visibility of *internal* methods that are *public* now.
> > > > > > > > > > > That kind of changes is out of 

Build failed in Jenkins: kafka-trunk-jdk10 #491

2018-09-17 Thread Apache Jenkins Server
See 

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on H23 (ubuntu xenial) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/kafka.git # timeout=10
Fetching upstream changes from https://github.com/apache/kafka.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/kafka.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:888)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1155)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1186)
at hudson.scm.SCM.checkout(SCM.java:504)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1208)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1794)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress https://github.com/apache/kafka.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: error: Could not read 42f07849917fadb444802add590e7bb9ca4f6ba2
remote: Counting objects: 10671, done.
remote: Compressing objects:   3% (1/30)   remote: Compressing objects: 
  6% (2/30)   remote: Compressing objects:  10% (3/30)   
remote: Compressing objects:  13% (4/30)   remote: Compressing objects: 
 16% (5/30)   remote: Compressing objects:  20% (6/30)   
remote: Compressing objects:  23% (7/30)   remote: Compressing objects: 
 26% (8/30)   remote: Compressing objects:  30% (9/30)   
remote: Compressing objects:  33% (10/30)   remote: Compressing 
objects:  36% (11/30)   remote: Compressing objects:  40% (12/30)   
remote: Compressing objects:  43% (13/30)   remote: Compressing 
objects:  46% (14/30)   remote: Compressing objects:  50% (15/30)   
remote: Compressing objects:  53% (16/30)   remote: Compressing 
objects:  56% (17/30)   remote: Compressing objects:  60% (18/30)   
remote: Compressing objects:  63% (19/30)   remote: Compressing 
objects:  66% (20/30)   remote: Compressing objects:  70% (21/30)   
remote: Compressing objects:  73% (22/30)   remote: Compressing 
objects:  76% (23/30)   remote: Compressing objects:  80% (24/30)   
remote: Compressing objects:  83% (25/30)   remote: Compressing 
objects:  86% (26/30)   remote: Compressing objects:  90% (27/30)   
remote: Compressing objects:  93% (28/30)   remote: Compressing 
objects:  96% (29/30)   remote: Compressing objects: 100% (30/30)   
remote: Compressing objects: 100% (30/30), done.
Receiving objects:   0% (1/10671)   Receiving objects:   1% (107/10671)   
Receiving objects:   2% (214/10671)   Receiving objects:   3% (321/10671)   
Receiving objects:   4% (427/10671)   Receiving objects:   5% (534/10671)   
Receiving objects:   6% (641/10671)   Receiving objects:   7% (747/10671)   
Receiving objects:   8% (854/10671)   Receiving objects:   9% (961/10671)   
Receiving objects:  10% (1068/10671)   Receiving objects:  11% (1174/10671)   
Receiving objects:  12% (1281/10671)   Receiving objects:  13% (1388/10671)   
Receiving objects:  14% (1494/10671)   Receiving objects:  15% (1601/10671)   
Receiving objects:  16% (1708/10671)   Receiving objects:  17% (1815/10671)   
Receiving objects:  18% (1921/10671)   Receiving objects:  19% (2028/10671)   
Receiving objects:  20% (2135/10671)   Receiving objects:  21% (2241/10671)   
Receiving objects:  22% (2348/10671)   Receiving objects:  23% (2455/10671)   
Receiving objects:  24% (2562/10671)   Receiving objects:  25% (2668/10671)   
Receiving objects:  26% (2775/10671)   Receiving objects:  27% (2882/10671)   
Receiving objects:  28% (2988/10671)   Receiving objects:  29% (3095/10671)   
Receiving objects:  30% (3202/10671)   Receiving objects:  31% (3309/10671)   
Receiving objects:  32% (3415/10671)   Receiving objects:  33% (3522/10671)   
Receiving objects:  34% (3629/10671)   Receiving objects:  

Re: [VOTE] KIP-358: Migrate Streams API to Duration instead of long ms times

2018-09-17 Thread John Roesler
Yay! Thanks so much for sticking with this Nikolay.

I look forward to your PR!

Not to put pressure on you, but just to let you know, the deadline for
getting your pr *merged* for 2.1 is _October 1st_,
so you basically have 2 weeks to send the PR, have the reviews, and get it
merged.

(see
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=91554044)

Thanks again,
-John

On Mon, Sep 17, 2018 at 10:29 AM Nikolay Izhikov 
wrote:

> This KIP is now accepted with:
> - 3 binding +1
> - 2 non binding +1
>
> Thanks, all.
>
> Especially, John, Matthias, Guozhang, Bill, Damian!
>
> В Чт, 13/09/2018 в 22:16 -0700, Guozhang Wang пишет:
> > +1 (binding), thank you Nikolay!
> >
> > Guozhang
> >
> > On Thu, Sep 13, 2018 at 9:39 AM, Matthias J. Sax 
> > wrote:
> >
> > > Thanks for the KIP.
> > >
> > > +1 (binding)
> > >
> > >
> > > -Matthias
> > >
> > > On 9/5/18 8:52 AM, John Roesler wrote:
> > > > I'm a +1 (non-binding)
> > > >
> > > > On Mon, Sep 3, 2018 at 8:33 AM Nikolay Izhikov 
> > >
> > > wrote:
> > > >
> > > > > Dear commiters.
> > > > >
> > > > > Please, vote on a KIP.
> > > > >
> > > > > В Пт, 31/08/2018 в 12:05 -0500, John Roesler пишет:
> > > > > > Hi Nikolay,
> > > > > >
> > > > > > You can start a PR any time, but we cannot per it (and probably
> won't
> > >
> > > do
> > > > > > serious reviews) until after the KIP is voted and approved.
> > > > > >
> > > > > > Sometimes people start a PR during discussion just to help
> provide more
> > > > > > context, but it's not required (and can also be distracting
> because the
> > > > >
> > > > > KIP
> > > > > > discussion should avoid implementation details).
> > > > > >
> > > > > > Let's wait one more day for any other comments and plan to start
> the
> > >
> > > vote
> > > > > > on Monday if there are no other debates.
> > > > > >
> > > > > > Once you start the vote, you have to leave it up for at least 72
> hours,
> > > > >
> > > > > and
> > > > > > it requires 3 binding votes to pass. Only Kafka Committers have
> binding
> > > > > > votes (https://kafka.apache.org/committers).
> > > > > >
> > > > > > Thanks,
> > > > > > -John
> > > > > >
> > > > > > On Fri, Aug 31, 2018 at 11:09 AM Bill Bejeck 
> > >
> > > wrote:
> > > > > >
> > > > > > > Hi Nickolay,
> > > > > > >
> > > > > > > Thanks for the clarification.
> > > > > > >
> > > > > > > -Bill
> > > > > > >
> > > > > > > On Fri, Aug 31, 2018 at 11:59 AM Nikolay Izhikov <
> nizhi...@apache.org
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hello, John.
> > > > > > > >
> > > > > > > > This is my first KIP, so, please, help me with kafka
> development
> > > > >
> > > > > process.
> > > > > > > >
> > > > > > > > Should I start to work on PR now? Or should I wait for a
> "+1" from
> > > > > > > > commiters?
> > > > > > > >
> > > > > > > > В Пт, 31/08/2018 в 10:33 -0500, John Roesler пишет:
> > > > > > > > > I see. I guess that once we are in the PR-reviewing phase,
> we'll
> > > > >
> > > > > be in
> > > > > > >
> > > > > > > a
> > > > > > > > > better position to see what else can/should be done, and
> we can
> > > > >
> > > > > talk
> > > > > > > >
> > > > > > > > about
> > > > > > > > > follow-on work at that time.
> > > > > > > > >
> > > > > > > > > Thanks for the clarification,
> > > > > > > > > -John
> > > > > > > > >
> > > > > > > > > On Fri, Aug 31, 2018 at 1:19 AM Nikolay Izhikov <
> > > > >
> > > > > nizhi...@apache.org>
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hello, Bill
> > > > > > > > > >
> > > > > > > > > > > In the "Proposed Changes" section, there is "Try to
> reduce the
> > > > > > > > > >
> > > > > > > > > > visibility of methods in next tickets" does that mean
> eventual
> > > > > > > >
> > > > > > > > deprecation
> > > > > > > > > > and removal?
> > > > > > > > > >
> > > > > > > > > > 1. Some methods will become deprecated. I think they
> will be
> > > > >
> > > > > removed
> > > > > > >
> > > > > > > in
> > > > > > > > > > the future.
> > > > > > > > > > You can find list of deprecated methods in KIP.
> > > > > > > > > >
> > > > > > > > > > 2. Some internal methods can't be deprecated or hid from
> the
> > > > >
> > > > > user for
> > > > > > > >
> > > > > > > > now.
> > > > > > > > > > I was trying to say that we should research possibility
> to reduce
> > > > > > > > > > visibility of *internal* methods that are *public* now.
> > > > > > > > > > That kind of changes is out of the scope of current KIP,
> so we
> > > > >
> > > > > have
> > > > > > >
> > > > > > > to
> > > > > > > > do
> > > > > > > > > > it in the next tickets.
> > > > > > > > > >
> > > > > > > > > > I don't expect that internal methods will be removed.
> > > > > > > > > >
> > > > > > > > > > В Чт, 30/08/2018 в 18:59 -0400, Bill Bejeck пишет:
> > > > > > > > > > > Sorry for chiming in late, there was a lot of detail
> to catch
> > > > >
> > > > > up
> > > > > > >
> > > > > > > on.
> > > > > > > > > > >
> > > > > > > > > > > Overall I'm +1 in the KIP.  But I do have one 

Re: [DISCUSS] KIP-349 Priorities for Source Topics

2018-09-17 Thread Matthias J. Sax
I am not sure if this feature would help with stream-table joins. Also
note, that we recently merged a PR that improves the timestamp
synchronization of Kafka Streams -- this will vastly improve the guarantees.

What I don't understand:

> So table records that have been updated recently will not be read until the 
> stream records reach or exceed that same timestamp.

Yes, this is on purpose / by design.

> and if they do it will be with old data

What do you mean by "old data"? By definition, the stream record will
join with a table that contains data up-to the stream record's
timestamp. It does semantically not make sense to advance the table
beyond the stream record's timestamp, because if you do this, you would
semantically join with "future data" what---from my point of view---is
semantically incorrect.

Shameless plug: you might want to read
https://www.confluent.io/blog/streams-tables-two-sides-same-coin



-Matthias

On 9/17/18 8:23 AM, Thomas Becker wrote:
> For my part, a major use-case for this feature is stream-table joins. 
> Currently, KafkaStreams does the wrong thing in some cases because the only 
> message choosing strategy available is timestamp-based. So table records that 
> have been updated recently will not be read until the stream records reach or 
> exceed that same timestamp. So there is no guarantee these records get joined 
> at all, and if they do it will be with old data. I realize we're talking 
> about the consumer here and not streams specifically, but as it stands I 
> can't even write a non-streams application that does a join but prioritizes 
> table-topic records over stream records without using multiple consumers.
> 
> On Wed, 2018-09-05 at 08:18 -0700, Colin McCabe wrote:
> 
> Hi all,
> 
> 
> I agree that DISCUSS is more appropriate than VOTE at this point, since I 
> don't remember the last discussion coming to a definite conclusion.
> 
> 
> I guess my concern is that this will add complexity and memory consumption on 
> the server side.  In the case of incremental fetch requests, we will have to 
> track at least two extra bytes per partition, to know what the priority of 
> each partition is within each active fetch session.
> 
> 
> It would be nice to hear more about the use-cases for this feature.  I think 
> Gwen asked about this earlier, and I don't remember reading a response.  The 
> fact that we're now talking about Samza interfaces is a bit of a red flag.  
> After all, Samza didn't need partition priorities to do what it did.  You can 
> do a lot with muting partitions and using appropriate threading in your code.
> 
> 
> For example, you can hand data from a partition off to a work queue with a 
> fixed size, which is handled by a separate service thread.  If the queue gets 
> full, you can mute the partition until some of the buffered data is 
> processed.  Kafka Streams uses a similar approach to avoid reading partition 
> data that isn't immediately needed.
> 
> 
> There might be some use-cases that need priorities eventually, but I'm 
> concerned that we're jumping the gun by trying to implement this before we 
> know what they are.
> 
> 
> best,
> 
> Colin
> 
> 
> 
> On Wed, Sep 5, 2018, at 01:06, Jan Filipiak wrote:
> 
> 
> On 05.09.2018 02:38, n...@afshartous.com wrote:
> 
> 
> On Sep 4, 2018, at 4:20 PM, Jan Filipiak 
> mailto:jan.filip...@trivago.com>> wrote:
> 
> 
> what I meant is litterally this interface:
> 
> 
> https://samza.apache.org/learn/documentation/0.7.0/api/javadocs/org/apache/samza/system/chooser/MessageChooser.html
>  
> 
> 
> Hi Jan,
> 
> 
> Thanks for the reply and I have a few questions.  This Samza doc
> 
> 
>https://samza.apache.org/learn/documentation/0.14/container/streams.html 
> 
> 
> 
> indicates that the chooser is set via configuration.  Are you suggesting 
> adding a new configuration for Kafka ?  Seems like we could also have a 
> method on KafkaConsumer
> 
> 
>  public void register(MessageChooser messageChooser)
> 
> I don't have strong opinions regarding this. I like configs, i also
> 
> don't think it would be a problem to have both.
> 
> 
> 
> to make it more dynamic.
> 
> 
> Also, the Samza MessageChooser interface has method
> 
> 
>/* Notify the chooser that a new envelope is available for a processing. */
> 
> void update(IncomingMessageEnvelope envelope)
> 
> 
> and I’m wondering how this method would be translated to Kafka API.  In 
> particular what corresponds to IncomingMessageEnvelope.
> 
> I think Samza uses the envelop abstraction as they support other sources
> 
> besides kafka aswell. They are more
> 
> on the spark end of things when it comes to different input types. I
> 
> don't have strong opinions but it feels like
> 
> we wouldn't need such a thing in the kafka 

[jira] [Resolved] (KAFKA-5690) kafka-acls command should be able to list per principal

2018-09-17 Thread Dong Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin resolved KAFKA-5690.
-
Resolution: Fixed

> kafka-acls command should be able to list per principal
> ---
>
> Key: KAFKA-5690
> URL: https://issues.apache.org/jira/browse/KAFKA-5690
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.0, 0.11.0.0
>Reporter: Koelli Mungee
>Assignee: Manikumar
>Priority: Major
> Fix For: 2.1.0
>
>
> Currently the `kafka-acls` command has a `--list` option that can list per 
> resource which is --topic  or --group  or --cluster. In order 
> to look at the ACLs for a particular principal the user needs to iterate 
> through the entire list to figure out what privileges a particular principal 
> has been granted. An option to list the ACL per principal would simplify this 
> process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Apache Kafka 2.1.0 Release Plan

2018-09-17 Thread Dong Lin
Hey everyone,

Thanks for your support!

Just a reminder that your new KIP needs to pass vote by end of next Monday
(Sep 24) to be included in Apache Kafka 2.1.0 release.

Cheers,
Dong

On Thu, Sep 13, 2018 at 10:38 PM, Dongjin Lee  wrote:

> Great. Thanks!
>
> - Dongjin
>
> On Mon, Sep 10, 2018 at 11:48 AM Matthias J. Sax 
> wrote:
>
> > Thanks a lot! You are on a run after 1.1.1 release.
> >
> > I see something coming up for myself in 4 month. :)
> >
> > On 9/9/18 6:15 PM, Guozhang Wang wrote:
> > > Dong, thanks for driving the release!
> > >
> > > On Sun, Sep 9, 2018 at 5:57 PM, Ismael Juma  wrote:
> > >
> > >> Thanks for volunteering Dong!
> > >>
> > >> Ismael
> > >>
> > >> On Sun, 9 Sep 2018, 17:32 Dong Lin,  wrote:
> > >>
> > >>> Hi all,
> > >>>
> > >>> I would like to be the release manager for our next time-based
> feature
> > >>> release 2.1.0.
> > >>>
> > >>> The recent Kafka release history can be found at
> > >>> https://cwiki.apache.org/confluence/display/KAFKA/Future+
> release+plan.
> > >> The
> > >>> release plan (with open issues and planned KIPs) for 2.1.0 can be
> found
> > >> at
> > >>> https://cwiki.apache.org/confluence/pages/viewpage.
> > >> action?pageId=91554044.
> > >>>
> > >>> Here are the dates we have planned for Apache Kafka 2.1.0 release:
> > >>>
> > >>> 1) KIP Freeze: Sep 24, 2018.
> > >>> A KIP must be accepted by this date in order to be considered for
> this
> > >>> release)
> > >>>
> > >>> 2) Feature Freeze: Oct 1, 2018
> > >>> Major features merged & working on stabilization, minor features have
> > PR,
> > >>> release branch cut; anything not in this state will be automatically
> > >> moved
> > >>> to the next release in JIRA.
> > >>>
> > >>> 3) Code Freeze: Oct 15, 2018 (Tentatively)
> > >>>
> > >>> The KIP and feature freeze date is about 3-4 weeks from now. Please
> > plan
> > >>> accordingly for the features you want push into Apache Kafka 2.1.0
> > >> release.
> > >>>
> > >>>
> > >>> Cheers,
> > >>> Dong
> > >>>
> > >>
> > >
> > >
> > >
> >
> >
>
> --
> *Dongjin Lee*
>
> *A hitchhiker in the mathematical world.*
>
> *github:  github.com/dongjinleekr
> linkedin: kr.linkedin.com/in/dongjinleekr
> slideshare:
> www.slideshare.net/dongjinleekr
> *
>


[jira] [Resolved] (KAFKA-7316) Use of filter method in KTable.scala may result in StackOverflowError

2018-09-17 Thread Guozhang Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-7316.
--
Resolution: Fixed

> Use of filter method in KTable.scala may result in StackOverflowError
> -
>
> Key: KAFKA-7316
> URL: https://issues.apache.org/jira/browse/KAFKA-7316
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.0.0
>Reporter: Ted Yu
>Priority: Major
>  Labels: scala
> Fix For: 2.0.1, 2.1.0
>
> Attachments: 7316.v4.txt
>
>
> In this thread:
> http://search-hadoop.com/m/Kafka/uyzND1dNbYKXzC4F1?subj=Issue+in+Kafka+2+0+0+
> Druhin reported seeing StackOverflowError when using filter method from 
> KTable.scala
> This can be reproduced with the following change:
> {code}
> diff --git 
> a/streams/streams-scala/src/test/scala/org/apache/kafka/streams/scala/StreamToTableJoinScalaIntegrationTestImplicitSerdes.scala
>  b/streams/streams-scala/src/test/scala
> index 3d1bab5..e0a06f2 100644
> --- 
> a/streams/streams-scala/src/test/scala/org/apache/kafka/streams/scala/StreamToTableJoinScalaIntegrationTestImplicitSerdes.scala
> +++ 
> b/streams/streams-scala/src/test/scala/org/apache/kafka/streams/scala/StreamToTableJoinScalaIntegrationTestImplicitSerdes.scala
> @@ -58,6 +58,7 @@ class StreamToTableJoinScalaIntegrationTestImplicitSerdes 
> extends StreamToTableJ
>  val userClicksStream: KStream[String, Long] = 
> builder.stream(userClicksTopic)
>  val userRegionsTable: KTable[String, String] = 
> builder.table(userRegionsTopic)
> +userRegionsTable.filter { case (_, count) => true }
>  // Compute the total per region by summing the individual click counts 
> per region.
>  val clicksPerRegion: KTable[String, Long] =
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] KIP-358: Migrate Streams API to Duration instead of long ms times

2018-09-17 Thread Nikolay Izhikov
This KIP is now accepted with:
- 3 binding +1
- 2 non binding +1

Thanks, all.

Especially, John, Matthias, Guozhang, Bill, Damian!

В Чт, 13/09/2018 в 22:16 -0700, Guozhang Wang пишет:
> +1 (binding), thank you Nikolay!
> 
> Guozhang
> 
> On Thu, Sep 13, 2018 at 9:39 AM, Matthias J. Sax 
> wrote:
> 
> > Thanks for the KIP.
> > 
> > +1 (binding)
> > 
> > 
> > -Matthias
> > 
> > On 9/5/18 8:52 AM, John Roesler wrote:
> > > I'm a +1 (non-binding)
> > > 
> > > On Mon, Sep 3, 2018 at 8:33 AM Nikolay Izhikov 
> > 
> > wrote:
> > > 
> > > > Dear commiters.
> > > > 
> > > > Please, vote on a KIP.
> > > > 
> > > > В Пт, 31/08/2018 в 12:05 -0500, John Roesler пишет:
> > > > > Hi Nikolay,
> > > > > 
> > > > > You can start a PR any time, but we cannot per it (and probably won't
> > 
> > do
> > > > > serious reviews) until after the KIP is voted and approved.
> > > > > 
> > > > > Sometimes people start a PR during discussion just to help provide 
> > > > > more
> > > > > context, but it's not required (and can also be distracting because 
> > > > > the
> > > > 
> > > > KIP
> > > > > discussion should avoid implementation details).
> > > > > 
> > > > > Let's wait one more day for any other comments and plan to start the
> > 
> > vote
> > > > > on Monday if there are no other debates.
> > > > > 
> > > > > Once you start the vote, you have to leave it up for at least 72 
> > > > > hours,
> > > > 
> > > > and
> > > > > it requires 3 binding votes to pass. Only Kafka Committers have 
> > > > > binding
> > > > > votes (https://kafka.apache.org/committers).
> > > > > 
> > > > > Thanks,
> > > > > -John
> > > > > 
> > > > > On Fri, Aug 31, 2018 at 11:09 AM Bill Bejeck 
> > 
> > wrote:
> > > > > 
> > > > > > Hi Nickolay,
> > > > > > 
> > > > > > Thanks for the clarification.
> > > > > > 
> > > > > > -Bill
> > > > > > 
> > > > > > On Fri, Aug 31, 2018 at 11:59 AM Nikolay Izhikov 
> > > > > >  > > > > > wrote:
> > > > > > 
> > > > > > > Hello, John.
> > > > > > > 
> > > > > > > This is my first KIP, so, please, help me with kafka development
> > > > 
> > > > process.
> > > > > > > 
> > > > > > > Should I start to work on PR now? Or should I wait for a "+1" from
> > > > > > > commiters?
> > > > > > > 
> > > > > > > В Пт, 31/08/2018 в 10:33 -0500, John Roesler пишет:
> > > > > > > > I see. I guess that once we are in the PR-reviewing phase, we'll
> > > > 
> > > > be in
> > > > > > 
> > > > > > a
> > > > > > > > better position to see what else can/should be done, and we can
> > > > 
> > > > talk
> > > > > > > 
> > > > > > > about
> > > > > > > > follow-on work at that time.
> > > > > > > > 
> > > > > > > > Thanks for the clarification,
> > > > > > > > -John
> > > > > > > > 
> > > > > > > > On Fri, Aug 31, 2018 at 1:19 AM Nikolay Izhikov <
> > > > 
> > > > nizhi...@apache.org>
> > > > > > > 
> > > > > > > wrote:
> > > > > > > > 
> > > > > > > > > Hello, Bill
> > > > > > > > > 
> > > > > > > > > > In the "Proposed Changes" section, there is "Try to reduce 
> > > > > > > > > > the
> > > > > > > > > 
> > > > > > > > > visibility of methods in next tickets" does that mean eventual
> > > > > > > 
> > > > > > > deprecation
> > > > > > > > > and removal?
> > > > > > > > > 
> > > > > > > > > 1. Some methods will become deprecated. I think they will be
> > > > 
> > > > removed
> > > > > > 
> > > > > > in
> > > > > > > > > the future.
> > > > > > > > > You can find list of deprecated methods in KIP.
> > > > > > > > > 
> > > > > > > > > 2. Some internal methods can't be deprecated or hid from the
> > > > 
> > > > user for
> > > > > > > 
> > > > > > > now.
> > > > > > > > > I was trying to say that we should research possibility to 
> > > > > > > > > reduce
> > > > > > > > > visibility of *internal* methods that are *public* now.
> > > > > > > > > That kind of changes is out of the scope of current KIP, so we
> > > > 
> > > > have
> > > > > > 
> > > > > > to
> > > > > > > do
> > > > > > > > > it in the next tickets.
> > > > > > > > > 
> > > > > > > > > I don't expect that internal methods will be removed.
> > > > > > > > > 
> > > > > > > > > В Чт, 30/08/2018 в 18:59 -0400, Bill Bejeck пишет:
> > > > > > > > > > Sorry for chiming in late, there was a lot of detail to 
> > > > > > > > > > catch
> > > > 
> > > > up
> > > > > > 
> > > > > > on.
> > > > > > > > > > 
> > > > > > > > > > Overall I'm +1 in the KIP.  But I do have one question about
> > > > 
> > > > the
> > > > > > 
> > > > > > KIP
> > > > > > > in
> > > > > > > > > > regards to Matthias's comments about defining dual use.
> > > > > > > > > > 
> > > > > > > > > > In the "Proposed Changes" section, there is "Try to reduce 
> > > > > > > > > > the
> > > > > > > 
> > > > > > > visibility
> > > > > > > > > > of methods in next tickets" does that mean eventual
> > > > 
> > > > deprecation and
> > > > > > > > > 
> > > > > > > > > removal?
> > > > > > > > > > I thought we were aiming to keep the dual use methods? Or 
> > > > > > > > > > does
> > > > 
> > > > that
> > > > > > > 
> > > > > > > 

Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-17 Thread Ron Dagostino
Hi yet again, Rajini.  I also just realized that if the client is to learn
the credential lifetime we wouldn't want to put special-case code in the
Authenticator for GSSAPI and OAUTHBEARER; we would want to expose the value
generically, probably as a negotiated property on the SaslClient instance.
We might be talking about making the negotiated property key part of the
public API.  In other words, the SaslClient would be responsible for
exposing the credential (i.e. token or ticket) lifetime at a well-known
negotiated property name, such as "Credential.Lifetime" and putting a Long
value there if there is a lifetime.  That well-klnown key (e.g.
"Credential.Lifetime") would be part of the public API, right?

Ron

On Mon, Sep 17, 2018 at 11:03 AM Ron Dagostino  wrote:

> Hi again, Rajini.  After thinking about this a little while, it occurs to
> me that maybe the communication of max session lifetime should occur via
> SASL_HANDSHAKE after all.  Here's why.  The value communicated is the max
> session lifetime allowed, and the client can assume it is the the session
> lifetime for that particular session unless the particular SASL mechanism
> could result in a shorter session that would be obvious to the client and
> the server.  In particular, for OAUTHBEARER, the session lifetime will be
> the token lifetime, which the client and server will both know.  Is there a
> reason to communicate this value back to the client when the client already
> knows it?  It's an extra network round-trip, and since the SASL round-trips
> are defined by the spec I'm not certain adding an extra round-trip is
> acceptable.  Even if we decide we can add it, it helps with latency if we
> don't.
>
> Kerberos may be a bit different -- I don't know if the broker can learn
> the session lifetime.  If it can then the same thing holds -- both client
> and server will know the session lifetime and there is no reason to
> communicate it back.  If the server can't learn the lifetime then I don't
> think adding an extra SASL_AUTHENTICATE round trip is going to help, anyway.
>
> Also, by communicating the max session lifetime in the SASL_HANDSHAKE
> response, both OAUTHBEARER and GSSAPI clients will be able to know before
> sending any SASL_AUTHENTICATE requests whether their credential violates
> the maximum.  This allows a behaving client to give a good error message.
> A malicious client would ignore the value and send a longer-lived
> credential, and then that would be rejected on the server side.
>
> I'm still good with ExpiringCredential not needing to be public.
>
> What do you think?
>
> Ron
>
> Ron
>
> Ron
>
> On Mon, Sep 17, 2018 at 9:13 AM Ron Dagostino  wrote:
>
>> Hi Rajini.  I've updated the KIP to reflect the decision to fully support
>> this for all SASL mechanisms and to not require the ExpiringCredential
>> interface to be public.
>>
>> Ron
>>
>> On Mon, Sep 17, 2018 at 6:55 AM Ron Dagostino  wrote:
>>
>>> Actually, I think the metric remains the same assuming the
>>> authentication fails when the token  lifetime is too long.  Negative max
>>> config on server counts what would have been killed because maybe client
>>> re-auth is not turned on; positive max enables the kill, which is counted
>>> by a second metric as proposed.  The current proposal had already stated
>>> that a non-zero value would disallow an authentication with a token that
>>> has too large a lifetime, and that is still the case, I think.  But let’s
>>> make sure we are on the same page about all this.
>>>
>>> Ron
>>>
>>> > On Sep 17, 2018, at 6:42 AM, Ron Dagostino  wrote:
>>> >
>>> > Hi Rajini.  I see what you are saying.  The background login refresh
>>> thread does have to factor into the decision for OAUTHBEARER because there
>>> is no new token to re-authenticate with until that refresh succeeds
>>> (similarly with Kerberos), but I think you are right that the interface
>>> doesn’t necessarily have to be public.  The server does decide the time for
>>> PLAIN and the SCRAM-related mechanisms under the current proposal, but it
>>> is done in SaslHandshakeResponse, and at that point the server won’t have
>>> any token yet; making it happen via SaslAuthenticateRequest at the very end
>>> does allow the server to know everything for all mechanisms.
>>> >
>>> > I see two potential issues to discuss.  First, the server must
>>> communicate a time that exceeds the (token or ticket) refresh period on the
>>> client.  Assuming it communicates the expiration time of the token/ticket,
>>> that’s the best it can do.  So I think that’ll be fine.
>>> >
>>> > The second issue is what happens if the server is configured to accept
>>> a max value — say, an hour — and the token is good for longer.  I assumed
>>> that the client should not be allowed to authenticate in the first place
>>> because it could then simply re-authenticate with the same token after an
>>> hour, which defeats the motivations for this KIP.  So do we agree the max
>>> token lifetime allowed 

Re: [DISCUSS] KIP-349 Priorities for Source Topics

2018-09-17 Thread Thomas Becker
For my part, a major use-case for this feature is stream-table joins. 
Currently, KafkaStreams does the wrong thing in some cases because the only 
message choosing strategy available is timestamp-based. So table records that 
have been updated recently will not be read until the stream records reach or 
exceed that same timestamp. So there is no guarantee these records get joined 
at all, and if they do it will be with old data. I realize we're talking about 
the consumer here and not streams specifically, but as it stands I can't even 
write a non-streams application that does a join but prioritizes table-topic 
records over stream records without using multiple consumers.

On Wed, 2018-09-05 at 08:18 -0700, Colin McCabe wrote:

Hi all,


I agree that DISCUSS is more appropriate than VOTE at this point, since I don't 
remember the last discussion coming to a definite conclusion.


I guess my concern is that this will add complexity and memory consumption on 
the server side.  In the case of incremental fetch requests, we will have to 
track at least two extra bytes per partition, to know what the priority of each 
partition is within each active fetch session.


It would be nice to hear more about the use-cases for this feature.  I think 
Gwen asked about this earlier, and I don't remember reading a response.  The 
fact that we're now talking about Samza interfaces is a bit of a red flag.  
After all, Samza didn't need partition priorities to do what it did.  You can 
do a lot with muting partitions and using appropriate threading in your code.


For example, you can hand data from a partition off to a work queue with a 
fixed size, which is handled by a separate service thread.  If the queue gets 
full, you can mute the partition until some of the buffered data is processed.  
Kafka Streams uses a similar approach to avoid reading partition data that 
isn't immediately needed.


There might be some use-cases that need priorities eventually, but I'm 
concerned that we're jumping the gun by trying to implement this before we know 
what they are.


best,

Colin



On Wed, Sep 5, 2018, at 01:06, Jan Filipiak wrote:


On 05.09.2018 02:38, n...@afshartous.com wrote:


On Sep 4, 2018, at 4:20 PM, Jan Filipiak 
mailto:jan.filip...@trivago.com>> wrote:


what I meant is litterally this interface:


https://samza.apache.org/learn/documentation/0.7.0/api/javadocs/org/apache/samza/system/chooser/MessageChooser.html
 


Hi Jan,


Thanks for the reply and I have a few questions.  This Samza doc


   https://samza.apache.org/learn/documentation/0.14/container/streams.html 



indicates that the chooser is set via configuration.  Are you suggesting adding 
a new configuration for Kafka ?  Seems like we could also have a method on 
KafkaConsumer


 public void register(MessageChooser messageChooser)

I don't have strong opinions regarding this. I like configs, i also

don't think it would be a problem to have both.



to make it more dynamic.


Also, the Samza MessageChooser interface has method


   /* Notify the chooser that a new envelope is available for a processing. */

void update(IncomingMessageEnvelope envelope)


and I’m wondering how this method would be translated to Kafka API.  In 
particular what corresponds to IncomingMessageEnvelope.

I think Samza uses the envelop abstraction as they support other sources

besides kafka aswell. They are more

on the spark end of things when it comes to different input types. I

don't have strong opinions but it feels like

we wouldn't need such a thing in the kafka consumer but just use a

regular ConsumerRecord or so.


Best,

--

   Nick








This email and any attachments may contain confidential and privileged material 
for the sole use of the intended recipient. Any review, copying, or 
distribution of this email (or any attachments) by others is prohibited. If you 
are not the intended recipient, please contact the sender immediately and 
permanently delete this email and any attachments. No employee or agent of TiVo 
Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by 
email. Binding agreements with TiVo Inc. may only be made by a signed written 
agreement.


Re: [VOTE] KIP-361: Add Consumer Configuration to Disable Auto Topic Creation

2018-09-17 Thread Dhruvil Shah
Thank you for the votes and discussion, everyone. The KIP has passed with 3
binding votes (Ismael, Gwen, Matthias) and 5 non-binding votes (Brandon,
Bill, Manikumar, Colin, Mickael).

- Dhruvil

On Mon, Sep 17, 2018 at 1:59 AM Mickael Maison 
wrote:

> +1 (non-binding)
> Thanks for the KIP!
> On Sun, Sep 16, 2018 at 7:40 PM Matthias J. Sax 
> wrote:
> >
> > +1 (binding)
> >
> > -Matthias
> >
> > On 9/14/18 4:57 PM, Ismael Juma wrote:
> > > Thanks for the KIP, +1 (binding).
> > >
> > > Ismael
> > >
> > > On Fri, Sep 14, 2018 at 4:56 PM Dhruvil Shah 
> wrote:
> > >
> > >> Hi all,
> > >>
> > >> I would like to start a vote on KIP-361.
> > >>
> > >> Link to the KIP:
> > >>
> > >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-361%3A+Add+Consumer+Configuration+to+Disable+Auto+Topic+Creation
> > >>
> > >> Thanks,
> > >> Dhruvil
> > >>
> > >
> >
>


Re: Accessing Topology Builder

2018-09-17 Thread John Roesler
If I understand the request, it's about tracking the latencies for a
specific record, not the aggregated latencies for each processor.

Jorge,

The diff you posted only contains the library-side changes, and it's not
obvious how you would use this to insert the desired tracing code.
Perhaps you could provide a snippet demonstrating how you want to use this
change to enable tracing?

Also, as Matthias said, you would need to create a KIP to propose this
change, but of course we can continue this preliminary discussion until you
feel confident to create the KIP.

Off the top of my head, here are some other approaches you might evaluate:
* you mentioned interceptors. Perhaps we could create a
ProcessorInterceptor interface and add a config to set it.
* perhaps we could simply build the tracing headers into Streams. Is there
a benefit to making it customizable?

Thanks for considering this problem!
-John

On Mon, Sep 17, 2018 at 12:30 AM Guozhang Wang  wrote:

> Hello Jorge,
>
> From the TracingProcessor implementation it seems you want to track
> per-processor processing latency, is that right? If this is the case you
> can actually use the per-processor metrics which include latency sensors.
>
> If you do want to track, for a certain record, what's the latency of
> processing it, then you'd probably need the processor implementation in
> your repo. In this case, though, I'd suggest to provide a
> WrapperProcessorSupplier for the users than modifying
> InternalStreamsTopology: more specifically, you can provide an
> `abstract WrapperProcessorSupplier
> implements ProcessorSupplier` and then let users to instantiate this class
> instead of the "bare-metal" interface. WDYT?
>
>
> Guozhang
>
> On Sun, Sep 16, 2018 at 12:56 PM, Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
>
> > Thanks for your answer, Matthias!
> >
> > What I'm looking for is something similar to interceptors, but for Stream
> > Processors.
> >
> > In Zipkin -and probably other tracing implementations as well- we are
> using
> > Headers to propagate the context of a trace (i.e. adding metadata to the
> > Kafka Record, so we can create references to a trace).
> > Now that Headers are part of Kafka Streams Processor API, we can
> propagate
> > context from input (Consumers) to outputs (Producers) by using
> > `KafkaClientSupplier` (e.g. <
> > https://github.com/openzipkin/brave/blob/master/
> > instrumentation/kafka-streams/src/main/java/brave/kafka/streams/
> > TracingKafkaClientSupplier.java
> > >).
> >
> > "Input to Output" traces could be enough for some use-cases, but we are
> > looking for a more detailed trace -that could cover cases like
> side-effects
> > (e.g. for each processor), where input/output and processors latencies
> can
> > be recorded. This is why I have been looking for how to decorate the
> > `ProcessorSupplier` and all the changes shown in the comparison. Here is
> a
> > gist of how we are planning to decorate the `addProcessor` method:
> > https://github.com/openzipkin/brave/compare/master...jeqo:
> > kafka-streams-topology#diff-8282914d84039affdf7c37251b905b44R7
> >
> > Hope this makes a bit more sense now :)
> >
> > El dom., 16 sept. 2018 a las 20:51, Matthias J. Sax (<
> > matth...@confluent.io>)
> > escribió:
> >
> > > >> I'm experimenting on how to add tracing to Kafka Streams.
> > >
> > > What do you mean by this exactly? Is there a JIRA? I am fine removing
> > > `final` from `InternalTopologyBuilder#addProcessor()` -- it's an
> > > internal class.
> > >
> > > However, the diff also shows
> > >
> > > > public Topology(final InternalTopologyBuilder
> internalTopologyBuilder)
> > {
> > >
> > > This has two impacts: first, it modifies `Topology` what is part of
> > > public API and would require a KIP. Second, it exposes
> > > `InternalTopologyBuilder` as part of the public API -- something we
> > > should not do.
> > >
> > > I am also not sure, why you want to do this (btw: also public API
> change
> > > requiring a KIP). However, this should not be necessary.
> > >
> > > > public StreamsBuilder(final Topology topology)  {
> > >
> > >
> > > I think I am lacking some context what you try to achieve. Maybe you
> can
> > > elaborate in the problem you try to solve?
> > >
> > >
> > > -Matthias
> > >
> > > On 9/15/18 10:31 AM, Jorge Esteban Quilcate Otoya wrote:
> > > > Hi everyone,
> > > >
> > > > I'm experimenting on how to add tracing to Kafka Streams.
> > > >
> > > > One option is to override and access
> > > > `InternalTopologyBuilder#addProcessor`. Currently this method it is
> > > final,
> > > > and builder is not exposed as part of `StreamsBuilder`:
> > > >
> > > > ```
> > > > public class StreamsBuilder {
> > > >
> > > > /** The actual topology that is constructed by this
> StreamsBuilder.
> > > */
> > > > private final Topology topology = new Topology();
> > > >
> > > > /** The topology's internal builder. */
> > > > final InternalTopologyBuilder internalTopologyBuilder =
> 

Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-17 Thread Ron Dagostino
Hi again, Rajini.  After thinking about this a little while, it occurs to
me that maybe the communication of max session lifetime should occur via
SASL_HANDSHAKE after all.  Here's why.  The value communicated is the max
session lifetime allowed, and the client can assume it is the the session
lifetime for that particular session unless the particular SASL mechanism
could result in a shorter session that would be obvious to the client and
the server.  In particular, for OAUTHBEARER, the session lifetime will be
the token lifetime, which the client and server will both know.  Is there a
reason to communicate this value back to the client when the client already
knows it?  It's an extra network round-trip, and since the SASL round-trips
are defined by the spec I'm not certain adding an extra round-trip is
acceptable.  Even if we decide we can add it, it helps with latency if we
don't.

Kerberos may be a bit different -- I don't know if the broker can learn the
session lifetime.  If it can then the same thing holds -- both client and
server will know the session lifetime and there is no reason to communicate
it back.  If the server can't learn the lifetime then I don't think adding
an extra SASL_AUTHENTICATE round trip is going to help, anyway.

Also, by communicating the max session lifetime in the SASL_HANDSHAKE
response, both OAUTHBEARER and GSSAPI clients will be able to know before
sending any SASL_AUTHENTICATE requests whether their credential violates
the maximum.  This allows a behaving client to give a good error message.
A malicious client would ignore the value and send a longer-lived
credential, and then that would be rejected on the server side.

I'm still good with ExpiringCredential not needing to be public.

What do you think?

Ron

Ron

Ron

On Mon, Sep 17, 2018 at 9:13 AM Ron Dagostino  wrote:

> Hi Rajini.  I've updated the KIP to reflect the decision to fully support
> this for all SASL mechanisms and to not require the ExpiringCredential
> interface to be public.
>
> Ron
>
> On Mon, Sep 17, 2018 at 6:55 AM Ron Dagostino  wrote:
>
>> Actually, I think the metric remains the same assuming the authentication
>> fails when the token  lifetime is too long.  Negative max config on server
>> counts what would have been killed because maybe client re-auth is not
>> turned on; positive max enables the kill, which is counted by a second
>> metric as proposed.  The current proposal had already stated that a
>> non-zero value would disallow an authentication with a token that has too
>> large a lifetime, and that is still the case, I think.  But let’s make sure
>> we are on the same page about all this.
>>
>> Ron
>>
>> > On Sep 17, 2018, at 6:42 AM, Ron Dagostino  wrote:
>> >
>> > Hi Rajini.  I see what you are saying.  The background login refresh
>> thread does have to factor into the decision for OAUTHBEARER because there
>> is no new token to re-authenticate with until that refresh succeeds
>> (similarly with Kerberos), but I think you are right that the interface
>> doesn’t necessarily have to be public.  The server does decide the time for
>> PLAIN and the SCRAM-related mechanisms under the current proposal, but it
>> is done in SaslHandshakeResponse, and at that point the server won’t have
>> any token yet; making it happen via SaslAuthenticateRequest at the very end
>> does allow the server to know everything for all mechanisms.
>> >
>> > I see two potential issues to discuss.  First, the server must
>> communicate a time that exceeds the (token or ticket) refresh period on the
>> client.  Assuming it communicates the expiration time of the token/ticket,
>> that’s the best it can do.  So I think that’ll be fine.
>> >
>> > The second issue is what happens if the server is configured to accept
>> a max value — say, an hour — and the token is good for longer.  I assumed
>> that the client should not be allowed to authenticate in the first place
>> because it could then simply re-authenticate with the same token after an
>> hour, which defeats the motivations for this KIP.  So do we agree the max
>> token lifetime allowed will be enforced at authentication time?  Assuming
>> so, then we need to discuss migration.
>> >
>> > The current proposal includes a metric that can be used to identify if
>> an OAUTHBEARER client is misconfigured — the number of connections that
>> would have been killed (could be non-zero when the configured max is a
>> negative number). Do we still want this type of an indication, and if so,
>> is it still done the same way — a negative number for max, but instead of
>> counting the number of kills that would have been done it counts the number
>> of authentications that would have been failed?
>> >
>> > Ron
>> >
>> >> On Sep 17, 2018, at 6:06 AM, Rajini Sivaram 
>> wrote:
>> >>
>> >> Hi Ron,
>> >>
>> >> Thinking a bit more about other SASL mechanisms, I think the issue is
>> that
>> >> in the current proposal, clients decide the re-authentication period.
>> For
>> >> 

Re: [DISCUSS] KIP-358: Migrate Streams API to Duration instead of long ms times

2018-09-17 Thread Damian Guy
Thanks +1 (binding)

On Sun, 16 Sep 2018 at 19:37 Nikolay Izhikov  wrote:

> Dear commiters,
>
> I got two binding +1 in [VOTE] thread for this KIP [1].
> I still need one more.
>
> Please, take a look at KIP.
>
> [1]
> https://lists.apache.org/thread.html/e976352e7e42d459091ee66ac790b6a0de7064eac0c57760d50c983b@%3Cdev.kafka.apache.org%3E
>
> В Чт, 13/09/2018 в 19:33 +0300, Nikolay Izhikov пишет:
> > Fixed.
> >
> > Thanks, for help!
> >
> > Please, take a look and vote.
> >
> > В Чт, 13/09/2018 в 08:40 -0700, Matthias J. Sax пишет:
> > > No need to start a new voting thread :)
> > >
> > > For the KIP update, I think it should be:
> > >
> > > > ReadOnlyWindowStore {
> > > > //Deprecated methods.
> > > > WindowStoreIterator fetch(K key, long timeFrom, long timeTo);
> > > > KeyValueIterator, V> fetch(K from, K to, long
> timeFrom, long timeTo);
> > > > KeyValueIterator, V> fetchAll(long timeFrom, long
> timeTo);
> > > >
> > > > //New methods.
> > > > WindowStoreIterator fetch(K key, Instant from, Duration
> duration) throws IllegalArgumentException;
> > > > KeyValueIterator, V> fetch(K from, K to, Instant
> from, Duration duration) throws IllegalArgumentException;
> > > > KeyValueIterator, V> fetchAll(Instant from, Duration
> duration) throws IllegalArgumentException;
> > > > }
> > > >
> > > >
> > > > WindowStore {
> > > > //New methods.
> > > > WindowStoreIterator fetch(K key, long timeFrom, long timeTo);
> > > > KeyValueIterator, V> fetch(K from, K to, long
> timeFrom, long timeTo);
> > > > KeyValueIterator, V> fetchAll(long timeFrom, long
> timeTo);
> > > > }
> > >
> > > Ie, long-versions are replaced with Instant/Duration in
> > > `ReadOnlyWindowStore`, and `long` method are added in `WindowStore` --
> > > this way, we effectively "move" the long-versions from
> > > `ReadOnlyWindowStore` to `WindowStore`.
> > >
> > > -Matthias
> > >
> > > On 9/13/18 8:08 AM, Nikolay Izhikov wrote:
> > > > Hello, Matthias.
> > > >
> > > > > I like the KIP as-is. Feel free to start a VOTE thread.
> > > >
> > > >
> > > > I'm already started one [1].
> > > > Can you vote in it or I should create a new one?
> > > >
> > > >
> > > > I've updated KIP.
> > > > This has been changed:
> > > >
> > > > ReadOnlyWindowStore {
> > > > //Deprecated methods.
> > > > WindowStoreIterator fetch(K key, long timeFrom, long timeTo);
> > > > KeyValueIterator, V> fetch(K from, K to, long
> timeFrom, long timeTo);
> > > > KeyValueIterator, V> fetchAll(long timeFrom, long
> timeTo);
> > > > }
> > > >
> > > > WindowStore {
> > > >   //New methods.
> > > > WindowStoreIterator fetch(K key, Instant from, Duration
> duration) throws IllegalArgumentException;
> > > > KeyValueIterator, V> fetch(K from, K to, Instant
> from, Duration duration) throws IllegalArgumentException;
> > > > KeyValueIterator, V> fetchAll(Instant from, Duration
> duration) throws IllegalArgumentException;
> > > > }
> > > >
> > > > [1]
> https://lists.apache.org/thread.html/e976352e7e42d459091ee66ac790b6a0de7064eac0c57760d50c983b@%3Cdev.kafka.apache.org%3E
> > > >
> > > > В Ср, 12/09/2018 в 15:46 -0700, Matthias J. Sax пишет:
> > > > > Great!
> > > > >
> > > > > I did not double check the ReadOnlySessionStore interface before,
> and
> > > > > just assumed it would take a timestamp, too. My bad.
> > > > >
> > > > > Please update the KIP for ReadOnlyWindowStore and WindowStore.
> > > > >
> > > > > I like the KIP as-is. Feel free to start a VOTE thread. Even if
> there
> > > > > might be minor follow up comments, we can vote in parallel.
> > > > >
> > > > >
> > > > > -Matthias
> > > > >
> > > > > On 9/12/18 1:06 PM, John Roesler wrote:
> > > > > > Hi Nikolay,
> > > > > >
> > > > > > Yes, the changes we discussed for ReadOnlyXxxStore and XxxStore
> should be
> > > > > > in this KIP.
> > > > > >
> > > > > > And you're right, it seems like ReadOnlySessionStore is not
> necessary to
> > > > > > touch, since it doesn't reference any `long` timestamps.
> > > > > >
> > > > > > Thanks,
> > > > > > -John
> > > > > >
> > > > > > On Wed, Sep 12, 2018 at 4:36 AM Nikolay Izhikov <
> nizhi...@apache.org> wrote:
> > > > > >
> > > > > > > Hello, Matthias.
> > > > > > >
> > > > > > > > His proposal is, to deprecate existing methods on
> `ReadOnlyWindowStore`>
> > > > > > >
> > > > > > > and `ReadOnlySessionStore` and add them to `WindowStore` and>
> `SessionStore`
> > > > > > > > Does this make sense?
> > > > > > >
> > > > > > > You both are experienced Kafka developers, so yes, it does
> make a sense to
> > > > > > > me :).
> > > > > > > Do we want to make this change in KIP-358 or it required
> another KIP?
> > > > > > >
> > > > > > > > Btw: the KIP misses `ReadOnlySessionStore` atm.
> > > > > > >
> > > > > > > Sorry, but I don't understand you.
> > > > > > > As far as I can see, there is only 2 methods in
> `ReadOnlySessionStore`.
> > > > > > > Which method should be migrated to Duration?
> > > > > > >
> > > > > > >
> > > > 

Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-17 Thread Ron Dagostino
Hi Rajini.  I've updated the KIP to reflect the decision to fully support
this for all SASL mechanisms and to not require the ExpiringCredential
interface to be public.

Ron

On Mon, Sep 17, 2018 at 6:55 AM Ron Dagostino  wrote:

> Actually, I think the metric remains the same assuming the authentication
> fails when the token  lifetime is too long.  Negative max config on server
> counts what would have been killed because maybe client re-auth is not
> turned on; positive max enables the kill, which is counted by a second
> metric as proposed.  The current proposal had already stated that a
> non-zero value would disallow an authentication with a token that has too
> large a lifetime, and that is still the case, I think.  But let’s make sure
> we are on the same page about all this.
>
> Ron
>
> > On Sep 17, 2018, at 6:42 AM, Ron Dagostino  wrote:
> >
> > Hi Rajini.  I see what you are saying.  The background login refresh
> thread does have to factor into the decision for OAUTHBEARER because there
> is no new token to re-authenticate with until that refresh succeeds
> (similarly with Kerberos), but I think you are right that the interface
> doesn’t necessarily have to be public.  The server does decide the time for
> PLAIN and the SCRAM-related mechanisms under the current proposal, but it
> is done in SaslHandshakeResponse, and at that point the server won’t have
> any token yet; making it happen via SaslAuthenticateRequest at the very end
> does allow the server to know everything for all mechanisms.
> >
> > I see two potential issues to discuss.  First, the server must
> communicate a time that exceeds the (token or ticket) refresh period on the
> client.  Assuming it communicates the expiration time of the token/ticket,
> that’s the best it can do.  So I think that’ll be fine.
> >
> > The second issue is what happens if the server is configured to accept a
> max value — say, an hour — and the token is good for longer.  I assumed
> that the client should not be allowed to authenticate in the first place
> because it could then simply re-authenticate with the same token after an
> hour, which defeats the motivations for this KIP.  So do we agree the max
> token lifetime allowed will be enforced at authentication time?  Assuming
> so, then we need to discuss migration.
> >
> > The current proposal includes a metric that can be used to identify if
> an OAUTHBEARER client is misconfigured — the number of connections that
> would have been killed (could be non-zero when the configured max is a
> negative number). Do we still want this type of an indication, and if so,
> is it still done the same way — a negative number for max, but instead of
> counting the number of kills that would have been done it counts the number
> of authentications that would have been failed?
> >
> > Ron
> >
> >> On Sep 17, 2018, at 6:06 AM, Rajini Sivaram 
> wrote:
> >>
> >> Hi Ron,
> >>
> >> Thinking a bit more about other SASL mechanisms, I think the issue is
> that
> >> in the current proposal, clients decide the re-authentication period.
> For
> >> mechanisms like PLAIN or SCRAM, we would actually want server to
> determine
> >> the re-authentication interval. For example, the PLAIN or SCRAM database
> >> could specify the expiry time for each principal, or the broker could be
> >> configured with a single expiry time for all principals of that
> mechanism.
> >> For OAuth, brokers do know the expiry time. Haven't figured out what to
> do
> >> with Kerberos, but in any case for broker-side connection termination on
> >> expiry, we need the broker to know/decide the expiry time. So I would
> like
> >> to suggest a slightly different approach to managing re-authentications.
> >>
> >> 1) Instead of changing SASL_HANDSHAKE version number, we bump up
> >> SASL_AUTHENTICATE version number.
> >> 2) In the final SASL_AUTHENTICATE response, broker returns the expiry
> time
> >> of this authenticated session. This is the interval after which broker
> will
> >> terminate the connection If it wasn't re-authenticated.
> >> 3) We no longer need the public interface change to add `
> >> org.apache.kafka.common.security.expiring.ExpiringCredential` for the
> >> client-side. Instead we schedule the next re-authentication on the
> client
> >> based on the expiry time provided by the server (some time earlier than
> the
> >> expiry).
> >> 4) If client uses SASL_AUTHENTICATE v0, broker will not return expiry
> time.
> >> The connection will be terminated if that feature is enabled (the same
> code
> >> path as client failing to re-authenticte on time).
> >>
> >> Thoughts?
> >>
> >>> On Sat, Sep 15, 2018 at 3:06 AM, Ron Dagostino 
> wrote:
> >>>
> >>> Hi everyone.  I've updated the KIP to reflect all discussion to date,
> >>> including the decision to go with the low-level approach.  This latest
> >>> version of the KIP includes the above "connections.max.reauth.ms"
> >>> proposal,
> >>> which I know has not been discussed yet.  It mentions new 

Re: [DISCUSS] KIP-273 Kafka to support using ETCD beside Zookeeper

2018-09-17 Thread lbarikyn



On 2018/09/14 21:00:10, Colin McCabe  wrote: 
> On Fri, Sep 14, 2018, at 00:46, lbari...@gmail.com wrote:
> > Hello, Colin!
> > 
> > I have read the discussion on KIP-273 and want to raise this question. 
> > Deploying Kafka to Kubernetes is a crucial problem in my current work. 
> > And usage of ZooKeeper makes troubles, because it works unstable in 
> > container. 
> 
> Hi Ibarikyn,
> 
> I haven't heard any reports of ZooKeeper being unstable in containers.  There 
> are a lot of companies already deploying Kafka and ZooKeeper this way, so I 
> am surprised to hear that you had trouble.  Can you be a little more clear 
> about what issues you are seeing with ZooKeeper in containers?
> 
> > In this discussion you have mentioned that the goal is to get rid of it 
> > in future releases. But since then there were no posts about it. Also, i 
> > can not see such plan in a roadmap. Cuold you please tell us more about 
> > work made in this stream.
> 
> ZooKeeper (and etcd, and consul, and the others) were originally designed as 
> configuration management systems.  They're really not set up to do what Kafka 
> does with them, which is use them as one half of a data management system.  
> We need things like uniform access control, the ability to cache metadata 
> locally, the ability to propagate only what has changed to brokers, etc.  
> That requires us to manage our own metadata log.  Kafka is a system for 
> managing logs, so this is a natural design.
> 
> There have been a lot of discussions about pluggable consensus before on the 
> mailing list.  Even if we wanted to keep using a configuration management 
> system for metadata, pluggable consensus is a really bad idea.  It means we 
> would have to support multiple systems, getting correspondingly less testing 
> in each.  We also would have to use least common denominator APIs, since each 
> system is somewhat different.  And configuration would get harder for new 
> users.
> 
> best,
> Colin
> 
Hello,

Thank you for the answer!
I can see your point and agree with some ideas (eg complexity of configuration 
custom systems for users).

Some troubles using Zk that I mentioned:
 - it doesnt suuport ssl (at least the version that is used in Kafka)
 - cluster being unstable
 - dns names confusing

But I talking about another issue. As far as I am concerned, getting rid of 
dependency to Zk(and other configuration management) is the right direction. 
But you did not evaluate the time when we will see it in Kafka. I would like to 
know, is it WIP or smth like that. Do u have any evaluations when this feature 
will be released?
BTW, how Zk-less version is supposed to work? What technology will be used as a 
metadata container? 
As far as I am very interested in this feature, I d like to try to contribute 
my work. Could u show me issues/tickets so i can get acquainted with the 
situation and maybe start work.


Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-17 Thread Ron Dagostino
Actually, I think the metric remains the same assuming the authentication fails 
when the token  lifetime is too long.  Negative max config on server counts 
what would have been killed because maybe client re-auth is not turned on; 
positive max enables the kill, which is counted by a second metric as proposed. 
 The current proposal had already stated that a non-zero value would disallow 
an authentication with a token that has too large a lifetime, and that is still 
the case, I think.  But let’s make sure we are on the same page about all this.

Ron

> On Sep 17, 2018, at 6:42 AM, Ron Dagostino  wrote:
> 
> Hi Rajini.  I see what you are saying.  The background login refresh thread 
> does have to factor into the decision for OAUTHBEARER because there is no new 
> token to re-authenticate with until that refresh succeeds (similarly with 
> Kerberos), but I think you are right that the interface doesn’t necessarily 
> have to be public.  The server does decide the time for PLAIN and the 
> SCRAM-related mechanisms under the current proposal, but it is done in 
> SaslHandshakeResponse, and at that point the server won’t have any token yet; 
> making it happen via SaslAuthenticateRequest at the very end does allow the 
> server to know everything for all mechanisms.
> 
> I see two potential issues to discuss.  First, the server must communicate a 
> time that exceeds the (token or ticket) refresh period on the client.  
> Assuming it communicates the expiration time of the token/ticket, that’s the 
> best it can do.  So I think that’ll be fine.
> 
> The second issue is what happens if the server is configured to accept a max 
> value — say, an hour — and the token is good for longer.  I assumed that the 
> client should not be allowed to authenticate in the first place because it 
> could then simply re-authenticate with the same token after an hour, which 
> defeats the motivations for this KIP.  So do we agree the max token lifetime 
> allowed will be enforced at authentication time?  Assuming so, then we need 
> to discuss migration.  
> 
> The current proposal includes a metric that can be used to identify if an 
> OAUTHBEARER client is misconfigured — the number of connections that would 
> have been killed (could be non-zero when the configured max is a negative 
> number). Do we still want this type of an indication, and if so, is it still 
> done the same way — a negative number for max, but instead of counting the 
> number of kills that would have been done it counts the number of 
> authentications that would have been failed?
> 
> Ron
> 
>> On Sep 17, 2018, at 6:06 AM, Rajini Sivaram  wrote:
>> 
>> Hi Ron,
>> 
>> Thinking a bit more about other SASL mechanisms, I think the issue is that
>> in the current proposal, clients decide the re-authentication period. For
>> mechanisms like PLAIN or SCRAM, we would actually want server to determine
>> the re-authentication interval. For example, the PLAIN or SCRAM database
>> could specify the expiry time for each principal, or the broker could be
>> configured with a single expiry time for all principals of that mechanism.
>> For OAuth, brokers do know the expiry time. Haven't figured out what to do
>> with Kerberos, but in any case for broker-side connection termination on
>> expiry, we need the broker to know/decide the expiry time. So I would like
>> to suggest a slightly different approach to managing re-authentications.
>> 
>> 1) Instead of changing SASL_HANDSHAKE version number, we bump up
>> SASL_AUTHENTICATE version number.
>> 2) In the final SASL_AUTHENTICATE response, broker returns the expiry time
>> of this authenticated session. This is the interval after which broker will
>> terminate the connection If it wasn't re-authenticated.
>> 3) We no longer need the public interface change to add `
>> org.apache.kafka.common.security.expiring.ExpiringCredential` for the
>> client-side. Instead we schedule the next re-authentication on the client
>> based on the expiry time provided by the server (some time earlier than the
>> expiry).
>> 4) If client uses SASL_AUTHENTICATE v0, broker will not return expiry time.
>> The connection will be terminated if that feature is enabled (the same code
>> path as client failing to re-authenticte on time).
>> 
>> Thoughts?
>> 
>>> On Sat, Sep 15, 2018 at 3:06 AM, Ron Dagostino  wrote:
>>> 
>>> Hi everyone.  I've updated the KIP to reflect all discussion to date,
>>> including the decision to go with the low-level approach.  This latest
>>> version of the KIP includes the above "connections.max.reauth.ms"
>>> proposal,
>>> which I know has not been discussed yet.  It mentions new metrics, some of
>>> which may not have been discussed either (and names are missing from some
>>> of them).  Regardless, this new version is the closest yet to a version
>>> that can be put to a vote next week.
>>> 
>>> Ron
>>> 
 On Fri, Sep 14, 2018 at 8:59 PM Ron Dagostino  wrote:
 
 Minor correction: I'm proposing 

Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-17 Thread Ron Dagostino
Hi Rajini.  I see what you are saying.  The background login refresh thread 
does have to factor into the decision for OAUTHBEARER because there is no new 
token to re-authenticate with until that refresh succeeds (similarly with 
Kerberos), but I think you are right that the interface doesn’t necessarily 
have to be public.  The server does decide the time for PLAIN and the 
SCRAM-related mechanisms under the current proposal, but it is done in 
SaslHandshakeResponse, and at that point the server won’t have any token yet; 
making it happen via SaslAuthenticateRequest at the very end does allow the 
server to know everything for all mechanisms.

I see two potential issues to discuss.  First, the server must communicate a 
time that exceeds the (token or ticket) refresh period on the client.  Assuming 
it communicates the expiration time of the token/ticket, that’s the best it can 
do.  So I think that’ll be fine.

The second issue is what happens if the server is configured to accept a max 
value — say, an hour — and the token is good for longer.  I assumed that the 
client should not be allowed to authenticate in the first place because it 
could then simply re-authenticate with the same token after an hour, which 
defeats the motivations for this KIP.  So do we agree the max token lifetime 
allowed will be enforced at authentication time?  Assuming so, then we need to 
discuss migration.  

The current proposal includes a metric that can be used to identify if an 
OAUTHBEARER client is misconfigured — the number of connections that would have 
been killed (could be non-zero when the configured max is a negative number). 
Do we still want this type of an indication, and if so, is it still done the 
same way — a negative number for max, but instead of counting the number of 
kills that would have been done it counts the number of authentications that 
would have been failed?

Ron

> On Sep 17, 2018, at 6:06 AM, Rajini Sivaram  wrote:
> 
> Hi Ron,
> 
> Thinking a bit more about other SASL mechanisms, I think the issue is that
> in the current proposal, clients decide the re-authentication period. For
> mechanisms like PLAIN or SCRAM, we would actually want server to determine
> the re-authentication interval. For example, the PLAIN or SCRAM database
> could specify the expiry time for each principal, or the broker could be
> configured with a single expiry time for all principals of that mechanism.
> For OAuth, brokers do know the expiry time. Haven't figured out what to do
> with Kerberos, but in any case for broker-side connection termination on
> expiry, we need the broker to know/decide the expiry time. So I would like
> to suggest a slightly different approach to managing re-authentications.
> 
> 1) Instead of changing SASL_HANDSHAKE version number, we bump up
> SASL_AUTHENTICATE version number.
> 2) In the final SASL_AUTHENTICATE response, broker returns the expiry time
> of this authenticated session. This is the interval after which broker will
> terminate the connection If it wasn't re-authenticated.
> 3) We no longer need the public interface change to add `
> org.apache.kafka.common.security.expiring.ExpiringCredential` for the
> client-side. Instead we schedule the next re-authentication on the client
> based on the expiry time provided by the server (some time earlier than the
> expiry).
> 4) If client uses SASL_AUTHENTICATE v0, broker will not return expiry time.
> The connection will be terminated if that feature is enabled (the same code
> path as client failing to re-authenticte on time).
> 
> Thoughts?
> 
>> On Sat, Sep 15, 2018 at 3:06 AM, Ron Dagostino  wrote:
>> 
>> Hi everyone.  I've updated the KIP to reflect all discussion to date,
>> including the decision to go with the low-level approach.  This latest
>> version of the KIP includes the above "connections.max.reauth.ms"
>> proposal,
>> which I know has not been discussed yet.  It mentions new metrics, some of
>> which may not have been discussed either (and names are missing from some
>> of them).  Regardless, this new version is the closest yet to a version
>> that can be put to a vote next week.
>> 
>> Ron
>> 
>>> On Fri, Sep 14, 2018 at 8:59 PM Ron Dagostino  wrote:
>>> 
>>> Minor correction: I'm proposing "connections.max.reauth.ms" as the
>>> broker-side configuration property, not "
>>> connections.max.expired.credentials.ms".
>>> 
>>> Ron
>>> 
 On Fri, Sep 14, 2018 at 8:40 PM Ron Dagostino  wrote:
 
 Hi Rajini.  I'm going to choose *connections.max.expired.credentials.ms
 * as the option name
 because it is consistent with the comment I made in the section about
>> how
 to get the client and server to agree on credential lifetime:
 
 "We could add a new API call so that clients could ask servers for the
 lifetime they use, or we could extend the SaslHandshakeRequest/Response
>> API
 call to include that information in the server's 

Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-17 Thread Rajini Sivaram
Hi Ron,

Thinking a bit more about other SASL mechanisms, I think the issue is that
in the current proposal, clients decide the re-authentication period. For
mechanisms like PLAIN or SCRAM, we would actually want server to determine
the re-authentication interval. For example, the PLAIN or SCRAM database
could specify the expiry time for each principal, or the broker could be
configured with a single expiry time for all principals of that mechanism.
For OAuth, brokers do know the expiry time. Haven't figured out what to do
with Kerberos, but in any case for broker-side connection termination on
expiry, we need the broker to know/decide the expiry time. So I would like
to suggest a slightly different approach to managing re-authentications.

1) Instead of changing SASL_HANDSHAKE version number, we bump up
SASL_AUTHENTICATE version number.
2) In the final SASL_AUTHENTICATE response, broker returns the expiry time
of this authenticated session. This is the interval after which broker will
terminate the connection If it wasn't re-authenticated.
3) We no longer need the public interface change to add `
org.apache.kafka.common.security.expiring.ExpiringCredential` for the
client-side. Instead we schedule the next re-authentication on the client
based on the expiry time provided by the server (some time earlier than the
expiry).
4) If client uses SASL_AUTHENTICATE v0, broker will not return expiry time.
The connection will be terminated if that feature is enabled (the same code
path as client failing to re-authenticte on time).

Thoughts?

On Sat, Sep 15, 2018 at 3:06 AM, Ron Dagostino  wrote:

> Hi everyone.  I've updated the KIP to reflect all discussion to date,
> including the decision to go with the low-level approach.  This latest
> version of the KIP includes the above "connections.max.reauth.ms"
> proposal,
> which I know has not been discussed yet.  It mentions new metrics, some of
> which may not have been discussed either (and names are missing from some
> of them).  Regardless, this new version is the closest yet to a version
> that can be put to a vote next week.
>
> Ron
>
> On Fri, Sep 14, 2018 at 8:59 PM Ron Dagostino  wrote:
>
> > Minor correction: I'm proposing "connections.max.reauth.ms" as the
> > broker-side configuration property, not "
> > connections.max.expired.credentials.ms".
> >
> > Ron
> >
> > On Fri, Sep 14, 2018 at 8:40 PM Ron Dagostino  wrote:
> >
> >> Hi Rajini.  I'm going to choose *connections.max.expired.credentials.ms
> >> * as the option name
> >> because it is consistent with the comment I made in the section about
> how
> >> to get the client and server to agree on credential lifetime:
> >>
> >> "We could add a new API call so that clients could ask servers for the
> >> lifetime they use, or we could extend the SaslHandshakeRequest/Response
> API
> >> call to include that information in the server's response – the client
> >> would then adopt that value"
> >>
> >>
> >> We set the config option value on the broker (with the "
> >> listener.name.mechanism." prefix), and we will return the configured
> >> value in the SaslHandshakeResponse (requiring a wire format change in
> >> addition to a version bump).  The value (referred to as "X" below) can
> be
> >> negative, zero, or positive and is to be interpreted as follows:
> >>
> >> X = 0: Fully disable.  The server will respond to re-authentications,
> but
> >> it won't kill any connections due to expiration, and it won't track
> either
> >> of the 2 metrics mentioned below.
> >>
> >> Now, a couple of definitions for when X != 0:
> >>
> >> 1) We define the *maximum allowed expiration time* to be the time
> >> determined by the point when (re-)authentication occurs plus |X|
> >> milliseconds.
> >> 2) We define the *requested expiration time* to be the maximum allowed
> >> expiration time except for the case of an OAuth bearer token, in which
> case
> >> it is the point at which the token expires.  (Kerberos presumably also
> >> specifies a ticket lifetime, but frankly I am not in a position to do
> any
> >> Kerberos-related coding in time for a 2.1.0 release and would prefer to
> >> ignore this piece of information for this KIP -- would it be acceptable
> to
> >> have someone else add it later?).
> >>
> >> Based on these definitions, we define the behavior as follows:
> >>
> >> *X > 0: Fully enable*
> >> A) The server will reject any authentication/re-authentication attempt
> >> when the requested expiration time is after the maximum allowed
> expiration
> >> time (which could only happen with an OAuth bearer token, assuming we
> skip
> >> dealing with Kerberos for now).
> >> B) The server will kill connections that are used beyond the requested
> >> expiration time.
> >> C) A broker metric will be maintained that documents the number
> >> connections killed by the broker.  This metric will be non-zero if a
> client
> >> is connecting to the broker with re-authentication either 

Re: [VOTE] KIP-361: Add Consumer Configuration to Disable Auto Topic Creation

2018-09-17 Thread Patrick Williams
Hi Matthias,

I'm still getting a lot of messages from Kafka and being cc'd on lots of things 
I don't need to get despite unsubscribing from the lists.
Is there anything more I can do?

 Thanks
 
Best,
 
Patrick Williams
 
Enterprise New Business Manager
+44 (0)7549 676279
patrick.willi...@storageos.com
 
20 Midtown
20 Proctor Street
Holborn
London WC1V 6NX
 
Twitter: @patch37
LinkedIn: linkedin.com/in/patrickwilliams4 

https://slack.storageos.com/
 
 

On 16/09/2018, 19:40, "Matthias J. Sax"  wrote:

+1 (binding)

-Matthias

On 9/14/18 4:57 PM, Ismael Juma wrote:
> Thanks for the KIP, +1 (binding).
> 
> Ismael
> 
> On Fri, Sep 14, 2018 at 4:56 PM Dhruvil Shah  wrote:
> 
>> Hi all,
>>
>> I would like to start a vote on KIP-361.
>>
>> Link to the KIP:
>>
>> 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-361%3A+Add+Consumer+Configuration+to+Disable+Auto+Topic+Creation
>>
>> Thanks,
>> Dhruvil
>>
> 





Re: [VOTE] KIP-361: Add Consumer Configuration to Disable Auto Topic Creation

2018-09-17 Thread Mickael Maison
+1 (non-binding)
Thanks for the KIP!
On Sun, Sep 16, 2018 at 7:40 PM Matthias J. Sax  wrote:
>
> +1 (binding)
>
> -Matthias
>
> On 9/14/18 4:57 PM, Ismael Juma wrote:
> > Thanks for the KIP, +1 (binding).
> >
> > Ismael
> >
> > On Fri, Sep 14, 2018 at 4:56 PM Dhruvil Shah  wrote:
> >
> >> Hi all,
> >>
> >> I would like to start a vote on KIP-361.
> >>
> >> Link to the KIP:
> >>
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-361%3A+Add+Consumer+Configuration+to+Disable+Auto+Topic+Creation
> >>
> >> Thanks,
> >> Dhruvil
> >>
> >
>


Re: [VOTE] KIP-365: Materialized, Serialized, Joined, Consumed and Produced with implicit Serde

2018-09-17 Thread Joan Goyeau
I see!
Thanks for the explanation Matt.

On Sun, 16 Sep 2018 at 20:10 Matthias J. Sax  wrote:

> People cannot pick if their vote is binding or not. If a committer
> (https://kafka.apache.org/committers) votes, it's binding; otherwise, not.
>
> Thus, there is no default -- it depends who votes.
>
>
> -Matthias
>
>
> On 9/13/18 2:41 PM, Joan Goyeau wrote:
> > Ok so a +1 is non binding by default.
> >
> > On Tue, 11 Sep 2018 at 17:25 Matthias J. Sax 
> wrote:
> >
> >> I only count 3 binding votes (Guozhang, Matthias, Damian). Plus 4
> >> non-binding (John, Ted, Bill, Dongjin) -- or 5 if you vote your own KIP
> :)
> >>
> >> -Matthias
> >>
> >> On 9/11/18 12:53 AM, Joan Goyeau wrote:
> >>> Ok, so this is now accepted.
> >>>
> >>> Binding votes: +5
> >>> Non-binding votes: +2
> >>>
> >>> Thanks
> >>>
> >>> On Wed, 5 Sep 2018 at 21:38 John Roesler  wrote:
> >>>
>  Hi Joan,
> 
>  Damian makes 3 binding votes, and the vote has been open longer than
> 72
>  hours, so your KIP vote has passed!
> 
>  It's customary for you to send a final reply to this thread stating
> that
>  the vote has passed, and stating the number of binding and non-binding
> >> +1s.
> 
>  Also please update the current state to Accepted here:
> 
> 
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-365%3A+Materialized%2C+Serialized%2C+Joined%2C+Consumed+and+Produced+with+implicit+Serde
> 
>  And move your kip into the Adopted table here:
> 
> 
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-365%3A+Materialized%2C+Serialized%2C+Joined%2C+Consumed+and+Produced+with+implicit+Serde
>  (release will be 2.1)
> 
>  And finally, move it to Accepted here:
>  https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Streams
> 
>  Lastly, you might want to make another pass over the PR and call for
> >> final
>  reviews.
> 
>  Thanks so much for managing this process,
>  -John
> 
>  On Mon, Sep 3, 2018 at 11:00 AM Damian Guy 
> >> wrote:
> 
> > +1
> >
> > On Sun, 2 Sep 2018 at 15:20 Matthias J. Sax 
>  wrote:
> >
> >> +1 (binding)
> >>
> >> On 9/1/18 2:40 PM, Guozhang Wang wrote:
> >>> +1 (binding).
> >>>
> >>> On Mon, Aug 27, 2018 at 5:20 PM, Dongjin Lee 
> > wrote:
> >>>
>  +1 (non-binding)
> 
>  On Tue, Aug 28, 2018 at 8:53 AM Bill Bejeck 
> > wrote:
> 
> > +1
> >
> > -Bill
> >
> > On Mon, Aug 27, 2018 at 3:24 PM Ted Yu 
>  wrote:
> >
> >> +1
> >>
> >> On Mon, Aug 27, 2018 at 12:18 PM John Roesler <
> j...@confluent.io>
>  wrote:
> >>
> >>> +1 (non-binding)
> >>>
> >>> On Sat, Aug 25, 2018 at 1:16 PM Joan Goyeau 
> > wrote:
> >>>
>  Hi,
> 
>  We want to make sure that we always have a serde for all
> > Materialized,
>  Serialized, Joined, Consumed and Produced.
>  For that we can make use of the implicit parameters in Scala.
> 
>  KIP:
> 
> 
> >>>
> >>
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 
> >>
> >
> 
> >>
> 365%3A+Materialized%2C+Serialized%2C+Joined%2C+Consumed+and+Produced+with+
>  implicit+Serde
> 
>  Github PR: https://github.com/apache/kafka/pull/5551
> 
>  Please make your votes.
>  Thanks
> 
> >>>
> >>
> >
> > --
> > *Dongjin Lee*
> >
> > *A hitchhiker in the mathematical world.*
> >
> > *github:  github.com/dongjinleekr
> > linkedin: kr.linkedin.com/in/
>  dongjinleekr
> > slideshare:
> >> www.slideshare.net/
>  dongjinleekr
> > *
> >
> 
> >>>
> >>>
> >>>
> >>
> >>
> >
> 
> >>>
> >>
> >>
> >
>
>


Re: [VOTE] KIP-366 - Make FunctionConversations private

2018-09-17 Thread Joan Goyeau
This KIP is now accepted with:
- 3 binding +1
- 2 non binding +1

Thanks all

On Fri, 14 Sep 2018 at 21:05 Matthias J. Sax  wrote:

> +1 (binding)
>
> -Matthias
>
> On 9/14/18 7:55 AM, Damian Guy wrote:
> > +1 (binding)
> >
> > Thanks
> >
> > On Fri, 14 Sep 2018 at 10:00 Joan Goyeau  wrote:
> >
> >> Ok so we have only one binding vote up to now. I guess we need at least
> 3
> >> binding votes.
> >>
> >> Thanks
> >>
> >> On Fri, 14 Sep 2018 at 09:59 Joan Goyeau  wrote:
> >>
> >>> Matt, This is now all updated.
> >>>
> >>> On Fri, 7 Sep 2018 at 03:34 Matthias J. Sax 
> >> wrote:
> >>>
>  Can you please update the KIP accordingly?
> 
>  It still says "make private" instead of "deprecating"
> 
>  -Matthias
> 
>  On 9/6/18 10:07 AM, Attila Sasvári wrote:
> > +1 (non-binding)
> >
> > On Thu, Sep 6, 2018 at 6:38 PM Guozhang Wang 
>  wrote:
> >
> >> +1 for deprecating and copying the class over to internals.
> >>
> >> On Thu, Sep 6, 2018 at 6:56 AM, Bill Bejeck 
> >> wrote:
> >>
> >>> +1
> >>>
> >>> -Bill
> >>>
> >>> On Thu, Sep 6, 2018 at 4:29 AM Joan Goyeau 
> wrote:
> >>>
>  Sournds good, I'll make the deprecation and copy the class over.
> 
>  Thanks
> 
>  On Wed, 5 Sep 2018 at 22:48 John Roesler 
> >> wrote:
> 
> > I'm a +1 (non-binding) because we doubt the class is in use.
> >
> > If you decide to copy it to a private version and deprecate the
> >>> original
> > instead, as Matthias suggested, I would still be a +1.
> >
> > Thanks,
> > -John
> >
> > On Sat, Sep 1, 2018 at 6:47 AM Joan Goyeau 
> >> wrote:
> >
> >> Hi,
> >>
> >> As pointed out in this comment
> >> https://github.com/apache/kafka/pull/5539#discussion_r212380648
> >>> "This
> >> class
> >> was already defaulted to public visibility, and we can't retract
> >> it
>  now,
> >> without a KIP.", the object FunctionConversions is only of
> >> internal
> >>> use
> > and
> >> therefore should be private to the lib only so that we can do
> >> changes
> >> without going through KIP like this one.
> >>
> >> KIP:
> >>
> >>
> >
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-366%3A+Make+
> >>> FunctionConversions+private
> >>
> >> Please make your votes.
> >> Thanks
> >>
> >
> 
> >>>
> >>
> >>
> >>
> >> --
> >> -- Guozhang
> >>
> >
> 
> 
> >>
> >
>
>