Re: [DISCUSS] KIP-354 Time-based log compaction policy

2018-09-03 Thread Brett Rann
Might also be worth moving to a vote thread? Discussion seems to have gone as 
far as it can. 

> On 4 Sep 2018, at 12:08, xiongqi wu  wrote:
> 
> Brett,
> 
> Yes, I will post PR tomorrow.
> 
> Xiongqi (Wesley) Wu
> 
> 
> On Sun, Sep 2, 2018 at 6:28 PM Brett Rann  wrote:
> 
> > +1 (non-binding) from me on the interface. I'd like to see someone familiar
> > with
> > the code comment on the approach, and note there's a couple of different
> > approaches: what's documented in the KIP, and what Xiaohe Dong was working
> > on
> > here:
> >
> > https://github.com/dongxiaohe/kafka/tree/dongxiaohe/log-cleaner-compaction-max-lifetime-2.0
> >
> > If you have code working already Xiongqi Wu could you share a PR? I'd be
> > happy
> > to start testing.
> >
> > On Tue, Aug 28, 2018 at 5:57 AM xiongqi wu  wrote:
> >
> > > Hi All,
> > >
> > > Do you have any additional comments on this KIP?
> > >
> > >
> > > On Thu, Aug 16, 2018 at 9:17 PM, xiongqi wu  wrote:
> > >
> > > > on 2)
> > > > The offsetmap is built starting from dirty segment.
> > > > The compaction starts from the beginning of the log partition. That's
> > how
> > > > it ensure the deletion of tomb keys.
> > > > I will double check tomorrow.
> > > >
> > > > Xiongqi (Wesley) Wu
> > > >
> > > >
> > > > On Thu, Aug 16, 2018 at 6:46 PM Brett Rann 
> > > > wrote:
> > > >
> > > >> To just clarify a bit on 1. whether there's an external storage/DB
> > isn't
> > > >> relevant here.
> > > >> Compacted topics allow a tombstone record to be sent (a null value
> > for a
> > > >> key) which
> > > >> currently will result in old values for that key being deleted if some
> > > >> conditions are met.
> > > >> There are existing controls to make sure the old values will stay
> > around
> > > >> for a minimum
> > > >> time at least, but no dedicated control to ensure the tombstone will
> > > >> delete
> > > >> within a
> > > >> maximum time.
> > > >>
> > > >> One popular reason that maximum time for deletion is desirable right
> > now
> > > >> is
> > > >> GDPR with
> > > >> PII. But we're not proposing any GDPR awareness in kafka, just being
> > > able
> > > >> to guarantee
> > > >> a max time where a tombstoned key will be removed from the compacted
> > > >> topic.
> > > >>
> > > >> on 2)
> > > >> huh, i thought it kept track of the first dirty segment and didn't
> > > >> recompact older "clean" ones.
> > > >> But I didn't look at code or test for that.
> > > >>
> > > >> On Fri, Aug 17, 2018 at 10:57 AM xiongqi wu 
> > > wrote:
> > > >>
> > > >> > 1, Owner of data (in this sense, kafka is the not the owner of data)
> > > >> > should keep track of lifecycle of the data in some external
> > > storage/DB.
> > > >> > The owner determines when to delete the data and send the delete
> > > >> request to
> > > >> > kafka. Kafka doesn't know about the content of data but to provide a
> > > >> mean
> > > >> > for deletion.
> > > >> >
> > > >> > 2 , each time compaction runs, it will start from first segments (no
> > > >> > matter if it is compacted or not). The time estimation here is only
> > > used
> > > >> > to determine whether we should run compaction on this log partition.
> > > So
> > > >> we
> > > >> > only need to estimate uncompacted segments.
> > > >> >
> > > >> > On Thu, Aug 16, 2018 at 5:35 PM, Dong Lin 
> > > wrote:
> > > >> >
> > > >> > > Hey Xiongqi,
> > > >> > >
> > > >> > > Thanks for the update. I have two questions for the latest KIP.
> > > >> > >
> > > >> > > 1) The motivation section says that one use case is to delete PII
> > > >> > (Personal
> > > >> > > Identifiable information) data within 7 days while keeping non-PII
> > > >> > > indefinitely in compacted format. I suppose the use-case depends
> > on
> > > >> the
> > > >> > > application to determine when to delete those PII data. Could you
> > > >> explain
> > > >> > > how can application reliably determine the set of keys that should
> > > be
> > > >> > > deleted? Is application required to always messages from the topic
> > > >> after
> > > >> > > every restart and determine the keys to be deleted by looking at
> > > >> message
> > > >> > > timestamp, or is application supposed to persist the key->
> > timstamp
> > > >> > > information in a separate persistent storage system?
> > > >> > >
> > > >> > > 2) It is mentioned in the KIP that "we only need to estimate
> > > earliest
> > > >> > > message timestamp for un-compacted log segments because the
> > deletion
> > > >> > > requests that belong to compacted segments have already been
> > > >> processed".
> > > >> > > Not sure if it is correct. If a segment is compacted before user
> > > sends
> > > >> > > message to delete a key in this segment, it seems that we still
> > need
> > > >> to
> > > >> > > ensure that the segment will be compacted again within the given
> > > time
> > > >> > after
> > > >> > > the deletion is requested, right?
> > > >> > >
> > > >> > > Thanks,
> > > >> > > Dong
> > > >> > >
> > > >> > > On Thu, Aug 16, 2018 at 10:27 AM, xiongqi 

Re: [DISCUSS] KIP-354 Time-based log compaction policy

2018-09-03 Thread xiongqi wu
Brett,

Yes, I will post PR tomorrow.

Xiongqi (Wesley) Wu


On Sun, Sep 2, 2018 at 6:28 PM Brett Rann  wrote:

> +1 (non-binding) from me on the interface. I'd like to see someone familiar
> with
> the code comment on the approach, and note there's a couple of different
> approaches: what's documented in the KIP, and what Xiaohe Dong was working
> on
> here:
>
> https://github.com/dongxiaohe/kafka/tree/dongxiaohe/log-cleaner-compaction-max-lifetime-2.0
>
> If you have code working already Xiongqi Wu could you share a PR? I'd be
> happy
> to start testing.
>
> On Tue, Aug 28, 2018 at 5:57 AM xiongqi wu  wrote:
>
> > Hi All,
> >
> > Do you have any additional comments on this KIP?
> >
> >
> > On Thu, Aug 16, 2018 at 9:17 PM, xiongqi wu  wrote:
> >
> > > on 2)
> > > The offsetmap is built starting from dirty segment.
> > > The compaction starts from the beginning of the log partition. That's
> how
> > > it ensure the deletion of tomb keys.
> > > I will double check tomorrow.
> > >
> > > Xiongqi (Wesley) Wu
> > >
> > >
> > > On Thu, Aug 16, 2018 at 6:46 PM Brett Rann 
> > > wrote:
> > >
> > >> To just clarify a bit on 1. whether there's an external storage/DB
> isn't
> > >> relevant here.
> > >> Compacted topics allow a tombstone record to be sent (a null value
> for a
> > >> key) which
> > >> currently will result in old values for that key being deleted if some
> > >> conditions are met.
> > >> There are existing controls to make sure the old values will stay
> around
> > >> for a minimum
> > >> time at least, but no dedicated control to ensure the tombstone will
> > >> delete
> > >> within a
> > >> maximum time.
> > >>
> > >> One popular reason that maximum time for deletion is desirable right
> now
> > >> is
> > >> GDPR with
> > >> PII. But we're not proposing any GDPR awareness in kafka, just being
> > able
> > >> to guarantee
> > >> a max time where a tombstoned key will be removed from the compacted
> > >> topic.
> > >>
> > >> on 2)
> > >> huh, i thought it kept track of the first dirty segment and didn't
> > >> recompact older "clean" ones.
> > >> But I didn't look at code or test for that.
> > >>
> > >> On Fri, Aug 17, 2018 at 10:57 AM xiongqi wu 
> > wrote:
> > >>
> > >> > 1, Owner of data (in this sense, kafka is the not the owner of data)
> > >> > should keep track of lifecycle of the data in some external
> > storage/DB.
> > >> > The owner determines when to delete the data and send the delete
> > >> request to
> > >> > kafka. Kafka doesn't know about the content of data but to provide a
> > >> mean
> > >> > for deletion.
> > >> >
> > >> > 2 , each time compaction runs, it will start from first segments (no
> > >> > matter if it is compacted or not). The time estimation here is only
> > used
> > >> > to determine whether we should run compaction on this log partition.
> > So
> > >> we
> > >> > only need to estimate uncompacted segments.
> > >> >
> > >> > On Thu, Aug 16, 2018 at 5:35 PM, Dong Lin 
> > wrote:
> > >> >
> > >> > > Hey Xiongqi,
> > >> > >
> > >> > > Thanks for the update. I have two questions for the latest KIP.
> > >> > >
> > >> > > 1) The motivation section says that one use case is to delete PII
> > >> > (Personal
> > >> > > Identifiable information) data within 7 days while keeping non-PII
> > >> > > indefinitely in compacted format. I suppose the use-case depends
> on
> > >> the
> > >> > > application to determine when to delete those PII data. Could you
> > >> explain
> > >> > > how can application reliably determine the set of keys that should
> > be
> > >> > > deleted? Is application required to always messages from the topic
> > >> after
> > >> > > every restart and determine the keys to be deleted by looking at
> > >> message
> > >> > > timestamp, or is application supposed to persist the key->
> timstamp
> > >> > > information in a separate persistent storage system?
> > >> > >
> > >> > > 2) It is mentioned in the KIP that "we only need to estimate
> > earliest
> > >> > > message timestamp for un-compacted log segments because the
> deletion
> > >> > > requests that belong to compacted segments have already been
> > >> processed".
> > >> > > Not sure if it is correct. If a segment is compacted before user
> > sends
> > >> > > message to delete a key in this segment, it seems that we still
> need
> > >> to
> > >> > > ensure that the segment will be compacted again within the given
> > time
> > >> > after
> > >> > > the deletion is requested, right?
> > >> > >
> > >> > > Thanks,
> > >> > > Dong
> > >> > >
> > >> > > On Thu, Aug 16, 2018 at 10:27 AM, xiongqi wu  >
> > >> > wrote:
> > >> > >
> > >> > > > Hi Xiaohe,
> > >> > > >
> > >> > > > Quick note:
> > >> > > > 1) Use minimum of segment.ms and max.compaction.lag.ms
> > >> > > >  > 
> > >> >  > >>
> > >> > > >
> > >> > > > 2) I am not sure if I get your second question. first, we have
> > >> jitter
> > >> > > when
> > 

Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-03 Thread Ron Dagostino
I just realized there was one place in the KIP that stated that retries
could occur indefinitely (when a client attempts to change identity, which
we arbitrarily disallow).  This was a mistake, a holdover from a prior
draft of the KIP.  This is now fixed.  Retries are never allowed
indefinitely.

<<>>if a connection originally authenticates as User:user1, an attempt to
re-authenticate as anything else (e.g. User:user2) will fail.
>>>Retry is allowed in this case (still subject to the expiration of the
original credential as described above)

Ron


On Mon, Sep 3, 2018 at 5:35 PM Ron Dagostino  wrote:

> Thanks for the feedback, Stanislav.
>
> << solution.
> Yeah, I wasn't familiar with it when I started, either, and I am hoping
> people who are intimately familiar with it will take a close look.  Some of
> that code seems to be very central to the core of Kafka, and injecting
> re-authentication attempts into the flow of replica fetching, sending
> transaction markers, and producing or consuming messages is not something I
> am convinced is acceptable under all circumstances.  To be clear, though, I
> am not saying it is problematic, either -- I just don't have enough
> experience or familiarity to know.  I really do want additional eyes on
> this if possible.
>
> Regarding your question about retries, here's the part of the KIP that
> deals with those:
>
> If a re-authentication attempt should fail then the connection will be
>> told to retry after some delay depending on how many retries have been
>> attempted so far: after some small amount of time for the first retry (e.g.
>> 1 minute), double that for the second retry, and 4 times the initial delay
>> for every retry thereafter.  Retry attempts generally occur at least until
>> the current credential expires (but not indefinitely – and of course they
>> won't continue if one of them actually succeeds).  There are certain errors
>> that result in retries not being attempted (i.e. if some internal state
>> doesn't make sense, which generally should not happen).
>
>
> << No.  There may be at most one retry after the client token expires.  So if
> a token expires after an hour, and several retry attempts fail including
> one at minute 59, then one last attempt will be made at the 63-minute mark.
>
> << scope of this KIP, what effect would
> << Practically speaking, unless authorization is based on the contents of the
> token rather than ACLs, the ultimate success or failure of
> re-authentication has no effect.  The intent is definitely to follow this
> KIP with another one to add the ability for brokers to close connections
> that use expired credentials, and then at that point the client would have
> to successfully re-authenticate to avoid the connection being forcibly
> closed.
>
> Ron
>
>
> On Mon, Sep 3, 2018 at 12:58 PM Stanislav Kozlovski <
> stanis...@confluent.io> wrote:
>
>> Hi Ron,
>>
>> I really like this KIP, it is very needed.
>> I am still looking through it but unfortunately I am not too familiar with
>> the networking code to evaluate your solution.
>>
>> I'd like to ask what happens if re-authentication consistently fails.
>> Would
>> we retry endlessly? Since the functionality for brokers to close
>> connections is outside the scope of this KIP, what effect would
>> success/failure in re-authentication have? I think it's worth noting in
>> the
>> KIP
>>
>> I also think the rejected alternative of initiating a new connection
>> should
>> stay rejected. I am not aware of anything currently limiting the client to
>> connect to the same broker, but I think it would be best if we kept
>> Kafka's
>> options open (e.g addition of a load balancer in the future) and not
>> introduce additional client-broker statefulness.
>>
>> Thanks,
>> Stanislav
>>
>> On Tue, Aug 28, 2018 at 5:28 PM Ron Dagostino  wrote:
>>
>> > Hi everyone. I created KIP 368: Allow SASL Connections to Periodically
>> > Re-Authenticate
>> > <
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-368%3A+Allow+SASL+Connections+to+Periodically+Re-Authenticate
>> > >
>> > (
>> >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-368%3A+Allow+SASL+Connections+to+Periodically+Re-Authenticate
>> > ).
>> > The motivation for this KIP is as follows:
>> >
>> > The adoption of KIP-255: OAuth Authentication via SASL/OAUTHBEARER
>> > <
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75968876
>> >
>> > in
>> > release 2.0.0 creates the possibility of using information in the bearer
>> > token to make authorization decisions.  Unfortunately, however, Kafka
>> > connections are long-lived, so there is no ability to change the bearer
>> > token associated with a particular connection.  Allowing SASL
>> connections
>> > to periodically re-authenticate would resolve this.  In addition to this
>> > motivation there are two others that are security-related.  First, to
>> > eliminate access to Kafka for connected clients, the current
>> requirement is
>> > to 

Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-03 Thread Ron Dagostino
Thanks for the feedback, Stanislav.

<< to retry after some delay depending on how many retries have been attempted
> so far: after some small amount of time for the first retry (e.g. 1
> minute), double that for the second retry, and 4 times the initial delay
> for every retry thereafter.  Retry attempts generally occur at least until
> the current credential expires (but not indefinitely – and of course they
> won't continue if one of them actually succeeds).  There are certain errors
> that result in retries not being attempted (i.e. if some internal state
> doesn't make sense, which generally should not happen).


<<
wrote:

> Hi Ron,
>
> I really like this KIP, it is very needed.
> I am still looking through it but unfortunately I am not too familiar with
> the networking code to evaluate your solution.
>
> I'd like to ask what happens if re-authentication consistently fails. Would
> we retry endlessly? Since the functionality for brokers to close
> connections is outside the scope of this KIP, what effect would
> success/failure in re-authentication have? I think it's worth noting in the
> KIP
>
> I also think the rejected alternative of initiating a new connection should
> stay rejected. I am not aware of anything currently limiting the client to
> connect to the same broker, but I think it would be best if we kept Kafka's
> options open (e.g addition of a load balancer in the future) and not
> introduce additional client-broker statefulness.
>
> Thanks,
> Stanislav
>
> On Tue, Aug 28, 2018 at 5:28 PM Ron Dagostino  wrote:
>
> > Hi everyone. I created KIP 368: Allow SASL Connections to Periodically
> > Re-Authenticate
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-368%3A+Allow+SASL+Connections+to+Periodically+Re-Authenticate
> > >
> > (
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-368%3A+Allow+SASL+Connections+to+Periodically+Re-Authenticate
> > ).
> > The motivation for this KIP is as follows:
> >
> > The adoption of KIP-255: OAuth Authentication via SASL/OAUTHBEARER
> > <
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75968876>
> > in
> > release 2.0.0 creates the possibility of using information in the bearer
> > token to make authorization decisions.  Unfortunately, however, Kafka
> > connections are long-lived, so there is no ability to change the bearer
> > token associated with a particular connection.  Allowing SASL connections
> > to periodically re-authenticate would resolve this.  In addition to this
> > motivation there are two others that are security-related.  First, to
> > eliminate access to Kafka for connected clients, the current requirement
> is
> > to remove all authorizations (i.e. remove all ACLs).  This is necessary
> > because of the long-lived nature of the connections.  It is operationally
> > simpler to shut off access at the point of authentication, and with the
> > release of KIP-86: Configurable SASL Callback Handlers
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-86%3A+Configurable+SASL+callback+handlers
> > >
> > it
> > is going to become more and more likely that installations will
> > authenticate users against external directories (e.g. via LDAP).  The
> > ability to stop Kafka access by simply disabling an account in an LDAP
> > directory (for example) is desirable.  The second motivating factor for
> > re-authentication related to security is that the use of short-lived
> tokens
> > is a common OAuth security recommendation, but issuing a short-lived
> token
> > to a Kafka client (or a broker when OAUTHBEARER is the inter-broker
> > protocol) currently has no benefit because once a client is connected to
> a
> > broker the client is never challenged again and the connection may remain
> > intact beyond the token expiration time (and may remain intact
> indefinitely
> > under perfect circumstances).  This KIP proposes adding the ability for
> > clients (and brokers when OAUTHBEARER is the inter-broker protocol) to
> > re-authenticate their connections to brokers and have the new bearer
> token
> > appear on their session rather than the old one.
> >
> > The description of this KIP is actually quite straightforward from a
> > functionality perspective; from an implementation perspective, though,
> the
> > KIP is not so straightforward, so it includes a pull request with a
> > proposed implementation.  See https://github.com/apache/kafka/pull/5582.
> >
> > Ron
> >
>
>
> --
> Best,
> Stanislav
>


Re: [DISCUSS] KIP-369 Alternative Partitioner to Support "Always Round-Robin" Selection

2018-09-03 Thread Matthias J. Sax
Thanks for the KIP.

I was initially wondering if we need to add this into Kafka code base
not. You could just go with a custom `Partitioner` implementation.

The demand of KAFKA- seems not to be big actually. On the other
hand, it might be a generic use case that justifies to add to Kafka code
base. Also, it's not much code and thus easy to maintain.

About naming: `RoundRobinPartitioner` seems to be a good name.
`KeylessPartitioner` does not describe how partitions are selected. I
don't see any conflict with `DefaultPartitioner` -- it implements a
hybrid strategy depending if key is `null` or not (thus finding a name
better than `DefaultPartitioner` that describes the strategy seem to be
difficult).

Feel free to start a VOTE :) I don't expect this KIP to be controversial.


-Matthias

On 9/2/18 7:57 AM, M. Manna wrote:
> Thanks for all your comments and taking it easy on me for my first KIP :)
> 
>  I am trying to check if it's okay for us to start a vote on this? As per
> some recent comment I'll change the name to RoundRobinPartitioner.
> 
> I'll need to put some effort in writing Scala tests etc. since I'm a novice
> with Scala.
> 
> Please let me know your thoughts, and I'll update the status accordingly
> (and start working on the JIRA once it's approved).
> 
> Regards,
> 
> On Fri, 31 Aug 2018, 10:22 M. Manna,  wrote:
> 
>> Yes I’m more than happy to change it to a more appropriate name.
>>
>> The issue with RoundRobinPatitoner is that the DefaultPartitioner already
>> has a Round-Robin associated to it. But if community doesn’t mind the name,
>> I don’t either.
>>
>> Thanks for reading the KIP btw.
>>
>> Regards,
>>
>> On Fri, 31 Aug 2018 at 05:47, Magesh Nandakumar 
>> wrote:
>>
>>> +1 for this. The only small suggestion would be to possibly call this
>>> RondRobinPartitioner which makes the intent obvious.
>>>
>>> On Thu, Aug 30, 2018 at 5:31 PM Stephen Powis 
>>> wrote:
>>>
 Neat, this would be super helpful! I submitted this ages ago:
 https://issues.apache.org/jira/browse/KAFKA-

 On Fri, Aug 31, 2018 at 5:04 AM, Satish Duggana <
>>> satish.dugg...@gmail.com>
 wrote:

> +including both dev and user mailing lists.
>
> Hi,
> Thanks for the KIP.
>
> "* For us, the message keys represent some metadata which we use to
 either
> ignore messages (if a loop-back to the sender), or log some
 information.*"
>
> Above statement was mentioned in the KIP about how key value is used.
>>> I
> guess the topic is not configured to be compacted and you do not want
>>> to
> have partitioning based on that key. IMHO, it qualifies more as a
>>> header
> than a key. What do you think about building records with a specific
 header
> and consumers to execute the logic whether to process or ignore the
> messages based on that header value.
>
> Thanks,
> Satish.
>
>
> On Fri, Aug 31, 2018 at 1:32 AM, Satish Duggana <
 satish.dugg...@gmail.com>
> wrote:
>
>> Hi,
>> Thanks for the KIP.
>>
>> "* For us, the message keys represent some metadata which we use to
>> either ignore messages (if a loop-back to the sender), or log some
>> information.*"
>>
>> Above statement was mentioned in the KIP about how key value is
>>> used. I
>> guess the topic is not configured to be compacted and you do not
>>> want
 to
>> have partitioning based on that key. IMHO, it qualifies more as a
 header
>> than a key. What do you think about building records with a specific
> header
>> and consumers to execute the logic whether to process or ignore the
>> messages based on that header value.
>>
>> Thanks,
>> Satish.
>>
>>
>> On Fri, Aug 31, 2018 at 12:02 AM, M. Manna 
>>> wrote:
>>
>>> Hi Harsha,
>>>
>>> thanks for reading the KIP.
>>>
>>> The intent is to use the DefaultPartitioner logic for round-robin
>>> selection
>>> of partition regardless of key type.
>>>
>>> Implementing Partitioner interface isn’t the issue here, you would
 have
> to
>>> do that anyway if  you are implementing your own. But we also want
 this
> to
>>> be part of formal codebase.
>>>
>>> Regards,
>>>
>>> On Thu, 30 Aug 2018 at 16:58, Harsha  wrote:
>>>
 Hi,
   Thanks for the KIP. I am trying to understand the intent of
 the
 KIP.  Is the use case you specified can't be achieved by
 implementing
>>> the
 Partitioner interface here?
 https://github.com/apache/kafka/blob/trunk/clients/src/main/
>>> java/org/apache/kafka/clients/producer/Partitioner.java#L28
 .
 Use your custom partitioner to be configured in your producer
 clients.

 Thanks,
 Harsha

 On Thu, Aug 30, 2018, at 1:45 AM, M. Manna wrote:
> Hello,
>
> I opened a 

Re: KIP-213 - Scalable/Usable Foreign-Key KTable joins - Rebooted.

2018-09-03 Thread Adam Bellemare
Hi Jan

Thank you for taking the time to look into my PR. I have updated it
accordingly along with the suggestions from John. Please note that I am by
no means an expert on Java, so I really do appreciate any Java-specific
feedback you may have. Do not worry about being overly verbose on it.

You are correct with regards to the highwater mark growing unbounded. One
option would be to implement the rocksDB TTL to expire records. I am open
to other options as well.

I have tried to detail the reasoning behind it in the KIP - I have added
additional comments and I hope that it is clearer now.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-213+Support+non-key+joining+in+KTable#KIP-213Supportnon-keyjoininginKTable-MultipleRapidForeign-KeyValueChanges-Whyahighwatertableisrequired
.

Please keep in mind that there may be something about ordering guarantees
that I am not aware of. As far as I know, once you begin to operate on
events in parallel across different nodes within the processor API, there
are no ordering guarantees and everything is simple first-come,
first-served(processed). If this is not the case then I am unaware of that
fact.



Thanks

Adam





On Mon, Sep 3, 2018 at 8:38 AM, Jan Filipiak 
wrote:

> Finished my deeper scan on your approach.
> Most of the comments I put at the PR are minor code style things.
> One forward call seems to be a bug though, would be great if you could
> double check.
>
> the one problem I see is that the high watermark store grows unbounded.
> A key being deleted from the source table does not lead to deletion in the
> watermark store.
>
> I also don't quite grasp the concept why it's needed.  I think the whole
> offset part can go away?
> It seems to deal with node failures of some kind but everything should
> turn out okay without it?
>
> Best Jan
>
>
> On 01.09.2018 20:44, Guozhang Wang wrote:
>
>> Yes Adam, that makes sense.
>>
>> I think it may be better to have a working PR to review before we complete
>> the VOTE thread. In my previous experience a large feature like this are
>> mostly definitely going to miss some devils in the details in the design
>> and wiki discussion phases.
>>
>> That would unfortunately mean that your implementations may need to be
>> modified / updated along with the review and further KIP discussion. I can
>> understand this can be painful, but that may be the best option we can do
>> to avoid as much work to be wasted as possible.
>>
>>
>> Guozhang
>>
>>
>> On Wed, Aug 29, 2018 at 10:06 AM, Adam Bellemare <
>> adam.bellem...@gmail.com>
>> wrote:
>>
>> Hi Guozhang
>>>
>>> By workflow I mean just the overall process of how the KIP is
>>> implemented.
>>> Any ideas on the ways to reduce the topic count, materializations, if
>>> there
>>> is a better way to resolve out-of-order than a highwater mark table, if
>>> the
>>> design philosophy of “keep everything encapsulated within the join
>>> function” is appropriate, etc. I can implement the changes that John
>>> suggested, but if my overall workflow is not acceptable I would rather
>>> address that before making minor changes.
>>>
>>> If this requires a full candidate PR ready to go to prod then I can make
>>> those changes. Hope that clears things up.
>>>
>>> Thanks
>>>
>>> Adam
>>>
>>> On Aug 29, 2018, at 12:42 PM, Guozhang Wang  wrote:

 Hi Adam,

 What do you mean by "additional comments on the workflow.", do you mean

>>> to
>>>
 let other review your PR https://github.com/apache/kafka/pull/5527 ? Is

>>> is
>>>
 ready for reviews?


 Guozhang

 On Tue, Aug 28, 2018 at 5:00 AM, Adam Bellemare <

>>> adam.bellem...@gmail.com>
>>>
 wrote:

 Okay, I will implement John's suggestion of namespacing the external
> headers prior to processing, and then removing the namespacing prior to
> emitting. A potential future KIP could be to provide this namespacing
> automatically.
>
> I would also appreciate any other additional comments on the workflow.
>
 My
>>>
 goal is suss out agreement prior to moving to a vote.
>
> On Mon, Aug 27, 2018 at 3:19 PM, Guozhang Wang 
>>
> wrote:
>>>
 I like John's idea as well: for this KIP specifically as we do not
>>
> expect
>>>
 any other consumers to read the repartition topics externally, we can
>> slightly prefix the header to be safe, while keeping the additional
>>
> cost
>>>
 (note the header field is per-record, so any additional byte is
>>
> per-record
>
>> as well) low.
>>
>>
>> Guozhang
>>
>> On Tue, Aug 21, 2018 at 11:58 AM, Adam Bellemare <
>>
> adam.bellem...@gmail.com
>
>> wrote:
>>
>> Hi John
>>>
>>> That is an excellent idea. The header usage I propose would be
>>> limited
>>> entirely to internal topics, and this could very well be the solution
>>>
>> to
>
>> potential conflicts. If we do not 

Re: [DISCUSS] KIP-347: Enable batching in FindCoordinatorRequest

2018-09-03 Thread Yishun Guan
Hi Guozhang,

Yes, I totally agree with you. Like I said, I think it is an overkill
for now, to make so many changes for a small improvement.
And like you said, the only way to do this is the "ugly and dirty"
way, do you think we should still apply this improvement?

I updated a new BuildToList() (method name pending) and add a check
condition in the doSend().
This is the 
KIP:https://cwiki.apache.org/confluence/display/KAFKA/KIP-347%3A++Enable+batching+in+FindCoordinatorRequest

Let me know what you think.

Thanks,
Yishun
On Sun, Sep 2, 2018 at 10:02 PM Guozhang Wang  wrote:
>
> Hi Yishun,
>
> I was actually not suggesting we should immediately make such dramatic
> change on the AbstractRequest APIs which will affect all requests types,
> just clarifying if it is your intent or not, since your code snippet in the
> KIP has "@Override"  :)
>
> I think an alternative way is to add such a function for for
> FindCoordinator only, i.e. besides the overridden `public
> FindCoordinatorRequest build(short version)` we can have one more function
> (note the function name need to be different since Java type erasure caused
> it to not able to differentiate these two otherwise, but we can consider a
> better name: buildMulti is only for illustration)
>
> public List buildMulti(short version)
>
>
> It does mean that we now need to special-handle
> FindCoordinatorRequestBuilder in all callers from other requests, which is
> also a bit "ugly and dirty", but the change scope may be smaller. General
> changes on the AbstractRequestBuilder could be delayed until we realize
> this is a common usage for some other requests in their newer versions as
> well.
>
>
> Guozhang
>
>
> On Sun, Sep 2, 2018 at 4:10 PM, Yishun Guan  wrote:
>
> > Hi Guozhang,
> >
> > Yes, you are right. I didn't notice T build() is bounded to  > AbstractRequest>.
> > I was originally thinking T could be an AbstractedRequest or List<>
> > but I was wrong. Now the return type has to change from T build() to
> > List build
> > where . As you mentioned,
> > this required a change for all the requests, probably need
> > a new KIP too, do you think. I will update this KIP accordingly first.
> >
> > However, do you see other benefits of changing the return type for build()?
> > The original improvement that we want is this:
> > https://issues.apache.org/jira/browse/KAFKA-6788.
> > It seems like we have to make a lot of structural changes for this
> > small improvement.
> > I think changing the return type might benefit the project in the future,
> > but I don't know the project enough to say so. I would love to keep
> > working on this,
> > but do you see all these changes are worth for this story,
> > and if not, is there another way out?
> >
> > Thanks,
> > Yishun
> > On Sat, Sep 1, 2018 at 11:04 AM Guozhang Wang  wrote:
> > >
> > > Hello Yishun,
> > >
> > > Thanks for the updated KIP. I think option 1), i.e. return multiple
> > > requests from build() call is okay. Just to clarify: you are going to
> > > change `AbstractRequest#build(short version)` signature, and hence all
> > > requests need to be updated accordingly, but only FindCoordinator for may
> > > return multiple requests in the list, while all others will return
> > > singleton list, right?
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Fri, Aug 31, 2018 at 10:51 AM, Yishun Guan  wrote:
> > >
> > > > @Guozhang Wang Could you review this again when you have time? Thanks!
> > > > -Yishun
> > > > On Wed, Aug 29, 2018 at 11:57 AM Yishun Guan 
> > wrote:
> > > > >
> > > > > Hi, because I have made some significant changes on this design, so I
> > > > > want to reopen the discussion on this KIP:
> > > > > https://cwiki.apache.org/confluence/x/CgZPBQ
> > > > >
> > > > > Thanks,
> > > > > Yishun
> > > > > On Thu, Aug 16, 2018 at 5:06 PM Yishun Guan 
> > wrote:
> > > > > >
> > > > > > I see! Thanks!
> > > > > >
> > > > > > On Thu, Aug 16, 2018, 4:35 PM Guozhang Wang 
> > > > wrote:
> > > > > >>
> > > > > >> It is not implemented, but should not be hard to do so (and again
> > you
> > > > do
> > > > > >> NOT have to do that in this KIP, I'm bringing this up so that you
> > can
> > > > help
> > > > > >> thinking about the process).
> > > > > >>
> > > > > >> Quoting from Colin's comment:
> > > > > >>
> > > > > >> "
> > > > > >> The pattern is that you would try to send a request for more than
> > one
> > > > > >> group, and then you would get an UnsupportedVersionException
> > (nothing
> > > > would
> > > > > >> be sent on the wire, this is purely internal to the code).
> > > > > >> Then your code would handle the UVE by creating requests with an
> > older
> > > > > >> version that only had one group each.
> > > > > >> "
> > > > > >>
> > > > > >>
> > > > > >> Guozhang
> > > > > >>
> > > > > >>
> > > > > >> On Wed, Aug 15, 2018 at 4:44 PM, Yishun Guan 
> > > > wrote:
> > > > > >>
> > > > > >> > Hi, I am looking into AdminClient.scala and AdminClient.java,
> > and
> > > > also
> > > > > >> > looking into 

Re: [DISCUSS] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-03 Thread Stanislav Kozlovski
Hi Ron,

I really like this KIP, it is very needed.
I am still looking through it but unfortunately I am not too familiar with
the networking code to evaluate your solution.

I'd like to ask what happens if re-authentication consistently fails. Would
we retry endlessly? Since the functionality for brokers to close
connections is outside the scope of this KIP, what effect would
success/failure in re-authentication have? I think it's worth noting in the
KIP

I also think the rejected alternative of initiating a new connection should
stay rejected. I am not aware of anything currently limiting the client to
connect to the same broker, but I think it would be best if we kept Kafka's
options open (e.g addition of a load balancer in the future) and not
introduce additional client-broker statefulness.

Thanks,
Stanislav

On Tue, Aug 28, 2018 at 5:28 PM Ron Dagostino  wrote:

> Hi everyone. I created KIP 368: Allow SASL Connections to Periodically
> Re-Authenticate
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-368%3A+Allow+SASL+Connections+to+Periodically+Re-Authenticate
> >
> (
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-368%3A+Allow+SASL+Connections+to+Periodically+Re-Authenticate
> ).
> The motivation for this KIP is as follows:
>
> The adoption of KIP-255: OAuth Authentication via SASL/OAUTHBEARER
> 
> in
> release 2.0.0 creates the possibility of using information in the bearer
> token to make authorization decisions.  Unfortunately, however, Kafka
> connections are long-lived, so there is no ability to change the bearer
> token associated with a particular connection.  Allowing SASL connections
> to periodically re-authenticate would resolve this.  In addition to this
> motivation there are two others that are security-related.  First, to
> eliminate access to Kafka for connected clients, the current requirement is
> to remove all authorizations (i.e. remove all ACLs).  This is necessary
> because of the long-lived nature of the connections.  It is operationally
> simpler to shut off access at the point of authentication, and with the
> release of KIP-86: Configurable SASL Callback Handlers
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-86%3A+Configurable+SASL+callback+handlers
> >
> it
> is going to become more and more likely that installations will
> authenticate users against external directories (e.g. via LDAP).  The
> ability to stop Kafka access by simply disabling an account in an LDAP
> directory (for example) is desirable.  The second motivating factor for
> re-authentication related to security is that the use of short-lived tokens
> is a common OAuth security recommendation, but issuing a short-lived token
> to a Kafka client (or a broker when OAUTHBEARER is the inter-broker
> protocol) currently has no benefit because once a client is connected to a
> broker the client is never challenged again and the connection may remain
> intact beyond the token expiration time (and may remain intact indefinitely
> under perfect circumstances).  This KIP proposes adding the ability for
> clients (and brokers when OAUTHBEARER is the inter-broker protocol) to
> re-authenticate their connections to brokers and have the new bearer token
> appear on their session rather than the old one.
>
> The description of this KIP is actually quite straightforward from a
> functionality perspective; from an implementation perspective, though, the
> KIP is not so straightforward, so it includes a pull request with a
> proposed implementation.  See https://github.com/apache/kafka/pull/5582.
>
> Ron
>


-- 
Best,
Stanislav


Re: [VOTE] KIP-357: Add support to list ACLs per principal

2018-09-03 Thread Mickael Maison
+1 (non binding)
On Mon, Sep 3, 2018 at 3:14 PM Manikumar  wrote:
>
> bump up!  waiting for  2 more binding votes!
>
> On Tue, Aug 28, 2018 at 7:36 AM Satish Duggana 
> wrote:
>
> > +1 (non-binding)
> >
> > On Tue, Aug 28, 2018 at 2:59 AM, Harsha  wrote:
> >
> > > +1 (binding)
> > >
> > > -Harsha
> > >
> > > On Mon, Aug 27, 2018, at 12:46 PM, Jakub Scholz wrote:
> > > > +1 (non-binding)
> > > >
> > > > On Mon, Aug 27, 2018 at 6:24 PM Manikumar 
> > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > I would like to start voting on KIP-357 which allows to list ACLs per
> > > > > principal using AclCommand (kafka-acls.sh)
> > > > >
> > > > > KIP:
> > > > >
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 357%3A++Add+support+to+list+ACLs+per+principal
> > > > >
> > > > > Discussion Thread:
> > > > >
> > > > > https://lists.apache.org/thread.html/dc7f6005845a372a0a48a40872a32d
> > > 9ece03807a4fb1bb89d3645afb@%3Cdev.kafka.apache.org%3E
> > > > >
> > > > > Thanks,
> > > > > Manikumar
> > > > >
> > >
> >


Re: [VOTE] KIP-365: Materialized, Serialized, Joined, Consumed and Produced with implicit Serde

2018-09-03 Thread Damian Guy
+1

On Sun, 2 Sep 2018 at 15:20 Matthias J. Sax  wrote:

> +1 (binding)
>
> On 9/1/18 2:40 PM, Guozhang Wang wrote:
> > +1 (binding).
> >
> > On Mon, Aug 27, 2018 at 5:20 PM, Dongjin Lee  wrote:
> >
> >> +1 (non-binding)
> >>
> >> On Tue, Aug 28, 2018 at 8:53 AM Bill Bejeck  wrote:
> >>
> >>> +1
> >>>
> >>> -Bill
> >>>
> >>> On Mon, Aug 27, 2018 at 3:24 PM Ted Yu  wrote:
> >>>
>  +1
> 
>  On Mon, Aug 27, 2018 at 12:18 PM John Roesler 
> >> wrote:
> 
> > +1 (non-binding)
> >
> > On Sat, Aug 25, 2018 at 1:16 PM Joan Goyeau  wrote:
> >
> >> Hi,
> >>
> >> We want to make sure that we always have a serde for all
> >>> Materialized,
> >> Serialized, Joined, Consumed and Produced.
> >> For that we can make use of the implicit parameters in Scala.
> >>
> >> KIP:
> >>
> >>
> >
> 
> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>
> 365%3A+Materialized%2C+Serialized%2C+Joined%2C+Consumed+and+Produced+with+
> >> implicit+Serde
> >>
> >> Github PR: https://github.com/apache/kafka/pull/5551
> >>
> >> Please make your votes.
> >> Thanks
> >>
> >
> 
> >>>
> >>> --
> >>> *Dongjin Lee*
> >>>
> >>> *A hitchhiker in the mathematical world.*
> >>>
> >>> *github:  github.com/dongjinleekr
> >>> linkedin: kr.linkedin.com/in/
> >> dongjinleekr
> >>> slideshare:
> www.slideshare.net/
> >> dongjinleekr
> >>> *
> >>>
> >>
> >
> >
> >
>
>


Re: [VOTE] KIP-357: Add support to list ACLs per principal

2018-09-03 Thread Manikumar
bump up!  waiting for  2 more binding votes!

On Tue, Aug 28, 2018 at 7:36 AM Satish Duggana 
wrote:

> +1 (non-binding)
>
> On Tue, Aug 28, 2018 at 2:59 AM, Harsha  wrote:
>
> > +1 (binding)
> >
> > -Harsha
> >
> > On Mon, Aug 27, 2018, at 12:46 PM, Jakub Scholz wrote:
> > > +1 (non-binding)
> > >
> > > On Mon, Aug 27, 2018 at 6:24 PM Manikumar 
> > wrote:
> > >
> > > > Hi All,
> > > >
> > > > I would like to start voting on KIP-357 which allows to list ACLs per
> > > > principal using AclCommand (kafka-acls.sh)
> > > >
> > > > KIP:
> > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 357%3A++Add+support+to+list+ACLs+per+principal
> > > >
> > > > Discussion Thread:
> > > >
> > > > https://lists.apache.org/thread.html/dc7f6005845a372a0a48a40872a32d
> > 9ece03807a4fb1bb89d3645afb@%3Cdev.kafka.apache.org%3E
> > > >
> > > > Thanks,
> > > > Manikumar
> > > >
> >
>


Re: KAFKA-1194

2018-09-03 Thread Stephane Maarek
Hi Manna

If you check the first email in this thread I was telling I already tested
the fix by Dong and that it unfortunately didn't work. I've been building
from trunk.

I've instead opened a PR based on various other people's fixes and it
passed Jenkins and also fixes the windows issues encountered in KAFKA-1194:
https://github.com/apache/kafka/pull/5603
Tested for topic deletion

Would love your pair of eyes on this too and testing if you can

Cheers
Stephane

On Mon., 3 Sep. 2018, 11:42 am M. Manna,  wrote:

> Hi Stephan,
>
> https://jira.apache.org/jira/browse/KAFKA-7278
>
> This might (according to Dong) be a better fix for the issue. Could you
> kindly apply to patch to your kafka (assuming that your version is
> reasonably close to the latest), and see if that helps?
>
> Regards,
>
> On Mon, 3 Sep 2018 at 02:27, Ismael Juma  wrote:
>
> > Hi Stephane,
> >
> > Does https://github.com/apache/kafka/pull/4947/files fix it for you?
> >
> > Ismael
> >
> > On Sun, Sep 2, 2018 at 11:25 AM Stephane Maarek <
> > steph...@simplemachines.com.au> wrote:
> >
> > > Hi,
> > >
> > > I've seen Dong has done some work on
> > > https://issues.apache.org/jira/browse/KAFKA-7278 which is said from
> the
> > > comments that it could have possibly fixed
> > > https://issues.apache.org/jira/browse/KAFKA-1194.
> > >
> > > I tested and it is unfortunately not the case...
> > > I have posted in KAFKA-1194 a way to reproduce the issue in a
> > deterministic
> > > way:
> > > ===
> > > C:\kafka_2.11-2.1.0-SNAPSHOT>bin\windows\kafka-server-start.bat
> > > config\server.properties
> > >
> > > C:\kafka_2.11-2.1.0-SNAPSHOT>kafka-topics.bat --zookeeper
> 127.0.0.1:2181
> > > --topic second_topic --create --partitions 3 --replication-factor 1
> > >
> > > C:\kafka_2.11-2.1.0-SNAPSHOT>kafka-console-producer.bat --broker-list
> > > 127.0.0.1:9092 --topic second_topic
> > > >hello
> > > >world
> > > >hello
> > > >Terminate batch job (Y/N)? Y
> > >
> > > C:\kafka_2.11-2.1.0-SNAPSHOT>kafka-topics.bat --zookeeper
> 127.0.0.1:2181
> > > --topic second_topic --delete
> > > 
> > >
> > >
> > > I usually wouldn't push for any issues for windows to be fixed, but
> this
> > > one is triggered from basic CLI usage and triggers a full broker
> failure
> > > that can't be recovered.
> > > It actually completely destroys the faith of Kafka newcomers learning
> > using
> > > Windows (which are 50% of the users I have in my course).
> > >
> > > I'm happy to help out in any way I can to test any patch.
> > >
> > > Thanks
> > > Stephane
> > >
> >
>


Re: [VOTE] KIP-358: Migrate Streams API to Duration instead of long ms times

2018-09-03 Thread Nikolay Izhikov
Dear commiters.

Please, vote on a KIP.

В Пт, 31/08/2018 в 12:05 -0500, John Roesler пишет:
> Hi Nikolay,
> 
> You can start a PR any time, but we cannot per it (and probably won't do
> serious reviews) until after the KIP is voted and approved.
> 
> Sometimes people start a PR during discussion just to help provide more
> context, but it's not required (and can also be distracting because the KIP
> discussion should avoid implementation details).
> 
> Let's wait one more day for any other comments and plan to start the vote
> on Monday if there are no other debates.
> 
> Once you start the vote, you have to leave it up for at least 72 hours, and
> it requires 3 binding votes to pass. Only Kafka Committers have binding
> votes (https://kafka.apache.org/committers).
> 
> Thanks,
> -John
> 
> On Fri, Aug 31, 2018 at 11:09 AM Bill Bejeck  wrote:
> 
> > Hi Nickolay,
> > 
> > Thanks for the clarification.
> > 
> > -Bill
> > 
> > On Fri, Aug 31, 2018 at 11:59 AM Nikolay Izhikov 
> > wrote:
> > 
> > > Hello, John.
> > > 
> > > This is my first KIP, so, please, help me with kafka development process.
> > > 
> > > Should I start to work on PR now? Or should I wait for a "+1" from
> > > commiters?
> > > 
> > > В Пт, 31/08/2018 в 10:33 -0500, John Roesler пишет:
> > > > I see. I guess that once we are in the PR-reviewing phase, we'll be in
> > 
> > a
> > > > better position to see what else can/should be done, and we can talk
> > > 
> > > about
> > > > follow-on work at that time.
> > > > 
> > > > Thanks for the clarification,
> > > > -John
> > > > 
> > > > On Fri, Aug 31, 2018 at 1:19 AM Nikolay Izhikov 
> > > 
> > > wrote:
> > > > 
> > > > > Hello, Bill
> > > > > 
> > > > > > In the "Proposed Changes" section, there is "Try to reduce the
> > > > > 
> > > > > visibility of methods in next tickets" does that mean eventual
> > > 
> > > deprecation
> > > > > and removal?
> > > > > 
> > > > > 1. Some methods will become deprecated. I think they will be removed
> > 
> > in
> > > > > the future.
> > > > > You can find list of deprecated methods in KIP.
> > > > > 
> > > > > 2. Some internal methods can't be deprecated or hid from the user for
> > > 
> > > now.
> > > > > I was trying to say that we should research possibility to reduce
> > > > > visibility of *internal* methods that are *public* now.
> > > > > That kind of changes is out of the scope of current KIP, so we have
> > 
> > to
> > > do
> > > > > it in the next tickets.
> > > > > 
> > > > > I don't expect that internal methods will be removed.
> > > > > 
> > > > > В Чт, 30/08/2018 в 18:59 -0400, Bill Bejeck пишет:
> > > > > > Sorry for chiming in late, there was a lot of detail to catch up
> > 
> > on.
> > > > > > 
> > > > > > Overall I'm +1 in the KIP.  But I do have one question about the
> > 
> > KIP
> > > in
> > > > > > regards to Matthias's comments about defining dual use.
> > > > > > 
> > > > > > In the "Proposed Changes" section, there is "Try to reduce the
> > > 
> > > visibility
> > > > > > of methods in next tickets" does that mean eventual deprecation and
> > > > > 
> > > > > removal?
> > > > > > I thought we were aiming to keep the dual use methods? Or does that
> > > 
> > > imply
> > > > > > we'll strive for more clear delineation between DSL and internal
> > 
> > use?
> > > > > > 
> > > > > > Thanks,
> > > > > > Bill
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > On Thu, Aug 30, 2018 at 5:59 PM Nikolay Izhikov <
> > 
> > nizhi...@apache.org
> > > > 
> > > > > 
> > > > > wrote:
> > > > > > 
> > > > > > > John, thank you.
> > > > > > > 
> > > > > > > I've updated KIP.
> > > > > > > 
> > > > > > > 
> > > > > > > Dear commiters, please take a look and share your opinion.
> > > > > > > 
> > > > > > > В Чт, 30/08/2018 в 14:58 -0500, John Roesler пишет:
> > > > > > > > Oh! I missed one minor thing: UnlimitedWindows doesn't need to
> > > 
> > > set
> > > > > 
> > > > > grace
> > > > > > > > (it currently does not either).
> > > > > > > > 
> > > > > > > > Otherwise, it looks good to me!
> > > > > > > > 
> > > > > > > > Thanks so much,
> > > > > > > > -John
> > > > > > > > 
> > > > > > > > On Thu, Aug 30, 2018 at 5:30 AM Nikolay Izhikov <
> > > 
> > > nizhi...@apache.org
> > > > > > > 
> > > > > > > wrote:
> > > > > > > > 
> > > > > > > > > Hello, John.
> > > > > > > > > 
> > > > > > > > > I've updated KIP according on your comments.
> > > > > > > > > Please, take a look.
> > > > > > > > > 
> > > > > > > > > Are we ready to vot now?
> > > > > > > > > 
> > > > > > > > > В Ср, 29/08/2018 в 14:51 -0500, John Roesler пишет:
> > > > > > > > > > Hey Nikolay, sorry for the silence. I'm taking another look
> > > 
> > > at
> > > > > 
> > > > > the
> > > > > > > 
> > > > > > > KIP
> > > > > > > > > > before voting...
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > >1. I think the Window constructor should actually be
> > > > > 
> > > > > protected. I
> > > > > > > > > 
> > > > > > > > > don't
> > > > > > > > > >know if we need a constructor 

Re: KAFKA-1194

2018-09-03 Thread Ismael Juma
KAFKA-7278 doesn't fix the Windows issues.

Ismael

On Mon, 3 Sep 2018, 02:42 M. Manna,  wrote:

> Hi Stephan,
>
> https://jira.apache.org/jira/browse/KAFKA-7278
>
> This might (according to Dong) be a better fix for the issue. Could you
> kindly apply to patch to your kafka (assuming that your version is
> reasonably close to the latest), and see if that helps?
>
> Regards,
>
> On Mon, 3 Sep 2018 at 02:27, Ismael Juma  wrote:
>
> > Hi Stephane,
> >
> > Does https://github.com/apache/kafka/pull/4947/files fix it for you?
> >
> > Ismael
> >
> > On Sun, Sep 2, 2018 at 11:25 AM Stephane Maarek <
> > steph...@simplemachines.com.au> wrote:
> >
> > > Hi,
> > >
> > > I've seen Dong has done some work on
> > > https://issues.apache.org/jira/browse/KAFKA-7278 which is said from
> the
> > > comments that it could have possibly fixed
> > > https://issues.apache.org/jira/browse/KAFKA-1194.
> > >
> > > I tested and it is unfortunately not the case...
> > > I have posted in KAFKA-1194 a way to reproduce the issue in a
> > deterministic
> > > way:
> > > ===
> > > C:\kafka_2.11-2.1.0-SNAPSHOT>bin\windows\kafka-server-start.bat
> > > config\server.properties
> > >
> > > C:\kafka_2.11-2.1.0-SNAPSHOT>kafka-topics.bat --zookeeper
> 127.0.0.1:2181
> > > --topic second_topic --create --partitions 3 --replication-factor 1
> > >
> > > C:\kafka_2.11-2.1.0-SNAPSHOT>kafka-console-producer.bat --broker-list
> > > 127.0.0.1:9092 --topic second_topic
> > > >hello
> > > >world
> > > >hello
> > > >Terminate batch job (Y/N)? Y
> > >
> > > C:\kafka_2.11-2.1.0-SNAPSHOT>kafka-topics.bat --zookeeper
> 127.0.0.1:2181
> > > --topic second_topic --delete
> > > 
> > >
> > >
> > > I usually wouldn't push for any issues for windows to be fixed, but
> this
> > > one is triggered from basic CLI usage and triggers a full broker
> failure
> > > that can't be recovered.
> > > It actually completely destroys the faith of Kafka newcomers learning
> > using
> > > Windows (which are 50% of the users I have in my course).
> > >
> > > I'm happy to help out in any way I can to test any patch.
> > >
> > > Thanks
> > > Stephane
> > >
> >
>


Re: [VOTE] KIP-336: Consolidate ExtendedSerializer/Serializer and ExtendedDeserializer/Deserializer

2018-09-03 Thread Viktor Somogyi-Vass
Apologies, miscounted the binding votes but the good news is that we need
only one.

Cheers,
Viktor

On Mon, Sep 3, 2018 at 11:09 AM Viktor Somogyi-Vass 
wrote:

> Hi All,
>
> I have completed my binary compatibility testing on this as well. Created
> a small script & example to make this reproducible:
> https://gist.github.com/viktorsomogyi/391defca73e7a46a2c6a40bc699231d4 .
>
> Also, thanks for the votes so far, we're still awaiting for 2 binding.
>
> Cheers,
> Viktor
>
> On Thu, Aug 30, 2018 at 5:09 PM Harsha  wrote:
>
>> +1.
>> Thanks,
>> Harsha
>>
>> On Thu, Aug 30, 2018, at 4:19 AM, Attila Sasvári wrote:
>> > Thanks for the KIP and the updates Viktor!
>> >
>> > +1 (non-binding)
>> >
>> >
>> >
>> > On Wed, Aug 29, 2018 at 10:44 AM Manikumar 
>> > wrote:
>> >
>> > > +1 (non-binding)
>> > >
>> > > Thanks for the KIP.
>> > >
>> > > On Wed, Aug 29, 2018 at 1:41 AM Jason Gustafson 
>> > > wrote:
>> > >
>> > > > +1 Thanks for the updates.
>> > > >
>> > > > On Tue, Aug 28, 2018 at 1:15 AM, Viktor Somogyi-Vass <
>> > > > viktorsomo...@gmail.com> wrote:
>> > > >
>> > > > > Sure, I've added it. I'll also do the testing today.
>> > > > >
>> > > > > On Mon, Aug 27, 2018 at 5:03 PM Ismael Juma 
>> wrote:
>> > > > >
>> > > > > > Thanks Viktor. I think it would be good to verify that existing
>> > > > > > ExtendedSerializer implementations work without recompiling.
>> This
>> > > could
>> > > > > be
>> > > > > > done as a manual test. If you agree, I suggest adding it to the
>> > > testing
>> > > > > > plan section.
>> > > > > >
>> > > > > > Ismael
>> > > > > >
>> > > > > > On Mon, Aug 27, 2018 at 7:57 AM Viktor Somogyi-Vass <
>> > > > > > viktorsomo...@gmail.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Thanks guys, I've updated my KIP with this info (so to keep
>> > > solution
>> > > > > #1).
>> > > > > > > If you find it good enough, please vote as well or let me
>> know if
>> > > you
>> > > > > > think
>> > > > > > > something is missing.
>> > > > > > >
>> > > > > > > On Sat, Aug 25, 2018 at 1:14 AM Ismael Juma <
>> ism...@juma.me.uk>
>> > > > wrote:
>> > > > > > >
>> > > > > > > > I'm OK with 1 too. It makes me a bit sad that we don't have
>> a
>> > > path
>> > > > > for
>> > > > > > > > removing the method without headers, but it seems like the
>> > > simplest
>> > > > > and
>> > > > > > > > least confusing option (I am assuming that headers are not
>> needed
>> > > > in
>> > > > > > the
>> > > > > > > > serializers in the common case).
>> > > > > > > >
>> > > > > > > > Ismael
>> > > > > > > >
>> > > > > > > > On Fri, Aug 24, 2018 at 2:42 PM Jason Gustafson <
>> > > > ja...@confluent.io>
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Hey Viktor,
>> > > > > > > > >
>> > > > > > > > > Good summary. I agree that option 1) seems like the
>> simplest
>> > > > choice
>> > > > > > > and,
>> > > > > > > > as
>> > > > > > > > > you note, we can always add the default implementation
>> later.
>> > > > I'll
>> > > > > > > leave
>> > > > > > > > > Ismael to make a case for the circular forwarding
>> approach ;)
>> > > > > > > > >
>> > > > > > > > > -Jason
>> > > > > > > > >
>> > > > > > > > > On Fri, Aug 24, 2018 at 3:02 AM, Viktor Somogyi-Vass <
>> > > > > > > > > viktorsomo...@gmail.com> wrote:
>> > > > > > > > >
>> > > > > > > > > > I think in the first draft I didn't provide an
>> implementation
>> > > > for
>> > > > > > > them
>> > > > > > > > as
>> > > > > > > > > > it seemed very simple and straightforward. I looked up a
>> > > couple
>> > > > > of
>> > > > > > > > > > implementations of the ExtendedSerializers on github
>> and the
>> > > > > > general
>> > > > > > > > > > behavior seems to be that they delegate to the 2
>> argument
>> > > > > > > (headerless)
>> > > > > > > > > > method:
>> > > > > > > > > >
>> > > > > > > > > >
>> https://github.com/khoitnm/practice-spring-kafka-grpc/blob/
>> > > > > > > > > >
>> a6fc9b3395762c4889807baedd822f4653d5dcdd/kafka-common/src/
>> > > > > > > > > > main/java/org/tnmk/common/kafka/serialization/protobuf/
>> > > > > > > > > > ProtobufSerializer.java
>> > > > > > > > > >
>> > > > > > >
>> > > >
>> https://github.com/hong-zhu/nxgen/blob/5cf1427d4e1a8f5c7fab47955af32a
>> > > > > > > > > > 0d4f4873af/nxgen-kafka-client/src/main/java/nxgen/kafka/
>> > > > > > > > > > client/event/serdes/EventSerializer.java
>> > > > > > > > > > https://github.com/jerry-jx/spring-kafka/blob/
>> > > > > > > > > >
>> ac323ec5b8b9a0ca975db2f7322ff6878fce481a/spring-kafka/src/
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > > main/java/org/springframework/kafka/support/serializer/
>> > > > > JsonSerializer.java
>> > > > > > > > > > https://github.com/enzobonggio/nonblocking-kafka/blob/
>> > > > > > > > > >
>> bc1a379b2d9593b46cf9604063bc5b38e2785d19/src/main/java/com/
>> > > > > > > > > > example/kafka/producer/CustomJsonSerializer.java
>> > > > > > > > > >
>> > > > > > > > > > Of course 4 example is not representative but it shows
>> that
>> 

Re: KIP-213 - Scalable/Usable Foreign-Key KTable joins - Rebooted.

2018-09-03 Thread Jan Filipiak

Finished my deeper scan on your approach.
Most of the comments I put at the PR are minor code style things.
One forward call seems to be a bug though, would be great if you could 
double check.


the one problem I see is that the high watermark store grows unbounded.
A key being deleted from the source table does not lead to deletion in 
the watermark store.


I also don't quite grasp the concept why it's needed.  I think the whole 
offset part can go away?
It seems to deal with node failures of some kind but everything should 
turn out okay without it?


Best Jan


On 01.09.2018 20:44, Guozhang Wang wrote:

Yes Adam, that makes sense.

I think it may be better to have a working PR to review before we complete
the VOTE thread. In my previous experience a large feature like this are
mostly definitely going to miss some devils in the details in the design
and wiki discussion phases.

That would unfortunately mean that your implementations may need to be
modified / updated along with the review and further KIP discussion. I can
understand this can be painful, but that may be the best option we can do
to avoid as much work to be wasted as possible.


Guozhang


On Wed, Aug 29, 2018 at 10:06 AM, Adam Bellemare 
wrote:


Hi Guozhang

By workflow I mean just the overall process of how the KIP is implemented.
Any ideas on the ways to reduce the topic count, materializations, if there
is a better way to resolve out-of-order than a highwater mark table, if the
design philosophy of “keep everything encapsulated within the join
function” is appropriate, etc. I can implement the changes that John
suggested, but if my overall workflow is not acceptable I would rather
address that before making minor changes.

If this requires a full candidate PR ready to go to prod then I can make
those changes. Hope that clears things up.

Thanks

Adam


On Aug 29, 2018, at 12:42 PM, Guozhang Wang  wrote:

Hi Adam,

What do you mean by "additional comments on the workflow.", do you mean

to

let other review your PR https://github.com/apache/kafka/pull/5527 ? Is

is

ready for reviews?


Guozhang

On Tue, Aug 28, 2018 at 5:00 AM, Adam Bellemare <

adam.bellem...@gmail.com>

wrote:


Okay, I will implement John's suggestion of namespacing the external
headers prior to processing, and then removing the namespacing prior to
emitting. A potential future KIP could be to provide this namespacing
automatically.

I would also appreciate any other additional comments on the workflow.

My

goal is suss out agreement prior to moving to a vote.


On Mon, Aug 27, 2018 at 3:19 PM, Guozhang Wang 

wrote:

I like John's idea as well: for this KIP specifically as we do not

expect

any other consumers to read the repartition topics externally, we can
slightly prefix the header to be safe, while keeping the additional

cost

(note the header field is per-record, so any additional byte is

per-record

as well) low.


Guozhang

On Tue, Aug 21, 2018 at 11:58 AM, Adam Bellemare <

adam.bellem...@gmail.com

wrote:


Hi John

That is an excellent idea. The header usage I propose would be limited
entirely to internal topics, and this could very well be the solution

to

potential conflicts. If we do not officially reserve a prefix "__"

then I

think this would be the safest idea, as it would entirely avoid any
accidents (perhaps if a company is using its own "__" prefix for other
reasons).

Thanks

Adam


On Tue, Aug 21, 2018 at 2:24 PM, John Roesler 

wrote:

Just a quick thought regarding headers:

I think there is no absolute-safe ways to avoid conflicts, but we

can

still

consider using some name patterns to reduce the likelihood as much

as

possible.. e.g. consider sth. like the internal topics naming: e.g.
"__internal_[name]"?

I think there is a safe way to avoid conflicts, since these headers

are

only needed in internal topics (I think):
For internal and changelog topics, we can namespace all headers:
* user-defined headers are namespaced as "external." + headerKey
* internal headers are namespaced as "internal." + headerKey

This is a lot of characters, so we could use a sigil instead (e.g.,

"_"

for

internal, "~" for external)

We simply apply the namespacing when we read user headers from

external

topics into the topology and then de-namespace them before we emit

them

to

an external topic (via "to" or "through").
Now, it is not possible to collide with user-defined headers.

That said, I'd also be fine with just reserving "__" as a header

prefix

and

not worrying about collisions.

Thanks for the KIP,
-John

On Tue, Aug 21, 2018 at 9:48 AM Jan Filipiak <

jan.filip...@trivago.com

wrote:


Still havent completly grabbed it.
sorry will read more


On 17.08.2018 21:23, Jan Filipiak wrote:
Cool stuff.

I made some random remarks. Did not touch the core of the

algorithm

yet.

Will do Monday 100%

I don't see Interactive Queries :) like that!





On 17.08.2018 20:28, Adam Bellemare wrote:
I have submitted a PR with my code against 

Could not find or load main class kafka.Kafka (Eclipse Startup)

2018-09-03 Thread M. Manna
Hello,

I completed set up according to cwiki instruction -
https://cwiki.apache.org/confluence/display/KAFKA/Developer+Setup

I have also built all the sources and jars using `gradlew jarAll`

After that, I created a Scala application configuration in Eclipse to
debug. I provided the sources to log4j config file and server.properties
file location as arguments. After that, I also included the projects
"Clients", "Tools", and "Core" in the classpath. But the broker couldn't be
started in debug mode. I always get the following error.

Error: Could not find or load main class kafka.Kafka

Could someone guide me to fixing this issue? I was able to run this in the
past, but for some reason it has stopped working and I am struggling to get
this back up. I hope I've followed the correct process above.

Regards,


Re: [DISCUSS] KIP-349 Priorities for Source Topics

2018-09-03 Thread Jan Filipiak




On 30.08.2018 15:17, Matthias J. Sax wrote:

Nick,

Would be good to understand the difference between the current approach
and what Jan suggested. If we believe that the current proposal is too
limited in functionality and also hard to extend later on, it might make
sense to work on a more generic solution from the beginning on. On the
other hand, if we can extend the current proposal easily, I see no
reason to build this incrementally.
Difference is that from my POV topic priorities is a special 
implementation of a MessageChooser




@Jan: Can you summarize the differences from you point of view? You also
linked to KIP-353 in the VOTE thread -- how does it related to this KIP?
The relationship is rather obvious to me. KIP-353 can be implemented as 
a MessageChooser


-Matthias
To add more confusion, why not also make "pause partition" as in connect 
part of the MessageChooser interface?





On 8/12/18 11:15 PM, Matt Farmer wrote:

Ah, sorry, yes it does.

On Sun, Aug 12, 2018 at 2:58 PM  wrote:


Does this clarify ?
--
   Nick

On Aug 9, 2018, at 7:44 PM, n...@afshartous.com wrote:

Since there are questions I changed the heading from VOTE to DISCUSS

On Aug 8, 2018, at 9:09 PM, Matt Farmer  wrote:

s it worth spelling out explicitly what the behavior is when two topics
have the same priority? I'm a bit fuzzy on how we choose what topics to
consume from right now, if I'm being honest, so it could be useful to
outline the current behavior in the background and to spell out how that
would change (or if it would change) when two topics are given the same
priority.


I added an additional note in the KIP’s Compatibility section to clarify
that current behavior would not change in order to preserve backwards
compatibility.

Also, how does this play with max.poll.records? Does the consumer read from

all the topics in priority order until we've hit the number of records or
the poll timeout? Or does it immediately return the high priority records
without pulling low priority records?


My own interpretation would be to read from all the topics in priority
order as the consumer is subscribed to multiple topics.
--
   Nick











Re: Add to contributor list

2018-09-03 Thread Murali Mani
Guozhang,

Could you please send the invitation for creating my new Apache ID
(muralimani)? I had sent the email to *priv...@kafka.apache.org
 * mail id. I am sending this mail as you are one
of the PMC member of Apache Kakfa.


On Sat, Sep 1, 2018 at 1:31 PM Murali Mani  wrote:

> Guozhang,
>
> In order to create a apache id, I have to submit the ICLA document, which
> i had completed today. As per the process defined below, a PMC member (in
> this case you / team for Apache Kafka) would need to send the activation
> email once the ID has been created. I shall notify you / team to send the
> request for actvation. I hope it is fine.
>
> https://www.apache.org/dev/new-committers-guide.html#cla
>
> Thanks for your understanding.
>
>
>
>
> On Sun, Aug 19, 2018 at 10:42 PM Guozhang Wang  wrote:
>
>> You need an apache id for the project (github account is separate), you
>> can
>> create one on apache website.
>>
>>
>> Guozhang
>>
>> On Sun, Aug 19, 2018 at 2:02 PM, Murali Mani 
>> wrote:
>>
>> > Yes, it is correct. muralimani is the user id of Github. The mail id for
>> > github - murali.m...@gmail.com
>> >
>> > On Sun, Aug 19, 2018 at 7:03 PM Guozhang Wang 
>> wrote:
>> >
>> > > Hello,
>> > >
>> > > I cannot find the id "muralimani" for either JIRA or wiki system.
>> Could
>> > you
>> > > double check if it is created?
>> > >
>> > >
>> > > Guozhang
>> > >
>> > >
>> > > On Sun, Aug 19, 2018 at 12:46 AM, Murali Mani 
>> > > wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > Could you please add me to the contributor list? My jira id is
>> > muralimani
>> > > >
>> > > > --
>> > > > regards
>> > > > Murali Krishnan Mani
>> > > > Ph: +44 (0)7432128519
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > -- Guozhang
>> > >
>> >
>> >
>> > --
>> > regards
>> > Murali Krishnan Mani
>> > Ph: +44 (0)7432128519
>> >
>>
>>
>>
>> --
>> -- Guozhang
>>
>
>
> --
> regards
> Murali Krishnan Mani
> Ph: +44 (0)7432128519
>


-- 
regards
Murali Krishnan Mani
Ph: +44 (0)7432128519


Re: [VOTE] KIP-359: Verify leader epoch in produce requests

2018-09-03 Thread Attila Sasvári
+1 (non-binding)

On Mon, Sep 3, 2018 at 12:21 PM Manikumar  wrote:

> +1 (non-binding)
> Thanks for the KIP.
>
> On Sun, Sep 2, 2018 at 10:06 PM Satish Duggana 
> wrote:
>
> > Thanks for the KIP,
> > +1 (non-binding)
> >
> > On Sun, Sep 2, 2018 at 7:59 PM, Matthias J. Sax 
> > wrote:
> >
> > > +1 (binding)
> > >
> > > On 8/30/18 11:30 PM, Dongjin Lee wrote:
> > > > Thanks for the KIP. I'm +1 (non-binding).
> > > >
> > > > Best,
> > > > Dongjin
> > > >
> > > > On Fri, Aug 31, 2018 at 9:25 AM Dong Lin 
> wrote:
> > > >
> > > >> Thanks for the KIP!
> > > >>
> > > >> +1 (binding)
> > > >>
> > > >> On Thu, Aug 30, 2018 at 3:56 PM, Jason Gustafson <
> ja...@confluent.io>
> > > >> wrote:
> > > >>
> > > >>> Hi All,
> > > >>>
> > > >>> I'd like to start the vote on KIP-359: https://cwiki.apache.org/
> > > >>> confluence/display/KAFKA/KIP-359%3A+Verify+leader+epoch+in+
> > > >>> produce+requests.
> > > >>> Thanks in advance for reviewing.
> > > >>>
> > > >>> -Jason
> > > >>>
> > > >>
> > > >
> > > >
> > >
> > >
> >
>


Re: [VOTE] KIP-359: Verify leader epoch in produce requests

2018-09-03 Thread Manikumar
+1 (non-binding)
Thanks for the KIP.

On Sun, Sep 2, 2018 at 10:06 PM Satish Duggana 
wrote:

> Thanks for the KIP,
> +1 (non-binding)
>
> On Sun, Sep 2, 2018 at 7:59 PM, Matthias J. Sax 
> wrote:
>
> > +1 (binding)
> >
> > On 8/30/18 11:30 PM, Dongjin Lee wrote:
> > > Thanks for the KIP. I'm +1 (non-binding).
> > >
> > > Best,
> > > Dongjin
> > >
> > > On Fri, Aug 31, 2018 at 9:25 AM Dong Lin  wrote:
> > >
> > >> Thanks for the KIP!
> > >>
> > >> +1 (binding)
> > >>
> > >> On Thu, Aug 30, 2018 at 3:56 PM, Jason Gustafson 
> > >> wrote:
> > >>
> > >>> Hi All,
> > >>>
> > >>> I'd like to start the vote on KIP-359: https://cwiki.apache.org/
> > >>> confluence/display/KAFKA/KIP-359%3A+Verify+leader+epoch+in+
> > >>> produce+requests.
> > >>> Thanks in advance for reviewing.
> > >>>
> > >>> -Jason
> > >>>
> > >>
> > >
> > >
> >
> >
>


Re: KAFKA-1194

2018-09-03 Thread M. Manna
Hi Stephan,

https://jira.apache.org/jira/browse/KAFKA-7278

This might (according to Dong) be a better fix for the issue. Could you
kindly apply to patch to your kafka (assuming that your version is
reasonably close to the latest), and see if that helps?

Regards,

On Mon, 3 Sep 2018 at 02:27, Ismael Juma  wrote:

> Hi Stephane,
>
> Does https://github.com/apache/kafka/pull/4947/files fix it for you?
>
> Ismael
>
> On Sun, Sep 2, 2018 at 11:25 AM Stephane Maarek <
> steph...@simplemachines.com.au> wrote:
>
> > Hi,
> >
> > I've seen Dong has done some work on
> > https://issues.apache.org/jira/browse/KAFKA-7278 which is said from the
> > comments that it could have possibly fixed
> > https://issues.apache.org/jira/browse/KAFKA-1194.
> >
> > I tested and it is unfortunately not the case...
> > I have posted in KAFKA-1194 a way to reproduce the issue in a
> deterministic
> > way:
> > ===
> > C:\kafka_2.11-2.1.0-SNAPSHOT>bin\windows\kafka-server-start.bat
> > config\server.properties
> >
> > C:\kafka_2.11-2.1.0-SNAPSHOT>kafka-topics.bat --zookeeper 127.0.0.1:2181
> > --topic second_topic --create --partitions 3 --replication-factor 1
> >
> > C:\kafka_2.11-2.1.0-SNAPSHOT>kafka-console-producer.bat --broker-list
> > 127.0.0.1:9092 --topic second_topic
> > >hello
> > >world
> > >hello
> > >Terminate batch job (Y/N)? Y
> >
> > C:\kafka_2.11-2.1.0-SNAPSHOT>kafka-topics.bat --zookeeper 127.0.0.1:2181
> > --topic second_topic --delete
> > 
> >
> >
> > I usually wouldn't push for any issues for windows to be fixed, but this
> > one is triggered from basic CLI usage and triggers a full broker failure
> > that can't be recovered.
> > It actually completely destroys the faith of Kafka newcomers learning
> using
> > Windows (which are 50% of the users I have in my course).
> >
> > I'm happy to help out in any way I can to test any patch.
> >
> > Thanks
> > Stephane
> >
>


Re: [VOTE] KIP-336: Consolidate ExtendedSerializer/Serializer and ExtendedDeserializer/Deserializer

2018-09-03 Thread Viktor Somogyi-Vass
Hi All,

I have completed my binary compatibility testing on this as well. Created a
small script & example to make this reproducible:
https://gist.github.com/viktorsomogyi/391defca73e7a46a2c6a40bc699231d4 .

Also, thanks for the votes so far, we're still awaiting for 2 binding.

Cheers,
Viktor

On Thu, Aug 30, 2018 at 5:09 PM Harsha  wrote:

> +1.
> Thanks,
> Harsha
>
> On Thu, Aug 30, 2018, at 4:19 AM, Attila Sasvári wrote:
> > Thanks for the KIP and the updates Viktor!
> >
> > +1 (non-binding)
> >
> >
> >
> > On Wed, Aug 29, 2018 at 10:44 AM Manikumar 
> > wrote:
> >
> > > +1 (non-binding)
> > >
> > > Thanks for the KIP.
> > >
> > > On Wed, Aug 29, 2018 at 1:41 AM Jason Gustafson 
> > > wrote:
> > >
> > > > +1 Thanks for the updates.
> > > >
> > > > On Tue, Aug 28, 2018 at 1:15 AM, Viktor Somogyi-Vass <
> > > > viktorsomo...@gmail.com> wrote:
> > > >
> > > > > Sure, I've added it. I'll also do the testing today.
> > > > >
> > > > > On Mon, Aug 27, 2018 at 5:03 PM Ismael Juma 
> wrote:
> > > > >
> > > > > > Thanks Viktor. I think it would be good to verify that existing
> > > > > > ExtendedSerializer implementations work without recompiling. This
> > > could
> > > > > be
> > > > > > done as a manual test. If you agree, I suggest adding it to the
> > > testing
> > > > > > plan section.
> > > > > >
> > > > > > Ismael
> > > > > >
> > > > > > On Mon, Aug 27, 2018 at 7:57 AM Viktor Somogyi-Vass <
> > > > > > viktorsomo...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks guys, I've updated my KIP with this info (so to keep
> > > solution
> > > > > #1).
> > > > > > > If you find it good enough, please vote as well or let me know
> if
> > > you
> > > > > > think
> > > > > > > something is missing.
> > > > > > >
> > > > > > > On Sat, Aug 25, 2018 at 1:14 AM Ismael Juma  >
> > > > wrote:
> > > > > > >
> > > > > > > > I'm OK with 1 too. It makes me a bit sad that we don't have a
> > > path
> > > > > for
> > > > > > > > removing the method without headers, but it seems like the
> > > simplest
> > > > > and
> > > > > > > > least confusing option (I am assuming that headers are not
> needed
> > > > in
> > > > > > the
> > > > > > > > serializers in the common case).
> > > > > > > >
> > > > > > > > Ismael
> > > > > > > >
> > > > > > > > On Fri, Aug 24, 2018 at 2:42 PM Jason Gustafson <
> > > > ja...@confluent.io>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hey Viktor,
> > > > > > > > >
> > > > > > > > > Good summary. I agree that option 1) seems like the
> simplest
> > > > choice
> > > > > > > and,
> > > > > > > > as
> > > > > > > > > you note, we can always add the default implementation
> later.
> > > > I'll
> > > > > > > leave
> > > > > > > > > Ismael to make a case for the circular forwarding approach
> ;)
> > > > > > > > >
> > > > > > > > > -Jason
> > > > > > > > >
> > > > > > > > > On Fri, Aug 24, 2018 at 3:02 AM, Viktor Somogyi-Vass <
> > > > > > > > > viktorsomo...@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > > I think in the first draft I didn't provide an
> implementation
> > > > for
> > > > > > > them
> > > > > > > > as
> > > > > > > > > > it seemed very simple and straightforward. I looked up a
> > > couple
> > > > > of
> > > > > > > > > > implementations of the ExtendedSerializers on github and
> the
> > > > > > general
> > > > > > > > > > behavior seems to be that they delegate to the 2 argument
> > > > > > > (headerless)
> > > > > > > > > > method:
> > > > > > > > > >
> > > > > > > > > >
> https://github.com/khoitnm/practice-spring-kafka-grpc/blob/
> > > > > > > > > >
> a6fc9b3395762c4889807baedd822f4653d5dcdd/kafka-common/src/
> > > > > > > > > > main/java/org/tnmk/common/kafka/serialization/protobuf/
> > > > > > > > > > ProtobufSerializer.java
> > > > > > > > > >
> > > > > > >
> > > >
> https://github.com/hong-zhu/nxgen/blob/5cf1427d4e1a8f5c7fab47955af32a
> > > > > > > > > > 0d4f4873af/nxgen-kafka-client/src/main/java/nxgen/kafka/
> > > > > > > > > > client/event/serdes/EventSerializer.java
> > > > > > > > > > https://github.com/jerry-jx/spring-kafka/blob/
> > > > > > > > > >
> ac323ec5b8b9a0ca975db2f7322ff6878fce481a/spring-kafka/src/
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > main/java/org/springframework/kafka/support/serializer/
> > > > > JsonSerializer.java
> > > > > > > > > > https://github.com/enzobonggio/nonblocking-kafka/blob/
> > > > > > > > > >
> bc1a379b2d9593b46cf9604063bc5b38e2785d19/src/main/java/com/
> > > > > > > > > > example/kafka/producer/CustomJsonSerializer.java
> > > > > > > > > >
> > > > > > > > > > Of course 4 example is not representative but it shows
> that
> > > > these
> > > > > > > users
> > > > > > > > > > usually delegate to the "headerless" (2 argument) method.
> > > I've
> > > > > > tried
> > > > > > > to
> > > > > > > > > > look it up on other code search sites but haven't had
> much
> > > luck
> > > > > so
> > > > > > > far.
> > > > > > > > > > Given these examples and the way they implement 

Re: KAFKA-1194

2018-09-03 Thread Stephane Maarek
Sorry for the double email. I've compiled the fixes made by the other PR
and it fixes the Windows problem: https://github.com/apache/kafka/pull/5603

It triggers a few errors in the unit tests, which is probably due to the
order changes of which files are deleted. Let's continue the discussion
there?


On 3 September 2018 at 08:24, Stephane Maarek <
steph...@simplemachines.com.au> wrote:

> Hi Ismael, thanks for having a look!
>
> https://github.com/apache/kafka/pull/4947/files does not fix it for me.
>
> But this one does: https://github.com/apache/kafka/pull/4431
> It produces the following log: https://pastebin.com/NHeAcc6v
> As you can see the folders do get renamed, it gives the user a few odd
> WARNs, but I checked and the directories are fully gone nonetheless.
>
> I think the ideas of the PRs go back all along to
> https://github.com/apache/kafka/pull/154/files or
> https://github.com/apache/kafka/pull/1757
>
> I couldn't find how to map both PRs to the new Kafka codebase,
> unfortunately.
>
> On 3 September 2018 at 03:26, Ismael Juma  wrote:
>
>> Hi Stephane,
>>
>> Does https://github.com/apache/kafka/pull/4947/files fix it for you?
>>
>> Ismael
>>
>> On Sun, Sep 2, 2018 at 11:25 AM Stephane Maarek <
>> steph...@simplemachines.com.au> wrote:
>>
>> > Hi,
>> >
>> > I've seen Dong has done some work on
>> > https://issues.apache.org/jira/browse/KAFKA-7278 which is said from the
>> > comments that it could have possibly fixed
>> > https://issues.apache.org/jira/browse/KAFKA-1194.
>> >
>> > I tested and it is unfortunately not the case...
>> > I have posted in KAFKA-1194 a way to reproduce the issue in a
>> deterministic
>> > way:
>> > ===
>> > C:\kafka_2.11-2.1.0-SNAPSHOT>bin\windows\kafka-server-start.bat
>> > config\server.properties
>> >
>> > C:\kafka_2.11-2.1.0-SNAPSHOT>kafka-topics.bat --zookeeper
>> 127.0.0.1:2181
>> > --topic second_topic --create --partitions 3 --replication-factor 1
>> >
>> > C:\kafka_2.11-2.1.0-SNAPSHOT>kafka-console-producer.bat --broker-list
>> > 127.0.0.1:9092 --topic second_topic
>> > >hello
>> > >world
>> > >hello
>> > >Terminate batch job (Y/N)? Y
>> >
>> > C:\kafka_2.11-2.1.0-SNAPSHOT>kafka-topics.bat --zookeeper
>> 127.0.0.1:2181
>> > --topic second_topic --delete
>> > 
>> >
>> >
>> > I usually wouldn't push for any issues for windows to be fixed, but this
>> > one is triggered from basic CLI usage and triggers a full broker failure
>> > that can't be recovered.
>> > It actually completely destroys the faith of Kafka newcomers learning
>> using
>> > Windows (which are 50% of the users I have in my course).
>> >
>> > I'm happy to help out in any way I can to test any patch.
>> >
>> > Thanks
>> > Stephane
>> >
>>
>
>


Re: KAFKA-1194

2018-09-03 Thread Stephane Maarek
Hi Ismael, thanks for having a look!

https://github.com/apache/kafka/pull/4947/files does not fix it for me.

But this one does: https://github.com/apache/kafka/pull/4431
It produces the following log: https://pastebin.com/NHeAcc6v
As you can see the folders do get renamed, it gives the user a few odd
WARNs, but I checked and the directories are fully gone nonetheless.

I think the ideas of the PRs go back all along to https://github.com/apache/
kafka/pull/154/files or https://github.com/apache/kafka/pull/1757

I couldn't find how to map both PRs to the new Kafka codebase,
unfortunately.

On 3 September 2018 at 03:26, Ismael Juma  wrote:

> Hi Stephane,
>
> Does https://github.com/apache/kafka/pull/4947/files fix it for you?
>
> Ismael
>
> On Sun, Sep 2, 2018 at 11:25 AM Stephane Maarek <
> steph...@simplemachines.com.au> wrote:
>
> > Hi,
> >
> > I've seen Dong has done some work on
> > https://issues.apache.org/jira/browse/KAFKA-7278 which is said from the
> > comments that it could have possibly fixed
> > https://issues.apache.org/jira/browse/KAFKA-1194.
> >
> > I tested and it is unfortunately not the case...
> > I have posted in KAFKA-1194 a way to reproduce the issue in a
> deterministic
> > way:
> > ===
> > C:\kafka_2.11-2.1.0-SNAPSHOT>bin\windows\kafka-server-start.bat
> > config\server.properties
> >
> > C:\kafka_2.11-2.1.0-SNAPSHOT>kafka-topics.bat --zookeeper 127.0.0.1:2181
> > --topic second_topic --create --partitions 3 --replication-factor 1
> >
> > C:\kafka_2.11-2.1.0-SNAPSHOT>kafka-console-producer.bat --broker-list
> > 127.0.0.1:9092 --topic second_topic
> > >hello
> > >world
> > >hello
> > >Terminate batch job (Y/N)? Y
> >
> > C:\kafka_2.11-2.1.0-SNAPSHOT>kafka-topics.bat --zookeeper 127.0.0.1:2181
> > --topic second_topic --delete
> > 
> >
> >
> > I usually wouldn't push for any issues for windows to be fixed, but this
> > one is triggered from basic CLI usage and triggers a full broker failure
> > that can't be recovered.
> > It actually completely destroys the faith of Kafka newcomers learning
> using
> > Windows (which are 50% of the users I have in my course).
> >
> > I'm happy to help out in any way I can to test any patch.
> >
> > Thanks
> > Stephane
> >
>