Re: [ANNOUNCE] New committer: Luke Chen

2022-02-09 Thread Mayuresh Gharat
Congratulations Luke!

Thanks,

Mayuresh

On Wed, Feb 9, 2022, 5:24 PM Ismael Juma  wrote:

> Congratulations Luke!
>
> On Wed, Feb 9, 2022 at 3:22 PM Guozhang Wang  wrote:
>
> > The PMC for Apache Kafka has invited Luke Chen (showuon) as a committer
> and
> > we are pleased to announce that he has accepted!
> >
> > Luke has been actively contributing to Kafka since early 2020. He has
> > made more than 120 commits on various components of Kafka, with notable
> > contributions to the rebalance protocol in Consumer and Streams (KIP-766,
> > KIP-726, KIP-591, KAFKA-12675 and KAFKA12464, to just name a few), as
> well
> > as making an impact on improving test stability of the project. Aside
> from
> > all his code contributions, Luke has been a great participant in
> > discussions across the board, a very active and helpful reviewer of other
> > contributors' works, all of which are super valuable and highly
> appreciated
> > by the community.
> >
> >
> > Thanks for all of your contributions Luke. Congratulations!
> >
> > -- Guozhang, on behalf of the Apache Kafka PMC
> >
>


Re: Confluent Schema Registry Compatibility config

2021-12-16 Thread Mayuresh Gharat
Hi Folks,

I was reading docs on Confluent Schema Registry about Compatibility :
https://docs.confluent.io/platform/current/schema-registry/avro.html#compatibility-types

I was confused with "BACKWARDS" vs "BACKWARDS_TRANSITIVE".

If we have 3 schemas X, X-1, X-2 and configure a schema registry with
compatibility = "BACKWARDS". When we registered the X-1 schema it must have
been compared against the X-2 schema. When we register Xth schema it must
have been compared against X-1 schema. So by transitivity Xth Schema would
also be compatible with X-2.

So I am wondering what is the difference between "BACKWARDS" vs
"BACKWARDS_TRANSITIVE"? Any example would be really helpful.

--
-Regards,
Mayuresh R. Gharat
(862) 250-7125


-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [ANNOUNCE] New committer: John Roesler

2019-11-12 Thread Mayuresh Gharat
Congratulations John!

Thanks,

Mayuresh

On Tue, Nov 12, 2019 at 4:54 PM Vahid Hashemian 
wrote:

> Congratulations John!
>
> --Vahid
>
> On Tue, Nov 12, 2019 at 4:38 PM Adam Bellemare 
> wrote:
>
> > Congratulations John, and thanks for all your help on KIP-213!
> >
> > > On Nov 12, 2019, at 6:24 PM, Bill Bejeck  wrote:
> > >
> > > Congratulations John!
> > >
> > > On Tue, Nov 12, 2019 at 6:20 PM Matthias J. Sax  >
> > > wrote:
> > >
> > >> Congrats John!
> > >>
> > >>
> > >>> On 11/12/19 2:52 PM, Boyang Chen wrote:
> > >>> Great work John! Well deserved
> > >>>
> > >>> On Tue, Nov 12, 2019 at 1:56 PM Guozhang Wang 
> > >> wrote:
> > >>>
> >  Hi Everyone,
> > 
> >  The PMC of Apache Kafka is pleased to announce a new Kafka
> committer,
> > >> John
> >  Roesler.
> > 
> >  John has been contributing to Apache Kafka since early 2018. His
> main
> >  contributions are primarily around Kafka Streams, but have also
> > included
> >  improving our test coverage beyond Streams as well. Besides his own
> > code
> >  contributions, John has also actively participated on community
> > >> discussions
> >  and reviews including several other contributors' big proposals like
> >  foreign-key join in Streams (KIP-213). He has also been writing,
> > >> presenting
> >  and evangelizing Apache Kafka in many venues.
> > 
> >  Congratulations, John! And look forward to more collaborations with
> > you
> > >> on
> >  Apache Kafka.
> > 
> > 
> >  Guozhang, on behalf of the Apache Kafka PMC
> > 
> > >>>
> > >>
> > >>
> >
>
>
> --
>
> Thanks!
> --Vahid
>


-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [ANNOUNCE] New Committer: Vahid Hashemian

2019-01-15 Thread Mayuresh Gharat
congrats !!

On Tue, Jan 15, 2019 at 3:42 PM Matthias J. Sax 
wrote:

> Congrats!
>
> On 1/15/19 3:34 PM, Boyang Chen wrote:
> > This is exciting moment! Congrats Vahid!
> >
> > Boyang
> >
> > 
> > From: Rajini Sivaram 
> > Sent: Wednesday, January 16, 2019 6:50 AM
> > To: Users
> > Cc: dev
> > Subject: Re: [ANNOUNCE] New Committer: Vahid Hashemian
> >
> > Congratulations, Vahid! Well deserved!!
> >
> > Regards,
> >
> > Rajini
> >
> > On Tue, Jan 15, 2019 at 10:45 PM Jason Gustafson 
> wrote:
> >
> >> Hi All,
> >>
> >> The PMC for Apache Kafka has invited Vahid Hashemian as a project
> >> committer and
> >> we are
> >> pleased to announce that he has accepted!
> >>
> >> Vahid has made numerous contributions to the Kafka community over the
> past
> >> few years. He has authored 13 KIPs with core improvements to the
> consumer
> >> and the tooling around it. He has also contributed nearly 100 patches
> >> affecting all parts of the codebase. Additionally, Vahid puts a lot of
> >> effort into community engagement, helping others on the mail lists and
> >> sharing his experience at conferences and meetups.
> >>
> >> We appreciate the contributions and we are looking forward to more.
> >> Congrats Vahid!
> >>
> >> Jason, on behalf of the Apache Kafka PMC
> >>
> >
>
>

-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [DISCUSS] : KIP-410: Add metric for request handler thread pool utilization by request type

2019-01-02 Thread Mayuresh Gharat
Hi Stanislav,

Thanks a lot for the feedback and sorry for the late reply.
I agree that we can have the name "RequestHandlerPoolUsagePercent" instead
of "RequestHandlerPoolUsageRate".
The main reason, why I named it rate was because of the discussion we had
on the corresponding jira.
But now, that I think of it, it is actually a fraction of the usage, so the
"percent" sounds better than the "rate".

Thanks,

Mayuresh

On Wed, Dec 26, 2018 at 1:56 AM Stanislav Kozlovski 
wrote:

> Hey there Mayuresh,
>
> Thanks for the KIP! This will prove to be very useful.
>
> I am wondering whether we should opt for the name of `
> RequestHandlerPoolUsagePercent`. We seem to use the "Percent" suffix to
> denote fractions of times in other places - e.g
> `NetworkProcessorAvgIdlePercent`
>
> The average fraction of time the network processors are idle
>
> kafka.network:type=SocketServer,name=NetworkProcessorAvgIdlePercent
> between 0 and 1, ideally  0.3
>
>
> Whereas we use the "Rate" suffix to denote the number of events per second
> - e.g the clients' "connection-close-rate"
>
> 
>   connection-close-rate
>   Connections closed per second in the window.
>
> kafka.[producer|consumer|connect]:type=[producer|consumer|connect]-metrics,client-id=([-.\w]+)
> 
>
>
> A separate nit - you may want to update the discussion thread link in the
> KIP as it is pointing to the default one.
>
> Thanks,
> Stanislav
>
> On Thu, Dec 20, 2018 at 9:58 PM Mayuresh Gharat <
> gharatmayures...@gmail.com>
> wrote:
>
> > I would like to get feedback on the proposal to add a metric for request
> > handler thread pool utilization by request type. Please find the KIP
> here :
> > KIP-410: Add metric for request handler thread pool utilization by
> request
> > type
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-410%3A+Add+metric+for+request+handler+thread+pool+utilization+by+request+type
> > >
> >
> >
> > Thanks,
> >
> > Mayuresh
> >
>
>
> --
> Best,
> Stanislav
>


-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by specifying member id

2019-01-02 Thread Mayuresh Gharat
Hi Boyang,

Thanks a lot for the reply. I think, I get what you are saying.
IIUC, once the group starts moving to dynamic membership from static
membership, the group will have a list of new member.id's and the earlier
mapping of group.instance.id -> member.id will be outdated and become
inactive from pov of triggering rebalances. This is assuming that the time
interval to move from static to dynamic membership is small.

Thanks,

Mayuresh

On Fri, Dec 28, 2018 at 11:37 PM Boyang Chen  wrote:

> Hey Mayuresh,
>
> thanks for giving another thorough thought on this matter! The session
> timeout is controlled on the member level (or more specifically member.id
> level), which has nothing to do with the static member mapping. Putting
> aside the low possibility of having a duplicate member id assignment
> throughout the restart, if this indeed happens when we switch a member c1
> to dynamic member, the static membership map will only contain [gc1 -> mc1]
> mapping.
> This is because c2~c4 will be evicted during rebalance as their
> corresponding mc2~mc4 are already timeout through heartbeatpugatory. Next
> time the dynamic member restarts while group is in stable, the static map
> will be cleaned up and rebalance will be triggered as expected (we are
> already in dynamic mode now).
>
> I hope this answers your question, thanks!
>
> Boyang
>
> 
> From: Mayuresh Gharat 
> Sent: Saturday, December 22, 2018 2:21 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by
> specifying member id
>
> Hi Boyang,
>
> Regarding "However, we shall still attempt to remove the member static info
> if the given `member.id` points to an existing `group.instance.id` upon
> LeaveGroupRequest, because I could think of the possibility that in long
> term we could want to add static membership leave group logic for more
> fine-grained use cases."
>
> > I think, there is some confusion here. I am probably not putting it
> > right.
> >
> I agree, If a static member sends LeaveGroupRequest, it should be removed
> > from the group.
> >
> Now getting back to downgrade of static membership to Dynamic membership,
> > with the example described earlier  (copying it again for ease of
> reading)
> > :
> >
>
> >>1. Lets say we have 4 consumers :  c1, c2, c3, c4 in the static
> group.
> >>2. The group.instance.id for each of there are as follows :
> >>   - c1 -> gc1, c2 -> gc2, c3 -> gc3, c4 -> gc4
> >>3. The mapping on the GroupCordinator would be :
> >>   - gc1 -> mc1, gc2 -> mc2, gc3 -> mc3, gc4 -> mc4, where mc1, mc2,
> >>   mc3, mc4 are the randomly generated memberIds for c1, c2, c3, c4
> >>   respectively, by the GroupCoordinator.
> >>4. Now we do a restart to move the group to dynamic membership.
> >>5. We bounce c1 first and it rejoins with UNKNOWN_MEMBERID (since we
> >>don't persist the previously assigned memberId mc1 anywhere on the
> c1).
> >>
> > - We agree that there is no way to recognize that c1 was a part of the
> > group, *earlier*.  If yes, the statement : "The dynamic member rejoins
> > the group without `group.instance.id`. It will be accepted since it is a
> > known member." is not necessarily true, right?
> >
>
>
> > - Now I *agree* with "However, we shall still attempt to remove the
> > member static info if the given `member.id` points to an existing `
> > group.instance.id` upon LeaveGroupRequest, because I could think of the
> > possibility that in long term we could want to add static membership
> leave
> > group logic for more fine-grained use cases."
> >
> But that would only happen if the GroupCoordinator allocates the same
> > member.id (mc1) to the consumer c1, when it rejoins the group in step 5
> > above as a dynamic member, which is very rare as it is randomly
> generated,
> > but possible.
> >
>
>
> > - This raises another question, if the GroupCoordinator assigns a
> > member.id (mc1~) to consumer c1 after step 5. It will join the group and
> > rebalance and the group will become stable, eventually. Now the
> > GroupCoordinator still maintains a mapping of  "group.instance.id ->
> > member.id" (c1 -> gc1, c2 -> gc2, c3 -> gc3, c4 -> gc4) internally and
> > after some time, it realizes that it has not received heartbeat from the
> > consumer with "group.instance.id" = gc1. In that case, it will trigger
> > another rebalance assuming that a static member has lef

Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by specifying member id

2018-12-21 Thread Mayuresh Gharat
Hi Boyang,

Regarding "However, we shall still attempt to remove the member static info
if the given `member.id` points to an existing `group.instance.id` upon
LeaveGroupRequest, because I could think of the possibility that in long
term we could want to add static membership leave group logic for more
fine-grained use cases."

> I think, there is some confusion here. I am probably not putting it
> right.
>
I agree, If a static member sends LeaveGroupRequest, it should be removed
> from the group.
>
Now getting back to downgrade of static membership to Dynamic membership,
> with the example described earlier  (copying it again for ease of reading)
> :
>

>>1. Lets say we have 4 consumers :  c1, c2, c3, c4 in the static group.
>>2. The group.instance.id for each of there are as follows :
>>   - c1 -> gc1, c2 -> gc2, c3 -> gc3, c4 -> gc4
>>3. The mapping on the GroupCordinator would be :
>>   - gc1 -> mc1, gc2 -> mc2, gc3 -> mc3, gc4 -> mc4, where mc1, mc2,
>>   mc3, mc4 are the randomly generated memberIds for c1, c2, c3, c4
>>   respectively, by the GroupCoordinator.
>>4. Now we do a restart to move the group to dynamic membership.
>>5. We bounce c1 first and it rejoins with UNKNOWN_MEMBERID (since we
>>don't persist the previously assigned memberId mc1 anywhere on the c1).
>>
> - We agree that there is no way to recognize that c1 was a part of the
> group, *earlier*.  If yes, the statement : "The dynamic member rejoins
> the group without `group.instance.id`. It will be accepted since it is a
> known member." is not necessarily true, right?
>


> - Now I *agree* with "However, we shall still attempt to remove the
> member static info if the given `member.id` points to an existing `
> group.instance.id` upon LeaveGroupRequest, because I could think of the
> possibility that in long term we could want to add static membership leave
> group logic for more fine-grained use cases."
>
But that would only happen if the GroupCoordinator allocates the same
> member.id (mc1) to the consumer c1, when it rejoins the group in step 5
> above as a dynamic member, which is very rare as it is randomly generated,
> but possible.
>


> - This raises another question, if the GroupCoordinator assigns a
> member.id (mc1~) to consumer c1 after step 5. It will join the group and
> rebalance and the group will become stable, eventually. Now the
> GroupCoordinator still maintains a mapping of  "group.instance.id ->
> member.id" (c1 -> gc1, c2 -> gc2, c3 -> gc3, c4 -> gc4) internally and
> after some time, it realizes that it has not received heartbeat from the
> consumer with "group.instance.id" = gc1. In that case, it will trigger
> another rebalance assuming that a static member has left the group (when
> actually it (c1) has not left the group but moved to dynamic membership).
> This can result in multiple rebalances as the same will happen for c2, c3,
> c4.
>

Thoughts ???
One thing, I can think of right now is to run :
removeMemberFromGroup(String groupId, list
groupInstanceIdsToRemove, RemoveMemberFromGroupOptions options)
with groupInstanceIdsToRemove =  once we have bounced
all the members in the group. This assumes that we will be able to complete
the bounces before the GroupCoordinator realizes that it has not received a
heartbeat for any of . This is tricky and error prone.
Will have to think more on this.

Thanks,

Mayuresh


[DISCUSS] : KIP-410: Add metric for request handler thread pool utilization by request type

2018-12-20 Thread Mayuresh Gharat
I would like to get feedback on the proposal to add a metric for request
handler thread pool utilization by request type. Please find the KIP here :
KIP-410: Add metric for request handler thread pool utilization by request
type



Thanks,

Mayuresh


Re: [EXTERNAL] - Re: [DISCUSS] KIP-387: Fair Message Consumption Across Partitions in KafkaConsumer

2018-12-12 Thread Mayuresh Gharat
Hi ChienHsing,

We are actually working on buffering the already fetched data for paused
topicPartitions, so ideally it should not have any effect on performance.
Associated jira : https://issues.apache.org/jira/browse/KAFKA-7548

Thanks,

Mayuresh

On Wed, Dec 12, 2018 at 6:01 AM ChienHsing Wu  wrote:

> Hi Mayuresh,
>
> Thanks for the input!
>
> Pausing and Resuming are cumbersome and has some undesirable performance
> impact since pausing will in effect clean up the completed fetch and
> resuming will call the broker to retrieve again.
>
> The way I changed the code was just to parse the completed fetch earlier
> and ensure the order to retrieve are the same as the completed fetch queue.
> I did make code changes to take into account the following in Fetcher class.
>
> 1) exception handling
> 2) ensure the parsed partitions are not included in fetchablePartitions
> 3) clear buffer when not in the newly assigned partitions in
> clearBufferedDataForUnassignedPartitions
> 4) close them properly in close method
>
> Though the consumer does not guarantee explicit order, KIP 41 (link below)
> did intend to ensure fair distribution and therefore the round robin
> algorithm in the code. The change I propose was to enhance it.
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-41%3A+KafkaConsumer+Max+Records#KIP-41:KafkaConsumerMaxRecords-EnsuringFairConsumption
>
> As for performance, the changes does not add any additional calls to the
> broker nor does it introduce significant processing logic; it just parses
> the completed fetch earlier and have a list to manage them.
>
>
> CH
>
> -Original Message-
> From: Mayuresh Gharat 
> Sent: Tuesday, December 11, 2018 6:58 PM
> To: dev@kafka.apache.org
> Subject: Re: [EXTERNAL] - Re: [DISCUSS] KIP-387: Fair Message Consumption
> Across Partitions in KafkaConsumer
>
> Hi ChienHsing,
>
> The other way I was thinking, this can be done outside of KafkaConsumer is
> by pausing and resuming TopicPartitions (may be in round robin fashion).
> There is some gotcha there as in you might not know if the consumer has
> already fetched data for the remaining partitions.
> Also I am not sure, if we need a KIP for this as the KafkaConsumer does
> not guarantee the end user, any order, I believe. So if this change goes
> in, I don't think its changing the underlying behavior.
> It would be good to check if this change will impact the performance of
> the consumer.
>
> Thanks,
>
> Mayuresh
>
>
> On Tue, Dec 11, 2018 at 11:03 AM ChienHsing Wu 
> wrote:
>
> > Hi Mayuresh,
> >
> > To serve one poll call the logic greedily gets records from one
> > completed fetch before including records from the next completed fetch
> > from the queue, as you described.
> >
> > The algorithm remembers the current completed fetch as starting one
> > when serving the next poll call. The net effect is that completed
> > fetch will be retrieved to serve as many poll calls before retrieving
> > records from any other completed fetches.
> >
> > For example, let's say the consumer has been assigned partition A, B
> > and C and the max.poll.records is set to 100. Right now we have
> > completed fetch A, and B. Each one has 300 records. It will take 6
> > poll calls to retrieve all record and the sequence of retrieved
> > partitions will be: A, A, A, B, B, B.
> >
> > Ideally, it should alternate between A and B. I was proposing to move
> > to the next one fetch for the next poll call based on the order in the
> > completed fetch queue, so the order becomes A, B, A, B, A, B. The
> > implementation parses the completed fetch only once.
> >
> > Thanks, CH
> >
> > -Original Message-
> > From: Mayuresh Gharat 
> > Sent: Tuesday, December 11, 2018 1:21 PM
> > To: dev@kafka.apache.org
> > Subject: Re: [EXTERNAL] - Re: [DISCUSS] KIP-387: Fair Message
> > Consumption Across Partitions in KafkaConsumer
> >
> > Hi ChienHsing,
> >
> > Thanks for the KIP.
> > It would be great if you can explain with an example, what you mean by "
> > Currently the implementation will return available records starting
> > from the last partition the last poll call retrieves records from.
> > This leads to unfair patterns of record consumption from multiple
> partitions."
> >
> > KafkaConsumer would send fetch requests to multiple brokers and then
> > gets the corresponding responses and puts them in to a single queue of
> > CompletedFetches. IT then iterates over these completed fetches queue
> > and peels of number of records = max.poll.records from each
> &

Re: [EXTERNAL] - Re: [DISCUSS] KIP-387: Fair Message Consumption Across Partitions in KafkaConsumer

2018-12-11 Thread Mayuresh Gharat
Hi ChienHsing,

The other way I was thinking, this can be done outside of KafkaConsumer is
by pausing and resuming TopicPartitions (may be in round robin fashion).
There is some gotcha there as in you might not know if the consumer has
already fetched data for the remaining partitions.
Also I am not sure, if we need a KIP for this as the KafkaConsumer does not
guarantee the end user, any order, I believe. So if this change goes in, I
don't think its changing the underlying behavior.
It would be good to check if this change will impact the performance of the
consumer.

Thanks,

Mayuresh


On Tue, Dec 11, 2018 at 11:03 AM ChienHsing Wu 
wrote:

> Hi Mayuresh,
>
> To serve one poll call the logic greedily gets records from one completed
> fetch before including records from the next completed fetch from the
> queue, as you described.
>
> The algorithm remembers the current completed fetch as starting one when
> serving the next poll call. The net effect is that completed fetch will be
> retrieved to serve as many poll calls before retrieving records from any
> other completed fetches.
>
> For example, let's say the consumer has been assigned partition A, B and C
> and the max.poll.records is set to 100. Right now we have completed fetch
> A, and B. Each one has 300 records. It will take 6 poll calls to retrieve
> all record and the sequence of retrieved partitions will be: A, A, A, B, B,
> B.
>
> Ideally, it should alternate between A and B. I was proposing to move to
> the next one fetch for the next poll call based on the order in the
> completed fetch queue, so the order becomes A, B, A, B, A, B. The
> implementation parses the completed fetch only once.
>
> Thanks, CH
>
> -Original Message-
> From: Mayuresh Gharat 
> Sent: Tuesday, December 11, 2018 1:21 PM
> To: dev@kafka.apache.org
> Subject: Re: [EXTERNAL] - Re: [DISCUSS] KIP-387: Fair Message Consumption
> Across Partitions in KafkaConsumer
>
> Hi ChienHsing,
>
> Thanks for the KIP.
> It would be great if you can explain with an example, what you mean by "
> Currently the implementation will return available records starting from
> the last partition the last poll call retrieves records from. This leads to
> unfair patterns of record consumption from multiple partitions."
>
> KafkaConsumer would send fetch requests to multiple brokers and then gets
> the corresponding responses and puts them in to a single queue of
> CompletedFetches. IT then iterates over these completed fetches queue and
> peels of number of records = max.poll.records from each completedFetch for
> each poll() before moving on to next completedFetch. Also it does not send
> a fetch request for a TopicPartition, if we already have a buffered data
> (completedFetch or nextInlineRecord) for that TopicPartition. It also moves
> the TopicPartition to the end of the assignment queue, once it has received
> data from broker for that TopicPartition, to maintain round robin fetch
> sequence for fairness.
>
> Thanks,
>
> Mayuresh
>
> On Tue, Dec 11, 2018 at 9:13 AM ChienHsing Wu 
> wrote:
>
> > Jason,
> >
> >
> >
> > KIP 41 was initiated by you and this KIP is to change the logic
> > discussed in the Ensure Fair Consumption<
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_KAFKA_KIP-2D41-253A-2BKafkaConsumer-2BMax-2BRecords-23KIP-2D41-3AKafkaConsumerMaxRecords-2DEnsuringFairConsumption=DwIFaQ=ZgVRmm3mf2P1-XDAyDsu4A=Az03wMrbL9ToLW0OFyo3wo3985rhAKPMLmmg6RA3V7I=jeijHrRehjaysSML7ZSVlVEepS5LWchozwVVbwp7TLA=warXH2nttWvhdQhn-oSZuBYfZ_V2OY5ikbksVMzbt9o=
> >.
> > Your input on KIP-387<
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_
> > confluence_display_KAFKA_KIP-2D387-253A-2BFair-2BMessage-2BConsumption
> > -2BAcross-2BPartitions-2Bin-2BKafkaConsumer=DwIFaQ=ZgVRmm3mf2P1-XD
> > AyDsu4A=Az03wMrbL9ToLW0OFyo3wo3985rhAKPMLmmg6RA3V7I=jeijHrRehjaysS
> > ML7ZSVlVEepS5LWchozwVVbwp7TLA=Ptfb85HFvz0TqKSju21-_uV-U_0_HHNlnNf0kT
> > tRlgk=>
> > would be very valuable.
> >
> >
> >
> > Thanks, ChienHsing
> >
> >
> >
> > -Original Message-
> > From: ChienHsing Wu 
> > Sent: Tuesday, December 04, 2018 11:43 AM
> > To: dev@kafka.apache.org
> > Subject: RE: [EXTERNAL] - Re: [DISCUSS] KIP-387: Fair Message
> > Consumption Across Partitions in KafkaConsumer
> >
> >
> >
> > Hi,
> >
> >
> >
> > Any comments/updates? I am not sure the next steps if no one has any
> > further comments.
> >
> >
> >
> > Thanks, CH
> >
> >
> >
> > -Original Message-
> >
> 

Re: [EXTERNAL] - Re: [DISCUSS] KIP-387: Fair Message Consumption Across Partitions in KafkaConsumer

2018-12-11 Thread Mayuresh Gharat
Hi ChienHsing,

Thanks for the KIP.
It would be great if you can explain with an example, what you mean by "
Currently the implementation will return available records starting from
the last partition the last poll call retrieves records from. This leads to
unfair patterns of record consumption from multiple partitions."

KafkaConsumer would send fetch requests to multiple brokers and then gets
the corresponding responses and puts them in to a single queue of
CompletedFetches. IT then iterates over these completed fetches queue and
peels of number of records = max.poll.records from each completedFetch for
each poll() before moving on to next completedFetch. Also it does not send
a fetch request for a TopicPartition, if we already have a buffered data
(completedFetch or nextInlineRecord) for that TopicPartition. It also moves
the TopicPartition to the end of the assignment queue, once it has received
data from broker for that TopicPartition, to maintain round robin fetch
sequence for fairness.

Thanks,

Mayuresh

On Tue, Dec 11, 2018 at 9:13 AM ChienHsing Wu  wrote:

> Jason,
>
>
>
> KIP 41 was initiated by you and this KIP is to change the logic discussed
> in the Ensure Fair Consumption<
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-41%3A+KafkaConsumer+Max+Records#KIP-41:KafkaConsumerMaxRecords-EnsuringFairConsumption>.
> Your input on KIP-387<
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-387%3A+Fair+Message+Consumption+Across+Partitions+in+KafkaConsumer>
> would be very valuable.
>
>
>
> Thanks, ChienHsing
>
>
>
> -Original Message-
> From: ChienHsing Wu 
> Sent: Tuesday, December 04, 2018 11:43 AM
> To: dev@kafka.apache.org
> Subject: RE: [EXTERNAL] - Re: [DISCUSS] KIP-387: Fair Message Consumption
> Across Partitions in KafkaConsumer
>
>
>
> Hi,
>
>
>
> Any comments/updates? I am not sure the next steps if no one has any
> further comments.
>
>
>
> Thanks, CH
>
>
>
> -Original Message-
>
> From: ChienHsing Wu mailto:chien...@opentext.com>>
>
> Sent: Tuesday, November 20, 2018 2:46 PM
>
> To: dev@kafka.apache.org
>
> Subject: RE: [EXTERNAL] - Re: [DISCUSS] KIP-387: Fair Message Consumption
> Across Partitions in KafkaConsumer
>
>
>
> Hi Matt,
>
>
>
> Thanks for the feedback.
>
>
>
> The issue with the current design is that it stays on the previous
> partition even if the last poll call consumes the max.poll.records; it will
> consume all records in that partition available at the consumer side to
> serve multiple poll calls before moving to the next partition.
>
>
>
> Introducing another threshold at partition level will decrease the number
> of records consumed in one partition within one poll call but will still
> use that same partition as the starting one in the next poll call.
>
>
>
> The same effect can be achieved by setting max.poll.records to 100 I
> believe. The main difference is that the client will need to make more poll
> calls when that value is set to 100, and because of the non-blocking nature
> I believe the cost of extra poll calls are not significant.
>
>
>
> Further thoughts?
>
>
>
> Thanks, CH
>
>
>
> -Original Message-
>
> From: Matt Farmer mailto:m...@frmr.me>>
>
> Sent: Monday, November 19, 2018 9:32 PM
>
> To: dev@kafka.apache.org
>
> Subject: [EXTERNAL] - Re: [DISCUSS] KIP-387: Fair Message Consumption
> Across Partitions in KafkaConsumer
>
>
>
> Hi there,
>
>
>
> Thanks for the KIP.
>
>
>
> We’ve run into issues with this at Mailchimp so something to address
> consuming behavior would save us from having to always ensure we’re running
> enough consumers that each consumer has only one partition (which is our
> usual MO).
>
>
>
> I wonder though if it would be simpler and more powerful to define the
> maximum number of records the consumer should pull from one partition
> before pulling some records from another?
>
>
>
> So if you set max.poll.records to 500 and then some new setting,
> max.poll.records.per.partition, to 100 then the Consumer would switch what
> partition it reads from every 100 records - looping back around to the
> first partition that had records if there aren’t 5 or more partitions with
> records.
>
>
>
> What do you think?
>
>
>
> On Mon, Nov 19, 2018 at 9:11 AM ChienHsing Wu  > wrote:
>
>
>
> > Hi, could anyone please review this KIP?
>
> >
>
> > Thanks, ChienHsing
>
> >
>
> > From: ChienHsing Wu
>
> > Sent: Friday, November 09, 2018 1:10 PM
>
> > To: dev@kafka.apache.org
>
> > Subject: RE: [DISCUSS] KIP-387: Fair Message Consumption Across
>
> > Partitions in KafkaConsumer
>
> >
>
> > Just to check: Will anyone review this? It's been silent for a week...
>
> > Thanks, ChienHsing
>
> >
>
> > From: ChienHsing Wu
>
> > Sent: Monday, November 05, 2018 4:18 PM
>
> > To: 'dev@kafka.apache.org' 
> > dev@kafka.apache.org>>
>
> > Subject: [DISCUSS] KIP-387: Fair Message 

Re: Kafka client Metadata update on demand?

2018-12-10 Thread Mayuresh Gharat
Hi Ming,

Kafka clients do update there metadata on NotLeaderForPartitionException.
The metadata update happens asynchronously.

Also if you are getting this exception for a longer time, it might mean
that your client is fetching metadata from a broker whose metadata cache is
not updated with the latest metadata (he broker has not yet processed the
updateMetadataRequest from the controller).

Thanks,

Mayuresh

On Mon, Dec 10, 2018 at 10:39 AM Ming Liu  wrote:

> Hey community,
> It seems Kafka Metadata update only happens at the pre-configured
> Metadata update interval.  During upgrade, when leader changes, the client
> will fail with NotLeaderForPartitionException until next Metadata update
> happened.  I wonder why we don't have the on-demand Metadata update (that
> is, have Metadata update when there is exception like
> NotLeaderForPartitionException)?
>
> Thanks!
> Ming
>


-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [VOTE] KIP-345: Introduce static membership protocol to reduce consumer rebalances

2018-12-05 Thread Mayuresh Gharat
+1 (non-binding)

Thanks,

Mayuresh

On Tue, Dec 4, 2018 at 6:58 AM Mike Freyberger 
wrote:

> +1 (non binding)
>
> On 12/4/18, 9:43 AM, "Patrick Williams" 
> wrote:
>
> Pls take me off this VOTE list
>
> Best,
>
> Patrick Williams
>
> Sales Manager, UK & Ireland, Nordics & Israel
> StorageOS
> +44 (0)7549 676279
> patrick.willi...@storageos.com
>
> 20 Midtown
> 20 Proctor Street
> Holborn
> London WC1V 6NX
>
> Twitter: @patch37
> LinkedIn: linkedin.com/in/patrickwilliams4 <
> http://linkedin.com/in/patrickwilliams4>
>
> https://slack.storageos.com/
>
>
>
> On 03/12/2018, 17:34, "Guozhang Wang"  wrote:
>
> Hello Boyang,
>
> I've browsed through the new wiki and there are still a couple of
> minor
> things to notice:
>
> 1. RemoveMemberFromGroupOptions seems not defined anywhere.
>
> 2. LeaveGroupRequest added a list of group instance id, but still
> keep the
> member id as a singleton; is that intentional? I think to make the
> protocol
> consistent both member id and instance ids could be plural.
>
> 3. About the *kafka-remove-member-from-group.sh *tool, I'm
> wondering if we
> can defer adding this while just add the corresponding calls of the
> LeaveGroupRequest inside Streams until we have used it in
> production and
> hence have a better understanding on how flexible or extensible if
> we want
> to add any cmd tools. The rationale is that if we do not
> necessarily need
> it now, we can always add it later with a more think-through API
> design,
> but if we add the tool in a rush, we may need to extend or modify
> it soon
> after we realize its limits in operations.
>
> Otherwise, I'm +1 on the proposal.
>
> Guozhang
>
>
> On Mon, Dec 3, 2018 at 9:14 AM Boyang Chen 
> wrote:
>
> > Hey community friends,
> >
> > after another month of polishing, KIP-345<
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances
> >
> > design is ready for vote. Feel free to add your comment on the
> discussion
> > thread or here.
> >
> > Thanks for your time!
> >
> > Boyang
> > 
> > From: Boyang Chen 
> > Sent: Friday, November 9, 2018 6:35 AM
> > To: dev@kafka.apache.org
> > Subject: [VOTE] KIP-345: Introduce static membership protocol to
> reduce
> > consumer rebalances
> >
> > Hey all,
> >
> >
> > thanks so much for all the inputs on KIP-345 so far. The
> original proposal
> > has enhanced a lot with your help. To make sure the
> implementation go
> > smoothly without back and forth, I would like to start a vote on
> the final
> > design agreement now:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-<
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances
> > >
> >
> >
> 345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances<
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances
> > >
> >
> > KIP-345: Introduce static membership protocol to reduce ...<
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances
> > >
> > cwiki.apache.org
> > For stateful applications, one of the biggest performance
> bottleneck is
> > the state shuffling. In Kafka consumer, there is a concept called
> > "rebalance" which means that for given M partitions and N
> consumers in one
> > consumer group, Kafka will try to balance the load between
> consumers and
> > ideally have ...
> >
> >
> > Let me know if you have any questions.
> >
> >
> > Best,
> >
> > Boyang
> >
> >
>
> --
> -- Guozhang
>
>
>
>
>

-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [VOTE] KIP-394: Require member.id for initial join group request

2018-12-05 Thread Mayuresh Gharat
+1 (non-binding)

Thanks,

Mayuresh


On Wed, Dec 5, 2018 at 3:59 AM Boyang Chen  wrote:

> Hey friends,
>
> I would like to start a vote for KIP-394<
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-394%3A+Require+member.id+for+initial+join+group+request>.
> The goal of this KIP is to improve broker stability by fencing invalid join
> group requests.
>
> Best,
> Boyang
>
>

-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [DISCUSS] KIP-394: Require member.id for initial join group request

2018-12-05 Thread Mayuresh Gharat
Thanks for the KIP Boyang and great to see the progress on solving the
rebalance issues with this and KIP-345.

Thanks,

Mayuresh

On Mon, Dec 3, 2018 at 4:57 AM Stanislav Kozlovski 
wrote:

> Everything sounds good to me.
>
> On Sun, Dec 2, 2018 at 1:24 PM Boyang Chen  wrote:
>
> > In fact, it's probably better to move KIP-394<
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-394%3A+Require+member.id+for+initial+join+group+request
> >
> > to the vote stage first, so that it's easier to finalize the timeline and
> > smooth the rollout plan for KIP-345. Jason and Stanislav, since you two
> > involve most in this KIP, could you let me know if there is still any
> > unclarity we want to resolve before moving to vote?
> >
> > Best,
> > Boyang
> > 
> > From: Boyang Chen 
> > Sent: Saturday, December 1, 2018 10:53 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-394: Require member.id for initial join group
> > request
> >
> > Thanks Jason for the reply! Since the overall motivation and design is
> > pretty clear, I will go ahead to start implementation and we could
> discuss
> > the underlying details in the PR.
> >
> > Best,
> > Boyang
> > 
> > From: Matthias J. Sax 
> > Sent: Saturday, December 1, 2018 3:12 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-394: Require member.id for initial join group
> > request
> >
> > SGTM.
> >
> > On 11/30/18 10:17 AM, Jason Gustafson wrote:
> > > Using the session expiration logic we already have seems like the
> > simplest
> > > option (this is probably a one or two line change). The rejoin should
> be
> > > quick anyway, so I don't think it's worth optimizing for unjoined new
> > > members. Just my two cents. This is more of an implementation detail,
> so
> > > need not necessarily be resolved here.
> > >
> > > -Jason
> > >
> > > On Fri, Nov 30, 2018 at 12:56 AM Boyang Chen 
> > wrote:
> > >
> > >> Thanks Matthias for the question. I'm thinking of having a separate
> hash
> > >> set called `registeredMemberIds` which
> > >> will be cleared out every time a group finishes one round of
> rebalance.
> > >> Since storing one id is pretty trivial, using
> > >> purgatory to track the id removal is a bit wasteful in my opinion.
> > >> 
> > >> From: Matthias J. Sax 
> > >> Sent: Friday, November 30, 2018 10:26 AM
> > >> To: dev@kafka.apache.org
> > >> Subject: Re: [DISCUSS] KIP-394: Require member.id for initial join
> > group
> > >> request
> > >>
> > >> Thanks! Makes sense.
> > >>
> > >> I missed that fact, that the `member.id` is added on the second
> > >> joinGroup request that contains the `member.id`.
> > >>
> > >> However, it seems there is another race condition for this design:
> > >>
> > >> If two consumers join at the same time, it it possible that the broker
> > >> assigns the same `member.id` to both (because none of them have
> joined
> > >> the group yet--ie, second joinGroup request not sent yet--, the
> > >> `member.id` is not store broker side yes and broker cannot check for
> > >> duplicates when creating a new `member.id`.
> > >>
> > >> The probability might be fairly low thought. However, what Stanislav
> > >> proposed, to add the `member.id` directly, and remove it after
> > >> `session.timeout.ms` sound like a save option that avoids this issue.
> > >>
> > >> Thoughts?
> > >>
> > >>
> > >> -Matthias
> > >>
> > >> On 11/28/18 8:15 PM, Boyang Chen wrote:
> > >>> Thanks Matthias for the question, and Stanislav for the explanation!
> > >>>
> > >>> For the scenario described, we will never let a member join the
> > >> GroupMetadata map
> > >>> if it uses UNKNOWN_MEMBER_ID. So the workflow will be like this:
> > >>>
> > >>>   1.  Group is empty. Consumer c1 started. Join with
> UNKNOWN_MEMBER_ID;
> > >>>   2.  Broker rejects while allocating a member.id to c1 in response
> > (c1
> > >> protocol version is current);
> > >>>   3.  c1 handles the error and rejoins with assigned member.id;
> > >>>   4.  Broker stores c1 in its group metadata;
> > >>>   5.  Consumer c2 started. Join with UNKNOWN_MEMBER_ID;
> > >>>   6.  Broker rejects while allocating a member.id to c2 in response
> > (c2
> > >> protocol version is current);
> > >>>   7.  c2 fails to get the response/crashes in the middle;
> > >>>   8.  After certain time, c2 restarts a join request with
> > >> UNKNOWN_MEMBER_ID;
> > >>>
> > >>> As you could see, c2 will repeat step 6~8 until successfully send
> back
> > a
> > >> join group request with allocated id.
> > >>> By then broker will include c2 within the broker metadata map.
> > >>>
> > >>> Does this sound clear to you?
> > >>>
> > >>> Best,
> > >>> Boyang
> > >>> 
> > >>> From: Stanislav Kozlovski 
> > >>> Sent: Wednesday, November 28, 2018 7:39 PM
> > >>> To: dev@kafka.apache.org
> > >>> Subject: Re: [DISCUSS] KIP-394: Require member.id for initial join
> > >> group request
> > >>>
> > 

Re: [VOTE] KIP-345: Introduce static membership protocol to reduce consumer rebalances

2018-12-03 Thread Mayuresh Gharat
Hi Folks,

Would it be good to move this to the DISCUSS thread and keep this thread
only for voting purposes, else it will be hard to coordinate responses
between 2 threads.

Thanks,

Mayuresh



On Mon, Dec 3, 2018 at 5:43 PM Boyang Chen  wrote:

> Thanks Guozhang for the reply!
>
> 1. RemoveMemberFromGroupOptions seems not defined anywhere.
> Added the definition.
> 2. LeaveGroupRequest added a list of group instance id, but still keep the
> member id as a singleton; is that intentional? I think to make the protocol
> consistent both member id and instance ids could be plural.
> Since a dynamic member would send LeaveGroupRequest with its member.id,
> I feel it's ok to keep the existing API instead of expanding singleton to
> a list. Haven't
> been able to define a scenario where we need to pass a list of `member.id
> `.
> What do you think?
>
> 3. About the *kafka-remove-member-from-group.sh *tool, I'm wondering if we
> can defer adding this while just add the corresponding calls of the
> LeaveGroupRequest inside Streams until we have used it in production and
> hence have a better understanding on how flexible or extensible if we want
> to add any cmd tools. The rationale is that if we do not necessarily need
> it now, we can always add it later with a more think-through API design,
> but if we add the tool in a rush, we may need to extend or modify it soon
> after we realize its limits in operations.
> Totally agree. I moved this part to the future work, because tooling
> options could be addressed
> in a separate KIP and a universally favorable solution could be discussed
> independently (for different
> company setup)
>
> Best,
> Boyang
>
> 
> From: Guozhang Wang 
> Sent: Tuesday, December 4, 2018 1:27 AM
> To: dev
> Subject: Re: [VOTE] KIP-345: Introduce static membership protocol to
> reduce consumer rebalances
>
> Hello Boyang,
>
> I've browsed through the new wiki and there are still a couple of minor
> things to notice:
>
> 1. RemoveMemberFromGroupOptions seems not defined anywhere.
>
> 2. LeaveGroupRequest added a list of group instance id, but still keep the
> member id as a singleton; is that intentional? I think to make the protocol
> consistent both member id and instance ids could be plural.
>
> 3. About the *kafka-remove-member-from-group.sh *tool, I'm wondering if we
> can defer adding this while just add the corresponding calls of the
> LeaveGroupRequest inside Streams until we have used it in production and
> hence have a better understanding on how flexible or extensible if we want
> to add any cmd tools. The rationale is that if we do not necessarily need
> it now, we can always add it later with a more think-through API design,
> but if we add the tool in a rush, we may need to extend or modify it soon
> after we realize its limits in operations.
>
> Otherwise, I'm +1 on the proposal.
>
> Guozhang
>
>
> On Mon, Dec 3, 2018 at 9:14 AM Boyang Chen  wrote:
>
> > Hey community friends,
> >
> > after another month of polishing, KIP-345<
> >
> https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-345%253A%2BIntroduce%2Bstatic%2Bmembership%2Bprotocol%2Bto%2Breduce%2Bconsumer%2Brebalancesdata=02%7C01%7C%7C94b56c977f3647e1141908d659458c8c%7C84df9e7fe9f640afb435%7C1%7C0%7C636794552568572008sdata=LiNnhFJm8Avri26aEBa3q4%2Fr4aRKVIrZzKHzn71U3Xk%3Dreserved=0
> >
> > design is ready for vote. Feel free to add your comment on the discussion
> > thread or here.
> >
> > Thanks for your time!
> >
> > Boyang
> > 
> > From: Boyang Chen 
> > Sent: Friday, November 9, 2018 6:35 AM
> > To: dev@kafka.apache.org
> > Subject: [VOTE] KIP-345: Introduce static membership protocol to reduce
> > consumer rebalances
> >
> > Hey all,
> >
> >
> > thanks so much for all the inputs on KIP-345 so far. The original
> proposal
> > has enhanced a lot with your help. To make sure the implementation go
> > smoothly without back and forth, I would like to start a vote on the
> final
> > design agreement now:
> >
> >
> https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-data=02%7C01%7C%7C94b56c977f3647e1141908d659458c8c%7C84df9e7fe9f640afb435%7C1%7C0%7C636794552568572008sdata=GwbfkDFkY2m38V2e%2B6bEWU7PKWPoia5Hw6KmdOXrdcs%3Dreserved=0
> <
> >
> https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-345%253A%2BIntroduce%2Bstatic%2Bmembership%2Bprotocol%2Bto%2Breduce%2Bconsumer%2Brebalancesdata=02%7C01%7C%7C94b56c977f3647e1141908d659458c8c%7C84df9e7fe9f640afb435%7C1%7C0%7C636794552568572008sdata=LiNnhFJm8Avri26aEBa3q4%2Fr4aRKVIrZzKHzn71U3Xk%3Dreserved=0
> > >
> >
> >
> 345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances<
> >
> 

Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by specifying member id

2018-12-01 Thread Mayuresh Gharat
Hi Boyang,

Sounds good to me. We can keep this first version simple.
I can also start working on a follow up KIP for hotback and cold backup in
the meantime this KIP goes in once finalized, if its OK with you.

Thanks,

Mayuresh

On Fri, Nov 30, 2018 at 9:31 PM Boyang Chen  wrote:

> Thanks Guozhang and Mayuresh for the followup here.
> > Also I was thinking if we can have a replace API, that takes in a map of
> > old to new instance Ids. Such that we can replace a consumer.
> > IF we have this api, and if a consumer host goes down due to hardware
> > issues, we can have another host spin up and take its place. This is
> like a
> > cold backup which can be a step towards providing the hot backup that we
> > discussed earlier in the KIP.
> I like Mayuresh's suggestion, and I think we could prepare follow-up work
> once 345 is done to add a replace API. For the
> very first version I feel this is not a must-have.
>
> For Streams, I think we do not need an extra config for the instance id,
> instead, we can re-use the way we construct the embedded consumer's client
> id as:
>
> [streams client-id] + "-StreamThread-" + [thread-id] + "-consumer"
>
> So as long as user's specify the unique streams client-id, the resulted
> consumer client-id / instance-id should be unique as well already.
> So Guozhang you mean stream will enable static membership automatically
> correct? That would make the logic simpler
> and fewer code change on stream side.
>
> As for the LeaveGroupRequest, as I understand it, your concern is that when
> we are shutting down a single Streams instance that may contain multiple
> threads, shutting down that instance would mean shutting down multiple
> members. Personally I'd prefer to make the LeaveGroupRequest API more
> general and less inclined to Streams (I think Mayuresh also suggested
> this). So I'd suggest that we keep the LeaveGroupRequest API as suggested,
> i.e. a list of member.instance.ids. And in Streams we can add a new API in
> KafkaStreams to expose:
>
> 1) the list of embedded consumer / producer client ids,
> 2) the producer's txn ids if EOS is turned on, and
> 3) the consumer's instance ids.
> I agree with the suggestion to make the leave group request change
> generic. So this new Stream API
> will be added on the rest layer to expose the necessary ids correct?
>
> Looking forward to your confirmation 
>
> Best,
> Boyang
>
> 
> From: Guozhang Wang 
> Sent: Saturday, December 1, 2018 7:00 AM
> To: dev
> Subject: Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by
> specifying member id
>
> Hi Boyang,
>
> For Streams, I think we do not need an extra config for the instance id,
> instead, we can re-use the way we construct the embedded consumer's client
> id as:
>
> [streams client-id] + "-StreamThread-" + [thread-id] + "-consumer"
>
> So as long as user's specify the unique streams client-id, the resulted
> consumer client-id / instance-id should be unique as well already.
>
> As for the LeaveGroupRequest, as I understand it, your concern is that when
> we are shutting down a single Streams instance that may contain multiple
> threads, shutting down that instance would mean shutting down multiple
> members. Personally I'd prefer to make the LeaveGroupRequest API more
> general and less inclined to Streams (I think Mayuresh also suggested
> this). So I'd suggest that we keep the LeaveGroupRequest API as suggested,
> i.e. a list of member.instance.ids. And in Streams we can add a new API in
> KafkaStreams to expose:
>
> 1) the list of embedded consumer / producer client ids,
> 2) the producer's txn ids if EOS is turned on, and
> 3) the consumer's instance ids.
>
> So that Streams operators can read those values from KafkaStreams directly
> before shutting it down and use the list in the LeaveGroupRequest API. How
> about that?
>
>
> Guozhang
>
>
> On Fri, Nov 30, 2018 at 7:45 AM Mayuresh Gharat <
> gharatmayures...@gmail.com>
> wrote:
>
> > I like Guozhang's suggestion to not have to wait for session timeout in
> > case we know that we want to downsize the consumer group and redistribute
> > the partitions among the remaining consumers.
> > IIUC, with the above suggestions, the admin api
> > "removeMemberFromGroup(groupId, list[instanceId])" or
> > "removeMemberFromGroup(groupId, instanceId)", will automatically cause a
> > rebalance, right?
> > I would prefer ist[instanceid] because that's more general scenario.
> >
> > Also I was thinking if we can have a replace API, that takes in a map of
> > old to new instance 

Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by specifying member id

2018-11-30 Thread Mayuresh Gharat
tent, plus on the producer side we
> > have a `transactional.id` whose semantics is a bit similar to this one,
> > i.e. for unique distinguishment of a client which may comes and goes but
> > need to be persist over multiple "instance life-times".
> > Sure we have enough votes for ids I will finalize the name to `
> > group.instance.id`, does that
> > sound good?
> >
> > Best,
> > Boyang
> > 
> > From: Guozhang Wang 
> > Sent: Wednesday, November 28, 2018 4:51 AM
> > To: dev
> > Subject: Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by
> > specifying member id
> >
> > Regarding Jason's question and Boyang's responses:
> >
> > 2) I once have a discussion about the LeaveGroupRequest for static
> members,
> > and the reason for not having it for static members is that we'd need to
> > make it a configurable behavior as well (i.e. the likelihood that a
> static
> > member may shutdown but come back later may be even larger than the
> > likelihood that a shutdown static member would not come back), and when a
> > shutdown is complete the instance cannot tell whether or not it will come
> > back by itself. And hence letting a third party (think: admin used by K8s
> > plugins) issuing a request to indicate static member changes would be
> more
> > plausible.
> >
> > I think having an optional list of all the static members that are still
> in
> > the group, rather than the members to be removed since the latter looks a
> > bit less flexible to me, in the request is a good idea (remember we
> allow a
> > group to have both static and dynamic members at the same time, so when
> > receiving the request, we will only do the diff and add / remove the
> static
> > members directly only, while still let the dynamic members to try to
> > re-join the group with the rebalance timeout).
> >
> > 3) personally I favor "ids" over "names" :) Since we already have some
> > "ids" and hence it sounds more consistent, plus on the producer side we
> > have a `transactional.id` whose semantics is a bit similar to this one,
> > i.e. for unique distinguishment of a client which may comes and goes but
> > need to be persist over multiple "instance life-times".
> >
> >
> > Guozhang
> >
> >
> > On Tue, Nov 27, 2018 at 10:00 AM Mayuresh Gharat <
> > gharatmayures...@gmail.com>
> > wrote:
> >
> > > Hi Boyang,
> > >
> > > Thanks for the replies. Please find the follow up queries below.
> > >
> > > 5. Regarding "So in summary, *the member will only be removed due
> to
> > > session timeout*. We shall remove it from both in-memory static member
> > name
> > > mapping and member list." If the rebalance is invoked manually using
> the
> > > the admin apis, how long should the group coordinator wait for the
> > members
> > > of the group to send a JoinGroupRequest for participating in the
> > rebalance?
> > > How is a lagging consumer handled?
> > > The plan is to disable member kick out when rebalance.timeout is
> reached,
> > > so basically we are not "waiting" any
> > > join group request from existing members; we shall just rebalance base
> on
> > > what we currently have within the group
> > > metadata. Lagging consumer will trigger rebalance later if session
> > timeout
> > > > rebalance timeout.
> > >
> > > >
> > > Just wanted to understand this better. Lets take an example, say we
> have
> > a
> > > > consumer group "GroupA" with 4 consumers  c1, c2, c3, c4.
> > > > Everything is running fine and suddenly C4 host has issues and it
> goes
> > > > down. Now we notice that we can still operate with c1, c2, c3 and
> don't
> > > > want to wait for
> > > > c4 to come back up. We use the admin api
> > > > "invokeConsumerRebalance("GroupA")".
> > > > Now the GroupCoordinator, will ask the members c1, c2, c3 to join the
> > > > group again (in there heartBeatResponse) as first step of rebalance.
> > > > Now lets say that c1, c2 immediately send a joinGroupRequest but c3
> is
> > > > delayed. At this stage, if we are not "waiting" on any join group
> > > request,
> > > > few things can happen :
> > > >
> > > >- c4's partitions are distributed only among c1,c2. c3 maintains
> its
&g

Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by specifying member id

2018-11-27 Thread Mayuresh Gharat
immediately. How does this plan sound?
>
> 3. I've been holding back on mentioning this, but I think we should
> reconsider the name `member.name`. I think we want something that suggests
> its expectation of uniqueness in the group. How about `group.instance.id`
> to go along with `group.id`?
>
> Yea, Dong and Stanislav also mentioned this naming. I personally buy in
> the namespace idea, and
>
> since we already use `member.name` in a lot of context, I decide to
> rename the config to `group.member.name`
>
> which should be sufficient for solving all the concerns we have now.
> Sounds good?
>
>
> Thank you for your great suggestions! Let me know if my reply makes sense
> her.
>
>
> Best,
>
> Boyang
>
> 
> From: Jason Gustafson 
> Sent: Tuesday, November 27, 2018 7:51 AM
> To: dev
> Subject: Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by
> specifying member id
>
> Hi Boyang,
>
> Thanks for the updates. Looks like we're headed in the right direction and
> clearly the interest that this KIP is receiving shows how strong the
> motivation is!
>
> I have a few questions:
>
> 1. This may be the same thing that Mayuresh is asking about. I think the
> suggestion in the KIP is that if a consumer sends JoinGroup with a member
> name, but no member id, then we will return the current member id
> associated with that name. It seems in this case that we wouldn't be able
> to protect from having two consumers active with the same configured
> member.name? For example, imagine that we had a consumer with member.name
> =A
> which is assigned member.id=1. Suppose it becomes a zombie and a new
> instance starts up with member.name=A. If it is also assigned member.id=1,
> then how can we detect the zombie if it comes back to life? Both instances
> will have the same member.id.
>
> The goal is to avoid a rebalance on a rolling restart, but we still need to
> fence previous members. I am wondering if we can generate a new member.id
> every time we receive a request from a static member with an unknown member
> id. If the old instance with the same member.name attempts any operation,
> then it will be fenced with an UNKNOWN_MEMBER_ID error. As long as the
> subscription of the new instance hasn't changed, then we can skip the
> rebalance and return the current assignment without forcing a rebalance.
>
> The trick to making this work is in the error handling of the zombie
> consumer. If the zombie simply resets its member.id and rejoins to get a
> new one upon receiving the UNKNOWN_MEMBER_ID error, then it would end up
> fencing the new member. We want to avoid this. There needs to be an
> expectation for static members that the member.id of a static member will
> not be changed except when a new member with the same member.name joins
> the
> group. Then we can treat UNKNOWN_MEMBER_ID as a fatal error for consumers
> with static member names.
>
> 2. The mechanics of the ConsumerRebalance API seem unclear to me. As far as
> I understand it, it is used for scaling down a consumer group and somehow
> bypasses normal session timeout expiration. I am wondering how critical
> this piece is and whether we can leave it for future work. If not, then it
> would be helpful to elaborate on its implementation. How would the
> coordinator know which members to kick out of the group?
>
> 3. I've been holding back on mentioning this, but I think we should
> reconsider the name `member.name`. I think we want something that suggests
> its expectation of uniqueness in the group. How about `group.instance.id`
> to go along with `group.id`?
>
> Thanks,
> Jason
>
>
>
> On Mon, Nov 26, 2018 at 10:18 AM Mayuresh Gharat <
> gharatmayures...@gmail.com>
> wrote:
>
> > Hi Boyang,
> >
> > Thanks a lot for replying to all the queries and discussions here, so
> > patiently.
> > Really appreciate it.
> >
> > Had a few questions and suggestions after rereading the current version
> of
> > the KIP :
> >
> >
> >1. Do you intend to have member.id is a static config like
> member.name
> >after KIP-345 and KIP-394?
> >2. Regarding "On client side, we add a new config called MEMBER_NAME
> in
> >ConsumerConfig. On consumer service init, if the MEMBER_NAME config is
> > set,
> >we will put it in the initial join group request to identify itself
> as a
> >static member (static membership); otherwise, we will still send
> >UNKNOWN_MEMBER_ID to ask broker for allocating a new random ID
> (dynamic
> >membership)."
> >   - What is the value of member_id sent in the first JoinGroupRequest
> &g

Re: [DISCUSS] KIP-391: Allow Producing with Offsets for Cluster Replication

2018-11-26 Thread Mayuresh Gharat
Hi Edoardo,

Thanks a lot for the KIP.
 I have a few questions/suggestions in addition to what Radai has mentioned
above :

   1. Is this meant only for 1:1 replication, for example one Kafka cluster
   replicating to other, instead of having multiple Kafka clusters mirroring
   into one Kafka cluster?
   2. Are we relying on exactly once produce in the replicator? If not, how
   are retries handled in the replicator ?
   3. What is the recommended value for inflight requests, here. Is it
   suppose to be strictly 1, if yes, it would be great to mention that in the
   KIP.
   4. How is unclean Leader election between source cluster and destination
   cluster handled?
   5. How are offsets resets in case of the replicator's consumer handled?
   6. It would be good to explain the workflow in the KIP, with an
   example,  regarding how this KIP will change the replication scenario and
   how it will benefit the consumer apps.

Thanks,

Mayuresh

On Mon, Nov 26, 2018 at 8:08 AM radai  wrote:

> a few questions:
>
> 1. how do you handle possible duplications caused by the "special"
> producer timing-out/retrying? are you explicitely relying on the
> "exactly once" sequencing?
> 2. what about the combination of log compacted topics + replicator
> downtime? by the time the replicator comes back up there might be
> "holes" in the source offsets (some msgs might have been compacted
> out)? how is that recoverable?
> 3. similarly, what if you try and fire up replication on a non-empty
> source topic? does the kip allow for offsets starting at some
> arbitrary X > 0 ? or would this have to be designed from the start.
>
> and lastly, since this KIP seems to be designed fro active-passive
> failover (there can be no produce traffic except the replicator)
> wouldnt a solution based on seeking to a time offset be more generic?
> your producers could checkpoint the last (say log append) timestamp of
> records theyve seen, and when restoring in the remote site seek to
> those timestamps (which will be metadata in their committed offsets) -
> assumming replication takes > 0 time you'd need to handle some dups,
> but every kafka consumer setup needs to know how to handle those
> anyway.
> On Fri, Nov 23, 2018 at 2:27 AM Edoardo Comar  wrote:
> >
> > Hi Stanislav
> >
> > > > The flag is needed to distinguish a batch with a desired base offset
> > of
> > > 0,
> > > from a regular batch for which offsets need to be generated.
> > > If the producer can provide offsets, why not provide a base offset of
> 0?
> >
> > a regular batch (for which offsets are generated by the broker on write)
> > is sent with a base offset of 0.
> > How could you distinguish it from a batch where you *want* the first
> > record to be written at offset 0 (i.e. be the first in the partition and
> > be rejected if there are records on the log already) ?
> > We wanted to avoid a "deep" inspection (and potentially decompression) of
> > the records.
> >
> > For the replicator use case, a single produce request where all the data
> > is to be assumed with offset,
> > or all without offsets, seems to suffice,
> > So we added only a toplevel flag, not a per-topic-partition one.
> >
> > Thanks for your interest !
> > cheers
> > Edo
> > --
> >
> > Edoardo Comar
> >
> > IBM Event Streams
> > IBM UK Ltd, Hursley Park, SO21 2JN
> >
> >
> > Stanislav Kozlovski  wrote on 22/11/2018
> 22:32:42:
> >
> > > From: Stanislav Kozlovski 
> > > To: dev@kafka.apache.org
> > > Date: 22/11/2018 22:33
> > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for
> > > Cluster Replication
> > >
> > > Hey Edo & Mickael,
> > >
> > > > The flag is needed to distinguish a batch with a desired base offset
> > of
> > > 0,
> > > from a regular batch for which offsets need to be generated.
> > > If the producer can provide offsets, why not provide a base offset of
> 0?
> > >
> > > > (I am reading your post thinking about
> > > partitions rather than topics).
> > > Yes, I meant partitions. Sorry about that.
> > >
> > > Thanks for answering my questions :)
> > >
> > > Best,
> > > Stanislav
> > >
> > > On Thu, Nov 22, 2018 at 5:28 PM Edoardo Comar 
> wrote:
> > >
> > > > Hi Stanislav,
> > > >
> > > > you're right we envision the replicator use case to have a single
> > producer
> > > > with offsets per partition (I am reading your post thinking about
> > > > partitions rather than topics).
> > > >
> > > > If a regular producer was to send its own records at the same time,
> > it's
> > > > very likely that the one sending with an offset will fail because of
> > > > invalid offsets.
> > > > Same if two producers were sending with offsets, likely both would
> > then
> > > > fail.
> > > >
> > > > > Does it make sense to *lock* the topic from other producers while
> > there
> > > > is
> > > > > one that uses offsets?
> > > >
> > > > You could do that with ACL permissions if you wanted, I don't think
> it
> > > > needs to be mandated by 

Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by specifying member id

2018-11-26 Thread Mayuresh Gharat
Hi Boyang,

Thanks a lot for replying to all the queries and discussions here, so
patiently.
Really appreciate it.

Had a few questions and suggestions after rereading the current version of
the KIP :


   1. Do you intend to have member.id is a static config like member.name
   after KIP-345 and KIP-394?
   2. Regarding "On client side, we add a new config called MEMBER_NAME in
   ConsumerConfig. On consumer service init, if the MEMBER_NAME config is set,
   we will put it in the initial join group request to identify itself as a
   static member (static membership); otherwise, we will still send
   UNKNOWN_MEMBER_ID to ask broker for allocating a new random ID (dynamic
   membership)."
  - What is the value of member_id sent in the first JoinGroupRequest
  when member_name is set (using static rebalance)? Is it UNKNOW_MEMBER_ID?
   3. Regarding "we are requiring member.id (if not unknown) to match the
   value stored in cache, otherwise reply MEMBER_ID_MISMATCH. The edge case
   that if we could have members with the same `member.name` (for example
   mis-configured instances with a valid member.id but added a used member
   name on runtime). When member name has duplicates, we could refuse join
   request from members with an outdated `member.id` (since we update the
   mapping upon each join group request). In an edge case where the client
   hits this exception in the response, it is suggesting that some other
   consumer takes its spot."
  - The part of "some other consumer takes the spot" would be
  intentional, right? Also when you say " The edge case that if we
  could have members with the same `member.name` (for example
  mis-configured instances *with a valid member.id <http://member.id> *but
  added a used member name on runtime).", what do you mean by *valid
  member id* here? Does it mean that there exist a mapping of
  member.name to member.id like *MemberA -> id1* on the
  GroupCoordinator and this consumer is trying to join with *member.name
  <http://member.name> = MemberB and member.id <http://member.id> = id1 *
  ?
   4. Depending on your explanation for point 2 and the point 3 above
   regarding returning back MEMBER_ID_MISMATCH on having a matching
   member_name but unknown member_id, if the consumer sends "UNKNOW_MEMBER_ID"
   on the first JoinGroupRequest and relies on the GroupCoordinator to give it
   a member_id, is the consumer suppose to remember member_id for
   joinGroupRequests? If yes, how are restarts handled?
   5. Regarding "So in summary, *the member will only be removed due to
   session timeout*. We shall remove it from both in-memory static member
   name mapping and member list."
  - If the rebalance is invoked manually using the the admin apis, how
  long should the group coordinator wait for the members of the
group to send
  a JoinGroupRequest for participating in the rebalance? How is a lagging
  consumer handled?
   6. Another detail to take care is that we need to automatically take the
   hash of group id so that we know which broker to send this request to.
  - I assume this should be same as the way we find the coordinator,
  today right? If yes, should we specify it in the KIP ?
   7. Are there any specific failure scenarios when you say "other
   potential failure cases."? It would be good to mention them explicitly, if
   you think there are any.
   8. It would be good to have a rollback plan as you have for roll forward
   in the KIP.

Thanks,

Mayuresh

On Mon, Nov 26, 2018 at 8:17 AM Mayuresh Gharat 
wrote:

> Hi Boyang,
>
> Do you have a discuss thread for KIP-394 that you mentioned here ?
>
> Thanks,
>
> Mayuresh
>
> On Mon, Nov 26, 2018 at 4:52 AM Boyang Chen  wrote:
>
>> Hey Dong, thanks for the follow-up here!
>>
>>
>> 1) It is not very clear to the user what is the difference between
>> member.name and client.id as both seems to be used to identify the
>> consumer. I am wondering if it would be more intuitive to name it
>> group.member.name (preferred choice since it matches the current group.id
>> config name) or rebalance.member.name to explicitly show that the id is
>> solely used for rebalance.
>> Great question. I feel `member.name` is enough to explain itself, it
>> seems not very
>> helpful to make the config name longer. Comparing `name` with `id` gives
>> user the
>> impression that they have the control over it with customized rule than
>> library decided.
>>
>> 2) In the interface change section it is said that
>> GroupMaxSessionTimeoutMs
>> will be changed to 30 minutes. It seems to suggest that we will change the
>> default value of this config. It does not seem necessary to increase the
>> time of consumer failure 

Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by specifying member id

2018-11-26 Thread Mayuresh Gharat
a%3DJWiSn5gQO5VNrmBov0KBdHpyVb4CiA0pFOAtLAlFqqY%253D%26reserved%3D0data=02%7C01%7C%7Cdde139857e7a4a3a83dd08d651d9c93e%7C84df9e7fe9f640afb435%7C1%7C0%7C636786393153281080sdata=By6s977ocGSm%2FF5dSvUVtyPM%2B2OUt0XzFMWRHWaoVVk%3Dreserved=0
> >
> > which by default is 5 minutes, given that broker can
> > use a deterministic algorithm to determine the partition -> member_name
> > mapping, each consumer should get assigned the same set of partitions
> > without requiring state shuffling. So it is not clear whether we have a
> > strong use-case for this new logic. Can you help clarify what is the
> > benefit of using topic "static_member_map" to persist member name map?
> > I have discussed with Guozhang offline, and I believe reusing the current
> > `_consumer_offsets`
> > topic is a better and unified solution.
> >
> > 7) Regarding the introduction of the expensionTimeoutMs config, it is
> > mentioned that "we are using expansion timeout to replace rebalance
> > timeout, which is configured by max.poll.intervals from client side, and
> > using registration timeout to replace session timeout". Currently the
> > default max.poll.interval.ms<
> >
> https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Feur04.safelinks.protection.outlook.com%2F%3Furl%3Dhttp%253A%252F%252Fmax.poll.interval.ms%26data%3D02%257C01%257C%257Cb48d52bf63324bd91a5208d64f43247d%257C84df9e7fe9f640afb435%257C1%257C0%257C636783547118328245%26sdata%3DJWiSn5gQO5VNrmBov0KBdHpyVb4CiA0pFOAtLAlFqqY%253D%26reserved%3D0data=02%7C01%7C%7Cdde139857e7a4a3a83dd08d651d9c93e%7C84df9e7fe9f640afb435%7C1%7C0%7C636786393153281080sdata=By6s977ocGSm%2FF5dSvUVtyPM%2B2OUt0XzFMWRHWaoVVk%3Dreserved=0
> >
> > is configured to be 5 minutes and there will
> > be only one rebalance if all new consumers can join within 5 minutes. So
> it
> > is not clear whether we have a strong use-case for this new config. Can
> you
> > explain what is the benefit of introducing this new config?
> > Previously our goal is to use expansion timeout as a workaround for
> > triggering multiple
> > rebalances when scaling up members are not joining at the same time. It
> is
> > decided to
> > be addressed by client side protocol change, so we will not introduce
> > expansion timeout.
> >
> > 8) It is mentioned that "To distinguish between previous version of
> > protocol, we will also increase the join group request version to v4 when
> > MEMBER_NAME is set" and "If the broker version is not the latest (< v4),
> > the join group request shall be downgraded to v3 without setting the
> member
> > Id". It is probably simpler to just say that this feature is enabled if
> > JoinGroupRequest V4 is supported on both client and broker and
> MEMBER_NAME
> > is configured with non-empty string.
> > Yep, addressed this!
> >
> > 9) It is mentioned that broker may return NO_STATIC_MEMBER_INFO_SET error
> > in OffsetCommitResponse for "commit requests under static membership".
> Can
> > you clarify how broker determines whether the commit request is under
> > static membership?
> >
> > We have agreed that commit request shouldn't be affected by the new
> > membership, thus
> > removing it here. Thanks for catching this!
> >
> > Let me know if you have further suggestions or concerns. Thank you for
> > your valuable feedback
> > to help me design the KIP better! (And I will try to address your
> > feedbacks in next round Mayuresh ??)
> >
> > Best,
> > Boyang
> > 
> > From: Mayuresh Gharat 
> > Sent: Wednesday, November 21, 2018 7:50 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by
> > specifying member id
> >
> > Hi Boyang,
> >
> > Thanks for updating the KIP. This is a step good direction for stateful
> > applications and also mirroring applications whose latency is affected
> due
> > to the rebalance issues that we have today.
> >
> > I had a few questions on the current version of the KIP :
> > For the effectiveness of the KIP, consumer with member.name set will
> *not
> > send leave group request* when they go offline
> >
> > > By this you mean, even if the application has not called
> > > KafkaConsumer.poll() within session timeout, it will not be sending the
> > > LeaveGroup request, right?
> > >
> >
> > Broker will maintain an in-memory mapping of {member.name ? member.id}
> to
> > track member uniquene

Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by specifying member id

2018-11-20 Thread Mayuresh Gharat
Hi Boyang,

Thanks for updating the KIP. This is a step good direction for stateful
applications and also mirroring applications whose latency is affected due
to the rebalance issues that we have today.

I had a few questions on the current version of the KIP :
For the effectiveness of the KIP, consumer with member.name set will *not
send leave group request* when they go offline

> By this you mean, even if the application has not called
> KafkaConsumer.poll() within session timeout, it will not be sending the
> LeaveGroup request, right?
>

Broker will maintain an in-memory mapping of {member.name → member.id} to
track member uniqueness.

> When is the member.name removed from this map?
>

Member.id must be set if the *member.name  *is already
within the map. Otherwise reply MISSING_MEMBER_ID

> How is this case handled on the client side? What is the application that
> is using the KafkaConsumer suppose to do in this scenario?
>

Session timeout is the timeout we will trigger rebalance when a member goes
offline for too long (not sending heartbeat request). To make static
membership effective, we should increase the default max session timeout to
30 min so that end user could config it freely.

> This would mean that it might take more time to detect unowned topic
> partitions and may cause delay for applications that perform data mirroring
> tasks. I discussed this with our sre and we have a suggestion to make here
> as listed below separately.
>

Currently there is a config called *rebalance timeout* which is configured
by consumer *max.poll.intervals*. The reason we set it to poll interval is
because consumer could only send request within the call of poll() and we
want to wait sufficient time for the join group request. When reaching
rebalance timeout, the group will move towards completingRebalance stage
and remove unjoined groups

> you meant remove unjoined members of the group, right ?
>

Currently there is a config called *rebalance timeout* which is configured
by consumer *max.poll.intervals*. The reason we set it to poll interval is
because consumer could only send request within the call of poll() and we
want to wait sufficient time for the join group request. When reaching
rebalance timeout, the group will move towards completingRebalance stage
and remove unjoined groups. This is actually conflicting with the design of
static membership, because those temporarily unavailable members will
potentially reattempt the join group and trigger extra rebalances.
Internally we would optimize this logic by having rebalance timeout only in
charge of stopping prepare rebalance stage, without removing non-responsive
members immediately.

> What do you mean by " Internally we would optimize this logic by having
> rebalance timeout only in charge of stopping prepare rebalance stage,
> without removing non-responsive members immediately." There would not be a
> full rebalance if the lagging consumer sent a JoinGroup request later,
> right ? If yes, can you highlight this in the KIP ?
>

Scale Up

> The KIP talks about scale up scenario but its not quite clear how we
> handle it. Are we adding a separate "expansion.timeout" or we adding status
> "learner" ?. Can you shed more light on how this is handled in the KIP, if
> its handled?
>


*Discussion*
Larger session timeouts causing latency rise for getting data for un-owned
topic partitions :

> I think Jason had brought this up earlier about having a way to say how
> many members/consumer hosts are you choosing to be in the consumer group.
> If we can do this, then in case of mirroring applications we can do this :
> Lets say we have a mirroring application that consumes from Kafka cluster
> A and produces to Kafka cluster B.
> Depending on the data and the Kafka cluster configuration, Kafka service
> providers can set a mirroring group saying that it will take, for example
> 300 consumer hosts/members to achieve the desired throughput and latency
> for mirroring and can have additional 10 consumer hosts as spare in the
> same group.
> So when the first 300 members/consumers to join the group will start
> mirroring the data from Kafka cluster A to Kafka cluster B.
> The remaining 10 consumer members can sit idle.
> The moment one of the consumer (for example: consumer number 54) from the
> first 300 members go out of the group (crossed session timeout), it (the
> groupCoordinator) can just assign the topicPartitions from the consumer
> member 54 to one of the spare hosts.
> Once the consumer member 54 comes back up, it can start as being a part of
> the spare pool.
> This enables us to have lower session timeouts and low latency mirroring,
> in cases where the service providers are OK with having spare hosts.
> This would mean that we would tolerate n consumer members leaving and
> rejoining the group and still provide low latency as long as n <= number of
> spare consumers.
> If there are no spare host available, we can get back to the 

Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by specifying member id

2018-11-10 Thread Mayuresh Gharat
Hi Boyang,

Thanks for the reply.

Please find the replies inline below :
For having a consumer config at runtime, I think it's not necessary to
address in this KIP because most companies run sidecar jobs through daemon
software like puppet. It should be easy to change the config through script
or UI without actual code change. We still want to leave flexibility for
user to define member name as they like.
 This might be little different for companies that use configuration
management tools that does not allow the applications to define/change the
configs dynamically. For example, if we use something similar to spring to
pull in the configs for the KafkaConsumer and pass it to the constructor to
create the KafkaConsumer object, it will be hard to specify a unique value
to the "MEMBER_NAME" config unless someone deploying the app generates a
unique string for this config outside the deployment workflow and copies it
statically before starting up each consumer instance. Unless we can loosen
the criteria for uniqueness of this config value, for each consumer
instance in the consumer group, I am not sure of a better way of
addressing this. If we don't want to loosen the criteria, then providing a
dynamic way to pass this in at runtime, would put the onus of having the
same unique value each time a consumer is restarted, on to the application
that is running the consumer.

I just updated the kip about having both "registration timeout" and
"session timeout". The benefit of having two configs instead of one is to
reduce the mental burden for operation, for example user just needs to
unset "member name" to cast back to dynamic membership without worrying
about tuning the "session timeout" back to a smaller value.
--- That is a good point. I was thinking, if both the configs are
specified, it would be confusing for the end user without understanding the
internals of the consumer and its interaction with group coordinator, as
which takes precedence when and how it affects the consumer behavior. Just
my 2 cents.

Thanks,

Mayuresh

On Fri, Nov 9, 2018 at 8:27 PM Boyang Chen  wrote:

> Hey Mayuresh,
>
>
> thanks for the thoughtful questions! Let me try to answer your questions
> one by one.
>
>
> For having a consumer config at runtime, I think it's not necessary to
> address in this KIP because most companies run sidecar jobs through daemon
> software like puppet. It should be easy to change the config through script
> or UI without actual code change. We still want to leave flexibility for
> user to define member name as they like.
>
>
> I just updated the kip about having both "registration timeout" and
> "session timeout". The benefit of having two configs instead of one is to
> reduce the mental burden for operation, for example user just needs to
> unset "member name" to cast back to dynamic membership without worrying
> about tuning the "session timeout" back to a smaller value.
>
>
> For backup topic, I think it's a low-level detail which could be addressed
> in the implementation. I feel no preference of adding a new topic vs reuse
> consumer offsets topic. I will do more analysis and make a trade-off
> comparison. Nice catch!
>
>
> I hope the explanations make sense to you. I will keep polishing on the
> edge cases and details.
>
>
> Best,
>
> Boyang
>
> 
> From: Mayuresh Gharat 
> Sent: Saturday, November 10, 2018 10:25 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by
> specifying member id
>
> Hi Boyang,
>
> Thanks for the KIP and sorry for being late to the party. This KIP is
> really useful for us at Linkedin.
>
> I had a few questions :
>
> The idea of having static member name seems nice, but instead of a config,
> would it be possible for it to be passed in to the consumer at runtime?
> This is because an app might want to decide the config value at runtime
> using its host information for example, to generate the unique member name.
>
> Also the KIP talks about using the "REGISTRATION_TIMEOUT_MS". I was
> wondering if we can reuse the session timeout here. This might help us to
> have one less config on the consumer.
>
> The KIP also talks about adding another internal topic "static_member_map".
> Would the semantics (GroupCoordinator broker, topic configs) be the same as
> __consumer_offsets topic?
>
> Thanks,
>
> Mayuresh
>
>
> On Wed, Nov 7, 2018 at 12:17 AM Boyang Chen  wrote:
>
> > I took a quick pass of the proposal. First I would say it's a very
> > brilliant initiative from Konstantine and Confluent folks. To draft up a
> > proposal like this needs deep understand

Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by specifying member id

2018-11-09 Thread Mayuresh Gharat
Hi Boyang,

Thanks for the KIP and sorry for being late to the party. This KIP is
really useful for us at Linkedin.

I had a few questions :

The idea of having static member name seems nice, but instead of a config,
would it be possible for it to be passed in to the consumer at runtime?
This is because an app might want to decide the config value at runtime
using its host information for example, to generate the unique member name.

Also the KIP talks about using the "REGISTRATION_TIMEOUT_MS". I was
wondering if we can reuse the session timeout here. This might help us to
have one less config on the consumer.

The KIP also talks about adding another internal topic "static_member_map".
Would the semantics (GroupCoordinator broker, topic configs) be the same as
__consumer_offsets topic?

Thanks,

Mayuresh


On Wed, Nov 7, 2018 at 12:17 AM Boyang Chen  wrote:

> I took a quick pass of the proposal. First I would say it's a very
> brilliant initiative from Konstantine and Confluent folks. To draft up a
> proposal like this needs deep understanding of the rebalance protocol! I
> summarized some thoughts here.
>
>
> Overall the motivations of the two proposals align on that:
>
>   1.  Both believe the invariant resource (belonging to the same process)
> should be preserved across rebalance.
>   2.  Transit failures (K8 thread death) shouldn't trigger resource
> redistribution. I don't use rebalance here since part one of the
> cooperative proposal could potentially introduce more rebalances but only
> on must-move resources.
>   3.  Scale up/down and rolling bounce are causing unnecessary resource
> shuffling that need to be mitigated.
>
>
> On motivation level, I think both approach could solve/mitigate the above
> issues. They are just different in design philosophy, or I would say the
> perspective difference between framework user and algorithm designer.
>
>
> Two proposals have different focuses. KIP-345 is trying to place more
> fine-grained control on the broker side to reduce the unnecessary
> rebalances, while keeping the client logic intact. This is pretty intuitive
> cause-effect for normal developers who are not very familiar with rebalance
> protocol. As a developer working with Kafka Streams daily, I'd be happy to
> see a simplified rebalance protocol and just focus on maintaining the
> stream/consumer jobs. Too many rebalances raised my concern on the job
> health. To be concise, static membership has the advantage of reducing
> mental burden.
>
>
> Cooperative proposal takes thoughtful approach on client side. We want to
> have fine-grained control on the join/exit group behaviors and make the
> current dynamic membership better to address above issues. I do feel our
> idea crossed on the delayed rebalance when we scale up/down, which could
> potentially reduce the state shuffling and decouple the behavior from
> session timeout which is already overloaded.  In this sense, I believe both
> approaches would serve well in making "reasonable rebalance" happen at the
> "right timing".
>
>
> However, based on my understanding, either 345 or cooperative rebalancing
> is not solving the problem Mike has proposed: could we do a better job at
> scaling up/down in ideal timing? My initial response was to introduce an
> admin API which now I feel is sub-optimal, in that the goal of smooth
> transition is to make sure the newly up hosts are actually "ready". For
> example:
>
>
> We have 4 instance reading from 8 topic partitions (= 8 tasks). At some
> time we would like to scale up to 8 hosts, with the current improvements we
> could reduce 4 potential rebalances to a single one. But the new hosts are
> yet unknown to be "ready" if they need to reconstruct the local state. To
> be actually ready, we need 4 standby tasks running on those empty hosts and
> leader needs to wait for the signal of "replay/reconstruct complete" to
> actually involve them into the main consumer group. Otherwise, rebalance
> just kills our performance since we need to wait indefinite long for task
> migration.
>
>
> The scale down is also tricky such that we are not able to define a "true"
> leave of a member. Rebalance immediately after "true" leaves are most
> optimal comparing with human intervention. Does this make sense?
>
>
> My intuition is that cooperative approach which was implemented on the
> client side could better handle scaling cases than KIP 345, since it
> involves a lot of algorithmic changes to define "replaying" stage, which I
> feel would over-complicate broker logic if implemented on coordinator. If
> we let 345 focus on reducing unnecessary rebalance, and let cooperative
> approach focus on judging best timing of scale up/down, the two efforts
> could be aligned. In long term, I feel the more complex improvement of
> consumer protocol should happen on client side instead of server side which
> is easier to test and has less global impact for the entire Kafka
> production cluster.
>
>
> Thanks again to 

Re: [DISCUSS] KIP-388 Add observer interface to record request and response

2018-11-09 Thread Mayuresh Gharat
Hi Lincong,

Thanks for the KIP.

As Colin pointed out, would it better to expose certain specific pieces of
information from the request/response like api key, request headers, record
counts, client ID instead of the entire request/response objects ? This
enables us to change the request response apis independently of this
pluggable public API, in future, unless you think we have a strong reason
that we need to expose the request, response objects.

Also, it would be great if you can expand on :
"Add code to the broker (in KafkaApis) to allow Kafka servers to invoke any
observers defined. More specifically, change KafkaApis code to invoke all
defined observers, in the order in which they were defined, for every
request-response pair."
probably with an example of how you visualize it. It would help the KIP to
be more concrete and easier to understand the end to end workflow.

Thanks,

Mayuresh

On Thu, Nov 8, 2018 at 7:44 PM Ismael Juma  wrote:

> I agree, the current KIP doesn't discuss the public API that we would be
> exposing and it's extensive if the normal usage would allow for casting
> AbstractRequest into the various subclasses and potentially even accessing
> Records and related for produce request.
>
> There are many use cases where this could be useful, but it requires quite
> a bit of thinking around the APIs that we expose and the expected usage.
>
> Ismael
>
> On Thu, Nov 8, 2018, 6:09 PM Colin McCabe 
> > Hi Lincong Li,
> >
> > I agree that server-side instrumentation is helpful.  However, I don't
> > think this is the right approach.
> >
> > The problem is that RequestChannel.Request and AbstractResponse are
> > internal classes that should not be exposed.  These are implementation
> > details that we may change in the future.  Freezing these into a public
> API
> > would really hold back the project.  For example, for really large
> > responses, we might eventually want to avoid materializing the whole
> > response all at once.  It would make more sense to return it in a
> streaming
> > fashion.  But if we need to support this API forever, we can't do that.
> >
> > I think it's fair to say that this is, at best, half a solution to the
> > problem of tracing requests.  Users still need to write the plugin code
> and
> > arrange for it to be on their classpath to make this work.  I think the
> > alternative here is not client-side instrumentation, but simply making
> the
> > change to the broker without using a plugin interface.
> >
> > If a public interface is absolutely necessary here we should expose only
> > things like the API key, client ID, time, etc. that don't constrain the
> > implementation a lot in the future.  I think we should also use java here
> > to avoid the compatibility issues we have had with Scala APIs in the
> past.
> >
> > best,
> > Colin
> >
> >
> > On Thu, Nov 8, 2018, at 11:34, radai wrote:
> > > another downside to client instrumentation (beyond the number of
> > > client codebases one would need to cover) is that in a large
> > > environments you'll have a very long tail of applications using older
> > > clients to upgrade - it would be a long and disruptive process (as
> > > opposed to updating broker-side instrumentation)
> > > On Thu, Nov 8, 2018 at 11:04 AM Peter M. Elias 
> > wrote:
> > > >
> > > > I know we have a lot of use cases for this type of functionality at
> my
> > > > enterprise deployment. I think it's helpful for maintaining
> > reliability of
> > > > the cluster especially and identifying clients that are not properly
> > tuned
> > > > and therefore applying excessive load to the brokers. Additionally,
> > there
> > > > is a bit of a dark spot without something like as currently. For
> > example,
> > > > if a client is not using a consumer group, there is no direct way to
> > query
> > > > the state of the consumer without looking at raw network connections
> to
> > > > determine the extent of the traffic generated by that particular
> > consumer.
> > > >
> > > > While client instrumentation can certainly help with this currently,
> > given
> > > > that Kafka is intended to be a shared service across a potentially
> very
> > > > large surface area of clients, central observation of client activity
> > is in
> > > > my opinion an essential feature.
> > > >
> > > > Peter
> > > >
> > > > On Thu, Nov 8, 2018 at 12:13 PM radai 
> > wrote:
> > > >
> > > > > bump.
> > > > >
> > > > > I think the proposed API (Observer) is useful for any sort of
> > > > > multi-tenant environment for chargeback and reporting purposes.
> > > > >
> > > > > if no one wants to comment, can we initiate a vote?
> > > > > On Mon, Nov 5, 2018 at 6:31 PM Lincong Li  >
> > wrote:
> > > > > >
> > > > > > Hi everyone. Here
> > > > > > <
> > > > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-388%3A+Add+observer+interface+to+record+request+and+response
> > > > > >
> > > > > > is
> > > > > > my KIP. Any feedback is appreciated.
> > > > > >
> > > > > > Thanks,
> > > > 

Re: Apache Kafka blog on more partitions support

2018-11-02 Thread Mayuresh Gharat
Thanks Jun for sharing this. Looks nice !

Do we intend to shed light on how much time is required, on an average, for
new Leader election. Also would it be good to add "if the controller waits
for the LeaderAndIsrResponses before sending shutDown_OK to the shutting
down broker".

Thanks,

Mayuresh

On Fri, Nov 2, 2018 at 12:07 PM  wrote:

> Thanks Jun for sharing the post.
> Minor Nit: Date says  December 16, 2019.
>
> Did this test measured the replication affects on the overall cluster
> health and performance?
> It looks like we are suggesting with 200k partitions and 4k per broker max
> size of a cluster should be around 50 brokers?
>
> Thanks,
> Harsha
> On Nov 2, 2018, 11:50 AM -0700, Jun Rao , wrote:
> > Hi, Everyone,
> >
> > The follow is the preview of a blog on Kafka supporting more partitions.
> >
> > https://drive.google.com/file/d/122TK0oCoforc2cBWfW_yaEBjTMoX6yMt
> >
> > Please let me know if you have any comments by Tuesday.
> >
> > Thanks,
> >
> > Jun
>


-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [DISCUSS] KIP-385: Provide configuration allowing consumer to no throw away prefetched data

2018-10-31 Thread Mayuresh Gharat
Hi Colin, Zahari,

Wanted to check if you can review the patch and let me know, if we need to
make any changes?

Thanks,

Mayuresh

On Fri, Oct 26, 2018 at 1:41 PM Zahari Dichev 
wrote:

> Thanks for participating the discussion. Indeed, I learned quite a lot.
> Will take a look at the patch as well and spend some time hunting for some
> other interesting issue to work on :)
>
> Cheers,
> Zahari
>
> On Fri, Oct 26, 2018 at 8:49 PM Colin McCabe  wrote:
>
> > Hi Zahari,
> >
> > I think we can retire the KIP, since the KAFKA-7548 patch should solve
> the
> > issue without any changes that require a KIP.  This is actually the best
> > thing we could do for our users, since things will "just work" more
> > efficiently without a lot of configuration knobs.
> >
> > I think you did an excellent job raising this issue and discussing it.
> > It's a very good contribution to the project even if you don't end up
> > writing the patch yourself.  I'm going to take a look at the patch today.
> > If you want to take a look, that would also be good.
> >
> > best,
> > Colin
> >
> >
> > On Thu, Oct 25, 2018, at 12:25, Zahari Dichev wrote:
> > > Hi there Mayuresh,
> > >
> > > Great to heat that this is actually working well in production for some
> > > time now. I have changed the details of the KIP to reflect the fact
> that
> > as
> > > already discussed - we do not really need any kind of configuration as
> > this
> > > data should not be thrown away at all.  Submitting a PR sounds great,
> > > although I feel a bit jealous you (LinkedIn) beat me to my first kafka
> > > commit  ;)  Not sure how things stand with the voting process ?
> > >
> > > Zahari
> > >
> > >
> > >
> > > On Thu, Oct 25, 2018 at 7:39 PM Mayuresh Gharat <
> > gharatmayures...@gmail.com>
> > > wrote:
> > >
> > > > Hi Colin/Zahari,
> > > >
> > > > I have created a ticket for the similar/same feature :
> > > > https://issues.apache.org/jira/browse/KAFKA-7548
> > > > We (Linkedin) had a use case in Samza at Linkedin when they moved
> from
> > the
> > > > SimpleConsumer to KafkaConsumer and they wanted to do this pause and
> > resume
> > > > pattern.
> > > > They realized there was performance degradation when they started
> using
> > > > KafkaConsumer.assign() and pausing and unPausing partitions. We
> > realized
> > > > that not throwing away the prefetched data for paused partitions
> might
> > > > improve the performance. We wrote a benchmark (I can share it if
> > needed) to
> > > > prove this. I have attached the findings in the ticket.
> > > > We have been running the hotfix internally for quite a while now.
> When
> > > > samza ran this fix in production, they realized 30% improvement in
> > there
> > > > app performance.
> > > > I have the patch ready on our internal branch and would like to
> submit
> > a PR
> > > > for this on the above ticket asap.
> > > > I am not sure, if we need a separate config for this as we haven't
> > seen a
> > > > lot of memory overhead due to this in our systems. We have had this
> > running
> > > > in production for a considerable amount of time without any issues.
> > > > It would be great if you guys can review the PR once its up and see
> if
> > that
> > > > satisfies your requirement. If it doesn't then we can think more on
> the
> > > > config driven approach.
> > > > Thoughts??
> > > >
> > > > Thanks,
> > > >
> > > > Mayuresh
> > > >
> > > >
> > > > On Thu, Oct 25, 2018 at 8:21 AM Colin McCabe 
> > wrote:
> > > >
> > > > > Hi Zahari,
> > > > >
> > > > > One question we didn't figure out earlier was who would actually
> want
> > > > this
> > > > > cached data to be thrown away.  If there's nobody who actually
> wants
> > > > this,
> > > > > then perhaps we can simplify the proposal by just unconditionally
> > > > retaining
> > > > > the cache until the partition is resumed, or we unsubscribe from
> the
> > > > > partition.  This would avoid adding a new configuration.
> > > > >
> > > > > best,
> > > > > Colin
> > > > >
> > > > >
> > > > > On Sun, Oct 21, 2018, at 11:54, Zahari Dichev wrote:
> > > > > > Hi there, although it has been discussed briefly already in this
> > thread
> > > > > > <
> > > > >
> > > >
> >
> https://lists.apache.org/thread.html/fbb7e9ccc41084fc2ff8612e6edf307fb400f806126b644d383b4a64@%3Cdev.kafka.apache.org%3E
> > > > > >,
> > > > > > I decided to follow the process and initiate a DISCUSS thread.
> > Comments
> > > > > > and
> > > > > > suggestions are more than welcome.
> > > > > >
> > > > > >
> > > > > > Zahari Dichev
> > > > >
> > > >
> > > >
> > > > --
> > > > -Regards,
> > > > Mayuresh R. Gharat
> > > > (862) 250-7125
> > > >
> >
>


-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [DISCUSS] KIP-385: Provide configuration allowing consumer to no throw away prefetched data

2018-10-25 Thread Mayuresh Gharat
Hi Zahari,

Created the patch here : https://github.com/apache/kafka/pull/5844

Thanks,

Mayuresh

On Thu, Oct 25, 2018 at 4:42 PM Mayuresh Gharat 
wrote:

> Hi Zahari,
>
> Oops. We had planned to put this patch upstream but somehow slipped my
> mind. We were recently going over hotfixes that we have and this seemed
> something that had been due for sometime now. Glad to know that someone
> else apart from us might also benefit from this :)
>
> Thanks,
>
> Mayuresh
>
> On Thu, Oct 25, 2018 at 12:25 PM Zahari Dichev 
> wrote:
>
>> Hi there Mayuresh,
>>
>> Great to heat that this is actually working well in production for some
>> time now. I have changed the details of the KIP to reflect the fact that
>> as
>> already discussed - we do not really need any kind of configuration as
>> this
>> data should not be thrown away at all.  Submitting a PR sounds great,
>> although I feel a bit jealous you (LinkedIn) beat me to my first kafka
>> commit  ;)  Not sure how things stand with the voting process ?
>>
>> Zahari
>>
>>
>>
>> On Thu, Oct 25, 2018 at 7:39 PM Mayuresh Gharat <
>> gharatmayures...@gmail.com>
>> wrote:
>>
>> > Hi Colin/Zahari,
>> >
>> > I have created a ticket for the similar/same feature :
>> > https://issues.apache.org/jira/browse/KAFKA-7548
>> > We (Linkedin) had a use case in Samza at Linkedin when they moved from
>> the
>> > SimpleConsumer to KafkaConsumer and they wanted to do this pause and
>> resume
>> > pattern.
>> > They realized there was performance degradation when they started using
>> > KafkaConsumer.assign() and pausing and unPausing partitions. We realized
>> > that not throwing away the prefetched data for paused partitions might
>> > improve the performance. We wrote a benchmark (I can share it if
>> needed) to
>> > prove this. I have attached the findings in the ticket.
>> > We have been running the hotfix internally for quite a while now. When
>> > samza ran this fix in production, they realized 30% improvement in there
>> > app performance.
>> > I have the patch ready on our internal branch and would like to submit
>> a PR
>> > for this on the above ticket asap.
>> > I am not sure, if we need a separate config for this as we haven't seen
>> a
>> > lot of memory overhead due to this in our systems. We have had this
>> running
>> > in production for a considerable amount of time without any issues.
>> > It would be great if you guys can review the PR once its up and see if
>> that
>> > satisfies your requirement. If it doesn't then we can think more on the
>> > config driven approach.
>> > Thoughts??
>> >
>> > Thanks,
>> >
>> > Mayuresh
>> >
>> >
>> > On Thu, Oct 25, 2018 at 8:21 AM Colin McCabe 
>> wrote:
>> >
>> > > Hi Zahari,
>> > >
>> > > One question we didn't figure out earlier was who would actually want
>> > this
>> > > cached data to be thrown away.  If there's nobody who actually wants
>> > this,
>> > > then perhaps we can simplify the proposal by just unconditionally
>> > retaining
>> > > the cache until the partition is resumed, or we unsubscribe from the
>> > > partition.  This would avoid adding a new configuration.
>> > >
>> > > best,
>> > > Colin
>> > >
>> > >
>> > > On Sun, Oct 21, 2018, at 11:54, Zahari Dichev wrote:
>> > > > Hi there, although it has been discussed briefly already in this
>> thread
>> > > > <
>> > >
>> >
>> https://lists.apache.org/thread.html/fbb7e9ccc41084fc2ff8612e6edf307fb400f806126b644d383b4a64@%3Cdev.kafka.apache.org%3E
>> > > >,
>> > > > I decided to follow the process and initiate a DISCUSS thread.
>> Comments
>> > > > and
>> > > > suggestions are more than welcome.
>> > > >
>> > > >
>> > > > Zahari Dichev
>> > >
>> >
>> >
>> > --
>> > -Regards,
>> > Mayuresh R. Gharat
>> > (862) 250-7125
>> >
>>
>
>
> --
> -Regards,
> Mayuresh R. Gharat
> (862) 250-7125
>


-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [DISCUSS] KIP-385: Provide configuration allowing consumer to no throw away prefetched data

2018-10-25 Thread Mayuresh Gharat
Hi Zahari,

Oops. We had planned to put this patch upstream but somehow slipped my
mind. We were recently going over hotfixes that we have and this seemed
something that had been due for sometime now. Glad to know that someone
else apart from us might also benefit from this :)

Thanks,

Mayuresh

On Thu, Oct 25, 2018 at 12:25 PM Zahari Dichev 
wrote:

> Hi there Mayuresh,
>
> Great to heat that this is actually working well in production for some
> time now. I have changed the details of the KIP to reflect the fact that as
> already discussed - we do not really need any kind of configuration as this
> data should not be thrown away at all.  Submitting a PR sounds great,
> although I feel a bit jealous you (LinkedIn) beat me to my first kafka
> commit  ;)  Not sure how things stand with the voting process ?
>
> Zahari
>
>
>
> On Thu, Oct 25, 2018 at 7:39 PM Mayuresh Gharat <
> gharatmayures...@gmail.com>
> wrote:
>
> > Hi Colin/Zahari,
> >
> > I have created a ticket for the similar/same feature :
> > https://issues.apache.org/jira/browse/KAFKA-7548
> > We (Linkedin) had a use case in Samza at Linkedin when they moved from
> the
> > SimpleConsumer to KafkaConsumer and they wanted to do this pause and
> resume
> > pattern.
> > They realized there was performance degradation when they started using
> > KafkaConsumer.assign() and pausing and unPausing partitions. We realized
> > that not throwing away the prefetched data for paused partitions might
> > improve the performance. We wrote a benchmark (I can share it if needed)
> to
> > prove this. I have attached the findings in the ticket.
> > We have been running the hotfix internally for quite a while now. When
> > samza ran this fix in production, they realized 30% improvement in there
> > app performance.
> > I have the patch ready on our internal branch and would like to submit a
> PR
> > for this on the above ticket asap.
> > I am not sure, if we need a separate config for this as we haven't seen a
> > lot of memory overhead due to this in our systems. We have had this
> running
> > in production for a considerable amount of time without any issues.
> > It would be great if you guys can review the PR once its up and see if
> that
> > satisfies your requirement. If it doesn't then we can think more on the
> > config driven approach.
> > Thoughts??
> >
> > Thanks,
> >
> > Mayuresh
> >
> >
> > On Thu, Oct 25, 2018 at 8:21 AM Colin McCabe  wrote:
> >
> > > Hi Zahari,
> > >
> > > One question we didn't figure out earlier was who would actually want
> > this
> > > cached data to be thrown away.  If there's nobody who actually wants
> > this,
> > > then perhaps we can simplify the proposal by just unconditionally
> > retaining
> > > the cache until the partition is resumed, or we unsubscribe from the
> > > partition.  This would avoid adding a new configuration.
> > >
> > > best,
> > > Colin
> > >
> > >
> > > On Sun, Oct 21, 2018, at 11:54, Zahari Dichev wrote:
> > > > Hi there, although it has been discussed briefly already in this
> thread
> > > > <
> > >
> >
> https://lists.apache.org/thread.html/fbb7e9ccc41084fc2ff8612e6edf307fb400f806126b644d383b4a64@%3Cdev.kafka.apache.org%3E
> > > >,
> > > > I decided to follow the process and initiate a DISCUSS thread.
> Comments
> > > > and
> > > > suggestions are more than welcome.
> > > >
> > > >
> > > > Zahari Dichev
> > >
> >
> >
> > --
> > -Regards,
> > Mayuresh R. Gharat
> > (862) 250-7125
> >
>


-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [DISCUSS] KIP-385: Provide configuration allowing consumer to no throw away prefetched data

2018-10-25 Thread Mayuresh Gharat
Hi Colin/Zahari,

I have created a ticket for the similar/same feature :
https://issues.apache.org/jira/browse/KAFKA-7548
We (Linkedin) had a use case in Samza at Linkedin when they moved from the
SimpleConsumer to KafkaConsumer and they wanted to do this pause and resume
pattern.
They realized there was performance degradation when they started using
KafkaConsumer.assign() and pausing and unPausing partitions. We realized
that not throwing away the prefetched data for paused partitions might
improve the performance. We wrote a benchmark (I can share it if needed) to
prove this. I have attached the findings in the ticket.
We have been running the hotfix internally for quite a while now. When
samza ran this fix in production, they realized 30% improvement in there
app performance.
I have the patch ready on our internal branch and would like to submit a PR
for this on the above ticket asap.
I am not sure, if we need a separate config for this as we haven't seen a
lot of memory overhead due to this in our systems. We have had this running
in production for a considerable amount of time without any issues.
It would be great if you guys can review the PR once its up and see if that
satisfies your requirement. If it doesn't then we can think more on the
config driven approach.
Thoughts??

Thanks,

Mayuresh


On Thu, Oct 25, 2018 at 8:21 AM Colin McCabe  wrote:

> Hi Zahari,
>
> One question we didn't figure out earlier was who would actually want this
> cached data to be thrown away.  If there's nobody who actually wants this,
> then perhaps we can simplify the proposal by just unconditionally retaining
> the cache until the partition is resumed, or we unsubscribe from the
> partition.  This would avoid adding a new configuration.
>
> best,
> Colin
>
>
> On Sun, Oct 21, 2018, at 11:54, Zahari Dichev wrote:
> > Hi there, although it has been discussed briefly already in this thread
> > <
> https://lists.apache.org/thread.html/fbb7e9ccc41084fc2ff8612e6edf307fb400f806126b644d383b4a64@%3Cdev.kafka.apache.org%3E
> >,
> > I decided to follow the process and initiate a DISCUSS thread. Comments
> > and
> > suggestions are more than welcome.
> >
> >
> > Zahari Dichev
>


-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


[jira] [Created] (KAFKA-7548) KafkaConsumer should not throw away already fetched data for paused partitions.

2018-10-25 Thread Mayuresh Gharat (JIRA)
Mayuresh Gharat created KAFKA-7548:
--

 Summary: KafkaConsumer should not throw away already fetched data 
for paused partitions.
 Key: KAFKA-7548
 URL: https://issues.apache.org/jira/browse/KAFKA-7548
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Reporter: Mayuresh Gharat
Assignee: Mayuresh Gharat


In KafkaConsumer, when we do a poll, we fetch data asynchronously from kafka 
brokers that is buffered in completedFetch queue. Now if we pause a few 
partitions, it seems that in next call to poll we remove the completedFetches 
for those paused partitions. Normally, if an application is calling pause on 
topicPartitions, it is likely to return to those topicPartitions in near future 
and when it does, with the current design we would have to re-fetch that data.

At Linkedin, we made a hotfix to see if NOT throwing away the prefetched data 
would improve the performance for stream applications like Samza. We ran a 
benchmark were we compared what is the throughput w.r.t to different values of 
maxPollRecords.

We had a consumer subscribed to 10 partitions of a high volume topic and paused 
different number of partitions for every poll call. Here are the results :

*Before fix (records consumed)*
|maxPollRecords->
Number of Partitions
Paused
\|
V|10|5|1|
|0|8605320
(60.022276059 sec)|8337690
(60.026690095 sec)|6424753
(60.67003 sec)|
|2|101910
(60.006989628 sec)|49350
(60.022598668 sec)|10495
(60.020077555 sec)|
|4|48420
(60.022096537 sec)|24850
(60.007451162 sec)|5004
(60.009773507 sec) |
|6|30420
(60.018380086 sec)|15385
(60.011912135 sec)|3152
(60.013573487 sec)|
|8|23390
(60.043122495 sec)|11390
(60.013297496 sec)|2237
(60.038921333 sec)|
|9|20230 (60.026183204 sec)|10355
(60.015584914 sec)|2087
(60.00319069 sec)|

 

*After fix (records consumed)*
|Number of Partitions
Paused / maxPollRecords|10|5|1|
|0|8662740 (60.011527576 sec)|8203445
(60.022204036 sec)|5846512
(60.0168916 sec)|
|2|8257390
(60.011121061 sec)|7776150
(60.01620875 sec)|5269557
(60.022581248 sec)|
|4|7938510
(60.011829002 sec)|7510140
(60.017571391 sec)|5213496
(60.000230139 sec)|
|6|7100970
(60.007220465 sec)|6382845
(60.038580526 sec)|4519645
(60.48034 sec)|
|8|6799956 (60.001850171 sec)|6482421
(60.001997219 sec)|4383300 (60.4836 sec)|
|9|7045177 (60.035366096 sec)|6465839 
(60.41961 sec)|4884693
(60.42054 sec)|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-354 Time-based log compaction policy

2018-10-15 Thread Mayuresh Gharat
Hi Wesley,

Thanks for the KIP and sorry for being late to the party.
 I wanted to understand, the scenario you mentioned in Proposed changes :

-
>
> Estimate the earliest message timestamp of an un-compacted log segment. we
> only need to estimate earliest message timestamp for un-compacted log
> segments to ensure timely compaction because the deletion requests that
> belong to compacted segments have already been processed.
>
>1.
>
>for the first (earliest) log segment:  The estimated earliest
>timestamp is set to the timestamp of the first message if timestamp is
>present in the message. Otherwise, the estimated earliest timestamp is set
>to "segment.largestTimestamp - maxSegmentMs”
> (segment.largestTimestamp is lastModified time of the log segment or max
>timestamp we see for the log segment.). In the later case, the actual
>timestamp of the first message might be later than the estimation, but it
>is safe to pick up the log for compaction earlier.
>
> When we say "actual timestamp of the first message might be later than the
estimation, but it is safe to pick up the log for compaction earlier.",
doesn't that violate the assumption that we will consider a segment for
compaction only if the time of creation the segment has crossed the "now -
maxCompactionLagMs" ?

Thanks,

Mayuresh

On Mon, Sep 3, 2018 at 7:28 PM Brett Rann  wrote:

> Might also be worth moving to a vote thread? Discussion seems to have gone
> as far as it can.
>
> > On 4 Sep 2018, at 12:08, xiongqi wu  wrote:
> >
> > Brett,
> >
> > Yes, I will post PR tomorrow.
> >
> > Xiongqi (Wesley) Wu
> >
> >
> > On Sun, Sep 2, 2018 at 6:28 PM Brett Rann 
> wrote:
> >
> > > +1 (non-binding) from me on the interface. I'd like to see someone
> familiar
> > > with
> > > the code comment on the approach, and note there's a couple of
> different
> > > approaches: what's documented in the KIP, and what Xiaohe Dong was
> working
> > > on
> > > here:
> > >
> > >
> https://github.com/dongxiaohe/kafka/tree/dongxiaohe/log-cleaner-compaction-max-lifetime-2.0
> > >
> > > If you have code working already Xiongqi Wu could you share a PR? I'd
> be
> > > happy
> > > to start testing.
> > >
> > > On Tue, Aug 28, 2018 at 5:57 AM xiongqi wu 
> wrote:
> > >
> > > > Hi All,
> > > >
> > > > Do you have any additional comments on this KIP?
> > > >
> > > >
> > > > On Thu, Aug 16, 2018 at 9:17 PM, xiongqi wu 
> wrote:
> > > >
> > > > > on 2)
> > > > > The offsetmap is built starting from dirty segment.
> > > > > The compaction starts from the beginning of the log partition.
> That's
> > > how
> > > > > it ensure the deletion of tomb keys.
> > > > > I will double check tomorrow.
> > > > >
> > > > > Xiongqi (Wesley) Wu
> > > > >
> > > > >
> > > > > On Thu, Aug 16, 2018 at 6:46 PM Brett Rann
> 
> > > > > wrote:
> > > > >
> > > > >> To just clarify a bit on 1. whether there's an external storage/DB
> > > isn't
> > > > >> relevant here.
> > > > >> Compacted topics allow a tombstone record to be sent (a null value
> > > for a
> > > > >> key) which
> > > > >> currently will result in old values for that key being deleted if
> some
> > > > >> conditions are met.
> > > > >> There are existing controls to make sure the old values will stay
> > > around
> > > > >> for a minimum
> > > > >> time at least, but no dedicated control to ensure the tombstone
> will
> > > > >> delete
> > > > >> within a
> > > > >> maximum time.
> > > > >>
> > > > >> One popular reason that maximum time for deletion is desirable
> right
> > > now
> > > > >> is
> > > > >> GDPR with
> > > > >> PII. But we're not proposing any GDPR awareness in kafka, just
> being
> > > > able
> > > > >> to guarantee
> > > > >> a max time where a tombstoned key will be removed from the
> compacted
> > > > >> topic.
> > > > >>
> > > > >> on 2)
> > > > >> huh, i thought it kept track of the first dirty segment and didn't
> > > > >> recompact older "clean" ones.
> > > > >> But I didn't look at code or test for that.
> > > > >>
> > > > >> On Fri, Aug 17, 2018 at 10:57 AM xiongqi wu 
> > > > wrote:
> > > > >>
> > > > >> > 1, Owner of data (in this sense, kafka is the not the owner of
> data)
> > > > >> > should keep track of lifecycle of the data in some external
> > > > storage/DB.
> > > > >> > The owner determines when to delete the data and send the delete
> > > > >> request to
> > > > >> > kafka. Kafka doesn't know about the content of data but to
> provide a
> > > > >> mean
> > > > >> > for deletion.
> > > > >> >
> > > > >> > 2 , each time compaction runs, it will start from first
> segments (no
> > > > >> > matter if it is compacted or not). The time estimation here is
> only
> > > > used
> > > > >> > to determine whether we should run compaction on this log
> partition.
> > > > So
> > > > >> we
> > > > >> > only need to estimate uncompacted segments.
> > > > >> >
> > > > >> > On Thu, Aug 16, 2018 at 5:35 PM, Dong Lin 
> > > > wrote:
> > > > >> >
> > > > >> > > Hey Xiongqi,
> > > > >> > >

Re: [ANNOUNCE] New Kafka PMC member: Dong Lin

2018-08-20 Thread Mayuresh Gharat
Congrats Dong !!!

Thanks,

Mayuresh

On Mon, Aug 20, 2018 at 1:36 PM Gwen Shapira  wrote:

> Congrats Dong Lin! Well deserved!
>
> On Mon, Aug 20, 2018, 3:55 AM Ismael Juma  wrote:
>
> > Hi everyone,
> >
> > Dong Lin became a committer in March 2018. Since then, he has remained
> > active in the community and contributed a number of patches, reviewed
> > several pull requests and participated in numerous KIP discussions. I am
> > happy to announce that Dong is now a member of the
> > Apache Kafka PMC.
> >
> > Congratulation Dong! Looking forward to your future contributions.
> >
> > Ismael, on behalf of the Apache Kafka PMC
> >
>


-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-07-24 Thread Mayuresh Gharat
Hi Lucas,
I agree, if we want to go forward with a separate controller plane and data
plane and completely isolate them, having a separate port for controller
with a separate Acceptor and a Processor sounds ideal to me.

Thanks,

Mayuresh


On Mon, Jul 23, 2018 at 11:04 PM Becket Qin  wrote:

> Hi Lucas,
>
> Yes, I agree that a dedicated end to end control flow would be ideal.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Tue, Jul 24, 2018 at 1:05 PM, Lucas Wang  wrote:
>
> > Thanks for the comment, Becket.
> > So far, we've been trying to avoid making any request handler thread
> > special.
> > But if we were to follow that path in order to make the two planes more
> > isolated,
> > what do you think about also having a dedicated processor thread,
> > and dedicated port for the controller?
> >
> > Today one processor thread can handle multiple connections, let's say 100
> > connections
> >
> > represented by connection0, ... connection99, among which connection0-98
> > are from clients, while connection99 is from
> >
> > the controller. Further let's say after one selector polling, there are
> > incoming requests on all connections.
> >
> > When the request queue is full, (either the data request being full in
> the
> > two queue design, or
> >
> > the one single queue being full in the deque design), the processor
> thread
> > will be blocked first
> >
> > when trying to enqueue the data request from connection0, then possibly
> > blocked for the data request
> >
> > from connection1, ... etc even though the controller request is ready to
> be
> > enqueued.
> >
> > To solve this problem, it seems we would need to have a separate port
> > dedicated to
> >
> > the controller, a dedicated processor thread, a dedicated controller
> > request queue,
> >
> > and pinning of one request handler thread for controller requests.
> >
> > Thanks,
> > Lucas
> >
> >
> > On Mon, Jul 23, 2018 at 6:00 PM, Becket Qin 
> wrote:
> >
> > > Personally I am not fond of the dequeue approach simply because it is
> > > against the basic idea of isolating the controller plane and data
> plane.
> > > With a single dequeue, theoretically speaking the controller requests
> can
> > > starve the clients requests. I would prefer the approach with a
> separate
> > > controller request queue and a dedicated controller request handler
> > thread.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Tue, Jul 24, 2018 at 8:16 AM, Lucas Wang 
> > wrote:
> > >
> > > > Sure, I can summarize the usage of correlation id. But before I do
> > that,
> > > it
> > > > seems
> > > > the same out-of-order processing can also happen to Produce requests
> > sent
> > > > by producers,
> > > > following the same example you described earlier.
> > > > If that's the case, I think this probably deserves a separate doc and
> > > > design independent of this KIP.
> > > >
> > > > Lucas
> > > >
> > > >
> > > >
> > > > On Mon, Jul 23, 2018 at 12:39 PM, Dong Lin 
> > wrote:
> > > >
> > > > > Hey Lucas,
> > > > >
> > > > > Could you update the KIP if you are confident with the approach
> which
> > > > uses
> > > > > correlation id? The idea around correlation id is kind of scattered
> > > > across
> > > > > multiple emails. It will be useful if other reviews can read the
> KIP
> > to
> > > > > understand the latest proposal.
> > > > >
> > > > > Thanks,
> > > > > Dong
> > > > >
> > > > > On Mon, Jul 23, 2018 at 12:32 PM, Mayuresh Gharat <
> > > > > gharatmayures...@gmail.com> wrote:
> > > > >
> > > > > > I like the idea of the dequeue implementation by Lucas. This will
> > > help
> > > > us
> > > > > > avoid additional queue for controller and additional configs in
> > > Kafka.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Mayuresh
> > > > > >
> > > > > > On Sun, Jul 22, 2018 at 2:58 AM Becket Qin  >
> > > > wrote:
> > > > > >
> > > > > > > Hi Jun,
> > > > > 

Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-07-23 Thread Mayuresh Gharat
I like the idea of the dequeue implementation by Lucas. This will help us
avoid additional queue for controller and additional configs in Kafka.

Thanks,

Mayuresh

On Sun, Jul 22, 2018 at 2:58 AM Becket Qin  wrote:

> Hi Jun,
>
> The usage of correlation ID might still be useful to address the cases
> that the controller epoch and leader epoch check are not sufficient to
> guarantee correct behavior. For example, if the controller sends a
> LeaderAndIsrRequest followed by a StopReplicaRequest, and the broker
> processes it in the reverse order, the replica may still be wrongly
> recreated, right?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> > On Jul 22, 2018, at 11:47 AM, Jun Rao  wrote:
> >
> > Hmm, since we already use controller epoch and leader epoch for properly
> > caching the latest partition state, do we really need correlation id for
> > ordering the controller requests?
> >
> > Thanks,
> >
> > Jun
> >
> > On Fri, Jul 20, 2018 at 2:18 PM, Becket Qin 
> wrote:
> >
> >> Lucas and Mayuresh,
> >>
> >> Good idea. The correlation id should work.
> >>
> >> In the ControllerChannelManager, a request will be resent until a
> response
> >> is received. So if the controller to broker connection disconnects after
> >> controller sends R1_a, but before the response of R1_a is received, a
> >> disconnection may cause the controller to resend R1_b. i.e. until R1 is
> >> acked, R2 won't be sent by the controller.
> >> This gives two guarantees:
> >> 1. Correlation id wise: R1_a < R1_b < R2.
> >> 2. On the broker side, when R2 is seen, R1 must have been processed at
> >> least once.
> >>
> >> So on the broker side, with a single thread controller request handler,
> the
> >> logic should be:
> >> 1. Process what ever request seen in the controller request queue
> >> 2. For the given epoch, drop request if its correlation id is smaller
> than
> >> that of the last processed request.
> >>
> >> Thanks,
> >>
> >> Jiangjie (Becket) Qin
> >>
> >> On Fri, Jul 20, 2018 at 8:07 AM, Jun Rao  wrote:
> >>
> >>> I agree that there is no strong ordering when there are more than one
> >>> socket connections. Currently, we rely on controllerEpoch and
> leaderEpoch
> >>> to ensure that the receiving broker picks up the latest state for each
> >>> partition.
> >>>
> >>> One potential issue with the dequeue approach is that if the queue is
> >> full,
> >>> there is no guarantee that the controller requests will be enqueued
> >>> quickly.
> >>>
> >>> Thanks,
> >>>
> >>> Jun
> >>>
> >>> On Fri, Jul 20, 2018 at 5:25 AM, Mayuresh Gharat <
> >>> gharatmayures...@gmail.com
> >>>> wrote:
> >>>
> >>>> Yea, the correlationId is only set to 0 in the NetworkClient
> >> constructor.
> >>>> Since we reuse the same NetworkClient between Controller and the
> >> broker,
> >>> a
> >>>> disconnection should not cause it to reset to 0, in which case it can
> >> be
> >>>> used to reject obsolete requests.
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Mayuresh
> >>>>
> >>>> On Thu, Jul 19, 2018 at 1:52 PM Lucas Wang 
> >>> wrote:
> >>>>
> >>>>> @Dong,
> >>>>> Great example and explanation, thanks!
> >>>>>
> >>>>> @All
> >>>>> Regarding the example given by Dong, it seems even if we use a queue,
> >>>> and a
> >>>>> dedicated controller request handling thread,
> >>>>> the same result can still happen because R1_a will be sent on one
> >>>>> connection, and R1_b & R2 will be sent on a different connection,
> >>>>> and there is no ordering between different connections on the broker
> >>>> side.
> >>>>> I was discussing with Mayuresh offline, and it seems correlation id
> >>>> within
> >>>>> the same NetworkClient object is monotonically increasing and never
> >>>> reset,
> >>>>> hence a broker can leverage that to properly reject obsolete
> >> requests.
> >>>>> Thoughts?
> >>>>>
> >>>>> Thanks,
> >>>>> Lucas
> >>&g

Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-07-19 Thread Mayuresh Gharat
Yea, the correlationId is only set to 0 in the NetworkClient constructor.
Since we reuse the same NetworkClient between Controller and the broker, a
disconnection should not cause it to reset to 0, in which case it can be
used to reject obsolete requests.

Thanks,

Mayuresh

On Thu, Jul 19, 2018 at 1:52 PM Lucas Wang  wrote:

> @Dong,
> Great example and explanation, thanks!
>
> @All
> Regarding the example given by Dong, it seems even if we use a queue, and a
> dedicated controller request handling thread,
> the same result can still happen because R1_a will be sent on one
> connection, and R1_b & R2 will be sent on a different connection,
> and there is no ordering between different connections on the broker side.
> I was discussing with Mayuresh offline, and it seems correlation id within
> the same NetworkClient object is monotonically increasing and never reset,
> hence a broker can leverage that to properly reject obsolete requests.
> Thoughts?
>
> Thanks,
> Lucas
>
> On Thu, Jul 19, 2018 at 12:11 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com> wrote:
>
> > Actually nvm, correlationId is reset in case of connection loss, I think.
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Thu, Jul 19, 2018 at 11:11 AM Mayuresh Gharat <
> > gharatmayures...@gmail.com>
> > wrote:
> >
> > > I agree with Dong that out-of-order processing can happen with having 2
> > > separate queues as well and it can even happen today.
> > > Can we use the correlationId in the request from the controller to the
> > > broker to handle ordering ?
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > >
> > > On Thu, Jul 19, 2018 at 6:41 AM Becket Qin 
> wrote:
> > >
> > >> Good point, Joel. I agree that a dedicated controller request handling
> > >> thread would be a better isolation. It also solves the reordering
> issue.
> > >>
> > >> On Thu, Jul 19, 2018 at 2:23 PM, Joel Koshy 
> > wrote:
> > >>
> > >> > Good example. I think this scenario can occur in the current code as
> > >> well
> > >> > but with even lower probability given that there are other
> > >> non-controller
> > >> > requests interleaved. It is still sketchy though and I think a safer
> > >> > approach would be separate queues and pinning controller request
> > >> handling
> > >> > to one handler thread.
> > >> >
> > >> > On Wed, Jul 18, 2018 at 11:12 PM, Dong Lin 
> > wrote:
> > >> >
> > >> > > Hey Becket,
> > >> > >
> > >> > > I think you are right that there may be out-of-order processing.
> > >> However,
> > >> > > it seems that out-of-order processing may also happen even if we
> > use a
> > >> > > separate queue.
> > >> > >
> > >> > > Here is the example:
> > >> > >
> > >> > > - Controller sends R1 and got disconnected before receiving
> > response.
> > >> > Then
> > >> > > it reconnects and sends R2. Both requests now stay in the
> controller
> > >> > > request queue in the order they are sent.
> > >> > > - thread1 takes R1_a from the request queue and then thread2 takes
> > R2
> > >> > from
> > >> > > the request queue almost at the same time.
> > >> > > - So R1_a and R2 are processed in parallel. There is chance that
> > R2's
> > >> > > processing is completed before R1.
> > >> > >
> > >> > > If out-of-order processing can happen for both approaches with
> very
> > >> low
> > >> > > probability, it may not be worthwhile to add the extra queue. What
> > do
> > >> you
> > >> > > think?
> > >> > >
> > >> > > Thanks,
> > >> > > Dong
> > >> > >
> > >> > >
> > >> > > On Wed, Jul 18, 2018 at 6:17 PM, Becket Qin  >
> > >> > wrote:
> > >> > >
> > >> > > > Hi Mayuresh/Joel,
> > >> > > >
> > >> > > > Using the request channel as a dequeue was bright up some time
> ago
> > >> when
> > >> > > we
> > >> > > > initially thinking of prioritizing the request. The 

Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-07-19 Thread Mayuresh Gharat
Actually nvm, correlationId is reset in case of connection loss, I think.

Thanks,

Mayuresh

On Thu, Jul 19, 2018 at 11:11 AM Mayuresh Gharat 
wrote:

> I agree with Dong that out-of-order processing can happen with having 2
> separate queues as well and it can even happen today.
> Can we use the correlationId in the request from the controller to the
> broker to handle ordering ?
>
> Thanks,
>
> Mayuresh
>
>
> On Thu, Jul 19, 2018 at 6:41 AM Becket Qin  wrote:
>
>> Good point, Joel. I agree that a dedicated controller request handling
>> thread would be a better isolation. It also solves the reordering issue.
>>
>> On Thu, Jul 19, 2018 at 2:23 PM, Joel Koshy  wrote:
>>
>> > Good example. I think this scenario can occur in the current code as
>> well
>> > but with even lower probability given that there are other
>> non-controller
>> > requests interleaved. It is still sketchy though and I think a safer
>> > approach would be separate queues and pinning controller request
>> handling
>> > to one handler thread.
>> >
>> > On Wed, Jul 18, 2018 at 11:12 PM, Dong Lin  wrote:
>> >
>> > > Hey Becket,
>> > >
>> > > I think you are right that there may be out-of-order processing.
>> However,
>> > > it seems that out-of-order processing may also happen even if we use a
>> > > separate queue.
>> > >
>> > > Here is the example:
>> > >
>> > > - Controller sends R1 and got disconnected before receiving response.
>> > Then
>> > > it reconnects and sends R2. Both requests now stay in the controller
>> > > request queue in the order they are sent.
>> > > - thread1 takes R1_a from the request queue and then thread2 takes R2
>> > from
>> > > the request queue almost at the same time.
>> > > - So R1_a and R2 are processed in parallel. There is chance that R2's
>> > > processing is completed before R1.
>> > >
>> > > If out-of-order processing can happen for both approaches with very
>> low
>> > > probability, it may not be worthwhile to add the extra queue. What do
>> you
>> > > think?
>> > >
>> > > Thanks,
>> > > Dong
>> > >
>> > >
>> > > On Wed, Jul 18, 2018 at 6:17 PM, Becket Qin 
>> > wrote:
>> > >
>> > > > Hi Mayuresh/Joel,
>> > > >
>> > > > Using the request channel as a dequeue was bright up some time ago
>> when
>> > > we
>> > > > initially thinking of prioritizing the request. The concern was that
>> > the
>> > > > controller requests are supposed to be processed in order. If we can
>> > > ensure
>> > > > that there is one controller request in the request channel, the
>> order
>> > is
>> > > > not a concern. But in cases that there are more than one controller
>> > > request
>> > > > inserted into the queue, the controller request order may change and
>> > > cause
>> > > > problem. For example, think about the following sequence:
>> > > > 1. Controller successfully sent a request R1 to broker
>> > > > 2. Broker receives R1 and put the request to the head of the request
>> > > queue.
>> > > > 3. Controller to broker connection failed and the controller
>> > reconnected
>> > > to
>> > > > the broker.
>> > > > 4. Controller sends a request R2 to the broker
>> > > > 5. Broker receives R2 and add it to the head of the request queue.
>> > > > Now on the broker side, R2 will be processed before R1 is processed,
>> > > which
>> > > > may cause problem.
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Jiangjie (Becket) Qin
>> > > >
>> > > >
>> > > >
>> > > > On Thu, Jul 19, 2018 at 3:23 AM, Joel Koshy 
>> > wrote:
>> > > >
>> > > > > @Mayuresh - I like your idea. It appears to be a simpler less
>> > invasive
>> > > > > alternative and it should work. Jun/Becket/others, do you see any
>> > > > pitfalls
>> > > > > with this approach?
>> > > > >
>> > > > > On Wed, Jul 18, 2018 at 12:03 PM, Lucas Wang <
>> lucasatu...@gmail.com>
>> > > > > wrote:
>> > > 

Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-07-19 Thread Mayuresh Gharat
ise, it seems
> > > > > > the java class LinkedBlockingQueue can readily satisfy the
> > > requirement
> > > > > > by supporting a capacity, and also allowing inserting at both
> ends.
> > > > > >
> > > > > > My only concern is that this design is tied to the coincidence
> that
> > > > > > we have two request priorities and there are two ends to a deque.
> > > > > > Hence by using the proposed design, it seems the network layer is
> > > > > > more tightly coupled with upper layer logic, e.g. if we were to
> add
> > > > > > an extra priority level in the future for some reason, we would
> > > > probably
> > > > > > need to go back to the design of separate queues, one for each
> > > priority
> > > > > > level.
> > > > > >
> > > > > > In summary, I'm ok with both designs and lean toward your
> suggested
> > > > > > approach.
> > > > > > Let's hear what others think.
> > > > > >
> > > > > > @Becket,
> > > > > > In light of Mayuresh's suggested new design, I'm answering your
> > > > question
> > > > > > only in the context
> > > > > > of the current KIP design: I think your suggestion makes sense,
> and
> > > I'm
> > > > > ok
> > > > > > with removing the capacity config and
> > > > > > just relying on the default value of 20 being sufficient enough.
> > > > > >
> > > > > > Thanks,
> > > > > > Lucas
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Jul 18, 2018 at 9:57 AM, Mayuresh Gharat <
> > > > > > gharatmayures...@gmail.com
> > > > > > > wrote:
> > > > > >
> > > > > > > Hi Lucas,
> > > > > > >
> > > > > > > Seems like the main intent here is to prioritize the controller
> > > > request
> > > > > > > over any other requests.
> > > > > > > In that case, we can change the request queue to a dequeue,
> where
> > > you
> > > > > > > always insert the normal requests (produce, consume,..etc) to
> the
> > > end
> > > > > of
> > > > > > > the dequeue, but if its a controller request, you insert it to
> > the
> > > > head
> > > > > > of
> > > > > > > the queue. This ensures that the controller request will be
> given
> > > > > higher
> > > > > > > priority over other requests.
> > > > > > >
> > > > > > > Also since we only read one request from the socket and mute it
> > and
> > > > > only
> > > > > > > unmute it after handling the request, this would ensure that we
> > > don't
> > > > > > > handle controller requests out of order.
> > > > > > >
> > > > > > > With this approach we can avoid the second queue and the
> > additional
> > > > > > config
> > > > > > > for the size of the queue.
> > > > > > >
> > > > > > > What do you think ?
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Mayuresh
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Jul 18, 2018 at 3:05 AM Becket Qin <
> becket@gmail.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > Hey Joel,
> > > > > > > >
> > > > > > > > Thank for the detail explanation. I agree the current design
> > > makes
> > > > > > sense.
> > > > > > > > My confusion is about whether the new config for the
> controller
> > > > queue
> > > > > > > > capacity is necessary. I cannot think of a case in which
> users
> > > > would
> > > > > > > change
> > > > > > > > it.
> >

Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-07-18 Thread Mayuresh Gharat
Hi Lucas,

Seems like the main intent here is to prioritize the controller request
over any other requests.
In that case, we can change the request queue to a dequeue, where you
always insert the normal requests (produce, consume,..etc) to the end of
the dequeue, but if its a controller request, you insert it to the head of
the queue. This ensures that the controller request will be given higher
priority over other requests.

Also since we only read one request from the socket and mute it and only
unmute it after handling the request, this would ensure that we don't
handle controller requests out of order.

With this approach we can avoid the second queue and the additional config
for the size of the queue.

What do you think ?

Thanks,

Mayuresh


On Wed, Jul 18, 2018 at 3:05 AM Becket Qin  wrote:

> Hey Joel,
>
> Thank for the detail explanation. I agree the current design makes sense.
> My confusion is about whether the new config for the controller queue
> capacity is necessary. I cannot think of a case in which users would change
> it.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Wed, Jul 18, 2018 at 6:00 PM, Becket Qin  wrote:
>
> > Hi Lucas,
> >
> > I guess my question can be rephrased to "do we expect user to ever change
> > the controller request queue capacity"? If we agree that 20 is already a
> > very generous default number and we do not expect user to change it, is
> it
> > still necessary to expose this as a config?
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Wed, Jul 18, 2018 at 2:29 AM, Lucas Wang 
> wrote:
> >
> >> @Becket
> >> 1. Thanks for the comment. You are right that normally there should be
> >> just
> >> one controller request because of muting,
> >> and I had NOT intended to say there would be many enqueued controller
> >> requests.
> >> I went through the KIP again, and I'm not sure which part conveys that
> >> info.
> >> I'd be happy to revise if you point it out the section.
> >>
> >> 2. Though it should not happen in normal conditions, the current design
> >> does not preclude multiple controllers running
> >> at the same time, hence if we don't have the controller queue capacity
> >> config and simply make its capacity to be 1,
> >> network threads handling requests from different controllers will be
> >> blocked during those troublesome times,
> >> which is probably not what we want. On the other hand, adding the extra
> >> config with a default value, say 20, guards us from issues in those
> >> troublesome times, and IMO there isn't much downside of adding the extra
> >> config.
> >>
> >> @Mayuresh
> >> Good catch, this sentence is an obsolete statement based on a previous
> >> design. I've revised the wording in the KIP.
> >>
> >> Thanks,
> >> Lucas
> >>
> >> On Tue, Jul 17, 2018 at 10:33 AM, Mayuresh Gharat <
> >> gharatmayures...@gmail.com> wrote:
> >>
> >> > Hi Lucas,
> >> >
> >> > Thanks for the KIP.
> >> > I am trying to understand why you think "The memory consumption can
> rise
> >> > given the total number of queued requests can go up to 2x" in the
> impact
> >> > section. Normally the requests from controller to a Broker are not
> high
> >> > volume, right ?
> >> >
> >> >
> >> > Thanks,
> >> >
> >> > Mayuresh
> >> >
> >> > On Tue, Jul 17, 2018 at 5:06 AM Becket Qin 
> >> wrote:
> >> >
> >> > > Thanks for the KIP, Lucas. Separating the control plane from the
> data
> >> > plane
> >> > > makes a lot of sense.
> >> > >
> >> > > In the KIP you mentioned that the controller request queue may have
> >> many
> >> > > requests in it. Will this be a common case? The controller requests
> >> still
> >> > > goes through the SocketServer. The SocketServer will mute the
> channel
> >> > once
> >> > > a request is read and put into the request channel. So assuming
> there
> >> is
> >> > > only one connection between controller and each broker, on the
> broker
> >> > side,
> >> > > there should be only one controller request in the controller
> request
> >> > queue
> >> > > at any given time. If that is the case, do we need a separate
> >> controller
> >> > > request queue capa

Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-07-17 Thread Mayuresh Gharat
Hi Lucas,

Thanks for the KIP.
I am trying to understand why you think "The memory consumption can rise
given the total number of queued requests can go up to 2x" in the impact
section. Normally the requests from controller to a Broker are not high
volume, right ?


Thanks,

Mayuresh

On Tue, Jul 17, 2018 at 5:06 AM Becket Qin  wrote:

> Thanks for the KIP, Lucas. Separating the control plane from the data plane
> makes a lot of sense.
>
> In the KIP you mentioned that the controller request queue may have many
> requests in it. Will this be a common case? The controller requests still
> goes through the SocketServer. The SocketServer will mute the channel once
> a request is read and put into the request channel. So assuming there is
> only one connection between controller and each broker, on the broker side,
> there should be only one controller request in the controller request queue
> at any given time. If that is the case, do we need a separate controller
> request queue capacity config? The default value 20 means that we expect
> there are 20 controller switches to happen in a short period of time. I am
> not sure whether someone should increase the controller request queue
> capacity to handle such case, as it seems indicating something very wrong
> has happened.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
> On Fri, Jul 13, 2018 at 1:10 PM, Dong Lin  wrote:
>
> > Thanks for the update Lucas.
> >
> > I think the motivation section is intuitive. It will be good to learn
> more
> > about the comments from other reviewers.
> >
> > On Thu, Jul 12, 2018 at 9:48 PM, Lucas Wang 
> wrote:
> >
> > > Hi Dong,
> > >
> > > I've updated the motivation section of the KIP by explaining the cases
> > that
> > > would have user impacts.
> > > Please take a look at let me know your comments.
> > >
> > > Thanks,
> > > Lucas
> > >
> > > On Mon, Jul 9, 2018 at 5:53 PM, Lucas Wang 
> > wrote:
> > >
> > > > Hi Dong,
> > > >
> > > > The simulation of disk being slow is merely for me to easily
> construct
> > a
> > > > testing scenario
> > > > with a backlog of produce requests. In production, other than the
> disk
> > > > being slow, a backlog of
> > > > produce requests may also be caused by high produce QPS.
> > > > In that case, we may not want to kill the broker and that's when this
> > KIP
> > > > can be useful, both for JBOD
> > > > and non-JBOD setup.
> > > >
> > > > Going back to your previous question about each ProduceRequest
> covering
> > > 20
> > > > partitions that are randomly
> > > > distributed, let's say a LeaderAndIsr request is enqueued that tries
> to
> > > > switch the current broker, say broker0, from leader to follower
> > > > *for one of the partitions*, say *test-0*. For the sake of argument,
> > > > let's also assume the other brokers, say broker1, have *stopped*
> > fetching
> > > > from
> > > > the current broker, i.e. broker0.
> > > > 1. If the enqueued produce requests have acks =  -1 (ALL)
> > > >   1.1 without this KIP, the ProduceRequests ahead of LeaderAndISR
> will
> > be
> > > > put into the purgatory,
> > > > and since they'll never be replicated to other brokers
> (because
> > > of
> > > > the assumption made above), they will
> > > > be completed either when the LeaderAndISR request is
> processed
> > or
> > > > when the timeout happens.
> > > >   1.2 With this KIP, broker0 will immediately transition the
> partition
> > > > test-0 to become a follower,
> > > > after the current broker sees the replication of the
> remaining
> > 19
> > > > partitions, it can send a response indicating that
> > > > it's no longer the leader for the "test-0".
> > > >   To see the latency difference between 1.1 and 1.2, let's say there
> > are
> > > > 24K produce requests ahead of the LeaderAndISR, and there are 8 io
> > > threads,
> > > >   so each io thread will process approximately 3000 produce requests.
> > Now
> > > > let's investigate the io thread that finally processed the
> > LeaderAndISR.
> > > >   For the 3000 produce requests, if we model the time when their
> > > remaining
> > > > 19 partitions catch up as t0, t1, ...t2999, and the LeaderAndISR
> > request
> > > is
> > > > processed at time t3000.
> > > >   Without this KIP, the 1st produce request would have waited an
> extra
> > > > t3000 - t0 time in the purgatory, the 2nd an extra time of t3000 -
> t1,
> > > etc.
> > > >   Roughly speaking, the latency difference is bigger for the earlier
> > > > produce requests than for the later ones. For the same reason, the
> more
> > > > ProduceRequests queued
> > > >   before the LeaderAndISR, the bigger benefit we get (capped by the
> > > > produce timeout).
> > > > 2. If the enqueued produce requests have acks=0 or acks=1
> > > >   There will be no latency differences in this case, but
> > > >   2.1 without this KIP, the records of partition test-0 in the
> > > > ProduceRequests ahead of the LeaderAndISR will be appended to the
> local
> > > log,
> > > > 

[jira] [Created] (KAFKA-7096) Consumer should drop the data for unassigned topic partitions

2018-06-25 Thread Mayuresh Gharat (JIRA)
Mayuresh Gharat created KAFKA-7096:
--

 Summary: Consumer should drop the data for unassigned topic 
partitions
 Key: KAFKA-7096
 URL: https://issues.apache.org/jira/browse/KAFKA-7096
 Project: Kafka
  Issue Type: Improvement
Reporter: Mayuresh Gharat
Assignee: Mayuresh Gharat


currently if a client has assigned topics : T1, T2, T3 and calls poll(), the 
poll might fetch data for partitions for all 3 topics T1, T2, T3. Now if the 
client unassigns some topics (for example T3) and calls poll() we still hold 
the data (for T3) in the completedFetches queue until we actually reach the 
buffered data for the unassigned Topics (T3 in our example) on subsequent 
poll() calls, at which point we drop that data. This process of holding the 
data is unnecessary.

When a client creates a topic, it takes time for the broker to fetch ACLs for 
the topic. But during this time, the client will issue fetchRequest for the 
topic, it will get response for the partitions of this topic. The response 
consist of TopicAuthorizationException for each of the partitions. This 
response for each partition is wrapped with a completedFetch and added to the 
completedFetches queue. Now when the client calls the next poll() it sees the 
TopicAuthorizationException from the first buffered CompletedFetch. At this 
point the client chooses to sleep for 1.5 min as a backoff (as per the design), 
hoping that the Broker fetches the ACL from ACL store in the meantime. Actually 
the Broker has already fetched the ACL by this time. When the client calls 
poll() after the sleep, it again sees the TopicAuthorizationException from the 
second completedFetch and it sleeps again. So it takes (1.5 * 60 * partitions) 
seconds before the client can see any data. With this patch, the client when it 
sees the first TopicAuthorizationException, it can all assign(EmptySet), which 
will get rid of the buffered completedFetches (those with 
TopicAuthorizationException) and it can again call assign(TopicPartitions) 
before calling poll(). With this patch we found that client was able to get the 
records as soon as the Broker fetched the ACLs from ACL store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] KIP-189 - Improve principal builder interface and add support for SASL

2017-09-13 Thread Mayuresh Gharat
Sure .

Thanks,

Mayuresh

On Wed, Sep 13, 2017 at 3:12 PM, Jun Rao  wrote:

> Hi, Mayuresh,
>
> Does this KIP obviate the need for KIP-111? If so, could you close that
> one?
>
> Thanks,
>
> Jun
>
> On Wed, Sep 13, 2017 at 8:43 AM, Jason Gustafson 
> wrote:
>
>> Hi All,
>>
>> I wanted to mention one minor change that came out of the code review.
>> We've added an additional method to AuthenticationContext to expose the
>> address of the authenticated client. This can be useful, for example, to
>> enforce host-based quotas. I've updated the KIP.
>>
>> Thanks,
>> Jason
>>
>> On Fri, Sep 8, 2017 at 1:12 AM, Edoardo Comar  wrote:
>>
>> > I am late to the party and my +1 vote is useless - but I took eventually
>> > the time to go through it and it's a great improvement.
>> > It'd enable us to carry along with the Principal a couple of additional
>> > attributes without the hacks we're doing today :-)
>> >
>> > cheers
>> > --
>> >
>> > Edoardo Comar
>> >
>> > IBM Message Hub
>> >
>> > IBM UK Ltd, Hursley Park, SO21 2JN
>> >
>> >
>> >
>> > From:   Jason Gustafson 
>> > To: dev@kafka.apache.org
>> > Date:   07/09/2017 17:23
>> > Subject:Re: [VOTE] KIP-189 - Improve principal builder interface
>> > and add support for SASL
>> >
>> >
>> >
>> > I am closing the vote. Here are the totals:
>> >
>> > Binding: Ismael, Rajini, Jun, (Me)
>> > Non-binding: Mayuresh, Manikumar, Mickael
>> >
>> > Thanks all for the reviews!
>> >
>> >
>> >
>> > On Wed, Sep 6, 2017 at 2:22 PM, Jason Gustafson 
>> > wrote:
>> >
>> > > Hi All,
>> > >
>> > > When implementing this, I found that the SecurityProtocol class has
>> some
>> > > internal details which we might not want to expose to users (in
>> > particular
>> > > to enable testing). Since it's still useful to know the security
>> > protocol
>> > > in use in some cases, and since the security protocol names are
>> already
>> > > exposed in configuration (and hence cannot easily change), I have
>> > modified
>> > > the method in AuthenticationContext to return the name of the security
>> > > protocol instead. Let me know if there are any concerns with this
>> > change.
>> > > Otherwise, I will close out the vote.
>> > >
>> > > Thanks,
>> > > Jason
>> > >
>> > > On Tue, Sep 5, 2017 at 11:10 AM, Ismael Juma 
>> wrote:
>> > >
>> > >> Thanks for the KIP, +1 (binding).
>> > >>
>> > >> Ismael
>> > >>
>> > >> On Wed, Aug 30, 2017 at 4:51 PM, Jason Gustafson > >
>> > >> wrote:
>> > >>
>> > >> > I'd like to open the vote for KIP-189:
>> > >> >
>> > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.
>> > apache.org_confluence_display_KAFKA_KIP-2D=DwIBaQ=jf_
>> > iaSHvJObTbx-siA1ZOg=EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ=
>> > 8TbXL3wrbGFsuFCex8zcLvXRZAxdxLXNvEzr4K-VfSQ=
>> > zDCjH3kSYjz3pYaMq9En4suoqr4LNK54NfE95khHkRo=
>> >
>> > >> > 189%3A+Improve+principal+builder+interface+and+add+support+
>> for+SASL.
>> > >> > Thanks to everyone who helped review.
>> > >> >
>> > >> > -Jason
>> > >> >
>> > >>
>> > >
>> > >
>> >
>> >
>> >
>> > Unless stated otherwise above:
>> > IBM United Kingdom Limited - Registered in England and Wales with number
>> > 741598.
>> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
>> 3AU
>> >
>>
>
>


-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [VOTE] KIP-189 - Improve principal builder interface and add support for SASL

2017-08-30 Thread Mayuresh Gharat
+1 (non-binding)

Thanks,

Mayuresh

On Wed, Aug 30, 2017 at 8:51 AM, Jason Gustafson  wrote:

> I'd like to open the vote for KIP-189:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 189%3A+Improve+principal+builder+interface+and+add+support+for+SASL.
> Thanks to everyone who helped review.
>
> -Jason
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [DISCUSS] KIP-189: Improve principal builder interface and add support for SASL

2017-08-25 Thread Mayuresh Gharat
Perfect.
As long as there is a way we can access the originally created Principal in
the Authorizer, it would solve the KIP-111 issue.

This is really helpful, thanks again.

Thanks,

Mayuresh

On Fri, Aug 25, 2017 at 3:13 PM, Jason Gustafson <ja...@confluent.io> wrote:

> Hi Mayuresh,
>
> To clarify, the intention is to use the KafkaPrincipal object built by the
> KafkaPrincipalBuilder inside the Session. So we would remove the logic to
> construct a new KafkaPrincipal using only the name from the Principal. Then
> it should be possible to pass the `AuthzPrincipal` to the underlying
> library through the `Extended_Plugged_In_Class` as you've suggested above.
> Is that reasonable for this use case?
>
> Thanks,
> Jason
>
>
> On Fri, Aug 25, 2017 at 2:44 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Hi Jason,
> >
> > Thanks for the replies.
> >
> > I think it would be better to discuss with an example that we were trying
> > to address with KIP-111 and see if the current mentioned solution would
> > address it.
> >
> > Let's consider a third party library called authz_lib that is provided by
> > some Security team at  some company.
> >
> >- When we call authz_lib.createPrincipal(X509_cert), it would return
> an
> >AuthzPrincipal that implements Java.Security.Principal.
> >
> >
> >- The authz_lib also provides an checkAccess() call that takes in
> 3
> >parameters :
> >   - authz_principal
> >   - operation type ("Read", "Write"...)
> >   - resource (for simplicity lets consider it as a TopicName)
> >
> >
> >- The AuthzPrincipal looks like this :
> >
> > class AuthzPrincipal implements java.security.Principal
> > {
> > String name;
> > String field1;
> > Object field2;
> > Object field3;
> > .//Some third party logic..
> > }
> >
> >
> >- In PrincipalBuilder.buildPrincipal() would return AuthzPrincipal as
> >follows :
> >
> > public Principal buildPrincipal(...)
> > {
> > ..
> > X509Certificate x509Cert = session.getCert(..);
> > return authz_lib.createPrincipal(x509Cert);
> > }
> >
> >
> >- The custom Authorizer (lets call it CustomAuthzAuthorizer), we would
> >use the checkAccess() function provided by the authz_lib as follows :
> >
> > public class CustomAuthzAuthorizer implements Authorizer
> > {
> > .
> > public boolean authorize(.)
> > {
> >AuthzPrincipal authz_principal = (AuthzPrincipal)
> > session.getPrincipal();
> > return authz_lib.checkAccess(authz_principal, "Read", "topicX");
> > }
> > ..
> > }
> >
> >
> >- The issue with current implementation is that in
> >processCompletedReceives() in SocketServer we create a KafkaPrincipal
> >that just extracts the name from AuthzPrincipal as follows :
> >
> > session = RequestChannel.Session(new
> > KafkaPrincipal(KafkaPrincipal.USER_TYPE,
> > *openOrClosingChannel.principal.getName*),
> > openOrClosingChannel.socketAddress)
> >
> > So the "AuthzPrincipal authz_principal = (AuthzPrincipal)
> > session.getPrincipal()" call in the CustomAuthzAuthorizer would error
> > out because we are trying to cast a KafkaPrincipal to AuthzPrincipal.
> >
> >
> >
> > In your reply when you said that :
> >
> > The KIP says that a user can have a class that extends KafkaPrincipal.
> > Would this extended class be used when constructing the Session object
> > in the SocketServer instead of constructing a new KafkaPrincipal?
> >
> > Yes, that's correct. We want to allow the authorizer to be able to
> leverage
> > > additional information from the authentication layer.
> >
> >
> > Would it make sense to make this extended class pluggable and when
> > constructing the Session object in SocketServer check if a plugin is
> > defined and use it and if not use the default KafkaPrincipal something
> like
> > :
> >
> > if (getConfig("principal.pluggedIn.class").isDefined())
> > //"principal.pluggedIn.class"
> > is just an example name for the config
> > {
> > session = RequestChannel.Session(*Extended_Plugged_In_Class*,
> > openOrClosingChannel.socketAddress)
> > }
> > else
> > {
> > session = RequestChannel.Session(new KafkaPrincipal(KafkaPrincipal.
> > USER_TYPE,
> > *openOrClosingCha

Re: [DISCUSS] KIP-189: Improve principal builder interface and add support for SASL

2017-08-25 Thread Mayuresh Gharat
 >use this third party library to authorize using this custom Principal
> >object. The developer who is implementing the Kafka Authorizer should
> >not be caring about what the custom Principal would look like and its
> >details, since it will just pass it to the third party library in
> Kafka
> >Authorizer's authorize() call.
>
>
> I'm not sure I understand this. Are you saying that the authorizer and
> principal builder are implemented by separate individuals? If the
> authorizer doesn't understand how to identify the principal, then it
> wouldn't work, right? Maybe I'm missing something?
>
> Let me explain how I see this. The simple ACL authorizer that Kafka ships
> with understands user principals as consisting of a type and a name. Any
> principal builder that follows this assumption will work with the
> SimpleAclAuthorizer. In some cases, the principal builder may provide
> additional metadata in a KafkaPrincipal extension such as user groups or
> roles. This information is not needed to identify the user principal, so
> the builder is still compatible with the SimpleAclAuthorizer. It would also
> be compatible with a RoleBasedAuthorizer which understood how to use the
> role metadata provided by the KafkaPrincipal extension. Basically what we
> would have is a user principal which is related to one or more role
> principals through the KafkaPrincipal extension. Both user and role
> principals are identifiable with a type and a name, so the ACL command tool
> can then be used (perhaps with a custom authorizer) to define permissions
> in either case.
>
> On the other hand, if a user principal is identified by more than just its
> name, then it is not compatible with the SimpleAclAuthorizer. This doesn't
> necessarily rule out this use case. As long as the authorizer and the
> principal builder both agree on how user principals are identified, then
> they can still be used together. But I am explicitly leaving out support in
> the ACL command tool for this use case in this KIP. This is mostly about
> clarifying what is compatible with the authorization system that Kafka
> ships with. Of course we can always reconsider it in the future.
>
> Thanks,
> Jason
>
> On Fri, Aug 25, 2017 at 10:48 AM, Mayuresh Gharat <
> gharatmayures...@gmail.com> wrote:
>
> > Hi Jason,
> >
> > Thanks a lot for the KIP and sorry for the delayed response.
> >
> > I had a few questions :
> >
> >
> >- The KIP says that a user can have a class that extends
> KafkaPrincipal.
> >Would this extended class be used when constructing the Session object
> > in
> >the SocketServer instead of constructing a new KafkaPrincipal?
> >
> >
> >- The KIP says "A principal is always identifiable by a principal type
> >and a name. Nothing else should ever be required." This might not be
> > true
> >always, right? For example, we might have a custom third party ACL
> > library
> >that creates a custom Principal from the passed in cert (this is done
> in
> >PrincipalBuilder/KafkaPrincipalBuilder) and the custom Authorizer
> might
> >use this third party library to authorize using this custom Principal
> >object. The developer who is implementing the Kafka Authorizer should
> >not be caring about what the custom Principal would look like and its
> >details, since it will just pass it to the third party library in
> Kafka
> >Authorizer's authorize() call.
> >
> >
> > Thanks,
> >
> > Mayuresh
> >
> >
> > On Thu, Aug 24, 2017 at 10:21 AM, Mayuresh Gharat <
> > gharatmayures...@gmail.com> wrote:
> >
> > > Sure.
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Wed, Aug 23, 2017 at 5:07 PM, Jun Rao <j...@confluent.io> wrote:
> > >
> > >> Hi, Mayuresh,
> > >>
> > >> Since this KIP covers the requirement in KIP-111, could you review it
> > too?
> > >>
> > >> Thanks,
> > >>
> > >> Jun
> > >>
> > >>
> > >> On Tue, Aug 22, 2017 at 3:04 PM, Jason Gustafson <ja...@confluent.io>
> > >> wrote:
> > >>
> > >>> Bump. I'll open a vote in a few days if there are no comments.
> > >>>
> > >>> Thanks,
> > >>> Jason
> > >>>
> > >>> On Sat, Aug 19, 2017 at 12:28 AM, Ismael Juma <ism...@juma.me.uk>
> > wrote:
> > >>>
> > >>> > Thanks for the KIP Jason. It 

Re: [DISCUSS] KIP-189: Improve principal builder interface and add support for SASL

2017-08-25 Thread Mayuresh Gharat
Hi Jason,

Thanks a lot for the KIP and sorry for the delayed response.

I had a few questions :


   - The KIP says that a user can have a class that extends KafkaPrincipal.
   Would this extended class be used when constructing the Session object in
   the SocketServer instead of constructing a new KafkaPrincipal?


   - The KIP says "A principal is always identifiable by a principal type
   and a name. Nothing else should ever be required." This might not be true
   always, right? For example, we might have a custom third party ACL library
   that creates a custom Principal from the passed in cert (this is done in
   PrincipalBuilder/KafkaPrincipalBuilder) and the custom Authorizer might
   use this third party library to authorize using this custom Principal
   object. The developer who is implementing the Kafka Authorizer should
   not be caring about what the custom Principal would look like and its
   details, since it will just pass it to the third party library in Kafka
   Authorizer's authorize() call.


Thanks,

Mayuresh


On Thu, Aug 24, 2017 at 10:21 AM, Mayuresh Gharat <
gharatmayures...@gmail.com> wrote:

> Sure.
>
> Thanks,
>
> Mayuresh
>
> On Wed, Aug 23, 2017 at 5:07 PM, Jun Rao <j...@confluent.io> wrote:
>
>> Hi, Mayuresh,
>>
>> Since this KIP covers the requirement in KIP-111, could you review it too?
>>
>> Thanks,
>>
>> Jun
>>
>>
>> On Tue, Aug 22, 2017 at 3:04 PM, Jason Gustafson <ja...@confluent.io>
>> wrote:
>>
>>> Bump. I'll open a vote in a few days if there are no comments.
>>>
>>> Thanks,
>>> Jason
>>>
>>> On Sat, Aug 19, 2017 at 12:28 AM, Ismael Juma <ism...@juma.me.uk> wrote:
>>>
>>> > Thanks for the KIP Jason. It seems reasonable and cleans up some
>>> > inconsistencies in that area. It would be great to get some feedback
>>> from
>>> > Mayuresh and others who worked on KIP-111.
>>> >
>>> > Ismael
>>> >
>>> > On Thu, Aug 17, 2017 at 1:21 AM, Jason Gustafson <ja...@confluent.io>
>>> > wrote:
>>> >
>>> > > Hi All,
>>> > >
>>> > > I've added a new KIP to improve and extend the principal building API
>>> > that
>>> > > Kafka exposes:
>>> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>> > > 189%3A+Improve+principal+builder+interface+and+add+support+for+SASL
>>> > > .
>>> > >
>>> > > As always, feedback is appreciated.
>>> > >
>>> > > Thanks,
>>> > > Jason
>>> > >
>>> >
>>>
>>
>>
>
>
> --
> -Regards,
> Mayuresh R. Gharat
> (862) 250-7125
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [DISCUSS] KIP-189: Improve principal builder interface and add support for SASL

2017-08-24 Thread Mayuresh Gharat
Sure.

Thanks,

Mayuresh

On Wed, Aug 23, 2017 at 5:07 PM, Jun Rao  wrote:

> Hi, Mayuresh,
>
> Since this KIP covers the requirement in KIP-111, could you review it too?
>
> Thanks,
>
> Jun
>
>
> On Tue, Aug 22, 2017 at 3:04 PM, Jason Gustafson 
> wrote:
>
>> Bump. I'll open a vote in a few days if there are no comments.
>>
>> Thanks,
>> Jason
>>
>> On Sat, Aug 19, 2017 at 12:28 AM, Ismael Juma  wrote:
>>
>> > Thanks for the KIP Jason. It seems reasonable and cleans up some
>> > inconsistencies in that area. It would be great to get some feedback
>> from
>> > Mayuresh and others who worked on KIP-111.
>> >
>> > Ismael
>> >
>> > On Thu, Aug 17, 2017 at 1:21 AM, Jason Gustafson 
>> > wrote:
>> >
>> > > Hi All,
>> > >
>> > > I've added a new KIP to improve and extend the principal building API
>> > that
>> > > Kafka exposes:
>> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > > 189%3A+Improve+principal+builder+interface+and+add+support+for+SASL
>> > > .
>> > >
>> > > As always, feedback is appreciated.
>> > >
>> > > Thanks,
>> > > Jason
>> > >
>> >
>>
>
>


-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: Kafka Read Data from All Partition Using Key or Timestamp

2017-05-25 Thread Mayuresh Gharat
Hi Senthil,

Kafka does allow search message by timestamp after KIP-33 :
https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index#KIP-33-Addatimebasedlogindex-Searchmessagebytimestamp

The new consumer does provide you a way to get offsets by timestamp. You
can use these offsets to seek to that offset and consume from there. So if
you want to consume between a range you can get the start and end offset
based on the timestamps, seek to the start offset and consume and process
the data till you reach the end offset.

But these timestamps are either CreateTime(when the message was created and
you will have to specify this when you do the send()) or LogAppendTime(when
the message was appended to the log on the kafka broker) :
https://kafka.apache.org/0101/javadoc/org/apache/kafka/clients/producer/ProducerRecord.html

Kafka does not look at the fields in your data (key/value) for giving back
you the data. What I meant was it will not look at the timestamp specified
by you in the actual data payload.

Thanks,

Mayuresh

On Thu, May 25, 2017 at 12:43 PM, SenthilKumar K 
wrote:

> Hello Dev Team, Pls let me know if any option to read data from Kafka (all
> partition ) using timestamp . Also can we set custom offset value to
> messages ?
>
> Cheers,
> Senthil
>
> On Wed, May 24, 2017 at 7:33 PM, SenthilKumar K 
> wrote:
>
> > Hi All ,  We have been using Kafka for our Use Case which helps in
> > delivering real time raw logs.. I have a requirement to fetch data from
> > Kafka by using offset ..
> >
> > DataSet Example :
> > {"access_date":"2017-05-24 13:57:45.044","format":"json",
> > "start":"1490296463.031"}
> > {"access_date":"2017-05-24 13:57:46.044","format":"json",
> > "start":"1490296463.031"}
> > {"access_date":"2017-05-24 13:57:47.044","format":"json",
> > "start":"1490296463.031"}
> > {"access_date":"2017-05-24 13:58:02.042","format":"json",
> > "start":"1490296463.031"}
> >
> > Above JSON data will be stored in Kafka..
> >
> > Key --> acces_date in epoch format
> > Value --> whole JSON.
> >
> > Data Access Pattern:
> >   1) Get me last 2 minz data ?
> >2) Get me records between 2017-05-24 13:57:42:00 to 2017-05-24
> > 13:57:44:00 ?
> >
> > How to achieve this in Kafka ?
> >
> > I tried using SimpleConsumer , but it expects partition and not sure
> > SimpleConsumer would match our requirement...
> >
> > Appreciate you help !
> >
> > Cheers,
> > Senthil
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [VOTE] KIP-135 : Send of null key to a compacted topic should throw non-retriable error back to user

2017-04-19 Thread Mayuresh Gharat
Hi Ismael,

I went ahead and created a patch so that we get on same page regarding what
we want to do. We can make changes if you have  any comments.

Thanks,

Mayuresh

On Mon, Apr 10, 2017 at 11:32 AM, Mayuresh Gharat <
gharatmayures...@gmail.com> wrote:

> Got it.
> We can probably extend the InvalidRecordException with a more specific
> exception this use case and make it first class for produce side OR we can
> add an error code for InvalidRecordException in the Errors class and make
> it first class. I am fine either ways.
> What do you prefer?
>
> Thanks,
>
> Mayuresh
>
> On Mon, Apr 10, 2017 at 10:16 AM, Ismael Juma <ism...@juma.me.uk> wrote:
>
>> Hi Mayuresh,
>>
>> I was suggesting that we introduce a new error code for non retriable
>> invalid record exceptions (not sure what's a good name). We would then
>> change LogValidator and Log to use this new exception wherever it makes
>> sense (errors that are not retriable). One of many such cases is
>> https://github.com/apache/kafka/blob/5cf64f06a877a181d12a2ae2390516
>> ba1a572135/core/src/main/scala/kafka/log/LogValidator.scala#L78
>> <https://github.com/apache/kafka/blob/5cf64f06a877a181d12a2ae2390516ba1a572135/core/src/main/scala/kafka/log/LogValidator.scala#L78>
>>
>> Does that make sense?
>>
>> Ismael
>>
>> On Thu, Apr 6, 2017 at 5:50 PM, Mayuresh Gharat <
>> gharatmayures...@gmail.com>
>> wrote:
>>
>> > Hi Ismael,
>> >
>> > Are you suggesting to use the InvalidRecordException when the key is
>> null?
>> >
>> > Thanks,
>> >
>> > Mayuresh
>> >
>> > On Thu, Apr 6, 2017 at 8:49 AM, Ismael Juma <ism...@juma.me.uk> wrote:
>> >
>> > > Hi Mayuresh,
>> > >
>> > > I took a closer look at the code and we seem to throw
>> > > `InvalidRecordException` in a number of cases where retrying doesn't
>> seem
>> > > to make sense. For example:
>> > >
>> > > throw new InvalidRecordException(s"Log record magic does not match
>> outer
>> > > magic ${batch.magic}")
>> > > throw new InvalidRecordException("Found invalid number of record
>> headers
>> > "
>> > > + numHeaders);
>> > > throw new InvalidRecordException("Found invalid record count " +
>> > numRecords
>> > > + " in magic v" + magic() + " batch");
>> > >
>> > > It seems like most of the usage of InvalidRecordException is for non
>> > > retriable errors. Maybe we need to introduce a non retriable version
>> of
>> > > this exception and use it in the various places where it makes sense.
>> > >
>> > > Ismael
>> > >
>> > > On Tue, Apr 4, 2017 at 12:22 AM, Mayuresh Gharat <
>> > > gharatmayures...@gmail.com
>> > > > wrote:
>> > >
>> > > > Hi All,
>> > > >
>> > > > It seems that there is no further concern with the KIP-135. At this
>> > point
>> > > > we would like to start the voting process. The KIP can be found at
>> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > > > 135+%3A+Send+of+null+key+to+a+compacted+topic+should+throw+
>> > > > non-retriable+error+back+to+user
>> > > > <https://cwiki.apache.org/confluence/pages/viewpage.
>> > > action?pageId=67638388
>> > > > >
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Mayuresh
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > -Regards,
>> > Mayuresh R. Gharat
>> > (862) 250-7125
>> >
>>
>
>
>
> --
> -Regards,
> Mayuresh R. Gharat
> (862) 250-7125
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [VOTE] KIP-135 : Send of null key to a compacted topic should throw non-retriable error back to user

2017-04-10 Thread Mayuresh Gharat
Got it.
We can probably extend the InvalidRecordException with a more specific
exception this use case and make it first class for produce side OR we can
add an error code for InvalidRecordException in the Errors class and make
it first class. I am fine either ways.
What do you prefer?

Thanks,

Mayuresh

On Mon, Apr 10, 2017 at 10:16 AM, Ismael Juma <ism...@juma.me.uk> wrote:

> Hi Mayuresh,
>
> I was suggesting that we introduce a new error code for non retriable
> invalid record exceptions (not sure what's a good name). We would then
> change LogValidator and Log to use this new exception wherever it makes
> sense (errors that are not retriable). One of many such cases is
> https://github.com/apache/kafka/blob/5cf64f06a877a181d12a2ae2390516
> ba1a572135/core/src/main/scala/kafka/log/LogValidator.scala#L78
>
> Does that make sense?
>
> Ismael
>
> On Thu, Apr 6, 2017 at 5:50 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com>
> wrote:
>
> > Hi Ismael,
> >
> > Are you suggesting to use the InvalidRecordException when the key is
> null?
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Thu, Apr 6, 2017 at 8:49 AM, Ismael Juma <ism...@juma.me.uk> wrote:
> >
> > > Hi Mayuresh,
> > >
> > > I took a closer look at the code and we seem to throw
> > > `InvalidRecordException` in a number of cases where retrying doesn't
> seem
> > > to make sense. For example:
> > >
> > > throw new InvalidRecordException(s"Log record magic does not match
> outer
> > > magic ${batch.magic}")
> > > throw new InvalidRecordException("Found invalid number of record
> headers
> > "
> > > + numHeaders);
> > > throw new InvalidRecordException("Found invalid record count " +
> > numRecords
> > > + " in magic v" + magic() + " batch");
> > >
> > > It seems like most of the usage of InvalidRecordException is for non
> > > retriable errors. Maybe we need to introduce a non retriable version of
> > > this exception and use it in the various places where it makes sense.
> > >
> > > Ismael
> > >
> > > On Tue, Apr 4, 2017 at 12:22 AM, Mayuresh Gharat <
> > > gharatmayures...@gmail.com
> > > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > It seems that there is no further concern with the KIP-135. At this
> > point
> > > > we would like to start the voting process. The KIP can be found at
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 135+%3A+Send+of+null+key+to+a+compacted+topic+should+throw+
> > > > non-retriable+error+back+to+user
> > > > <https://cwiki.apache.org/confluence/pages/viewpage.
> > > action?pageId=67638388
> > > > >
> > > >
> > > > Thanks,
> > > >
> > > > Mayuresh
> > > >
> > >
> >
> >
> >
> > --
> > -Regards,
> > Mayuresh R. Gharat
> > (862) 250-7125
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [VOTE] KIP-135 : Send of null key to a compacted topic should throw non-retriable error back to user

2017-04-10 Thread Mayuresh Gharat
Ping Ismael.

Thanks,

Mayuresh

On Thu, Apr 6, 2017 at 9:50 AM, Mayuresh Gharat <gharatmayures...@gmail.com>
wrote:

> Hi Ismael,
>
> Are you suggesting to use the InvalidRecordException when the key is null?
>
> Thanks,
>
> Mayuresh
>
> On Thu, Apr 6, 2017 at 8:49 AM, Ismael Juma <ism...@juma.me.uk> wrote:
>
>> Hi Mayuresh,
>>
>> I took a closer look at the code and we seem to throw
>> `InvalidRecordException` in a number of cases where retrying doesn't seem
>> to make sense. For example:
>>
>> throw new InvalidRecordException(s"Log record magic does not match outer
>> magic ${batch.magic}")
>> throw new InvalidRecordException("Found invalid number of record headers "
>> + numHeaders);
>> throw new InvalidRecordException("Found invalid record count " +
>> numRecords
>> + " in magic v" + magic() + " batch");
>>
>> It seems like most of the usage of InvalidRecordException is for non
>> retriable errors. Maybe we need to introduce a non retriable version of
>> this exception and use it in the various places where it makes sense.
>>
>> Ismael
>>
>> On Tue, Apr 4, 2017 at 12:22 AM, Mayuresh Gharat <
>> gharatmayures...@gmail.com
>> > wrote:
>>
>> > Hi All,
>> >
>> > It seems that there is no further concern with the KIP-135. At this
>> point
>> > we would like to start the voting process. The KIP can be found at
>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > 135+%3A+Send+of+null+key+to+a+compacted+topic+should+throw+
>> > non-retriable+error+back+to+user
>> > <https://cwiki.apache.org/confluence/pages/viewpage.action?
>> pageId=67638388
>> > >
>> >
>> > Thanks,
>> >
>> > Mayuresh
>> >
>>
>
>
>
> --
> -Regards,
> Mayuresh R. Gharat
> (862) 250-7125
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [VOTE] KIP-135 : Send of null key to a compacted topic should throw non-retriable error back to user

2017-04-06 Thread Mayuresh Gharat
Hi Ismael,

Are you suggesting to use the InvalidRecordException when the key is null?

Thanks,

Mayuresh

On Thu, Apr 6, 2017 at 8:49 AM, Ismael Juma <ism...@juma.me.uk> wrote:

> Hi Mayuresh,
>
> I took a closer look at the code and we seem to throw
> `InvalidRecordException` in a number of cases where retrying doesn't seem
> to make sense. For example:
>
> throw new InvalidRecordException(s"Log record magic does not match outer
> magic ${batch.magic}")
> throw new InvalidRecordException("Found invalid number of record headers "
> + numHeaders);
> throw new InvalidRecordException("Found invalid record count " + numRecords
> + " in magic v" + magic() + " batch");
>
> It seems like most of the usage of InvalidRecordException is for non
> retriable errors. Maybe we need to introduce a non retriable version of
> this exception and use it in the various places where it makes sense.
>
> Ismael
>
> On Tue, Apr 4, 2017 at 12:22 AM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Hi All,
> >
> > It seems that there is no further concern with the KIP-135. At this point
> > we would like to start the voting process. The KIP can be found at
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 135+%3A+Send+of+null+key+to+a+compacted+topic+should+throw+
> > non-retriable+error+back+to+user
> > <https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=67638388
> > >
> >
> > Thanks,
> >
> > Mayuresh
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [VOTE] KIP-135 : Send of null key to a compacted topic should throw non-retriable error back to user

2017-04-05 Thread Mayuresh Gharat
Bumping up this thread.

On Mon, Apr 3, 2017 at 4:22 PM, Mayuresh Gharat <gharatmayures...@gmail.com>
wrote:

> Hi All,
>
> It seems that there is no further concern with the KIP-135. At this point
> we would like to start the voting process. The KIP can be found at
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 135+%3A+Send+of+null+key+to+a+compacted+topic+should+throw+
> non-retriable+error+back+to+user
> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=67638388>
>
> Thanks,
>
> Mayuresh
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


[VOTE] KIP-135 : Send of null key to a compacted topic should throw non-retriable error back to user

2017-04-03 Thread Mayuresh Gharat
Hi All,

It seems that there is no further concern with the KIP-135. At this point
we would like to start the voting process. The KIP can be found at
https://cwiki.apache.org/confluence/display/KAFKA/KIP-135+%3A+Send+of+null+key+to+a+compacted+topic+should+throw+non-retriable+error+back+to+user


Thanks,

Mayuresh


Re: [DISCUSS] KIP-135 : Send of null key to a compacted topic should throw non-retriable error back to user

2017-03-30 Thread Mayuresh Gharat
Hi Ismael,

I have updated the KIP. Let me know if everything looks fine then I will
begin voting.

Thanks,

Mayuresh

On Wed, Mar 29, 2017 at 9:06 AM, Mayuresh Gharat <gharatmayures...@gmail.com
> wrote:

> Hi Ismael,
>
> I agree. I will change the compatibility para and start voting.
>
> Thanks,
>
> Mayuresh
>
> On Tue, Mar 28, 2017 at 6:40 PM, Ismael Juma <ism...@juma.me.uk> wrote:
>
>> Hi,
>>
>> I think error messages and error codes serve different purposes. Error
>> messages provide additional information about the error, but users should
>> never have to match on a message to handle an error/exception. For this
>> case, it seems like this is a fatal error so we could get away with just
>> using an error message. Having said that, InvalidKeyError is not too
>> specific and I'm OK with that too.
>>
>> As I said earlier, I do think that we need to change the following
>>
>> "It is recommended that we upgrade the clients before the broker is
>> upgraded, so that the clients would be able to understand the new
>> exception."
>>
>> This is problematic since we want older clients to work with newer
>> brokers.
>> That's why I recommended that we only throw this error if the
>> ProduceRequest is version 3 or higher.
>>
>> Ismael
>>
>> P.S. Note that we already send error messages back for the CreateTopics
>> protocol API (I added that in the previous release).
>>
>> On Tue, Mar 28, 2017 at 7:22 AM, Mayuresh Gharat <
>> gharatmayures...@gmail.com
>> > wrote:
>>
>> > I think, it's OK to do this right now.
>> > The other KIP will have a wider base to cover as it will include other
>> > exceptions as well and will take time.
>> >
>> > Thanks,
>> >
>> > Mayuresh
>> >
>> > On Mon, Mar 27, 2017 at 11:20 PM Dong Lin <lindon...@gmail.com> wrote:
>> >
>> > > Sorry, I forget that you have mentioned this idea in your previous
>> > reply. I
>> > > guess the question is, do we still need this KIP if we can have custom
>> > > error message specified in the exception via the other KIP?
>> > >
>> > >
>> > > On Mon, Mar 27, 2017 at 11:00 PM, Mayuresh Gharat <
>> > > gharatmayures...@gmail.com> wrote:
>> > >
>> > > > Hi Dong,
>> > > >
>> > > > I do agree with that as I said before the thought did cross my mind
>> > and I
>> > > > am working on getting another KIP ready to have error responses
>> > returned
>> > > > back to the client.
>> > > >
>> > > > In my opinion, it's OK to add a new error code if it justifies the
>> > need.
>> > > As
>> > > > Ismael, mentioned on the jira, we need a specific non retriable
>> error
>> > > code
>> > > > in this case, with specific message, at least until the other KIP is
>> > > ready.
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Mayuresh
>> > > > On Mon, Mar 27, 2017 at 10:55 PM Dong Lin <lindon...@gmail.com>
>> wrote:
>> > > >
>> > > > > Hey Mayuresh,
>> > > > >
>> > > > > I get that you want to provide a more specific error message to
>> user.
>> > > > Then
>> > > > > would it be more useful to have a KIP that allows custom error
>> > message
>> > > to
>> > > > > be returned to client together with the exception in the response?
>> > For
>> > > > > example, broker can include in the response
>> > > PolicyViolationException("key
>> > > > > can not be null for non-compact topic ${topic}") and client can
>> print
>> > > > this
>> > > > > error message in the log. My concern with current KIP that it is
>> not
>> > > very
>> > > > > scalable to always have a KIP and class for every new error we may
>> > see
>> > > in
>> > > > > the future. The list of error classes, and the errors that need
>> to be
>> > > > > caught and handled by the client code, will increase overtime and
>> > > become
>> > > > > harder to maintain.
>> > > > >
>> > > > > Thanks,
>> > > > > Dong
>> > > > >
>> > > > >
>> > > > >

Re: [DISCUSS] KIP-135 : Send of null key to a compacted topic should throw non-retriable error back to user

2017-03-29 Thread Mayuresh Gharat
Hi Ismael,

I agree. I will change the compatibility para and start voting.

Thanks,

Mayuresh

On Tue, Mar 28, 2017 at 6:40 PM, Ismael Juma <ism...@juma.me.uk> wrote:

> Hi,
>
> I think error messages and error codes serve different purposes. Error
> messages provide additional information about the error, but users should
> never have to match on a message to handle an error/exception. For this
> case, it seems like this is a fatal error so we could get away with just
> using an error message. Having said that, InvalidKeyError is not too
> specific and I'm OK with that too.
>
> As I said earlier, I do think that we need to change the following
>
> "It is recommended that we upgrade the clients before the broker is
> upgraded, so that the clients would be able to understand the new
> exception."
>
> This is problematic since we want older clients to work with newer brokers.
> That's why I recommended that we only throw this error if the
> ProduceRequest is version 3 or higher.
>
> Ismael
>
> P.S. Note that we already send error messages back for the CreateTopics
> protocol API (I added that in the previous release).
>
> On Tue, Mar 28, 2017 at 7:22 AM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > I think, it's OK to do this right now.
> > The other KIP will have a wider base to cover as it will include other
> > exceptions as well and will take time.
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Mon, Mar 27, 2017 at 11:20 PM Dong Lin <lindon...@gmail.com> wrote:
> >
> > > Sorry, I forget that you have mentioned this idea in your previous
> > reply. I
> > > guess the question is, do we still need this KIP if we can have custom
> > > error message specified in the exception via the other KIP?
> > >
> > >
> > > On Mon, Mar 27, 2017 at 11:00 PM, Mayuresh Gharat <
> > > gharatmayures...@gmail.com> wrote:
> > >
> > > > Hi Dong,
> > > >
> > > > I do agree with that as I said before the thought did cross my mind
> > and I
> > > > am working on getting another KIP ready to have error responses
> > returned
> > > > back to the client.
> > > >
> > > > In my opinion, it's OK to add a new error code if it justifies the
> > need.
> > > As
> > > > Ismael, mentioned on the jira, we need a specific non retriable error
> > > code
> > > > in this case, with specific message, at least until the other KIP is
> > > ready.
> > > >
> > > > Thanks,
> > > >
> > > > Mayuresh
> > > > On Mon, Mar 27, 2017 at 10:55 PM Dong Lin <lindon...@gmail.com>
> wrote:
> > > >
> > > > > Hey Mayuresh,
> > > > >
> > > > > I get that you want to provide a more specific error message to
> user.
> > > > Then
> > > > > would it be more useful to have a KIP that allows custom error
> > message
> > > to
> > > > > be returned to client together with the exception in the response?
> > For
> > > > > example, broker can include in the response
> > > PolicyViolationException("key
> > > > > can not be null for non-compact topic ${topic}") and client can
> print
> > > > this
> > > > > error message in the log. My concern with current KIP that it is
> not
> > > very
> > > > > scalable to always have a KIP and class for every new error we may
> > see
> > > in
> > > > > the future. The list of error classes, and the errors that need to
> be
> > > > > caught and handled by the client code, will increase overtime and
> > > become
> > > > > harder to maintain.
> > > > >
> > > > > Thanks,
> > > > > Dong
> > > > >
> > > > >
> > > > > On Mon, Mar 27, 2017 at 7:20 PM, Mayuresh Gharat <
> > > > > gharatmayures...@gmail.com
> > > > > > wrote:
> > > > >
> > > > > > Hi Dong,
> > > > > >
> > > > > > I had thought about this before and wanted to do similar thing.
> But
> > > as
> > > > > was
> > > > > > pointed out in the jira ticket, we wanted something more specific
> > > than
> > > > > > general.
> > > > > > The main issue is that we do not propagate server side error
> > messages

Re: [DISCUSS] KIP-135 : Send of null key to a compacted topic should throw non-retriable error back to user

2017-03-28 Thread Mayuresh Gharat
I think, it's OK to do this right now.
The other KIP will have a wider base to cover as it will include other
exceptions as well and will take time.

Thanks,

Mayuresh

On Mon, Mar 27, 2017 at 11:20 PM Dong Lin <lindon...@gmail.com> wrote:

> Sorry, I forget that you have mentioned this idea in your previous reply. I
> guess the question is, do we still need this KIP if we can have custom
> error message specified in the exception via the other KIP?
>
>
> On Mon, Mar 27, 2017 at 11:00 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com> wrote:
>
> > Hi Dong,
> >
> > I do agree with that as I said before the thought did cross my mind and I
> > am working on getting another KIP ready to have error responses returned
> > back to the client.
> >
> > In my opinion, it's OK to add a new error code if it justifies the need.
> As
> > Ismael, mentioned on the jira, we need a specific non retriable error
> code
> > in this case, with specific message, at least until the other KIP is
> ready.
> >
> > Thanks,
> >
> > Mayuresh
> > On Mon, Mar 27, 2017 at 10:55 PM Dong Lin <lindon...@gmail.com> wrote:
> >
> > > Hey Mayuresh,
> > >
> > > I get that you want to provide a more specific error message to user.
> > Then
> > > would it be more useful to have a KIP that allows custom error message
> to
> > > be returned to client together with the exception in the response? For
> > > example, broker can include in the response
> PolicyViolationException("key
> > > can not be null for non-compact topic ${topic}") and client can print
> > this
> > > error message in the log. My concern with current KIP that it is not
> very
> > > scalable to always have a KIP and class for every new error we may see
> in
> > > the future. The list of error classes, and the errors that need to be
> > > caught and handled by the client code, will increase overtime and
> become
> > > harder to maintain.
> > >
> > > Thanks,
> > > Dong
> > >
> > >
> > > On Mon, Mar 27, 2017 at 7:20 PM, Mayuresh Gharat <
> > > gharatmayures...@gmail.com
> > > > wrote:
> > >
> > > > Hi Dong,
> > > >
> > > > I had thought about this before and wanted to do similar thing. But
> as
> > > was
> > > > pointed out in the jira ticket, we wanted something more specific
> than
> > > > general.
> > > > The main issue is that we do not propagate server side error messages
> > to
> > > > clients, right now. I am working on a KIP proposal to propose it.
> > > >
> > > > Thanks,
> > > >
> > > > Mayuresh
> > > >
> > > > On Mon, Mar 27, 2017 at 5:55 PM, Dong Lin <lindon...@gmail.com>
> wrote:
> > > >
> > > > > Hey Mayuresh,
> > > > >
> > > > > Thanks for the patch. I am wondering if it would be better to add a
> > > more
> > > > > general error, e.g. InvalidMessageException. The benefit is that we
> > can
> > > > > reuse this for other message level error instead of adding one
> > > exception
> > > > > class for each possible exception in the future. This is similar to
> > the
> > > > use
> > > > > of InvalidRequestException. For example, ListOffsetResponse may
> > return
> > > > > InvalidRequestException if duplicate partitions are found in the
> > > > > ListOffsetRequest. We don't return DuplicatedPartitionException in
> > this
> > > > > case.
> > > > >
> > > > > Thanks,
> > > > > Dong
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Mar 22, 2017 at 3:07 PM, Mayuresh Gharat <
> > > > > gharatmayures...@gmail.com
> > > > > > wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > We have created KIP-135 to propose that Kafka should return a
> > > > > non-retriable
> > > > > > error when the producer produces a message with null key to a log
> > > > > compacted
> > > > > > topic.
> > > > > >
> > > > > > Please find the KIP wiki in the link :
> > > > > >
> > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > 135+%3A+Send+of+null+key+to+a+compacted+topic+should+throw+
> > > > > > non-retriable+error+back+to+user.
> > > > > >
> > > > > >
> > > > > > We would love to hear your comments and suggestions.
> > > > > >
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Mayuresh
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -Regards,
> > > > Mayuresh R. Gharat
> > > > (862) 250-7125
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-135 : Send of null key to a compacted topic should throw non-retriable error back to user

2017-03-28 Thread Mayuresh Gharat
Hi Dong,

I do agree with that as I said before the thought did cross my mind and I
am working on getting another KIP ready to have error responses returned
back to the client.

In my opinion, it's OK to add a new error code if it justifies the need. As
Ismael, mentioned on the jira, we need a specific non retriable error code
in this case, with specific message, at least until the other KIP is ready.

Thanks,

Mayuresh
On Mon, Mar 27, 2017 at 10:55 PM Dong Lin <lindon...@gmail.com> wrote:

> Hey Mayuresh,
>
> I get that you want to provide a more specific error message to user. Then
> would it be more useful to have a KIP that allows custom error message to
> be returned to client together with the exception in the response? For
> example, broker can include in the response PolicyViolationException("key
> can not be null for non-compact topic ${topic}") and client can print this
> error message in the log. My concern with current KIP that it is not very
> scalable to always have a KIP and class for every new error we may see in
> the future. The list of error classes, and the errors that need to be
> caught and handled by the client code, will increase overtime and become
> harder to maintain.
>
> Thanks,
> Dong
>
>
> On Mon, Mar 27, 2017 at 7:20 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Hi Dong,
> >
> > I had thought about this before and wanted to do similar thing. But as
> was
> > pointed out in the jira ticket, we wanted something more specific than
> > general.
> > The main issue is that we do not propagate server side error messages to
> > clients, right now. I am working on a KIP proposal to propose it.
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Mon, Mar 27, 2017 at 5:55 PM, Dong Lin <lindon...@gmail.com> wrote:
> >
> > > Hey Mayuresh,
> > >
> > > Thanks for the patch. I am wondering if it would be better to add a
> more
> > > general error, e.g. InvalidMessageException. The benefit is that we can
> > > reuse this for other message level error instead of adding one
> exception
> > > class for each possible exception in the future. This is similar to the
> > use
> > > of InvalidRequestException. For example, ListOffsetResponse may return
> > > InvalidRequestException if duplicate partitions are found in the
> > > ListOffsetRequest. We don't return DuplicatedPartitionException in this
> > > case.
> > >
> > > Thanks,
> > > Dong
> > >
> > >
> > >
> > > On Wed, Mar 22, 2017 at 3:07 PM, Mayuresh Gharat <
> > > gharatmayures...@gmail.com
> > > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > We have created KIP-135 to propose that Kafka should return a
> > > non-retriable
> > > > error when the producer produces a message with null key to a log
> > > compacted
> > > > topic.
> > > >
> > > > Please find the KIP wiki in the link :
> > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 135+%3A+Send+of+null+key+to+a+compacted+topic+should+throw+
> > > > non-retriable+error+back+to+user.
> > > >
> > > >
> > > > We would love to hear your comments and suggestions.
> > > >
> > > >
> > > > Thanks,
> > > >
> > > > Mayuresh
> > > >
> > >
> >
> >
> >
> > --
> > -Regards,
> > Mayuresh R. Gharat
> > (862) 250-7125
> >
>


Re: [DISCUSS] KIP-135 : Send of null key to a compacted topic should throw non-retriable error back to user

2017-03-27 Thread Mayuresh Gharat
Hi Dong,

I had thought about this before and wanted to do similar thing. But as was
pointed out in the jira ticket, we wanted something more specific than
general.
The main issue is that we do not propagate server side error messages to
clients, right now. I am working on a KIP proposal to propose it.

Thanks,

Mayuresh

On Mon, Mar 27, 2017 at 5:55 PM, Dong Lin <lindon...@gmail.com> wrote:

> Hey Mayuresh,
>
> Thanks for the patch. I am wondering if it would be better to add a more
> general error, e.g. InvalidMessageException. The benefit is that we can
> reuse this for other message level error instead of adding one exception
> class for each possible exception in the future. This is similar to the use
> of InvalidRequestException. For example, ListOffsetResponse may return
> InvalidRequestException if duplicate partitions are found in the
> ListOffsetRequest. We don't return DuplicatedPartitionException in this
> case.
>
> Thanks,
> Dong
>
>
>
> On Wed, Mar 22, 2017 at 3:07 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Hi All,
> >
> > We have created KIP-135 to propose that Kafka should return a
> non-retriable
> > error when the producer produces a message with null key to a log
> compacted
> > topic.
> >
> > Please find the KIP wiki in the link :
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 135+%3A+Send+of+null+key+to+a+compacted+topic+should+throw+
> > non-retriable+error+back+to+user.
> >
> >
> > We would love to hear your comments and suggestions.
> >
> >
> > Thanks,
> >
> > Mayuresh
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [DISCUSS] KIP-135 : Send of null key to a compacted topic should throw non-retriable error back to user

2017-03-27 Thread Mayuresh Gharat
Hi Ismael,

Sure we can do that. Just wanted to check on the timeline on when this can
go in.
I can wait till the new ProduceRequest gets in to trunk.
On the other hand we can also support it in the existing code.
I am fine either ways.

Should I start Vote on this, so that we can get this approved?

Thanks,

Mayuresh

On Wed, Mar 22, 2017 at 11:49 PM, Ismael Juma <ism...@juma.me.uk> wrote:

> Thanks for the KIP Mayuresh. I suggest we only throw this error for
> ProduceRequest version 3, which is being introduced with KIP-98
> (Exactly-once). That way, the compatibility story is clearer, in my
> opinion.
>
> Ismael
>
> On Wed, Mar 22, 2017 at 10:07 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com> wrote:
>
> > Hi All,
> >
> > We have created KIP-135 to propose that Kafka should return a
> non-retriable
> > error when the producer produces a message with null key to a log
> compacted
> > topic.
> >
> > Please find the KIP wiki in the link :
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 135+%3A+Send+of+null+key+to+a+compacted+topic+should+throw+
> > non-retriable+error+back+to+user.
> >
> >
> > We would love to hear your comments and suggestions.
> >
> >
> > Thanks,
> >
> > Mayuresh
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [DISCUSS] KIP-135 : Send of null key to a compacted topic should throw non-retriable error back to user

2017-03-23 Thread Mayuresh Gharat
Hi James,

I meant that "it is recommended to upgrade clients before upgrading the
brokers".
Will update the KIP to reflect that.

Thanks,

Mayuresh

On Wed, Mar 22, 2017 at 4:42 PM, James Cheng <wushuja...@gmail.com> wrote:

> Mayuresh,
>
> The Compatibility/Migration section says to upgrade the clients first,
> before the brokers. Are you talking about implementation or deployment? Do
> you mean to implement the client changes before the broker changes? That
> would imply that it would take 2 Kafka releases to implement this KIP.
>
> Or, are you saying that when deploying this change, that you would
> recommend upgrading clients before upgrading brokers?
>
> Thanks,
> -James
>
>
> > On Mar 22, 2017, at 3:07 PM, Mayuresh Gharat <gharatmayures...@gmail.com>
> wrote:
> >
> > Hi All,
> >
> > We have created KIP-135 to propose that Kafka should return a
> non-retriable
> > error when the producer produces a message with null key to a log
> compacted
> > topic.
> >
> > Please find the KIP wiki in the link :
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 135+%3A+Send+of+null+key+to+a+compacted+topic+should+throw+
> non-retriable+error+back+to+user.
> >
> >
> > We would love to hear your comments and suggestions.
> >
> >
> > Thanks,
> >
> > Mayuresh
>
>


-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-03-22 Thread Mayuresh Gharat
Hi Jun,

Please find the replies inline.

One reason to have KafkaPrincipal in ACL is that we can extend it to
support group in the future. Have you thought about how to support that in
your new proposal?
---> This is a feature of PrincipalBuilder and Authorizer, which are
pluggable.
The type of Principal should be opaque to core Kafka. If we want to add
support to group, we can add that to KafkaPrincipal class and modify the
SimpleAclAuthorizer to add/modify/check the ACL accordingly.


Another reason that we had KafkaPrincipal is simplicity. It can be
constructed from a simple string and makes matching easier. If we
expose java.security.Principal,then I guess that when an ACL is set, we
have to be able to construct
a java.security.Principal from some string to match the
java.security.Principal generated from the
SSL or SASL library. How do we make sure that the same type of
java.security.Principal
can be created and will match?
> Again this will be determined by the plugged in Authorizer and
PrincipalBuilder. Your PrincipalBuilder can make sure that it creates a
Principal whose name matches the string you specified while creating the
ACL. The Authorizer should make sure that it extracts the String from the
Principal and do the matching.
In our earlier discussions, we discussed about having a PrincipalBuilder
class specifier as a command line argument for the kafka-acls.sh to handle
this case but we decided that it would be an overkill at this stage.

Thanks,

Mayuresh

On Mon, Mar 20, 2017 at 7:42 AM, Jun Rao <j...@confluent.io> wrote:

> Hi, Mayuresh,
>
> One reason to have KafkaPrincipal in ACL is that we can extend it to
> support group in the future. Have you thought about how to support that in
> your new proposal?
>
> Another reason that we had KafkaPrincipal is simplicity. It can be
> constructed from a simple string and makes matching easier. If we
> expose java.security.Principal,
> then I guess that when an ACL is set, we have to be able to construct
> a java.security.Principal
> from some string to match the java.security.Principal generated from the
> SSL or SASL library. How do we make sure that the same type of
> java.security.Principal
> can be created and will match?
>
> Thanks,
>
> Jun
>
>
> On Wed, Mar 15, 2017 at 8:48 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Hi Jun,
> >
> > Sorry for the delayed reply.
> > I agree that the easiest thing will be to add an additional field in the
> > Session class and we should be OK.
> > But having a KafkaPrincipal and java Principal with in the same class
> looks
> > little weird.
> >
> > So we can do this and slowly deprecate the usage of KafkaPrincipal in
> > public api's.
> >
> > We add new apis and make changes to the existing apis as follows :
> >
> >
> >- Changes to Session class :
> >
> > @Deprecated
> > case class Session(principal: KafkaPrincipal, clientAddress:
> InetAddress) {
> > val sanitizedUser = QuotaId.sanitize(principal.getName)
> > }
> >
> >
> > *@Deprecated .. (NEW)*
> >
> >
> > *case class Session(principal: KafkaPrincipal, clientAddress:
> InetAddress,
> > channelPrincipal: Java.security.Principal) {val sanitizedUser =
> > QuotaId.sanitize(principal.getName)}*
> >
> > *(NEW)*
> >
> >
> > *case class Session(principal: Java.security.Principal, clientAddress:
> > InetAddress) {val sanitizedUser = QuotaId.sanitize(principal.get
> > Name)}*
> >
> >
> >- Changes to Authorizer Interface :
> >
> > @Deprecated
> > def getAcls(principal: KafkaPrincipal): Map[Resource, Set[Acl]]
> >
> > *(NEW)*
> > *def getAcls(principal: Java.security.Principal): Map[Resource,
> Set[Acl]]*
> >
> >
> >- Changes to Acl class :
> >
> > @Deprecated
> > case class Acl(principal: KafkaPrincipal, permissionType: PermissionType,
> > host: String, operation: Operation)
> >
> >*(NEW)*
> >
> >
> > *case class Acl(principal: Java.security.Principal, permissionType:
> > PermissionType, host: String, operation: Operation) *
> > The one in Bold are the new api's. We will remove them eventually,
> probably
> > in next major release.
> > We don't want to get rid of KafkaPrincipal class and it will be used in
> the
> > same way as it does right now for out of box authorizer and commandline
> > tool. We would only be removing its direct usage from public apis.
> > Doing the above deprecation will help us to support other implementation
> of
> > Java.security.Principal as well which seems necessary especially since

[DISCUSS] KIP-135 : Send of null key to a compacted topic should throw non-retriable error back to user

2017-03-22 Thread Mayuresh Gharat
Hi All,

We have created KIP-135 to propose that Kafka should return a non-retriable
error when the producer produces a message with null key to a log compacted
topic.

Please find the KIP wiki in the link :

https://cwiki.apache.org/confluence/display/KAFKA/KIP-135+%3A+Send+of+null+key+to+a+compacted+topic+should+throw+non-retriable+error+back+to+user.


We would love to hear your comments and suggestions.


Thanks,

Mayuresh


[jira] [Comment Edited] (KAFKA-4808) send of null key to a compacted topic should throw error back to user

2017-03-21 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935327#comment-15935327
 ] 

Mayuresh Gharat edited comment on KAFKA-4808 at 3/21/17 9:00 PM:
-

[~ijuma] Please find the KIP  here : 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-135+%3A+Send+of+null+key+to+a+compacted+topic+should+throw+non-retriable+error+back+to+user.



was (Author: mgharat):
[~ijuma] Please find the KIP  here : 
https://cwiki.apache.org/confluence/display/KAFKA/Send+of+null+key+to+a+compacted+topic+should+throw+non-retriable+error+back+to+user.


> send of null key to a compacted topic should throw error back to user
> -
>
> Key: KAFKA-4808
> URL: https://issues.apache.org/jira/browse/KAFKA-4808
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.10.2.0
>Reporter: Ismael Juma
>Assignee: Mayuresh Gharat
> Fix For: 0.11.0.0
>
>
> If a message with a null key is produced to a compacted topic, the broker 
> returns `CorruptRecordException`, which is a retriable exception. As such, 
> the producer keeps retrying until retries are exhausted or request.timeout.ms 
> expires and eventually throws a TimeoutException. This is confusing and not 
> user-friendly.
> We should throw a meaningful error back to the user. From an implementation 
> perspective, we would have to use a non retriable error code to avoid this 
> issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4808) send of null key to a compacted topic should throw error back to user

2017-03-21 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935327#comment-15935327
 ] 

Mayuresh Gharat commented on KAFKA-4808:


[~ijuma] Please find the KIP  here : 
https://cwiki.apache.org/confluence/display/KAFKA/Send+of+null+key+to+a+compacted+topic+should+throw+non-retriable+error+back+to+user.


> send of null key to a compacted topic should throw error back to user
> -
>
> Key: KAFKA-4808
> URL: https://issues.apache.org/jira/browse/KAFKA-4808
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.10.2.0
>Reporter: Ismael Juma
>Assignee: Mayuresh Gharat
> Fix For: 0.11.0.0
>
>
> If a message with a null key is produced to a compacted topic, the broker 
> returns `CorruptRecordException`, which is a retriable exception. As such, 
> the producer keeps retrying until retries are exhausted or request.timeout.ms 
> expires and eventually throws a TimeoutException. This is confusing and not 
> user-friendly.
> We should throw a meaningful error back to the user. From an implementation 
> perspective, we would have to use a non retriable error code to avoid this 
> issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-03-15 Thread Mayuresh Gharat
Hi Jun,

Sorry for the delayed reply.
I agree that the easiest thing will be to add an additional field in the
Session class and we should be OK.
But having a KafkaPrincipal and java Principal with in the same class looks
little weird.

So we can do this and slowly deprecate the usage of KafkaPrincipal in
public api's.

We add new apis and make changes to the existing apis as follows :


   - Changes to Session class :

@Deprecated
case class Session(principal: KafkaPrincipal, clientAddress: InetAddress) {
val sanitizedUser = QuotaId.sanitize(principal.getName)
}


*@Deprecated .. (NEW)*


*case class Session(principal: KafkaPrincipal, clientAddress: InetAddress,
channelPrincipal: Java.security.Principal) {val sanitizedUser =
QuotaId.sanitize(principal.getName)}*

*(NEW)*


*case class Session(principal: Java.security.Principal, clientAddress:
InetAddress) {val sanitizedUser = QuotaId.sanitize(principal.getName)}*


   - Changes to Authorizer Interface :

@Deprecated
def getAcls(principal: KafkaPrincipal): Map[Resource, Set[Acl]]

*(NEW)*
*def getAcls(principal: Java.security.Principal): Map[Resource, Set[Acl]]*


   - Changes to Acl class :

@Deprecated
case class Acl(principal: KafkaPrincipal, permissionType: PermissionType,
host: String, operation: Operation)

   *(NEW)*


*case class Acl(principal: Java.security.Principal, permissionType:
PermissionType, host: String, operation: Operation) *
The one in Bold are the new api's. We will remove them eventually, probably
in next major release.
We don't want to get rid of KafkaPrincipal class and it will be used in the
same way as it does right now for out of box authorizer and commandline
tool. We would only be removing its direct usage from public apis.
Doing the above deprecation will help us to support other implementation of
Java.security.Principal as well which seems necessary especially since
Kafka provides pluggable Authorizer and PrincipalBuilder.

Let me know your thoughts on this.

Thanks,

Mayuresh

On Tue, Feb 28, 2017 at 2:33 PM, Mayuresh Gharat <gharatmayures...@gmail.com
> wrote:

> Hi Jun,
>
> Sure.
> I had an offline discussion with Joel on how we can deprecate the
> KafkaPrincipal from  Session and Authorizer.
> I will update the KIP to see if we can address all the concerns here. If
> not we can keep the KafkaPrincipal.
>
> Thanks,
>
> Mayuresh
>
> On Tue, Feb 28, 2017 at 1:53 PM, Jun Rao <j...@confluent.io> wrote:
>
>> Hi, Joel,
>>
>> Good point on the getAcls() method. KafkaPrincipal is also tied to ACL,
>> which is used in pretty much every method in Authorizer. Now, I am not
>> sure
>> if it's easy to deprecate KafkaPrincipal.
>>
>> Hi, Mayuresh,
>>
>> Given the above, it seems that the easiest thing is to add a new Principal
>> field in Session. We want to make it clear that it's ignored in the
>> default
>> implementation, but a customizer authorizer could take advantage of that.
>>
>> Thanks,
>>
>> Jun
>>
>> On Tue, Feb 28, 2017 at 10:52 AM, Joel Koshy <jjkosh...@gmail.com> wrote:
>>
>> > If we deprecate KafkaPrincipal, then the Authorizer interface will also
>> > need to change - i.e., deprecate the getAcls(KafkaPrincipal) method.
>> >
>> > On Tue, Feb 28, 2017 at 10:11 AM, Mayuresh Gharat <
>> > gharatmayures...@gmail.com> wrote:
>> >
>> > > Hi Jun/Ismael,
>> > >
>> > > Thanks for the comments.
>> > >
>> > > I agree.
>> > > What I was thinking was, we get the KIP passed now and wait till major
>> > > kafka version release. We can then make this change, but for now we
>> can
>> > > wait. Does that work?
>> > >
>> > > If there are concerns, we can make the addition of extra field of type
>> > > Principal to Session and then deprecate the KafkaPrincipal later.
>> > >
>> > > I am fine either ways. What do you think?
>> > >
>> > > Thanks,
>> > >
>> > > Mayuresh
>> > >
>> > > On Tue, Feb 28, 2017 at 9:53 AM, Jun Rao <j...@confluent.io> wrote:
>> > >
>> > > > Hi, Ismael,
>> > > >
>> > > > Good point on compatibility.
>> > > >
>> > > > Hi, Mayuresh,
>> > > >
>> > > > Given that, it seems that it's better to just add the raw principal
>> as
>> > a
>> > > > new field in Session for now and deprecate the KafkaPrincipal field
>> in
>> > > the
>> > > > future if needed?
>> > > >
>> > > > Thanks,
>> > > >
>> > 

Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-28 Thread Mayuresh Gharat
Hi Jun,

Sure.
I had an offline discussion with Joel on how we can deprecate the
KafkaPrincipal from  Session and Authorizer.
I will update the KIP to see if we can address all the concerns here. If
not we can keep the KafkaPrincipal.

Thanks,

Mayuresh

On Tue, Feb 28, 2017 at 1:53 PM, Jun Rao <j...@confluent.io> wrote:

> Hi, Joel,
>
> Good point on the getAcls() method. KafkaPrincipal is also tied to ACL,
> which is used in pretty much every method in Authorizer. Now, I am not sure
> if it's easy to deprecate KafkaPrincipal.
>
> Hi, Mayuresh,
>
> Given the above, it seems that the easiest thing is to add a new Principal
> field in Session. We want to make it clear that it's ignored in the default
> implementation, but a customizer authorizer could take advantage of that.
>
> Thanks,
>
> Jun
>
> On Tue, Feb 28, 2017 at 10:52 AM, Joel Koshy <jjkosh...@gmail.com> wrote:
>
> > If we deprecate KafkaPrincipal, then the Authorizer interface will also
> > need to change - i.e., deprecate the getAcls(KafkaPrincipal) method.
> >
> > On Tue, Feb 28, 2017 at 10:11 AM, Mayuresh Gharat <
> > gharatmayures...@gmail.com> wrote:
> >
> > > Hi Jun/Ismael,
> > >
> > > Thanks for the comments.
> > >
> > > I agree.
> > > What I was thinking was, we get the KIP passed now and wait till major
> > > kafka version release. We can then make this change, but for now we can
> > > wait. Does that work?
> > >
> > > If there are concerns, we can make the addition of extra field of type
> > > Principal to Session and then deprecate the KafkaPrincipal later.
> > >
> > > I am fine either ways. What do you think?
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Tue, Feb 28, 2017 at 9:53 AM, Jun Rao <j...@confluent.io> wrote:
> > >
> > > > Hi, Ismael,
> > > >
> > > > Good point on compatibility.
> > > >
> > > > Hi, Mayuresh,
> > > >
> > > > Given that, it seems that it's better to just add the raw principal
> as
> > a
> > > > new field in Session for now and deprecate the KafkaPrincipal field
> in
> > > the
> > > > future if needed?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Mon, Feb 27, 2017 at 5:05 PM, Ismael Juma <ism...@juma.me.uk>
> > wrote:
> > > >
> > > > > Breaking clients without a deprecation period is something we only
> do
> > > as
> > > > a
> > > > > last resort. Is there strong justification for doing it here?
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Mon, Feb 27, 2017 at 11:28 PM, Mayuresh Gharat <
> > > > > gharatmayures...@gmail.com> wrote:
> > > > >
> > > > > > Hi Ismael,
> > > > > >
> > > > > > Yeah. I agree that it might break the clients if the user is
> using
> > > the
> > > > > > kafkaPrincipal directly. But since KafkaPrincipal is also a Java
> > > > > Principal
> > > > > > and I think, it would be a right thing to do replace the
> > > kafkaPrincipal
> > > > > > with Java Principal at this stage than later.
> > > > > >
> > > > > > We can mention in the KIP, that it would break the clients that
> are
> > > > using
> > > > > > the KafkaPrincipal directly and they will have to use the
> > > PrincipalType
> > > > > > directly, if they are using it as its only one value and use the
> > name
> > > > > from
> > > > > > the Principal directly or create a KafkaPrincipal from Java
> > Principal
> > > > as
> > > > > we
> > > > > > are doing in SimpleAclAuthorizer with this KIP.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Mayuresh
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Feb 27, 2017 at 10:56 AM, Ismael Juma <ism...@juma.me.uk
> >
> > > > wrote:
> > > > > >
> > > > > > > Hi Mayuresh,
> > > > > > >
> > > > > > > Sorry for the delay. The updated KIP states that there is no
> > > > > > compatibility
> > > > >

Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-28 Thread Mayuresh Gharat
Hi Jun/Ismael,

Thanks for the comments.

I agree.
What I was thinking was, we get the KIP passed now and wait till major
kafka version release. We can then make this change, but for now we can
wait. Does that work?

If there are concerns, we can make the addition of extra field of type
Principal to Session and then deprecate the KafkaPrincipal later.

I am fine either ways. What do you think?

Thanks,

Mayuresh

On Tue, Feb 28, 2017 at 9:53 AM, Jun Rao <j...@confluent.io> wrote:

> Hi, Ismael,
>
> Good point on compatibility.
>
> Hi, Mayuresh,
>
> Given that, it seems that it's better to just add the raw principal as a
> new field in Session for now and deprecate the KafkaPrincipal field in the
> future if needed?
>
> Thanks,
>
> Jun
>
> On Mon, Feb 27, 2017 at 5:05 PM, Ismael Juma <ism...@juma.me.uk> wrote:
>
> > Breaking clients without a deprecation period is something we only do as
> a
> > last resort. Is there strong justification for doing it here?
> >
> > Ismael
> >
> > On Mon, Feb 27, 2017 at 11:28 PM, Mayuresh Gharat <
> > gharatmayures...@gmail.com> wrote:
> >
> > > Hi Ismael,
> > >
> > > Yeah. I agree that it might break the clients if the user is using the
> > > kafkaPrincipal directly. But since KafkaPrincipal is also a Java
> > Principal
> > > and I think, it would be a right thing to do replace the kafkaPrincipal
> > > with Java Principal at this stage than later.
> > >
> > > We can mention in the KIP, that it would break the clients that are
> using
> > > the KafkaPrincipal directly and they will have to use the PrincipalType
> > > directly, if they are using it as its only one value and use the name
> > from
> > > the Principal directly or create a KafkaPrincipal from Java Principal
> as
> > we
> > > are doing in SimpleAclAuthorizer with this KIP.
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > >
> > >
> > > On Mon, Feb 27, 2017 at 10:56 AM, Ismael Juma <ism...@juma.me.uk>
> wrote:
> > >
> > > > Hi Mayuresh,
> > > >
> > > > Sorry for the delay. The updated KIP states that there is no
> > > compatibility
> > > > impact, but that doesn't seem right. The fact that we changed the
> type
> > of
> > > > Session.principal to `Principal` means that any code that expects it
> to
> > > be
> > > > `KafkaPrincipal` will break. Either because of declared types
> (likely)
> > or
> > > > if it accesses `getPrincipalType` (unlikely since the value is always
> > the
> > > > same). It's a bit annoying, but we should add a new field to
> `Session`
> > > with
> > > > the original principal. We can potentially deprecate the existing
> one,
> > if
> > > > we're sure we don't need it (or we can leave it for now).
> > > >
> > > > Ismael
> > > >
> > > > On Mon, Feb 27, 2017 at 6:40 PM, Mayuresh Gharat <
> > > > gharatmayures...@gmail.com
> > > > > wrote:
> > > >
> > > > > Hi Ismael, Joel, Becket
> > > > >
> > > > > Would you mind taking a look at this. We require 2 more binding
> votes
> > > for
> > > > > the KIP to pass.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Mayuresh
> > > > >
> > > > > On Thu, Feb 23, 2017 at 10:57 AM, Dong Lin <lindon...@gmail.com>
> > > wrote:
> > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > On Wed, Feb 22, 2017 at 10:52 PM, Manikumar <
> > > manikumar.re...@gmail.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > +1 (non-binding)
> > > > > > >
> > > > > > > On Thu, Feb 23, 2017 at 3:27 AM, Mayuresh Gharat <
> > > > > > > gharatmayures...@gmail.com
> > > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Jun,
> > > > > > > >
> > > > > > > > Thanks a lot for the comments and reviews.
> > > > > > > > I agree we should log the username.
> > > > > > > > What I meant by creating KafkaPrincipal was, after this KIP
> we
> > > > would
> > > > >

Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-27 Thread Mayuresh Gharat
Hi Ismael,

Yeah. I agree that it might break the clients if the user is using the
kafkaPrincipal directly. But since KafkaPrincipal is also a Java Principal
and I think, it would be a right thing to do replace the kafkaPrincipal
with Java Principal at this stage than later.

We can mention in the KIP, that it would break the clients that are using
the KafkaPrincipal directly and they will have to use the PrincipalType
directly, if they are using it as its only one value and use the name from
the Principal directly or create a KafkaPrincipal from Java Principal as we
are doing in SimpleAclAuthorizer with this KIP.

Thanks,

Mayuresh



On Mon, Feb 27, 2017 at 10:56 AM, Ismael Juma <ism...@juma.me.uk> wrote:

> Hi Mayuresh,
>
> Sorry for the delay. The updated KIP states that there is no compatibility
> impact, but that doesn't seem right. The fact that we changed the type of
> Session.principal to `Principal` means that any code that expects it to be
> `KafkaPrincipal` will break. Either because of declared types (likely) or
> if it accesses `getPrincipalType` (unlikely since the value is always the
> same). It's a bit annoying, but we should add a new field to `Session` with
> the original principal. We can potentially deprecate the existing one, if
> we're sure we don't need it (or we can leave it for now).
>
> Ismael
>
> On Mon, Feb 27, 2017 at 6:40 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Hi Ismael, Joel, Becket
> >
> > Would you mind taking a look at this. We require 2 more binding votes for
> > the KIP to pass.
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Thu, Feb 23, 2017 at 10:57 AM, Dong Lin <lindon...@gmail.com> wrote:
> >
> > > +1 (non-binding)
> > >
> > > On Wed, Feb 22, 2017 at 10:52 PM, Manikumar <manikumar.re...@gmail.com
> >
> > > wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > On Thu, Feb 23, 2017 at 3:27 AM, Mayuresh Gharat <
> > > > gharatmayures...@gmail.com
> > > > > wrote:
> > > >
> > > > > Hi Jun,
> > > > >
> > > > > Thanks a lot for the comments and reviews.
> > > > > I agree we should log the username.
> > > > > What I meant by creating KafkaPrincipal was, after this KIP we
> would
> > > not
> > > > be
> > > > > required to create KafkaPrincipal and if we want to maintain the
> old
> > > > > logging, we will have to create it as we do today.
> > > > > I will take care that we specify the Principal name in the log.
> > > > >
> > > > > Thanks again for all the reviews.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Mayuresh
> > > > >
> > > > > On Wed, Feb 22, 2017 at 1:45 PM, Jun Rao <j...@confluent.io> wrote:
> > > > >
> > > > > > Hi, Mayuresh,
> > > > > >
> > > > > > For logging the user name, we could do either way. We just need
> to
> > > make
> > > > > > sure the expected user name is logged. Also, currently, we are
> > > already
> > > > > > creating a KafkaPrincipal on every request. +1 on the latest KIP.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > >
> > > > > > On Tue, Feb 21, 2017 at 8:05 PM, Mayuresh Gharat <
> > > > > > gharatmayures...@gmail.com
> > > > > > > wrote:
> > > > > >
> > > > > > > Hi Jun,
> > > > > > >
> > > > > > > Thanks for the comments.
> > > > > > >
> > > > > > > I will mention in the KIP : how this change doesn't affect the
> > > > default
> > > > > > > authorizer implementation.
> > > > > > >
> > > > > > > Regarding, Currently, we log the principal name in the request
> > log
> > > in
> > > > > > > RequestChannel, which has the format of "principalType +
> > SEPARATOR
> > > +
> > > > > > > name;".
> > > > > > > It would be good if we can keep the same convention after this
> > KIP.
> > > > One
> > > > > > way
> > > > > > > to do that is to convert java.security.Principal to
> > KafkaPrincipal
> > &

[jira] [Commented] (KAFKA-4808) send of null key to a compacted topic should throw error back to user

2017-02-27 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886365#comment-15886365
 ] 

Mayuresh Gharat commented on KAFKA-4808:


[~ijuma] Sure we can provide a better error message if we have a separate 
Error_Code. It looks like a subclass of InvalidRequestException as its indeed 
an invalid produce request for that topic.
I will work on the KIP.

Thanks,

Mayuresh

> send of null key to a compacted topic should throw error back to user
> -
>
> Key: KAFKA-4808
> URL: https://issues.apache.org/jira/browse/KAFKA-4808
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.10.2.0
>Reporter: Ismael Juma
>Assignee: Mayuresh Gharat
> Fix For: 0.10.3.0
>
>
> If a message with a null key is produced to a compacted topic, the broker 
> returns `CorruptRecordException`, which is a retriable exception. As such, 
> the producer keeps retrying until retries are exhausted or request.timeout.ms 
> expires and eventually throws a TimeoutException. This is confusing and not 
> user-friendly.
> We should throw a meaningful error back to the user. From an implementation 
> perspective, we would have to use a non retriable error code to avoid this 
> issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-126 - Allow KafkaProducer to batch based on uncompressed size

2017-02-27 Thread Mayuresh Gharat
Hi Becket,

Thanks for the expatiation.
Regarding :
1) The batch would be split when an RecordTooLargeException is received.

Lets say we sent the batch over the wire and received a
RecordTooLargeException, how do we split it as once we add the message to
the batch we loose the message level granularity. We will have to
decompress, do deep iteration and split and again compress. right? This
looks like a performance bottle neck in case of multi topic producers like
mirror maker.


Thanks,

Mayuresh

On Mon, Feb 27, 2017 at 10:51 AM, Becket Qin <becket@gmail.com> wrote:

> Hey Mayuresh,
>
> 1) The batch would be split when an RecordTooLargeException is received.
> 2) Not lower the actual compression ratio, but lower the estimated
> compression ratio "according to" the Actual Compression Ratio(ACR).
>
> An example, let's start with Estimated Compression Ratio (ECR) = 1.0. Say
> the compression ratio of ACR is ~0.8, instead of letting the ECR dropped to
> 0.8 very quickly, we only drop 0.001 every time when ACR < ECR. However,
> once we see an ACR > ECR, we increment ECR by 0.05. If a
> RecordTooLargeException is received, we reset the ECR back to 1.0 and split
> the batch.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
>
> On Mon, Feb 27, 2017 at 10:30 AM, Mayuresh Gharat <
> gharatmayures...@gmail.com> wrote:
>
> > Hi Becket,
> >
> > Seems like an interesting idea.
> > I had couple of questions :
> > 1) How do we decide when the batch should be split?
> > 2) What do you mean by slowly lowering the "actual" compression ratio?
> > An example would really help here.
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Fri, Feb 24, 2017 at 3:17 PM, Becket Qin <becket@gmail.com>
> wrote:
> >
> > > Hi Jay,
> > >
> > > Yeah, I got your point.
> > >
> > > I think there might be a solution which do not require adding a new
> > > configuration. We can start from a very conservative compression ratio
> > say
> > > 1.0 and lower it very slowly according to the actual compression ratio
> > > until we hit a point that we have to split a batch. At that point, we
> > > exponentially back off on the compression ratio. The idea is somewhat
> > like
> > > TCP. This should help avoid frequent split.
> > >
> > > The upper bound of the batch size is also a little awkward today
> because
> > we
> > > say the batch size is based on compressed size, but users cannot set it
> > to
> > > the max message size because that will result in oversized messages.
> With
> > > this change we will be able to allow the users to set the message size
> to
> > > close to max message size.
> > >
> > > However the downside is that there could be latency spikes in the
> system
> > in
> > > this case due to the splitting, especially when there are many messages
> > > need to be split at the same time. That could potentially be an issue
> for
> > > some users.
> > >
> > > What do you think about this approach?
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > >
> > >
> > > On Thu, Feb 23, 2017 at 1:31 PM, Jay Kreps <j...@confluent.io> wrote:
> > >
> > > > Hey Becket,
> > > >
> > > > Yeah that makes sense.
> > > >
> > > > I agree that you'd really have to both fix the estimation (i.e. make
> it
> > > per
> > > > topic or make it better estimate the high percentiles) AND have the
> > > > recovery mechanism. If you are underestimating often and then paying
> a
> > > high
> > > > recovery price that won't fly.
> > > >
> > > > I think you take my main point though, which is just that I hate to
> > > exposes
> > > > these super low level options to users because it is so hard to
> explain
> > > to
> > > > people what it means and how they should set it. So if it is possible
> > to
> > > > make either some combination of better estimation and splitting or
> > better
> > > > tolerance of overage that would be preferrable.
> > > >
> > > > -Jay
> > > >
> > > > On Thu, Feb 23, 2017 at 11:51 AM, Becket Qin <becket@gmail.com>
> > > wrote:
> > > >
> > > > > @Dong,
> > > > >
> > > > > Thanks for the comments. The default behavior of the producer won't
> > > > change.
> >

[jira] [Assigned] (KAFKA-4808) send of null key to a compacted topic should throw error back to user

2017-02-27 Thread Mayuresh Gharat (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayuresh Gharat reassigned KAFKA-4808:
--

Assignee: Mayuresh Gharat

> send of null key to a compacted topic should throw error back to user
> -
>
> Key: KAFKA-4808
> URL: https://issues.apache.org/jira/browse/KAFKA-4808
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.10.2.0
>Reporter: Ismael Juma
>Assignee: Mayuresh Gharat
> Fix For: 0.10.3.0
>
>
> If a message with a null key is produced to a compacted topic, the broker 
> returns `CorruptRecordException`, which is a retriable exception. As such, 
> the producer keeps retrying until retries are exhausted or request.timeout.ms 
> expires and eventually throws a TimeoutException. This is confusing and not 
> user-friendly.
> We should throw a meaningful error back to the user. From an implementation 
> perspective, we would have to use a non retriable error code to avoid this 
> issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4808) send of null key to a compacted topic should throw error back to user

2017-02-27 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886288#comment-15886288
 ] 

Mayuresh Gharat commented on KAFKA-4808:


[~ijuma] I was thinking if throwing "INVALID_REQUEST" exception should work 
here, or do we need to add a new Exception type (which would require a KIP)?

> send of null key to a compacted topic should throw error back to user
> -
>
> Key: KAFKA-4808
> URL: https://issues.apache.org/jira/browse/KAFKA-4808
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.10.2.0
>Reporter: Ismael Juma
> Fix For: 0.10.3.0
>
>
> If a message with a null key is produced to a compacted topic, the broker 
> returns `CorruptRecordException`, which is a retriable exception. As such, 
> the producer keeps retrying until retries are exhausted or request.timeout.ms 
> expires and eventually throws a TimeoutException. This is confusing and not 
> user-friendly.
> We should throw a meaningful error back to the user. From an implementation 
> perspective, we would have to use a non retriable error code to avoid this 
> issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-27 Thread Mayuresh Gharat
Hi Ismael, Joel, Becket

Would you mind taking a look at this. We require 2 more binding votes for
the KIP to pass.

Thanks,

Mayuresh

On Thu, Feb 23, 2017 at 10:57 AM, Dong Lin <lindon...@gmail.com> wrote:

> +1 (non-binding)
>
> On Wed, Feb 22, 2017 at 10:52 PM, Manikumar <manikumar.re...@gmail.com>
> wrote:
>
> > +1 (non-binding)
> >
> > On Thu, Feb 23, 2017 at 3:27 AM, Mayuresh Gharat <
> > gharatmayures...@gmail.com
> > > wrote:
> >
> > > Hi Jun,
> > >
> > > Thanks a lot for the comments and reviews.
> > > I agree we should log the username.
> > > What I meant by creating KafkaPrincipal was, after this KIP we would
> not
> > be
> > > required to create KafkaPrincipal and if we want to maintain the old
> > > logging, we will have to create it as we do today.
> > > I will take care that we specify the Principal name in the log.
> > >
> > > Thanks again for all the reviews.
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Wed, Feb 22, 2017 at 1:45 PM, Jun Rao <j...@confluent.io> wrote:
> > >
> > > > Hi, Mayuresh,
> > > >
> > > > For logging the user name, we could do either way. We just need to
> make
> > > > sure the expected user name is logged. Also, currently, we are
> already
> > > > creating a KafkaPrincipal on every request. +1 on the latest KIP.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > >
> > > > On Tue, Feb 21, 2017 at 8:05 PM, Mayuresh Gharat <
> > > > gharatmayures...@gmail.com
> > > > > wrote:
> > > >
> > > > > Hi Jun,
> > > > >
> > > > > Thanks for the comments.
> > > > >
> > > > > I will mention in the KIP : how this change doesn't affect the
> > default
> > > > > authorizer implementation.
> > > > >
> > > > > Regarding, Currently, we log the principal name in the request log
> in
> > > > > RequestChannel, which has the format of "principalType + SEPARATOR
> +
> > > > > name;".
> > > > > It would be good if we can keep the same convention after this KIP.
> > One
> > > > way
> > > > > to do that is to convert java.security.Principal to KafkaPrincipal
> > for
> > > > > logging the requests.
> > > > > --- > This would mean we have to create a new KafkaPrincipal on
> each
> > > > > request. Would it be OK to just specify the name of the principal.
> > > > > Is there any major reason, we don't want to change the logging
> > format?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Mayuresh
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Feb 20, 2017 at 10:18 PM, Jun Rao <j...@confluent.io>
> wrote:
> > > > >
> > > > > > Hi, Mayuresh,
> > > > > >
> > > > > > Thanks for the updated KIP. A couple of more comments.
> > > > > >
> > > > > > 1. Do we convert java.security.Principal to KafkaPrincipal for
> > > > > > authorization check in SimpleAclAuthorizer? If so, it would be
> > useful
> > > > to
> > > > > > mention that in the wiki so that people can understand how this
> > > change
> > > > > > doesn't affect the default authorizer implementation.
> > > > > >
> > > > > > 2. Currently, we log the principal name in the request log in
> > > > > > RequestChannel, which has the format of "principalType +
> SEPARATOR
> > +
> > > > > > name;".
> > > > > > It would be good if we can keep the same convention after this
> KIP.
> > > One
> > > > > way
> > > > > > to do that is to convert java.security.Principal to
> KafkaPrincipal
> > > for
> > > > > > logging the requests.
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > >
> > > > > > On Fri, Feb 17, 2017 at 5:35 PM, Mayuresh Gharat <
> > > > > > gharatmayures...@gmail.com
> > > > > > > wrote:
> > > > > >
> > > > > > > Hi Jun,
> > > > > > &

Re: [DISCUSS] KIP-126 - Allow KafkaProducer to batch based on uncompressed size

2017-02-27 Thread Mayuresh Gharat
Hi Becket,

Seems like an interesting idea.
I had couple of questions :
1) How do we decide when the batch should be split?
2) What do you mean by slowly lowering the "actual" compression ratio?
An example would really help here.

Thanks,

Mayuresh

On Fri, Feb 24, 2017 at 3:17 PM, Becket Qin  wrote:

> Hi Jay,
>
> Yeah, I got your point.
>
> I think there might be a solution which do not require adding a new
> configuration. We can start from a very conservative compression ratio say
> 1.0 and lower it very slowly according to the actual compression ratio
> until we hit a point that we have to split a batch. At that point, we
> exponentially back off on the compression ratio. The idea is somewhat like
> TCP. This should help avoid frequent split.
>
> The upper bound of the batch size is also a little awkward today because we
> say the batch size is based on compressed size, but users cannot set it to
> the max message size because that will result in oversized messages. With
> this change we will be able to allow the users to set the message size to
> close to max message size.
>
> However the downside is that there could be latency spikes in the system in
> this case due to the splitting, especially when there are many messages
> need to be split at the same time. That could potentially be an issue for
> some users.
>
> What do you think about this approach?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
>
> On Thu, Feb 23, 2017 at 1:31 PM, Jay Kreps  wrote:
>
> > Hey Becket,
> >
> > Yeah that makes sense.
> >
> > I agree that you'd really have to both fix the estimation (i.e. make it
> per
> > topic or make it better estimate the high percentiles) AND have the
> > recovery mechanism. If you are underestimating often and then paying a
> high
> > recovery price that won't fly.
> >
> > I think you take my main point though, which is just that I hate to
> exposes
> > these super low level options to users because it is so hard to explain
> to
> > people what it means and how they should set it. So if it is possible to
> > make either some combination of better estimation and splitting or better
> > tolerance of overage that would be preferrable.
> >
> > -Jay
> >
> > On Thu, Feb 23, 2017 at 11:51 AM, Becket Qin 
> wrote:
> >
> > > @Dong,
> > >
> > > Thanks for the comments. The default behavior of the producer won't
> > change.
> > > If the users want to use the uncompressed message size, they probably
> > will
> > > also bump up the batch size to somewhere close to the max message size.
> > > This would be in the document. BTW the default batch size is 16K which
> is
> > > pretty small.
> > >
> > > @Jay,
> > >
> > > Yeah, we actually had debated quite a bit internally what is the best
> > > solution to this.
> > >
> > > I completely agree it is a bug. In practice we usually leave some
> > headroom
> > > to allow the compressed size to grow a little if the the original
> > messages
> > > are not compressible, for example, 1000 KB instead of exactly 1 MB. It
> is
> > > likely safe enough.
> > >
> > > The major concern for the rejected alternative is performance. It
> largely
> > > depends on how frequent we need to split a batch, i.e. how likely the
> > > estimation can go off. If we only need to the split work occasionally,
> > the
> > > cost would be amortized so we don't need to worry about it too much.
> > > However, it looks that for a producer with shared topics, the
> estimation
> > is
> > > always off. As an example, consider two topics, one with compression
> > ratio
> > > 0.6 the other 0.2, assuming exactly same traffic, the average
> compression
> > > ratio would be roughly 0.4, which is not right for either of the
> topics.
> > So
> > > almost half of the batches (of the topics with 0.6 compression ratio)
> > will
> > > end up larger than the configured batch size. When it comes to more
> > topics
> > > such as mirror maker, this becomes more unpredictable. To avoid
> frequent
> > > rejection / split of the batches, we need to configured the batch size
> > > pretty conservatively. This could actually hurt the performance because
> > we
> > > are shoehorn the messages that are highly compressible to a small batch
> > so
> > > that the other topics that are not that compressible will not become
> too
> > > large with the same batch size. At LinkedIn, our batch size is
> configured
> > > to 64 KB because of this. I think we may actually have better batching
> if
> > > we just use the uncompressed message size and 800 KB batch size.
> > >
> > > We did not think about loosening the message size restriction, but that
> > > sounds a viable solution given that the consumer now can fetch
> oversized
> > > messages. One concern would be that on the broker side oversized
> messages
> > > will bring more memory pressure. With KIP-92, we may mitigate that, but
> > the
> > > memory allocation for large messages may not be very GC friendly. I
> need
> > 

Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-22 Thread Mayuresh Gharat
Hi Jun,

Thanks a lot for the comments and reviews.
I agree we should log the username.
What I meant by creating KafkaPrincipal was, after this KIP we would not be
required to create KafkaPrincipal and if we want to maintain the old
logging, we will have to create it as we do today.
I will take care that we specify the Principal name in the log.

Thanks again for all the reviews.

Thanks,

Mayuresh

On Wed, Feb 22, 2017 at 1:45 PM, Jun Rao <j...@confluent.io> wrote:

> Hi, Mayuresh,
>
> For logging the user name, we could do either way. We just need to make
> sure the expected user name is logged. Also, currently, we are already
> creating a KafkaPrincipal on every request. +1 on the latest KIP.
>
> Thanks,
>
> Jun
>
>
> On Tue, Feb 21, 2017 at 8:05 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Hi Jun,
> >
> > Thanks for the comments.
> >
> > I will mention in the KIP : how this change doesn't affect the default
> > authorizer implementation.
> >
> > Regarding, Currently, we log the principal name in the request log in
> > RequestChannel, which has the format of "principalType + SEPARATOR +
> > name;".
> > It would be good if we can keep the same convention after this KIP. One
> way
> > to do that is to convert java.security.Principal to KafkaPrincipal for
> > logging the requests.
> > --- > This would mean we have to create a new KafkaPrincipal on each
> > request. Would it be OK to just specify the name of the principal.
> > Is there any major reason, we don't want to change the logging format?
> >
> > Thanks,
> >
> > Mayuresh
> >
> >
> >
> > On Mon, Feb 20, 2017 at 10:18 PM, Jun Rao <j...@confluent.io> wrote:
> >
> > > Hi, Mayuresh,
> > >
> > > Thanks for the updated KIP. A couple of more comments.
> > >
> > > 1. Do we convert java.security.Principal to KafkaPrincipal for
> > > authorization check in SimpleAclAuthorizer? If so, it would be useful
> to
> > > mention that in the wiki so that people can understand how this change
> > > doesn't affect the default authorizer implementation.
> > >
> > > 2. Currently, we log the principal name in the request log in
> > > RequestChannel, which has the format of "principalType + SEPARATOR +
> > > name;".
> > > It would be good if we can keep the same convention after this KIP. One
> > way
> > > to do that is to convert java.security.Principal to KafkaPrincipal for
> > > logging the requests.
> > >
> > > Jun
> > >
> > >
> > > On Fri, Feb 17, 2017 at 5:35 PM, Mayuresh Gharat <
> > > gharatmayures...@gmail.com
> > > > wrote:
> > >
> > > > Hi Jun,
> > > >
> > > > I have updated the KIP. Would you mind taking another look?
> > > >
> > > > Thanks,
> > > >
> > > > Mayuresh
> > > >
> > > > On Fri, Feb 17, 2017 at 4:42 PM, Mayuresh Gharat <
> > > > gharatmayures...@gmail.com
> > > > > wrote:
> > > >
> > > > > Hi Jun,
> > > > >
> > > > > Sure sounds good to me.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Mayuresh
> > > > >
> > > > > On Fri, Feb 17, 2017 at 1:54 PM, Jun Rao <j...@confluent.io> wrote:
> > > > >
> > > > >> Hi, Mani,
> > > > >>
> > > > >> Good point on using PrincipalBuilder for SASL. It seems that
> > > > >> PrincipalBuilder already has access to Authenticator. So, we could
> > > just
> > > > >> enable that in SaslChannelBuilder. We probably could do that in a
> > > > separate
> > > > >> KIP?
> > > > >>
> > > > >> Hi, Mayuresh,
> > > > >>
> > > > >> If you don't think there is a concrete use case for using
> > > > >> PrincipalBuilder in
> > > > >> kafka-acls.sh, perhaps we could do the simpler approach for now?
> > > > >>
> > > > >> Thanks,
> > > > >>
> > > > >> Jun
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Fri, Feb 17, 2017 at 12:23 PM, Mayuresh Gharat <
> > > > >> gharatmayures...@gmail.com> wrote:
> > > > >>
> &

Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-21 Thread Mayuresh Gharat
Hi Jun,

Thanks for the comments.

I will mention in the KIP : how this change doesn't affect the default
authorizer implementation.

Regarding, Currently, we log the principal name in the request log in
RequestChannel, which has the format of "principalType + SEPARATOR + name;".
It would be good if we can keep the same convention after this KIP. One way
to do that is to convert java.security.Principal to KafkaPrincipal for
logging the requests.
--- > This would mean we have to create a new KafkaPrincipal on each
request. Would it be OK to just specify the name of the principal.
Is there any major reason, we don't want to change the logging format?

Thanks,

Mayuresh



On Mon, Feb 20, 2017 at 10:18 PM, Jun Rao <j...@confluent.io> wrote:

> Hi, Mayuresh,
>
> Thanks for the updated KIP. A couple of more comments.
>
> 1. Do we convert java.security.Principal to KafkaPrincipal for
> authorization check in SimpleAclAuthorizer? If so, it would be useful to
> mention that in the wiki so that people can understand how this change
> doesn't affect the default authorizer implementation.
>
> 2. Currently, we log the principal name in the request log in
> RequestChannel, which has the format of "principalType + SEPARATOR +
> name;".
> It would be good if we can keep the same convention after this KIP. One way
> to do that is to convert java.security.Principal to KafkaPrincipal for
> logging the requests.
>
> Jun
>
>
> On Fri, Feb 17, 2017 at 5:35 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Hi Jun,
> >
> > I have updated the KIP. Would you mind taking another look?
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Fri, Feb 17, 2017 at 4:42 PM, Mayuresh Gharat <
> > gharatmayures...@gmail.com
> > > wrote:
> >
> > > Hi Jun,
> > >
> > > Sure sounds good to me.
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Fri, Feb 17, 2017 at 1:54 PM, Jun Rao <j...@confluent.io> wrote:
> > >
> > >> Hi, Mani,
> > >>
> > >> Good point on using PrincipalBuilder for SASL. It seems that
> > >> PrincipalBuilder already has access to Authenticator. So, we could
> just
> > >> enable that in SaslChannelBuilder. We probably could do that in a
> > separate
> > >> KIP?
> > >>
> > >> Hi, Mayuresh,
> > >>
> > >> If you don't think there is a concrete use case for using
> > >> PrincipalBuilder in
> > >> kafka-acls.sh, perhaps we could do the simpler approach for now?
> > >>
> > >> Thanks,
> > >>
> > >> Jun
> > >>
> > >>
> > >>
> > >> On Fri, Feb 17, 2017 at 12:23 PM, Mayuresh Gharat <
> > >> gharatmayures...@gmail.com> wrote:
> > >>
> > >> > @Manikumar,
> > >> >
> > >> > Can you give an example how you are planning to use
> PrincipalBuilder?
> > >> >
> > >> > @Jun
> > >> > Yes, that is right. To give a brief overview, we just extract the
> cert
> > >> and
> > >> > hand it over to a third party library for creating a Principal. So
> we
> > >> > cannot create a Principal from just a string.
> > >> > The main motive behind adding the PrincipalBuilder for kafk-acls.sh
> > was
> > >> > that someone else (who can generate a Principal from map of
> propertie,
> > >> > <String, String> for example) can use it.
> > >> > As I said, Linkedin is fine with not making any changes to
> > Kafka-acls.sh
> > >> > for now. But we thought that it would be a good improvement to the
> > tool
> > >> and
> > >> > it makes it more flexible and usable.
> > >> >
> > >> > Let us know your thoughts, if you would like us to make
> kafka-acls.sh
> > >> more
> > >> > flexible and usable and not limited to Authorizer coming out of the
> > box.
> > >> >
> > >> > Thanks,
> > >> >
> > >> > Mayuresh
> > >> >
> > >> >
> > >> > On Thu, Feb 16, 2017 at 10:18 PM, Manikumar <
> > manikumar.re...@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > Hi Jun,
> > >> > >
> > >> > > yes, we can just customize rules to send full principal name.  I
> was
> > >> > > just thi

Re: [DISCUSS] KIP-126 - Allow KafkaProducer to batch based on uncompressed size

2017-02-21 Thread Mayuresh Gharat
Apurva has a point that can be documented for this config.

Overall, LGTM +1.

Thanks,

Mayuresh

On Tue, Feb 21, 2017 at 6:41 PM, Becket Qin  wrote:

> Hi Apurva,
>
> Yes, it is true that the request size might be much smaller if the batching
> is based on uncompressed size. I will let the users know about this. That
> said, in practice, this is probably fine. For example, at LinkedIn, our max
> message size is 1 MB, typically the compressed size would be 100 KB or
> larger, given that in most cases, there are many partitions, the request
> size would not be too small (typically around a few MB).
>
> At LinkedIn we do have some topics has various compression ratio. Those are
> usually topics shared by different services so the data may differ a lot
> although they are in the same topic and similar fields.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
> On Tue, Feb 21, 2017 at 6:17 PM, Apurva Mehta  wrote:
>
> > Hi Becket, Thanks for the kip.
> >
> > I think one of the risks here is that when compression estimation is
> > disabled, you could have much smaller batches than expected, and
> throughput
> > could be hurt. It would be worth adding this to the documentation of this
> > setting.
> >
> > Also, one of the rejected alternatives states that per topic estimations
> > would not work when the compression of individual messages is variable.
> > This is true in theory, but in practice one would expect Kafka topics to
> > have fairly homogenous data, and hence should compress evenly. I was
> > curious if you have data which shows otherwise.
> >
> > Thanks,
> > Apurva
> >
> > On Tue, Feb 21, 2017 at 12:30 PM, Becket Qin 
> wrote:
> >
> > > Hi folks,
> > >
> > > I would like to start the discussion thread on KIP-126. The KIP propose
> > > adding a new configuration to KafkaProducer to allow batching based on
> > > uncompressed message size.
> > >
> > > Comments are welcome.
> > >
> > > The KIP wiki is following:
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 126+-+Allow+KafkaProducer+to+batch+based+on+uncompressed+size
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-17 Thread Mayuresh Gharat
Hi Jun,

I have updated the KIP. Would you mind taking another look?

Thanks,

Mayuresh

On Fri, Feb 17, 2017 at 4:42 PM, Mayuresh Gharat <gharatmayures...@gmail.com
> wrote:

> Hi Jun,
>
> Sure sounds good to me.
>
> Thanks,
>
> Mayuresh
>
> On Fri, Feb 17, 2017 at 1:54 PM, Jun Rao <j...@confluent.io> wrote:
>
>> Hi, Mani,
>>
>> Good point on using PrincipalBuilder for SASL. It seems that
>> PrincipalBuilder already has access to Authenticator. So, we could just
>> enable that in SaslChannelBuilder. We probably could do that in a separate
>> KIP?
>>
>> Hi, Mayuresh,
>>
>> If you don't think there is a concrete use case for using
>> PrincipalBuilder in
>> kafka-acls.sh, perhaps we could do the simpler approach for now?
>>
>> Thanks,
>>
>> Jun
>>
>>
>>
>> On Fri, Feb 17, 2017 at 12:23 PM, Mayuresh Gharat <
>> gharatmayures...@gmail.com> wrote:
>>
>> > @Manikumar,
>> >
>> > Can you give an example how you are planning to use PrincipalBuilder?
>> >
>> > @Jun
>> > Yes, that is right. To give a brief overview, we just extract the cert
>> and
>> > hand it over to a third party library for creating a Principal. So we
>> > cannot create a Principal from just a string.
>> > The main motive behind adding the PrincipalBuilder for kafk-acls.sh was
>> > that someone else (who can generate a Principal from map of propertie,
>> > <String, String> for example) can use it.
>> > As I said, Linkedin is fine with not making any changes to Kafka-acls.sh
>> > for now. But we thought that it would be a good improvement to the tool
>> and
>> > it makes it more flexible and usable.
>> >
>> > Let us know your thoughts, if you would like us to make kafka-acls.sh
>> more
>> > flexible and usable and not limited to Authorizer coming out of the box.
>> >
>> > Thanks,
>> >
>> > Mayuresh
>> >
>> >
>> > On Thu, Feb 16, 2017 at 10:18 PM, Manikumar <manikumar.re...@gmail.com>
>> > wrote:
>> >
>> > > Hi Jun,
>> > >
>> > > yes, we can just customize rules to send full principal name.  I was
>> > > just thinking to
>> > > use PrinciplaBuilder interface for implementing SASL rules also. So
>> that
>> > > the interface
>> > > will be consistent across protocols.
>> > >
>> > > Thanks
>> > >
>> > > On Fri, Feb 17, 2017 at 1:07 AM, Jun Rao <j...@confluent.io> wrote:
>> > >
>> > > > Hi, Radai, Mayuresh,
>> > > >
>> > > > Thanks for the explanation. Good point on a pluggable authorizer can
>> > > > customize how acls are added. However, earlier, Mayuresh was saying
>> > that
>> > > in
>> > > > LinkedIn's customized authorizer, it's not possible to create a
>> > principal
>> > > > from string. If that's the case, will adding the principal builder
>> in
>> > > > kafka-acl.sh help? If the principal can be constructed from a
>> string,
>> > > > wouldn't it be simpler to just let kafka-acl.sh do authorization
>> based
>> > on
>> > > > that string name and not be aware of the principal builder? If you
>> > still
>> > > > think there is a need, perhaps you can add a more concrete use case
>> > that
>> > > > can't be done otherwise?
>> > > >
>> > > >
>> > > > Hi, Mani,
>> > > >
>> > > > For SASL, if the authorizer needs the full kerberos principal name,
>> > > > currently, the user can just customize "sasl.kerberos.principal.to.
>> > > > local.rules"
>> > > > to return the full principal name as the name for authorization,
>> right?
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Jun
>> > > >
>> > > > On Wed, Feb 15, 2017 at 10:25 AM, Mayuresh Gharat <
>> > > > gharatmayures...@gmail.com> wrote:
>> > > >
>> > > > > @Jun thanks for the comments.Please see the replies inline.
>> > > > >
>> > > > > Currently kafka-acl.sh just creates an ACL path in ZK with the
>> > > principal
>> > > > > name string.
>> > > > > > Yes, the kafka-acl.sh calls the addAcl() on the inbuilt
>

Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-17 Thread Mayuresh Gharat
Hi Jun,

Sure sounds good to me.

Thanks,

Mayuresh

On Fri, Feb 17, 2017 at 1:54 PM, Jun Rao <j...@confluent.io> wrote:

> Hi, Mani,
>
> Good point on using PrincipalBuilder for SASL. It seems that
> PrincipalBuilder already has access to Authenticator. So, we could just
> enable that in SaslChannelBuilder. We probably could do that in a separate
> KIP?
>
> Hi, Mayuresh,
>
> If you don't think there is a concrete use case for using PrincipalBuilder
> in
> kafka-acls.sh, perhaps we could do the simpler approach for now?
>
> Thanks,
>
> Jun
>
>
>
> On Fri, Feb 17, 2017 at 12:23 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com> wrote:
>
> > @Manikumar,
> >
> > Can you give an example how you are planning to use PrincipalBuilder?
> >
> > @Jun
> > Yes, that is right. To give a brief overview, we just extract the cert
> and
> > hand it over to a third party library for creating a Principal. So we
> > cannot create a Principal from just a string.
> > The main motive behind adding the PrincipalBuilder for kafk-acls.sh was
> > that someone else (who can generate a Principal from map of propertie,
> > <String, String> for example) can use it.
> > As I said, Linkedin is fine with not making any changes to Kafka-acls.sh
> > for now. But we thought that it would be a good improvement to the tool
> and
> > it makes it more flexible and usable.
> >
> > Let us know your thoughts, if you would like us to make kafka-acls.sh
> more
> > flexible and usable and not limited to Authorizer coming out of the box.
> >
> > Thanks,
> >
> > Mayuresh
> >
> >
> > On Thu, Feb 16, 2017 at 10:18 PM, Manikumar <manikumar.re...@gmail.com>
> > wrote:
> >
> > > Hi Jun,
> > >
> > > yes, we can just customize rules to send full principal name.  I was
> > > just thinking to
> > > use PrinciplaBuilder interface for implementing SASL rules also. So
> that
> > > the interface
> > > will be consistent across protocols.
> > >
> > > Thanks
> > >
> > > On Fri, Feb 17, 2017 at 1:07 AM, Jun Rao <j...@confluent.io> wrote:
> > >
> > > > Hi, Radai, Mayuresh,
> > > >
> > > > Thanks for the explanation. Good point on a pluggable authorizer can
> > > > customize how acls are added. However, earlier, Mayuresh was saying
> > that
> > > in
> > > > LinkedIn's customized authorizer, it's not possible to create a
> > principal
> > > > from string. If that's the case, will adding the principal builder in
> > > > kafka-acl.sh help? If the principal can be constructed from a string,
> > > > wouldn't it be simpler to just let kafka-acl.sh do authorization
> based
> > on
> > > > that string name and not be aware of the principal builder? If you
> > still
> > > > think there is a need, perhaps you can add a more concrete use case
> > that
> > > > can't be done otherwise?
> > > >
> > > >
> > > > Hi, Mani,
> > > >
> > > > For SASL, if the authorizer needs the full kerberos principal name,
> > > > currently, the user can just customize "sasl.kerberos.principal.to.
> > > > local.rules"
> > > > to return the full principal name as the name for authorization,
> right?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Wed, Feb 15, 2017 at 10:25 AM, Mayuresh Gharat <
> > > > gharatmayures...@gmail.com> wrote:
> > > >
> > > > > @Jun thanks for the comments.Please see the replies inline.
> > > > >
> > > > > Currently kafka-acl.sh just creates an ACL path in ZK with the
> > > principal
> > > > > name string.
> > > > > > Yes, the kafka-acl.sh calls the addAcl() on the inbuilt
> > > > > SimpleAclAuthorizer which in turn creates an ACL in ZK with the
> > > Principal
> > > > > name string. This is because we supply the SimpleAclAuthorizer as a
> > > > > commandline argument in the Kafka-acls.sh command.
> > > > >
> > > > > The authorizer module in the broker reads the principal name
> > > > > string from the acl path in ZK and creates the expected
> > KafkaPrincipal
> > > > for
> > > > > matching. As you can see, the expected principal is created on the
> > > broker
> > > > > s

Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-17 Thread Mayuresh Gharat
@Manikumar,

Can you give an example how you are planning to use PrincipalBuilder?

@Jun
Yes, that is right. To give a brief overview, we just extract the cert and
hand it over to a third party library for creating a Principal. So we
cannot create a Principal from just a string.
The main motive behind adding the PrincipalBuilder for kafk-acls.sh was
that someone else (who can generate a Principal from map of propertie,
<String, String> for example) can use it.
As I said, Linkedin is fine with not making any changes to Kafka-acls.sh
for now. But we thought that it would be a good improvement to the tool and
it makes it more flexible and usable.

Let us know your thoughts, if you would like us to make kafka-acls.sh more
flexible and usable and not limited to Authorizer coming out of the box.

Thanks,

Mayuresh


On Thu, Feb 16, 2017 at 10:18 PM, Manikumar <manikumar.re...@gmail.com>
wrote:

> Hi Jun,
>
> yes, we can just customize rules to send full principal name.  I was
> just thinking to
> use PrinciplaBuilder interface for implementing SASL rules also. So that
> the interface
> will be consistent across protocols.
>
> Thanks
>
> On Fri, Feb 17, 2017 at 1:07 AM, Jun Rao <j...@confluent.io> wrote:
>
> > Hi, Radai, Mayuresh,
> >
> > Thanks for the explanation. Good point on a pluggable authorizer can
> > customize how acls are added. However, earlier, Mayuresh was saying that
> in
> > LinkedIn's customized authorizer, it's not possible to create a principal
> > from string. If that's the case, will adding the principal builder in
> > kafka-acl.sh help? If the principal can be constructed from a string,
> > wouldn't it be simpler to just let kafka-acl.sh do authorization based on
> > that string name and not be aware of the principal builder? If you still
> > think there is a need, perhaps you can add a more concrete use case that
> > can't be done otherwise?
> >
> >
> > Hi, Mani,
> >
> > For SASL, if the authorizer needs the full kerberos principal name,
> > currently, the user can just customize "sasl.kerberos.principal.to.
> > local.rules"
> > to return the full principal name as the name for authorization, right?
> >
> > Thanks,
> >
> > Jun
> >
> > On Wed, Feb 15, 2017 at 10:25 AM, Mayuresh Gharat <
> > gharatmayures...@gmail.com> wrote:
> >
> > > @Jun thanks for the comments.Please see the replies inline.
> > >
> > > Currently kafka-acl.sh just creates an ACL path in ZK with the
> principal
> > > name string.
> > > > Yes, the kafka-acl.sh calls the addAcl() on the inbuilt
> > > SimpleAclAuthorizer which in turn creates an ACL in ZK with the
> Principal
> > > name string. This is because we supply the SimpleAclAuthorizer as a
> > > commandline argument in the Kafka-acls.sh command.
> > >
> > > The authorizer module in the broker reads the principal name
> > > string from the acl path in ZK and creates the expected KafkaPrincipal
> > for
> > > matching. As you can see, the expected principal is created on the
> broker
> > > side, not by the kafka-acl.sh tool.
> > > > This is considering the fact that the user is using the
> > > SimpleAclAuthorizer on the broker side and not his own custom
> Authorizer.
> > > The SimpleAclAuthorizer will take the Principal it gets from the
> Session
> > > class . Currently the Principal is KafkaPrincipal. This KafkaPrincipal
> is
> > > generated from the name of the actual channel Principal, in
> SocketServer
> > > class when processing completed receives.
> > > With this KIP, this will no longer be the case as the Session class
> will
> > > store a java.security.Principal instead of specific KafkaPrincipal. So
> > the
> > > SimpleAclAuthorizer will construct the KafkaPrincipal from the channel
> > > Principal it gets from the Session class.
> > > User might not want to use the SimpleAclAuthorizer but use his/her own
> > > custom Authorizer.
> > >
> > > The broker already has the ability to
> > > configure PrincipalBuilder. That's why I am not sure if there is a need
> > for
> > > kafka-acl.sh to customize PrincipalBuilder.
> > > > This is exactly the reason why we want to propose a
> > PrincipalBuilder
> > > in kafka-acls.sh so that the Principal generated by the
> PrincipalBuilder
> > on
> > > broker is consistent with that generated while creating ACLs using the
> > > kafka-acls.sh command line tool.
> > >
> > >
> > > *To su

Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-15 Thread Mayuresh Gharat
string. The authorizer module in the broker reads the principal
> name
> > > string from the acl path in ZK and creates the expected KafkaPrincipal
> > for
> > > matching. As you can see, the expected principal is created on the
> broker
> > > side, not by the kafka-acl.sh tool. The broker already has the ability
> to
> > > configure PrincipalBuilder. That's why I am not sure if there is a need
> > for
> > > kafka-acl.sh to customize PrincipalBuilder.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Mon, Feb 13, 2017 at 7:01 PM, radai <radai.rosenbl...@gmail.com>
> > wrote:
> > >
> > > > if i understand correctly, kafka-acls.sh spins up an instance of (the
> > > > custom, in our case) Authorizer, and calls things like addAcls(acls:
> > > > Set[Acl], resource: Resource) on it, which are defined in the
> > interface,
> > > > hence expected to be "extensible".
> > > >
> > > > (side note: if Authorizer and PrincipalBuilder are defined as
> > extensible
> > > > interfaces, why doesnt class Acl, which is in the signature for
> > > Authorizer
> > > > calls, use java.security.Principal?)
> > > >
> > > > we would like to be able to use the standard kafka-acl command line
> for
> > > > defining ACLs even when replacing the vanilla Authorizer and
> > > > PrincipalBuilder (even though we have a management UI for these
> > > operations
> > > > within linkedin) - simply because thats the correct thing to do from
> an
> > > > extensibility point of view.
> > > >
> > > > On Mon, Feb 13, 2017 at 1:39 PM, Jun Rao <j...@confluent.io> wrote:
> > > >
> > > > > Hi, Mayuresh,
> > > > >
> > > > > I seems to me that there are two common use cases of authorizer.
> (1)
> > > Use
> > > > > the default SimpleAuthorizer and the kafka-acl to do authorization.
> > (2)
> > > > Use
> > > > > a customized authorizer and an external tool for authorization. Do
> > you
> > > > > think there is a use case for a customized authorizer and kafka-acl
> > at
> > > > the
> > > > > same time? If not, it's better not to complicate the kafka-acl api.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Feb 13, 2017 at 10:35 AM, Mayuresh Gharat <
> > > > > gharatmayures...@gmail.com> wrote:
> > > > >
> > > > > > Hi Jun,
> > > > > >
> > > > > > Thanks for the review and comments. Please find the replies
> inline
> > :
> > > > > >
> > > > > > This is so that in the future, we can extend to types like group.
> > > > > > ---> Yep, I did think the same. But since the SocketServer was
> > always
> > > > > > creating User type, it wasn't actually used. If we go ahead with
> > > > changes
> > > > > in
> > > > > > this KIP, we will give this power of creating different Principal
> > > types
> > > > > to
> > > > > > the PrincipalBuilder (which users can define there own). In that
> > way
> > > > > Kafka
> > > > > > will not have to deal with handling this. So the Principal
> building
> > > and
> > > > > > Authorization will be opaque to Kafka which seems like an
> expected
> > > > > > behavior.
> > > > > >
> > > > > >
> > > > > > Hmm, normally, the configurations you specify for plug-ins refer
> to
> > > > those
> > > > > > needed to construct the plug-in object. So, it's kind of weird to
> > use
> > > > > that
> > > > > > to call a method. For example, why can't
> > > principalBuilderService.rest.
> > > > > url
> > > > > > be passed in through the configure() method and the
> implementation
> > > can
> > > > > use
> > > > > > that to build principal. This way, there is only a single method
> to
> > > > > compute
> > > > > > the principal in a consistent way in the broker and in the
> > kafka-acl
> > > > > tool.
> > >

Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-13 Thread Mayuresh Gharat
Hi Jun,

Thanks for the review and comments. Please find the replies inline :

This is so that in the future, we can extend to types like group.
---> Yep, I did think the same. But since the SocketServer was always
creating User type, it wasn't actually used. If we go ahead with changes in
this KIP, we will give this power of creating different Principal types to
the PrincipalBuilder (which users can define there own). In that way Kafka
will not have to deal with handling this. So the Principal building and
Authorization will be opaque to Kafka which seems like an expected behavior.


Hmm, normally, the configurations you specify for plug-ins refer to those
needed to construct the plug-in object. So, it's kind of weird to use that
to call a method. For example, why can't principalBuilderService.rest.url
be passed in through the configure() method and the implementation can use
that to build principal. This way, there is only a single method to compute
the principal in a consistent way in the broker and in the kafka-acl tool.
> We can do that as well. But since the rest url is not related to the
Principal, it seems out of place to me to pass it every time we have to
create a Principal. I should replace "principalConfigs" with
"principalProperties".
I was trying to differentiate the configs/properties that are used to
create the PrincipalBuilder class and the Principal/Principals itself.


For LinkedIn's use case, do you actually use the kafka-acl tool? My
understanding is that LinkedIn does authorization through an external tool.
> For Linkedin's use case we don't actually use the kafka-acl tool
right now. As per the discussion that we had on
https://issues.apache.org/jira/browse/KAFKA-4454, we thought that it would
be good to make kafka-acl tool changes, to make it flexible and we might be
even able to use it in future.

It seems it's simpler if kafka-acl doesn't to need to understand the
principal builder. The tool does authorization based on a string name,
which is expected to match the principal name. So, I am wondering why the
tool needs to know the principal builder.
> If we don't make this change, I am not sure how clients/end users
will be able to use this tool if they have there own Authorizer that does
Authorization based on Principal, that has more information apart from name
and type.

What if we only make the following changes: pass the java principal in
session and in
SimpleAuthorizer, construct KafkaPrincipal from java principal name. Will
that work for LinkedIn?
> This can work for Linkedin but as explained above, it does not seem
like a complete design from open source point of view.

Thanks,

Mayuresh


On Thu, Feb 9, 2017 at 11:29 AM, Jun Rao <j...@confluent.io> wrote:

> Hi, Mayuresh,
>
> Thanks for the reply. A few more comments below.
>
> On Wed, Feb 8, 2017 at 9:14 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com>
> wrote:
>
> > Hi Jun,
> >
> > Thanks for the review. Please find the responses inline.
> >
> > 1. It seems the problem that you are trying to address is that java
> > principal returned from KafkaChannel may have additional fields than name
> > that are needed during authorization. Have you considered a customized
> > PrincipleBuilder that extracts all needed fields from java principal and
> > squeezes them as a json in the name of the returned principal? Then, the
> > authorizer can just parse the json and extract needed fields.
> > ---> Yes we had thought about this. We use a third party library that
> takes
> > in the passed in cert and creates the Principal. This Principal is then
> > used by the library to make the decision (ALLOW/DENY) when we call it in
> > the Authorizer. It does not have an API to create the Principal from a
> > String. If it did support, still we would have to be aware of the
> internal
> > details of the library, like the field values it creates from the certs,
> > defaults and so on.
> >
> > 2. Could you explain how the default authorizer works now? Currently, the
> > code just compares the two principal objects. Are we converting the java
> > principal to a KafkaPrincipal there?
> > ---> The SimpleAclAuthorizer currently expects that, the Principal it
> > fetches from the Session object is an instance of KafkaPrincipal. It then
> > uses it compare with the KafkaPrincipal extracted from the stored ACLs.
> In
> > this case, we can construct the KafkaPrincipal object on the fly by using
> > the name of the Principal as follows :
> >
> > *val principal = session.principal*
> > *val kafkaPrincipal = new KafkaPrincipal(KafkaPrincipal.USER_TYPE,
> > principal.getName)*
> > I was also planning to get rid of the principalType field in
> > KafkaPrincipal as
&

Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-08 Thread Mayuresh Gharat
 compares the two principal objects. Are we converting the java
> principal to a KafkaPrincipal there?
>
> 3. Do we need to add the following method in PrincipalBuilder? The configs
> are already passed in through configure() and an implementation can cache
> it and use it in buildPrincipal(). It's also not clear to me where we call
> the new and the old method, and whether both will be called or one of them
> will be called.
> Principal buildPrincipal(Map<String, ?> principalConfigs);
>
> 4. The KIP has "If users use there custom PrincipalBuilder, they will have
> to implement there custom Authorizer as the out of box Authorizer that
> Kafka provides uses KafkaPrincipal." This is not ideal for existing users.
> Could we avoid that?
>
> Thanks,
>
> Jun
>
>
> On Fri, Feb 3, 2017 at 11:25 AM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Hi All,
> >
> > It seems that there is no further concern with the KIP-111. At this point
> > we would like to start the voting process. The KIP can be found at
> > https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=67638388
> >
> > Thanks,
> >
> > Mayuresh
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-07 Thread Mayuresh Gharat
Bumping up this thread.

Thanks,

Mayuresh

On Fri, Feb 3, 2017 at 5:09 PM, radai <radai.rosenbl...@gmail.com> wrote:

> +1
>
> On Fri, Feb 3, 2017 at 11:25 AM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Hi All,
> >
> > It seems that there is no further concern with the KIP-111. At this point
> > we would like to start the voting process. The KIP can be found at
> > https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=67638388
> >
> > Thanks,
> >
> > Mayuresh
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


[VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-03 Thread Mayuresh Gharat
Hi All,

It seems that there is no further concern with the KIP-111. At this point
we would like to start the voting process. The KIP can be found at
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=67638388

Thanks,

Mayuresh


Re: [DISCUSS] KIP-111 : Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-03 Thread Mayuresh Gharat
Hi All,

If there are no more concerns, I will like to start vote for this KIP.

Thanks,

Mayuresh

On Wed, Feb 1, 2017 at 8:38 PM, Mayuresh Gharat <gharatmayures...@gmail.com>
wrote:

> Hi Dong,
>
> What I meant was "Right now Kafka just extracts the name out of the
> Principal that is generated by the PrincipalBuilder. Instead of doing that
> if it preserves the Principal itself, this issue can be addressed".
>
> May be I should have used the word "preserve" instead of "stores". I have
> updated the wording in the KIP.
>
> Thanks,
>
> Mayuresh
>
> On Wed, Feb 1, 2017 at 8:30 PM, Dong Lin <lindon...@gmail.com> wrote:
>
>> The last paragraph of the motivation section is a bit confusing. I guess
>> you want to say "This issue can be addressed if the Session class stores
>> the Principal object extracted from a request".
>>
>> I like the approach of changing Session class to be case class
>> *Session(principal:
>> KafkaPrincipal, clientAddress: InetAddress)* under the assumption that the
>> Session class doesn't really need principalType of the KafkaPrincipal. I
>> am
>> wondering if anyone in the open source mailing list knows why we need to
>> have principalType in KafkaPrincipal.
>>
>> For the record, I actually prefer that we use the existing configure() to
>> provide properties to PrincipalBuilder instead of adding the method
>> *buildPrincipal(Map<String,
>> ?> principalConfigs)* in the PrincipalBuilder interface. But this is not a
>> blocking issue for me.
>>
>>
>>
>>
>> On Wed, Feb 1, 2017 at 2:54 PM, Mayuresh Gharat <
>> gharatmayures...@gmail.com>
>> wrote:
>>
>> > Hi All,
>> >
>> > I have updated the KIP as per our discussion here.
>> > It would be great if you can take another look and let me know if there
>> are
>> > any concerns.
>> >
>> > Thanks,
>> >
>> > Mayuresh
>> >
>> > On Sat, Jan 28, 2017 at 6:10 PM, Mayuresh Gharat <
>> > gharatmayures...@gmail.com
>> > > wrote:
>> >
>> > > I had offline discussions with Joel, Dong and Radai.
>> > >
>> > > I agree that we can replace the KafkaPrincipal in Session with the
>> > > ChannelPrincipal.
>> > > KafkaPrincipal can be provided as an out of box implementation.
>> > >
>> > > The only gotcha will be users will have to implement there own
>> > Authorizer,
>> > > if they decide to use there own PrincipalBuilder in kafka-acls.sh.
>> > >
>> > > I will update the KIP accordingly.
>> > >
>> > > Thanks,
>> > >
>> > > Mayuresh
>> > >
>> > > On Thu, Jan 26, 2017 at 6:01 PM, Mayuresh Gharat <
>> > > gharatmayures...@gmail.com> wrote:
>> > >
>> > >> Hi Dong,
>> > >>
>> > >> Thanks for the review. Please see the replies inline.
>> > >>
>> > >>
>> > >> 1. I am not sure we need to add the method
>> buildPrincipal(Map<String, ?>
>> > >> principalConfigs). It seems that user can simply do
>> > >> principalBuilder.configure(...).buildPrincipal(...) without using
>> that
>> > >> method.
>> > >> -> I am not sure if I understand the question.
>> > >> buildPrincipal(Map<String, ?> principalConfigs) will be used to build
>> > >> individual Principals from the passed in configs. Each Principal can
>> be
>> > >> different type and the PrincipalBuilder is responsible for handling
>> > those
>> > >> configs correctly and build those Principals.
>> > >>
>> > >> 2. Is there any reason specific reason that we should put the
>> > >> channelPrincipal in KafkaPrincipal class instead of the Session
>> class?
>> > If
>> > >> they work equally well to serve the use-case of this KIP, then it
>> seems
>> > >> better to put this field in the Session class to avoid changing
>> > interface
>> > >> that needs to be implemented by custom principal.
>> > >> -> Doing this might be backwards incompatible as we need to
>> > >> preserve the existing behavior of kafka-acls.sh. Also as we have
>> field
>> > of
>> > >> PrincipalType which can be used in future if Kafka decides to support
>> > >> different Principal typ

Re: [DISCUSS] KIP-111 : Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-01 Thread Mayuresh Gharat
Hi Dong,

What I meant was "Right now Kafka just extracts the name out of the
Principal that is generated by the PrincipalBuilder. Instead of doing that
if it preserves the Principal itself, this issue can be addressed".

May be I should have used the word "preserve" instead of "stores". I have
updated the wording in the KIP.

Thanks,

Mayuresh

On Wed, Feb 1, 2017 at 8:30 PM, Dong Lin <lindon...@gmail.com> wrote:

> The last paragraph of the motivation section is a bit confusing. I guess
> you want to say "This issue can be addressed if the Session class stores
> the Principal object extracted from a request".
>
> I like the approach of changing Session class to be case class
> *Session(principal:
> KafkaPrincipal, clientAddress: InetAddress)* under the assumption that the
> Session class doesn't really need principalType of the KafkaPrincipal. I am
> wondering if anyone in the open source mailing list knows why we need to
> have principalType in KafkaPrincipal.
>
> For the record, I actually prefer that we use the existing configure() to
> provide properties to PrincipalBuilder instead of adding the method
> *buildPrincipal(Map<String,
> ?> principalConfigs)* in the PrincipalBuilder interface. But this is not a
> blocking issue for me.
>
>
>
>
> On Wed, Feb 1, 2017 at 2:54 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > I have updated the KIP as per our discussion here.
> > It would be great if you can take another look and let me know if there
> are
> > any concerns.
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Sat, Jan 28, 2017 at 6:10 PM, Mayuresh Gharat <
> > gharatmayures...@gmail.com
> > > wrote:
> >
> > > I had offline discussions with Joel, Dong and Radai.
> > >
> > > I agree that we can replace the KafkaPrincipal in Session with the
> > > ChannelPrincipal.
> > > KafkaPrincipal can be provided as an out of box implementation.
> > >
> > > The only gotcha will be users will have to implement there own
> > Authorizer,
> > > if they decide to use there own PrincipalBuilder in kafka-acls.sh.
> > >
> > > I will update the KIP accordingly.
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Thu, Jan 26, 2017 at 6:01 PM, Mayuresh Gharat <
> > > gharatmayures...@gmail.com> wrote:
> > >
> > >> Hi Dong,
> > >>
> > >> Thanks for the review. Please see the replies inline.
> > >>
> > >>
> > >> 1. I am not sure we need to add the method buildPrincipal(Map<String,
> ?>
> > >> principalConfigs). It seems that user can simply do
> > >> principalBuilder.configure(...).buildPrincipal(...) without using
> that
> > >> method.
> > >> -> I am not sure if I understand the question.
> > >> buildPrincipal(Map<String, ?> principalConfigs) will be used to build
> > >> individual Principals from the passed in configs. Each Principal can
> be
> > >> different type and the PrincipalBuilder is responsible for handling
> > those
> > >> configs correctly and build those Principals.
> > >>
> > >> 2. Is there any reason specific reason that we should put the
> > >> channelPrincipal in KafkaPrincipal class instead of the Session class?
> > If
> > >> they work equally well to serve the use-case of this KIP, then it
> seems
> > >> better to put this field in the Session class to avoid changing
> > interface
> > >> that needs to be implemented by custom principal.
> > >> -> Doing this might be backwards incompatible as we need to
> > >> preserve the existing behavior of kafka-acls.sh. Also as we have field
> > of
> > >> PrincipalType which can be used in future if Kafka decides to support
> > >> different Principal types (currently it just says "User"), we might
> > loose
> > >> that functionality.
> > >>
> > >> Thanks,
> > >>
> > >> Mayuresh
> > >>
> > >>
> > >> On Tue, Jan 24, 2017 at 3:35 PM, Dong Lin <lindon...@gmail.com>
> wrote:
> > >>
> > >>> Hey Mayuresh,
> > >>>
> > >>> Thanks for the KIP. I actually like the suggestions by Ismael and
> Jun.
> > >>> Here
> > >>> are my comments:
> > >>>
> > >>> 1. I am not sur

Re: [DISCUSS] KIP-111 : Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-02-01 Thread Mayuresh Gharat
Hi All,

I have updated the KIP as per our discussion here.
It would be great if you can take another look and let me know if there are
any concerns.

Thanks,

Mayuresh

On Sat, Jan 28, 2017 at 6:10 PM, Mayuresh Gharat <gharatmayures...@gmail.com
> wrote:

> I had offline discussions with Joel, Dong and Radai.
>
> I agree that we can replace the KafkaPrincipal in Session with the
> ChannelPrincipal.
> KafkaPrincipal can be provided as an out of box implementation.
>
> The only gotcha will be users will have to implement there own Authorizer,
> if they decide to use there own PrincipalBuilder in kafka-acls.sh.
>
> I will update the KIP accordingly.
>
> Thanks,
>
> Mayuresh
>
> On Thu, Jan 26, 2017 at 6:01 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com> wrote:
>
>> Hi Dong,
>>
>> Thanks for the review. Please see the replies inline.
>>
>>
>> 1. I am not sure we need to add the method buildPrincipal(Map<String, ?>
>> principalConfigs). It seems that user can simply do
>> principalBuilder.configure(...).buildPrincipal(...) without using that
>> method.
>> -> I am not sure if I understand the question.
>> buildPrincipal(Map<String, ?> principalConfigs) will be used to build
>> individual Principals from the passed in configs. Each Principal can be
>> different type and the PrincipalBuilder is responsible for handling those
>> configs correctly and build those Principals.
>>
>> 2. Is there any reason specific reason that we should put the
>> channelPrincipal in KafkaPrincipal class instead of the Session class? If
>> they work equally well to serve the use-case of this KIP, then it seems
>> better to put this field in the Session class to avoid changing interface
>> that needs to be implemented by custom principal.
>> -> Doing this might be backwards incompatible as we need to
>> preserve the existing behavior of kafka-acls.sh. Also as we have field of
>> PrincipalType which can be used in future if Kafka decides to support
>> different Principal types (currently it just says "User"), we might loose
>> that functionality.
>>
>> Thanks,
>>
>> Mayuresh
>>
>>
>> On Tue, Jan 24, 2017 at 3:35 PM, Dong Lin <lindon...@gmail.com> wrote:
>>
>>> Hey Mayuresh,
>>>
>>> Thanks for the KIP. I actually like the suggestions by Ismael and Jun.
>>> Here
>>> are my comments:
>>>
>>> 1. I am not sure we need to add the method buildPrincipal(Map<String, ?>
>>> principalConfigs). It seems that user can simply do
>>> principalBuilder.configure(...).buildPrincipal(...) without using that
>>> method.
>>>
>>> 2. Is there any reason specific reason that we should put the
>>> channelPrincipal in KafkaPrincipal class instead of the Session class? If
>>> they work equally well to serve the use-case of this KIP, then it seems
>>> better to put this field in the Session class to avoid changing interface
>>> that needs to be implemented by custom principal.
>>>
>>> Dong
>>>
>>>
>>> On Mon, Jan 23, 2017 at 5:55 PM, Mayuresh Gharat <
>>> gharatmayures...@gmail.com
>>> > wrote:
>>>
>>> > Hi Rajini,
>>> >
>>> > Thanks a lot for the review. Please see the comments inline :
>>> >
>>> > It feels like the goal is to expose custom Principal as an
>>> > opaque object between PrincipalBuilder and Authorizer so that Kafka
>>> doesn't
>>> > really need to know anything about additional stuff added for
>>> > customization. But kafka-acls.sh is expecting a key-value map from
>>> which
>>> > Principal is constructed. This is a breaking change to the
>>> PrincipalBuilder
>>> > interface - and I am not sure what it achieves.
>>> > -> kafka-acls is a commandline tool where in currently we just
>>> specify
>>> > the "names" of the principal that are allowed or denied.
>>> > The Principal generated by PrincipalBuilder is still opaque and Kafka
>>> as
>>> > such does not need to know the details.
>>> > The key-value map that is been passed in, will be used specifically by
>>> the
>>> > user PrincipalBuilder to create the Principal. The main motivation of
>>> the
>>> > KIP is that, the Principal built by the PrincipalBuilder can have other
>>> > fields apart from the "name", which are ignored currently. Allowing a
>

[jira] (KAFKA-1610) Local modifications to collections generated from mapValues will be lost

2017-01-31 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847355#comment-15847355
 ] 

Mayuresh Gharat commented on KAFKA-1610:


[~jozi-k] sure. The patch has been available but we somehow missed this.

> Local modifications to collections generated from mapValues will be lost
> 
>
> Key: KAFKA-1610
> URL: https://issues.apache.org/jira/browse/KAFKA-1610
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>        Assignee: Mayuresh Gharat
>  Labels: newbie
> Attachments: KAFKA-1610_2014-08-29_09:51:51.patch, 
> KAFKA-1610_2014-08-29_10:03:55.patch, KAFKA-1610_2014-09-03_11:27:50.patch, 
> KAFKA-1610_2014-09-16_13:08:17.patch, KAFKA-1610_2014-09-16_15:23:27.patch, 
> KAFKA-1610_2014-09-30_23:21:46.patch, KAFKA-1610_2014-10-02_12:07:01.patch, 
> KAFKA-1610_2014-10-02_12:09:46.patch, KAFKA-1610.patch
>
>
> In our current Scala code base we have 40+ usages of mapValues, however it 
> has an important semantic difference with map, which is that "map" creates a 
> new map collection instance, while "mapValues" just create a map view of the 
> original map, and hence any further value changes to the view will be 
> effectively lost.
> Example code:
> {code}
> scala> case class Test(i: Int, var j: Int) {}
> defined class Test
> scala> val a = collection.mutable.Map(1 -> 1)
> a: scala.collection.mutable.Map[Int,Int] = Map(1 -> 1)
> scala> val b = a.mapValues(v => Test(v, v))
> b: scala.collection.Map[Int,Test] = Map(1 -> Test(1,1))
> scala> val c = a.map(v => v._1 -> Test(v._2, v._2))
> c: scala.collection.mutable.Map[Int,Test] = Map(1 -> Test(1,1))
> scala> b.foreach(kv => kv._2.j = kv._2.j + 1)
> scala> b
> res1: scala.collection.Map[Int,Test] = Map(1 -> Test(1,1))
> scala> c.foreach(kv => kv._2.j = kv._2.j + 1)
> scala> c
> res3: scala.collection.mutable.Map[Int,Test] = Map(1 -> Test(1,2))
> scala> a.put(1,3)
> res4: Option[Int] = Some(1)
> scala> b
> res5: scala.collection.Map[Int,Test] = Map(1 -> Test(3,3))
> scala> c
> res6: scala.collection.mutable.Map[Int,Test] = Map(1 -> Test(1,2))
> {code}
> We need to go through all these mapValue to see if they should be changed to 
> map



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-111 : Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-01-28 Thread Mayuresh Gharat
I had offline discussions with Joel, Dong and Radai.

I agree that we can replace the KafkaPrincipal in Session with the
ChannelPrincipal.
KafkaPrincipal can be provided as an out of box implementation.

The only gotcha will be users will have to implement there own Authorizer,
if they decide to use there own PrincipalBuilder in kafka-acls.sh.

I will update the KIP accordingly.

Thanks,

Mayuresh

On Thu, Jan 26, 2017 at 6:01 PM, Mayuresh Gharat <gharatmayures...@gmail.com
> wrote:

> Hi Dong,
>
> Thanks for the review. Please see the replies inline.
>
>
> 1. I am not sure we need to add the method buildPrincipal(Map<String, ?>
> principalConfigs). It seems that user can simply do
> principalBuilder.configure(...).buildPrincipal(...) without using that
> method.
> -> I am not sure if I understand the question.
> buildPrincipal(Map<String, ?> principalConfigs) will be used to build
> individual Principals from the passed in configs. Each Principal can be
> different type and the PrincipalBuilder is responsible for handling those
> configs correctly and build those Principals.
>
> 2. Is there any reason specific reason that we should put the
> channelPrincipal in KafkaPrincipal class instead of the Session class? If
> they work equally well to serve the use-case of this KIP, then it seems
> better to put this field in the Session class to avoid changing interface
> that needs to be implemented by custom principal.
> -> Doing this might be backwards incompatible as we need to
> preserve the existing behavior of kafka-acls.sh. Also as we have field of
> PrincipalType which can be used in future if Kafka decides to support
> different Principal types (currently it just says "User"), we might loose
> that functionality.
>
> Thanks,
>
> Mayuresh
>
>
> On Tue, Jan 24, 2017 at 3:35 PM, Dong Lin <lindon...@gmail.com> wrote:
>
>> Hey Mayuresh,
>>
>> Thanks for the KIP. I actually like the suggestions by Ismael and Jun.
>> Here
>> are my comments:
>>
>> 1. I am not sure we need to add the method buildPrincipal(Map<String, ?>
>> principalConfigs). It seems that user can simply do
>> principalBuilder.configure(...).buildPrincipal(...) without using that
>> method.
>>
>> 2. Is there any reason specific reason that we should put the
>> channelPrincipal in KafkaPrincipal class instead of the Session class? If
>> they work equally well to serve the use-case of this KIP, then it seems
>> better to put this field in the Session class to avoid changing interface
>> that needs to be implemented by custom principal.
>>
>> Dong
>>
>>
>> On Mon, Jan 23, 2017 at 5:55 PM, Mayuresh Gharat <
>> gharatmayures...@gmail.com
>> > wrote:
>>
>> > Hi Rajini,
>> >
>> > Thanks a lot for the review. Please see the comments inline :
>> >
>> > It feels like the goal is to expose custom Principal as an
>> > opaque object between PrincipalBuilder and Authorizer so that Kafka
>> doesn't
>> > really need to know anything about additional stuff added for
>> > customization. But kafka-acls.sh is expecting a key-value map from which
>> > Principal is constructed. This is a breaking change to the
>> PrincipalBuilder
>> > interface - and I am not sure what it achieves.
>> > -> kafka-acls is a commandline tool where in currently we just
>> specify
>> > the "names" of the principal that are allowed or denied.
>> > The Principal generated by PrincipalBuilder is still opaque and Kafka as
>> > such does not need to know the details.
>> > The key-value map that is been passed in, will be used specifically by
>> the
>> > user PrincipalBuilder to create the Principal. The main motivation of
>> the
>> > KIP is that, the Principal built by the PrincipalBuilder can have other
>> > fields apart from the "name", which are ignored currently. Allowing a
>> > key-value pair to be passed in will enable the PrincipalBuilder to
>> create
>> > such type of Principal.
>> >
>> > 1. A custom Principal is (a) created during authentication using custom
>> > PrincipalBuilder (b) checked during authorization using
>> Principal.equals()
>> > and (c) stored in Zookeeper using Principal.toString(). Is that correct?
>> > -> The authorization will be done as per the user supplied
>> Authorizer.
>> > As not everyone might be using zookeeper for storing ACLs, its storage
>> is
>> > again Authorizer  implementation dependent.
>> >
>> &g

Re: [DISCUSS] KIP-111 : Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-01-26 Thread Mayuresh Gharat
Hi Dong,

Thanks for the review. Please see the replies inline.


1. I am not sure we need to add the method buildPrincipal(Map<String, ?>
principalConfigs). It seems that user can simply do
principalBuilder.configure(...).buildPrincipal(...) without using that
method.
-> I am not sure if I understand the question.
buildPrincipal(Map<String, ?> principalConfigs) will be used to build
individual Principals from the passed in configs. Each Principal can be
different type and the PrincipalBuilder is responsible for handling those
configs correctly and build those Principals.

2. Is there any reason specific reason that we should put the
channelPrincipal in KafkaPrincipal class instead of the Session class? If
they work equally well to serve the use-case of this KIP, then it seems
better to put this field in the Session class to avoid changing interface
that needs to be implemented by custom principal.
-> Doing this might be backwards incompatible as we need to
preserve the existing behavior of kafka-acls.sh. Also as we have field of
PrincipalType which can be used in future if Kafka decides to support
different Principal types (currently it just says "User"), we might loose
that functionality.

Thanks,

Mayuresh


On Tue, Jan 24, 2017 at 3:35 PM, Dong Lin <lindon...@gmail.com> wrote:

> Hey Mayuresh,
>
> Thanks for the KIP. I actually like the suggestions by Ismael and Jun. Here
> are my comments:
>
> 1. I am not sure we need to add the method buildPrincipal(Map<String, ?>
> principalConfigs). It seems that user can simply do
> principalBuilder.configure(...).buildPrincipal(...) without using that
> method.
>
> 2. Is there any reason specific reason that we should put the
> channelPrincipal in KafkaPrincipal class instead of the Session class? If
> they work equally well to serve the use-case of this KIP, then it seems
> better to put this field in the Session class to avoid changing interface
> that needs to be implemented by custom principal.
>
> Dong
>
>
> On Mon, Jan 23, 2017 at 5:55 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Hi Rajini,
> >
> > Thanks a lot for the review. Please see the comments inline :
> >
> > It feels like the goal is to expose custom Principal as an
> > opaque object between PrincipalBuilder and Authorizer so that Kafka
> doesn't
> > really need to know anything about additional stuff added for
> > customization. But kafka-acls.sh is expecting a key-value map from which
> > Principal is constructed. This is a breaking change to the
> PrincipalBuilder
> > interface - and I am not sure what it achieves.
> > -> kafka-acls is a commandline tool where in currently we just
> specify
> > the "names" of the principal that are allowed or denied.
> > The Principal generated by PrincipalBuilder is still opaque and Kafka as
> > such does not need to know the details.
> > The key-value map that is been passed in, will be used specifically by
> the
> > user PrincipalBuilder to create the Principal. The main motivation of the
> > KIP is that, the Principal built by the PrincipalBuilder can have other
> > fields apart from the "name", which are ignored currently. Allowing a
> > key-value pair to be passed in will enable the PrincipalBuilder to create
> > such type of Principal.
> >
> > 1. A custom Principal is (a) created during authentication using custom
> > PrincipalBuilder (b) checked during authorization using
> Principal.equals()
> > and (c) stored in Zookeeper using Principal.toString(). Is that correct?
> > -> The authorization will be done as per the user supplied
> Authorizer.
> > As not everyone might be using zookeeper for storing ACLs, its storage is
> > again Authorizer  implementation dependent.
> >
> > 2. Is the reason for the new parameters in kafka-acls.sh and the breaking
> > change in PrincipalBuilder interface to enable users to specify a
> Principal
> > using properties rather than create the String in 1c) themselves?
> > -> Please see the explanation above.
> >
> > 3. Since the purpose of the new PrincipalBuilder method
> > buildPrincipal(Map<String,
> > ?> principalConfigs) is to create a new Principal from command line
> > parameters, perhaps Properties or Map<String, String> would be more
> > appropriate?
> > -> Yes we can, but I actually prefer to keep it similar to
> > configure(Map<String, ?> configs) API.
> >
> >
> > Hi Ismael,
> >
> > Thanks a lot for the review. Please see the comments inline.
> >
> > 1. PrincipalBuilder implements Configurable and ge

Re: [DISCUSS] KIP-111 : Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-01-23 Thread Mayuresh Gharat
Hi Rajini,

Thanks a lot for the review. Please see the comments inline :

It feels like the goal is to expose custom Principal as an
opaque object between PrincipalBuilder and Authorizer so that Kafka doesn't
really need to know anything about additional stuff added for
customization. But kafka-acls.sh is expecting a key-value map from which
Principal is constructed. This is a breaking change to the PrincipalBuilder
interface - and I am not sure what it achieves.
-> kafka-acls is a commandline tool where in currently we just specify
the "names" of the principal that are allowed or denied.
The Principal generated by PrincipalBuilder is still opaque and Kafka as
such does not need to know the details.
The key-value map that is been passed in, will be used specifically by the
user PrincipalBuilder to create the Principal. The main motivation of the
KIP is that, the Principal built by the PrincipalBuilder can have other
fields apart from the "name", which are ignored currently. Allowing a
key-value pair to be passed in will enable the PrincipalBuilder to create
such type of Principal.

1. A custom Principal is (a) created during authentication using custom
PrincipalBuilder (b) checked during authorization using Principal.equals()
and (c) stored in Zookeeper using Principal.toString(). Is that correct?
-> The authorization will be done as per the user supplied Authorizer.
As not everyone might be using zookeeper for storing ACLs, its storage is
again Authorizer  implementation dependent.

2. Is the reason for the new parameters in kafka-acls.sh and the breaking
change in PrincipalBuilder interface to enable users to specify a Principal
using properties rather than create the String in 1c) themselves?
-> Please see the explanation above.

3. Since the purpose of the new PrincipalBuilder method
buildPrincipal(Map<String,
?> principalConfigs) is to create a new Principal from command line
parameters, perhaps Properties or Map<String, String> would be more
appropriate?
-> Yes we can, but I actually prefer to keep it similar to
configure(Map<String, ?> configs) API.


Hi Ismael,

Thanks a lot for the review. Please see the comments inline.

1. PrincipalBuilder implements Configurable and gets a map of properties
via the `configure` method. Do we really need a new `buildPrincipal` method
given that?
--> The configure() API will actually be used to configure the
PrincipalBuilder in the same way as the Authorizer. The buildPrincipal()
API will be used by the PrincipalBuilder to build individual principals.
Each of these principals can be of different custom types like
GroupPrincipals, ServicePrincipals and so on, based on the Map<String, ?>
principalConfigs provided to the buildPrincipal() API.

2. Jun suggested in the JIRA that it may make sense to pass the
`channelPrincipal` as a field in `Session` instead of `KafkaPrincipal`. It
would be good to understand why this was rejected.
-> Now I understand what Jun meant by "Perhaps, we could extend the
Session object with channelPrincipal instead.". Actually thinking more on
this, there is a PrincipalType in KafkaPrincipal, that was inserted for a
specific purpose when it was created for the first time, I think. I thought
that we should preserve it, if its useful for future.

Thanks,

Mayuresh





On Mon, Jan 23, 2017 at 8:56 AM, Ismael Juma <ism...@juma.me.uk> wrote:

> Hi Mayuresh,
>
> Thanks for updating the KIP. A couple of questions:
>
> 1. PrincipalBuilder implements Configurable and gets a map of properties
> via the `configure` method. Do we really need a new `buildPrincipal` method
> given that?
>
> 2. Jun suggested in the JIRA that it may make sense to pass the
> `channelPrincipal` as a field in `Session` instead of `KafkaPrincipal`. It
> would be good to understand why this was rejected.
>
> Ismael
>
> On Thu, Jan 12, 2017 at 7:07 PM, Ismael Juma <ism...@juma.me.uk> wrote:
>
> > Hi Mayuresh,
> >
> > Thanks for the KIP. A quick comment before I do a more detailed analysis,
> > the KIP says:
> >
> > `This KIP is a pure addition to existing functionality and does not
> > include any backward incompatible changes.`
> >
> > However, the KIP is proposing the addition of a method to the
> > PrincipalBuilder pluggable interface, which is not a compatible change.
> > Existing implementations would no longer compile, for example. It would
> be
> > good to make this clear in the KIP.
> >
> > Ismael
> >
> > On Thu, Jan 12, 2017 at 5:44 PM, Mayuresh Gharat <
> > gharatmayures...@gmail.com> wrote:
> >
> >> Hi all.
> >>
> >> We created KIP-111 to propose that Kafka should preserve the Principal
> >> generated by the PrincipalBuilder while processing the request 

Re: [DISCUSS] KIP-111 : Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-01-20 Thread Mayuresh Gharat
Hi,

Just wanted to see if anyone had any concerns with this KIP.
I would like to put this to vote soon, if there are no concerns.

Thanks,

Mayuresh

On Thu, Jan 12, 2017 at 11:21 AM, Mayuresh Gharat <
gharatmayures...@gmail.com> wrote:

> Hi Ismael,
>
> Fair point. I will update it.
>
> Thanks,
>
> Mayuresh
>
> On Thu, Jan 12, 2017 at 11:07 AM, Ismael Juma <ism...@juma.me.uk> wrote:
>
>> Hi Mayuresh,
>>
>> Thanks for the KIP. A quick comment before I do a more detailed analysis,
>> the KIP says:
>>
>> `This KIP is a pure addition to existing functionality and does not
>> include
>> any backward incompatible changes.`
>>
>> However, the KIP is proposing the addition of a method to the
>> PrincipalBuilder pluggable interface, which is not a compatible change.
>> Existing implementations would no longer compile, for example. It would be
>> good to make this clear in the KIP.
>>
>> Ismael
>>
>> On Thu, Jan 12, 2017 at 5:44 PM, Mayuresh Gharat <
>> gharatmayures...@gmail.com
>> > wrote:
>>
>> > Hi all.
>> >
>> > We created KIP-111 to propose that Kafka should preserve the Principal
>> > generated by the PrincipalBuilder while processing the request received
>> on
>> > socket channel, on the broker.
>> >
>> > Please find the KIP wiki in the link
>> > https://cwiki.apache.org/confluence/pages/viewpage.action?
>> pageId=67638388.
>> > We would love to hear your comments and suggestions.
>> >
>> >
>> > Thanks,
>> >
>> > Mayuresh
>> >
>>
>
>
>
> --
> -Regards,
> Mayuresh R. Gharat
> (862) 250-7125
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [DISCUSS] KIP-111 : Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-01-12 Thread Mayuresh Gharat
Hi Ismael,

Fair point. I will update it.

Thanks,

Mayuresh

On Thu, Jan 12, 2017 at 11:07 AM, Ismael Juma <ism...@juma.me.uk> wrote:

> Hi Mayuresh,
>
> Thanks for the KIP. A quick comment before I do a more detailed analysis,
> the KIP says:
>
> `This KIP is a pure addition to existing functionality and does not include
> any backward incompatible changes.`
>
> However, the KIP is proposing the addition of a method to the
> PrincipalBuilder pluggable interface, which is not a compatible change.
> Existing implementations would no longer compile, for example. It would be
> good to make this clear in the KIP.
>
> Ismael
>
> On Thu, Jan 12, 2017 at 5:44 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Hi all.
> >
> > We created KIP-111 to propose that Kafka should preserve the Principal
> > generated by the PrincipalBuilder while processing the request received
> on
> > socket channel, on the broker.
> >
> > Please find the KIP wiki in the link
> > https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=67638388.
> > We would love to hear your comments and suggestions.
> >
> >
> > Thanks,
> >
> > Mayuresh
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

2017-01-12 Thread Mayuresh Gharat
+1 (non-binding).

Thanks,

Mayuresh

On Wed, Jan 11, 2017 at 10:11 PM, radai  wrote:

> LGTM, +1
>
> On Wed, Jan 11, 2017 at 1:01 PM, Dong Lin  wrote:
>
> > Hi all,
> >
> > It seems that there is no further concern with the KIP-107. At this point
> > we would like to start the voting process. The KIP can be found at
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-107
> > %3A+Add+purgeDataBefore%28%29+API+in+AdminClient.
> >
> > Thanks,
> > Dong
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


[DISCUSS] KIP-111 : Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-01-12 Thread Mayuresh Gharat
Hi all.

We created KIP-111 to propose that Kafka should preserve the Principal
generated by the PrincipalBuilder while processing the request received on
socket channel, on the broker.

Please find the KIP wiki in the link
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=67638388.
We would love to hear your comments and suggestions.


Thanks,

Mayuresh


Re: [VOTE] KIP-107: Add purgeDataBefore() API in AdminClient

2017-01-12 Thread Mayuresh Gharat
+1 (non-binding)

Thanks,

Mayuresh

On Wed, Jan 11, 2017 at 1:03 PM, Dong Lin  wrote:

> Sorry for the duplicated email. It seems that gmail will put the voting
> email in this thread if I simply replace DISCUSS with VOTE in the subject.
>
> On Wed, Jan 11, 2017 at 12:57 PM, Dong Lin  wrote:
>
> > Hi all,
> >
> > It seems that there is no further concern with the KIP-107. At this point
> > we would like to start the voting process. The KIP can be found at
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-107
> > %3A+Add+purgeDataBefore%28%29+API+in+AdminClient.
> >
> > Thanks,
> > Dong
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: [DISCUSS] KIP-107: Add purgeDataBefore() API in AdminClient

2017-01-11 Thread Mayuresh Gharat
Hi Dong,

As per  "If the message's offset is below low_watermark,
then it should have been deleted by log retention policy."
---> I am not sure if  I understand this correctly. Do you mean to say that
the low_watermark will be updated only when the log retention fires on the
broker?

Thanks,

Mayuresh

On Tue, Jan 10, 2017 at 2:56 PM, Dong Lin <lindon...@gmail.com> wrote:

> Bump up. I am going to initiate the vote If there is no further concern
> with the KIP.
>
> On Fri, Jan 6, 2017 at 11:23 PM, Dong Lin <lindon...@gmail.com> wrote:
>
> > Hey Mayuresh,
> >
> > Thanks for the comment. If the message's offset is below low_watermark,
> > then it should have been deleted by log retention policy. Thus it is OK
> not
> > to expose this message to consumer. Does this answer your question?
> >
> > Thanks,
> > Dong
> >
> > On Fri, Jan 6, 2017 at 4:21 PM, Mayuresh Gharat <
> > gharatmayures...@gmail.com> wrote:
> >
> >> Hi Dong,
> >>
> >> Thanks for the KIP.
> >>
> >> I had a question (which might have been answered before).
> >>
> >> 1) The KIP says that the low_water_mark will be updated periodically by
> >> the
> >> broker like high_water_mark.
> >> Essentially we want to use low_water_mark for cases where an entire
> >> segment
> >> cannot be deleted because may be the segment_start_offset < PurgeOffset
> <
> >> segment_end_offset, in which case we will set the low_water_mark to
> >> PurgeOffset+1.
> >>
> >> 2) The KIP also says that messages below low_water_mark will not be
> >> exposed
> >> for consumers, which does make sense since we want say that data below
> >> low_water_mark is purged.
> >>
> >> Looking at above conditions, does it make sense not to update the
> >> low_water_mark periodically but only on PurgeRequest?
> >> The reason being, if we update it periodically then as per 2) we will
> not
> >> be allowing consumers to re-consume data that is not purged but is below
> >> low_water_mark.
> >>
> >> Thanks,
> >>
> >> Mayuresh
> >>
> >>
> >> On Fri, Jan 6, 2017 at 11:18 AM, Dong Lin <lindon...@gmail.com> wrote:
> >>
> >> > Hey Jun,
> >> >
> >> > Thanks for reviewing the KIP!
> >> >
> >> > 1. The low_watermark will be checkpointed in a new file named
> >> >  "replication-low-watermark-checkpoint". It will have the same format
> >> as
> >> > the existing replication-offset-checkpoint file. This allows us the
> keep
> >> > the existing format of checkpoint files which maps TopicPartition to
> >> Long.
> >> > I just updated the "Public Interface" section in the KIP wiki to
> explain
> >> > this file.
> >> >
> >> > 2. I think using low_watermark from leader to trigger log retention in
> >> the
> >> > follower will work correctly in the sense that all messages with
> offset
> >> <
> >> > low_watermark can be deleted. But I am not sure that the efficiency is
> >> the
> >> > same, i.e. offset of messages which should be deleted (i.e. due to
> time
> >> or
> >> > size-based log retention policy) will be smaller than low_watermark
> from
> >> > the leader.
> >> >
> >> > For example, say both the follower and the leader have messages with
> >> > offsets in range [0, 2000]. If the follower does log rolling slightly
> >> later
> >> > than leader, the segments on follower would be [0, 1001], [1002, 2000]
> >> and
> >> > segments on leader would be [0, 1000], [1001, 2000]. After leader
> >> deletes
> >> > the first segment, the low_watermark would be 1001. Thus the first
> >> segment
> >> > would stay on follower's disk unnecessarily which may double disk
> usage
> >> at
> >> > worst.
> >> >
> >> > Since this approach doesn't save us much, I am inclined to not include
> >> this
> >> > change to keep the KIP simple.
> >> >
> >> > Dong
> >> >
> >> >
> >> >
> >> > On Fri, Jan 6, 2017 at 10:05 AM, Jun Rao <j...@confluent.io> wrote:
> >> >
> >> > > Hi, Dong,
> >> > >
> >> > > Thanks for the proposal. Looks good overall. A couple of comments.
> >> > >
> >>

  1   2   3   4   5   6   >