Re: [DISCUSS] KIP-289: Improve the default group id behavior in KafkaConsumer

2018-07-23 Thread Jason Gustafson
Hey Vahid,

Thanks for the updates. Just to clarify, I was suggesting that we disable
enable.auto.commit only if no explicit group.id is configured. If an
explicit empty string is configured for the group id, then maybe we keep
the current behavior for compatibility. We can log a warning mentioning the
deprecation and we can use the old version of the OffsetCommit API that
allows the empty group id. In a later release, we can drop this support in
the client. Does that seem reasonable?

By the way, instead of using the new ILLEGAL_OFFSET_COMMIT error code,
couldn't we use INVALID_GROUP_ID?

Thanks,
Jason



On Mon, Jul 23, 2018 at 5:14 PM, Stanislav Kozlovski  wrote:

> Hey Vahid,
>
> No I don't see an issue with it. I believe it to be the best approach.
>
> Best,
> Stanisav
>
> On Mon, Jul 23, 2018 at 12:41 PM Vahid S Hashemian <
> vahidhashem...@us.ibm.com> wrote:
>
> > Hi Stanislav,
> >
> > Thanks for the feedback.
> > Do you see an issue with using `null` as the default group id (as
> > addressed by Jason in his response)?
> > This default group id would not support offset commits and consumers
> would
> > use `auto.offset.reset` config when there is no current offset.
> >
> > Thanks.
> > --Vahid
> >
> >
> >
> >
> > From:   Stanislav Kozlovski 
> > To: dev@kafka.apache.org
> > Date:   07/20/2018 11:09 AM
> > Subject:Re: [DISCUSS] KIP-289: Improve the default group id
> > behavior in KafkaConsumer
> >
> >
> >
> > I agree with Jason's notion that
> > >  implicit use of the empty group.id to commit offsets is more likely
> to
> > be causing users unexpected problems than actually providing a useful
> > capability.
> > I was initially confused that this is the behavior when investigating a
> > new-ish JIRA issue <
> > https://issues.apache.org/jira/browse/KAFKA-6758
> > > about
> > the same topic.
> > So, +1 to deprecating "" as a group.id
> >
> > The question after that becomes what the *default* value should be -
> > should
> > we:
> > a) treat an unconfigured group.id consumer as a sort of intermittent
> > consumer where you don't store offsets at all (thereby making the user
> > explicitly sign up for them)
> > b) have a default value which makes use of them? I sort of like the
> > former.
> >
> > @Dhruvil, thinking about it at a high-level - yes. I can't think of a
> > situation where it makes sense to name something an empty string as far
> as
> > I'm aware - to me it seems like potential for confusion
> >
> >
> > On Fri, Jul 20, 2018 at 10:22 AM Rajini Sivaram  >
> > wrote:
> >
> > > +1 to deprecate use of "" as group.id since it is odd to have a
> resource
> > > name that you cannot set ACLs for. Agree, we have to support older
> > clients
> > > though.
> > >
> > > Thanks,
> > >
> > > Rajini
> > >
> > > On Fri, Jul 20, 2018 at 5:25 PM, Jason Gustafson 
> > > wrote:
> > >
> > > > Hi Vahid,
> > > >
> > > > Sorry for getting to this so late. I think there are two things here:
> > > >
> > > > 1. The use of "" as a groupId has always been a dubious practice at
> > best.
> > > > We definitely ought to deprecate its use in the client. Perhaps in
> the
> > > next
> > > > major release, we can remove support completely. However, since older
> > > > clients depend on it, we may have to continue letting the broker
> > support
> > > it
> > > > to some extent. Perhaps we just need to bump the OffsetCommit request
> > API
> > > > and only accept the offset commit for older versions. You probably
> > have
> > > to
> > > > do this anyway if you want to introduce the new error code since old
> > > > clients will not expect it.
> > > >
> > > > 2. There should be a way for the consumer to indicate that it has no
> > > group
> > > > id and will not commit offsets. This is an explicit instruction that
> > the
> > > > consumer should not bother with coordinator lookup and such. We
> > currently
> > > > have some brittle logic in place to let users avoid the coordinator
> > > lookup,
> > > > but it is a bit error-prone. I was hoping that we could change the
> > > default
> > > > value of group.id to be null so that the user had to take an
> explicit
> > > > action to opt into coordinator management (groups or offsets).
> > However,
> > > it
> > > > is true that some users may be unknowingly depending on offset
> storage
> > if
> > > > they are using both the default group.id and the default
> > > > enable.auto.commit. Perhaps one option is to disable
> > enable.auto.commit
> > > > automatically if no group.id is specified? I am not sure if there
> are
> > > any
> > > > drawbacks, but my feeling is that implicit use of the empty group.id
> > to
> > > > commit offsets is more likely to be causing users unexpected problems
> > > than
> > > > actually providing a useful capability.
> > > >
> > > > Thoughts?
> > > >
> > > > Thanks,
> > > > Jason
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, May 28, 2018 at 9:50 AM, Vahid S Hashemian <
> > > > vahidhashem...@us.ibm.com> wrote:
> > > >
> > > > > Hi Viktor,
> > > > 

Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-07-23 Thread Lucas Wang
Thanks for the comment, Becket.
So far, we've been trying to avoid making any request handler thread
special.
But if we were to follow that path in order to make the two planes more
isolated,
what do you think about also having a dedicated processor thread,
and dedicated port for the controller?

Today one processor thread can handle multiple connections, let's say 100
connections

represented by connection0, ... connection99, among which connection0-98
are from clients, while connection99 is from

the controller. Further let's say after one selector polling, there are
incoming requests on all connections.

When the request queue is full, (either the data request being full in the
two queue design, or

the one single queue being full in the deque design), the processor thread
will be blocked first

when trying to enqueue the data request from connection0, then possibly
blocked for the data request

from connection1, ... etc even though the controller request is ready to be
enqueued.

To solve this problem, it seems we would need to have a separate port
dedicated to

the controller, a dedicated processor thread, a dedicated controller
request queue,

and pinning of one request handler thread for controller requests.

Thanks,
Lucas


On Mon, Jul 23, 2018 at 6:00 PM, Becket Qin  wrote:

> Personally I am not fond of the dequeue approach simply because it is
> against the basic idea of isolating the controller plane and data plane.
> With a single dequeue, theoretically speaking the controller requests can
> starve the clients requests. I would prefer the approach with a separate
> controller request queue and a dedicated controller request handler thread.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Tue, Jul 24, 2018 at 8:16 AM, Lucas Wang  wrote:
>
> > Sure, I can summarize the usage of correlation id. But before I do that,
> it
> > seems
> > the same out-of-order processing can also happen to Produce requests sent
> > by producers,
> > following the same example you described earlier.
> > If that's the case, I think this probably deserves a separate doc and
> > design independent of this KIP.
> >
> > Lucas
> >
> >
> >
> > On Mon, Jul 23, 2018 at 12:39 PM, Dong Lin  wrote:
> >
> > > Hey Lucas,
> > >
> > > Could you update the KIP if you are confident with the approach which
> > uses
> > > correlation id? The idea around correlation id is kind of scattered
> > across
> > > multiple emails. It will be useful if other reviews can read the KIP to
> > > understand the latest proposal.
> > >
> > > Thanks,
> > > Dong
> > >
> > > On Mon, Jul 23, 2018 at 12:32 PM, Mayuresh Gharat <
> > > gharatmayures...@gmail.com> wrote:
> > >
> > > > I like the idea of the dequeue implementation by Lucas. This will
> help
> > us
> > > > avoid additional queue for controller and additional configs in
> Kafka.
> > > >
> > > > Thanks,
> > > >
> > > > Mayuresh
> > > >
> > > > On Sun, Jul 22, 2018 at 2:58 AM Becket Qin 
> > wrote:
> > > >
> > > > > Hi Jun,
> > > > >
> > > > > The usage of correlation ID might still be useful to address the
> > cases
> > > > > that the controller epoch and leader epoch check are not sufficient
> > to
> > > > > guarantee correct behavior. For example, if the controller sends a
> > > > > LeaderAndIsrRequest followed by a StopReplicaRequest, and the
> broker
> > > > > processes it in the reverse order, the replica may still be wrongly
> > > > > recreated, right?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > > > On Jul 22, 2018, at 11:47 AM, Jun Rao  wrote:
> > > > > >
> > > > > > Hmm, since we already use controller epoch and leader epoch for
> > > > properly
> > > > > > caching the latest partition state, do we really need correlation
> > id
> > > > for
> > > > > > ordering the controller requests?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Fri, Jul 20, 2018 at 2:18 PM, Becket Qin <
> becket@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > >> Lucas and Mayuresh,
> > > > > >>
> > > > > >> Good idea. The correlation id should work.
> > > > > >>
> > > > > >> In the ControllerChannelManager, a request will be resent until
> a
> > > > > response
> > > > > >> is received. So if the controller to broker connection
> disconnects
> > > > after
> > > > > >> controller sends R1_a, but before the response of R1_a is
> > received,
> > > a
> > > > > >> disconnection may cause the controller to resend R1_b. i.e.
> until
> > R1
> > > > is
> > > > > >> acked, R2 won't be sent by the controller.
> > > > > >> This gives two guarantees:
> > > > > >> 1. Correlation id wise: R1_a < R1_b < R2.
> > > > > >> 2. On the broker side, when R2 is seen, R1 must have been
> > processed
> > > at
> > > > > >> least once.
> > > > > >>
> > > > > >> So on the broker side, with a single thread controller request
> > > > handler,
> > > > > the
> > > > > >> logic should be:
> > > > > >> 1. Process what ever request seen in the controller request
> queue
> 

Re: [DISCUSS] KIP-346 - Limit blast radius of log compaction failure

2018-07-23 Thread Stanislav Kozlovski
Hi Ted,

Yes, absolutely. Thanks for pointing that out!

On Mon, Jul 23, 2018 at 6:12 PM Ted Yu  wrote:

> For `uncleanable-partitions`, should the example include topic name(s) ?
>
> Cheers
>
> On Mon, Jul 23, 2018 at 5:46 PM Stanislav Kozlovski <
> stanis...@confluent.io>
> wrote:
>
> > I renamed the KIP and that changed the link. Sorry about that. Here is
> the
> > new link:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-346+-+Improve+LogCleaner+behavior+on+error
> >
> > On Mon, Jul 23, 2018 at 5:11 PM Stanislav Kozlovski <
> > stanis...@confluent.io>
> > wrote:
> >
> > > Hey group,
> > >
> > > I created a new KIP about making log compaction more fault-tolerant.
> > > Please give it a look here and please share what you think, especially
> in
> > > regards to the points in the "Needs Discussion" paragraph.
> > >
> > > KIP: KIP-346
> > > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-346+-+Limit+blast+radius+of+log+compaction+failure
> > >
> > > --
> > > Best,
> > > Stanislav
> > >
> >
> >
> > --
> > Best,
> > Stanislav
> >
>


-- 
Best,
Stanislav


Re: [Discuss] KIP-321: Add method to get TopicNameExtractor in TopologyDescription

2018-07-23 Thread Nishanth Pradeep
Sounds good to me too.

As far as deprecating goes -- should the topics() method removed completely
or should it have a @deprecated annotation for removal in some future
version?

Best,
Nishanth Pradeep

On Sun, Jul 22, 2018 at 1:32 PM Matthias J. Sax 
wrote:

> Works for me.
>
> On 7/22/18 9:48 AM, Guozhang Wang wrote:
> > I think I can be convinced with deprecating topics() to keep API minimal.
> >
> > About renaming the others with `XXNames()`: well, to me it feels still
> not
> > very worthy since although it is not a big burden, it seems also not a
> big
> > "return" if we name the newly added function `topicSet()`.
> >
> >
> > Guozhang
> >
> >
> > On Fri, Jul 20, 2018 at 7:38 PM, Nishanth Pradeep  >
> > wrote:
> >
> >> I definitely agree with you on deprecating topics().
> >>
> >> I also think changing the method names for consistency is reasonable,
> since
> >> there is no functionality change. Although, I can be convinced either
> way
> >> on this one.
> >>
> >> Best,
> >> Nishanth Pradeep
> >> On Fri, Jul 20, 2018 at 12:15 PM Matthias J. Sax  >
> >> wrote:
> >>
> >>> I would still deprecate existing `topics()` method. If users need a
> >>> String, they can call `topicSet().toString()`.
> >>>
> >>> It's just a personal preference, because I believe it's good to keep
> the
> >>> API "minimal".
> >>>
> >>> About renaming the other methods: I thinks it's a very small burden to
> >>> deprecate the existing methods and add them with new names. Also just
> my
> >>> 2 cents.
> >>>
> >>> Would be good to see what others think.
> >>>
> >>>
> >>> -Matthias
> >>>
> >>> On 7/19/18 6:20 PM, Nishanth Pradeep wrote:
>  Understood, Guozhang.
> 
>  Thanks for the help, everyone! I have updated the KIP. Let me know if
> >> you
>  any other thoughts or suggestions.
> 
>  Best,
>  Nishanth Pradeep
> 
>  On Thu, Jul 19, 2018 at 7:33 PM Guozhang Wang 
> >>> wrote:
> 
> > I see.
> >
> > Well, I think if we add a new function like topicSet() it is less
> >>> needed to
> > deprecate topics() as it returns "{topic1, topic2, ..}" which is sort
> >> of
> > non-overlapping in usage with the new API.
> >
> >
> > Guozhang
> >
> > On Thu, Jul 19, 2018 at 5:31 PM, Nishanth Pradeep <
> >>> nishanth...@gmail.com>
> > wrote:
> >
> >> That is what I meant. I will add topicSet() instead of changing the
> >> signature of topics() for compatibility reasons. But should we not
> >> add
> >>> a
> >> @deprecated flag for topics() or do you want to keep it around for
> >> the
> > long
> >> run?
> >>
> >> On Thu, Jul 19, 2018 at 7:27 PM Guozhang Wang 
> > wrote:
> >>
> >>> We cannot change the signature of the function named "topics" from
> >> "String"
> >>> to "Set", as Matthias mentioned it is a compatibility
> >> breaking
> >>> change.
> >>>
> >>> That's why I was proposing add a new function like "Set
> >>> topicSet()", while keeping "String topics()" as is.
> >>>
> >>> Guozhang
> >>>
> >>> On Thu, Jul 19, 2018 at 5:22 PM, Nishanth Pradeep <
> > nishanth...@gmail.com
> >>>
> >>> wrote:
> >>>
>  Right, adding topicNames() instead of changing the return type of
> >>> topics()
>  in order preserve backwards compatibility is a good idea. But is
> it
> > not
>  better to depreciate topics() because it would be redundant? In
> our
> >> case,
>  it would only be calling topicNames/topicSet#toString().
> 
>  I still agree that perhaps changing the other API's might be
> >> unnecessary
>  since it's only a name change.
> 
>  I have made the change to the KIP to only add, not change,
> > preexisting
>  APIs. But where do we stand on deprecating topics()?
> 
>  Best,
>  Nishanth Pradeep
> 
>  On Thu, Jul 19, 2018 at 1:44 PM Guozhang Wang  >
> >>> wrote:
> 
> > Personally I'd prefer to keep the deprecation-related changes as
> >> small
> >>> as
> > possible unless they are really necessary, and hence I'd prefer
> to
> >> just
>  add
> >
> > List topicList()  /* or Set topicSet() */
> >
> > in addition to topicPattern to Source, in addition to
>  `topicNameExtractor`
> > to Sink, and leaving the current APIs as-is.
> >
> > Guozhang
> >
> > On Thu, Jul 19, 2018 at 10:36 AM, Matthias J. Sax <
> >>> matth...@confluent.io
> >
> > wrote:
> >
> >> Thanks for updating the KIP.
> >>
> >> The current `Source` interface has a method `String topics()`
> > atm.
>  Thus,
> >> we cannot just add `Set Source#topics()` because this
> > would
> >> replace the existing method and would be an incompatible change.
> >>
> >> I 

Re: Seeing old tombstones in compacted topic

2018-07-23 Thread Ted Yu
Looking at some recent JIRAs, such as KAFKA-6568, which came in after the
release of 0.11.0

Would that possibly be related to what you observed ?

Cheers

On Mon, Jul 23, 2018 at 6:23 PM Mitch Seymour 
wrote:

> Hi all,
>
> We're using version 0.11.0 of Kafka (broker and client), and our Kafka
> Streams app uses a compacted topic for storing it's state. Here's the
> output of kafka-topics.sh --describe:
>
> Topic:mytopic
> PartitionCount:32
> ReplicationFactor:2
> Configs:retention.ms=43200,cleanup.policy=compact
>
> The app will write tombstones to this topic when it's finished with a
> certain key. I can see the tombstones using kafkacat
>
> kafkacat -q -b ... -t mytopic -o beginning -c 2 -f"Time: %T, Key: %k,
> Message: %s\n" -Z
>
> Output:
>
> Time: 1530559667357, Key: key1, Message: NULL
> Time: 1530559667466, Key: key2, Message: NULL
>
> Note: the -Z flag in kafkacat prints null values as NULL to make it easier
> to see the tombstones. Anyways, the timestamps on these topics are from
> GMT: Monday, July 2, 2018, so I'm not sure why the tombstones still exist
> in this topic.
>
> Furthermore, it looks like compaction is being triggered because I'm seeing
> this in the logs:
>
> discarding tombstones prior to Thu Jul 19 14:15:17 GMT 2018.
>
> However, I still see tombstones in this topic from Jul 2, so it doesn't add
> up Another side note: I'm not explicitly setting delete.retention.ms, and
> since the default value is 8640, or 1 day, I'm not too sure why the
> tombstones are sticking around.
>
> Anyways, has anyone experienced this before? I'm not sure if this is a
> known bug, or if there's something peculiar with our own setup. Thanks,
>
> Mitch
>


Seeing old tombstones in compacted topic

2018-07-23 Thread Mitch Seymour
Hi all,

We're using version 0.11.0 of Kafka (broker and client), and our Kafka
Streams app uses a compacted topic for storing it's state. Here's the
output of kafka-topics.sh --describe:

Topic:mytopic
PartitionCount:32
ReplicationFactor:2
Configs:retention.ms=43200,cleanup.policy=compact

The app will write tombstones to this topic when it's finished with a
certain key. I can see the tombstones using kafkacat

kafkacat -q -b ... -t mytopic -o beginning -c 2 -f"Time: %T, Key: %k,
Message: %s\n" -Z

Output:

Time: 1530559667357, Key: key1, Message: NULL
Time: 1530559667466, Key: key2, Message: NULL

Note: the -Z flag in kafkacat prints null values as NULL to make it easier
to see the tombstones. Anyways, the timestamps on these topics are from
GMT: Monday, July 2, 2018, so I'm not sure why the tombstones still exist
in this topic.

Furthermore, it looks like compaction is being triggered because I'm seeing
this in the logs:

discarding tombstones prior to Thu Jul 19 14:15:17 GMT 2018.

However, I still see tombstones in this topic from Jul 2, so it doesn't add
up Another side note: I'm not explicitly setting delete.retention.ms, and
since the default value is 8640, or 1 day, I'm not too sure why the
tombstones are sticking around.

Anyways, has anyone experienced this before? I'm not sure if this is a
known bug, or if there's something peculiar with our own setup. Thanks,

Mitch


Re: [DISCUSS] KIP-346 - Limit blast radius of log compaction failure

2018-07-23 Thread Ted Yu
For `uncleanable-partitions`, should the example include topic name(s) ?

Cheers

On Mon, Jul 23, 2018 at 5:46 PM Stanislav Kozlovski 
wrote:

> I renamed the KIP and that changed the link. Sorry about that. Here is the
> new link:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-346+-+Improve+LogCleaner+behavior+on+error
>
> On Mon, Jul 23, 2018 at 5:11 PM Stanislav Kozlovski <
> stanis...@confluent.io>
> wrote:
>
> > Hey group,
> >
> > I created a new KIP about making log compaction more fault-tolerant.
> > Please give it a look here and please share what you think, especially in
> > regards to the points in the "Needs Discussion" paragraph.
> >
> > KIP: KIP-346
> > <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-346+-+Limit+blast+radius+of+log+compaction+failure
> >
> > --
> > Best,
> > Stanislav
> >
>
>
> --
> Best,
> Stanislav
>


Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-07-23 Thread Becket Qin
Personally I am not fond of the dequeue approach simply because it is
against the basic idea of isolating the controller plane and data plane.
With a single dequeue, theoretically speaking the controller requests can
starve the clients requests. I would prefer the approach with a separate
controller request queue and a dedicated controller request handler thread.

Thanks,

Jiangjie (Becket) Qin

On Tue, Jul 24, 2018 at 8:16 AM, Lucas Wang  wrote:

> Sure, I can summarize the usage of correlation id. But before I do that, it
> seems
> the same out-of-order processing can also happen to Produce requests sent
> by producers,
> following the same example you described earlier.
> If that's the case, I think this probably deserves a separate doc and
> design independent of this KIP.
>
> Lucas
>
>
>
> On Mon, Jul 23, 2018 at 12:39 PM, Dong Lin  wrote:
>
> > Hey Lucas,
> >
> > Could you update the KIP if you are confident with the approach which
> uses
> > correlation id? The idea around correlation id is kind of scattered
> across
> > multiple emails. It will be useful if other reviews can read the KIP to
> > understand the latest proposal.
> >
> > Thanks,
> > Dong
> >
> > On Mon, Jul 23, 2018 at 12:32 PM, Mayuresh Gharat <
> > gharatmayures...@gmail.com> wrote:
> >
> > > I like the idea of the dequeue implementation by Lucas. This will help
> us
> > > avoid additional queue for controller and additional configs in Kafka.
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Sun, Jul 22, 2018 at 2:58 AM Becket Qin 
> wrote:
> > >
> > > > Hi Jun,
> > > >
> > > > The usage of correlation ID might still be useful to address the
> cases
> > > > that the controller epoch and leader epoch check are not sufficient
> to
> > > > guarantee correct behavior. For example, if the controller sends a
> > > > LeaderAndIsrRequest followed by a StopReplicaRequest, and the broker
> > > > processes it in the reverse order, the replica may still be wrongly
> > > > recreated, right?
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > > On Jul 22, 2018, at 11:47 AM, Jun Rao  wrote:
> > > > >
> > > > > Hmm, since we already use controller epoch and leader epoch for
> > > properly
> > > > > caching the latest partition state, do we really need correlation
> id
> > > for
> > > > > ordering the controller requests?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Fri, Jul 20, 2018 at 2:18 PM, Becket Qin 
> > > > wrote:
> > > > >
> > > > >> Lucas and Mayuresh,
> > > > >>
> > > > >> Good idea. The correlation id should work.
> > > > >>
> > > > >> In the ControllerChannelManager, a request will be resent until a
> > > > response
> > > > >> is received. So if the controller to broker connection disconnects
> > > after
> > > > >> controller sends R1_a, but before the response of R1_a is
> received,
> > a
> > > > >> disconnection may cause the controller to resend R1_b. i.e. until
> R1
> > > is
> > > > >> acked, R2 won't be sent by the controller.
> > > > >> This gives two guarantees:
> > > > >> 1. Correlation id wise: R1_a < R1_b < R2.
> > > > >> 2. On the broker side, when R2 is seen, R1 must have been
> processed
> > at
> > > > >> least once.
> > > > >>
> > > > >> So on the broker side, with a single thread controller request
> > > handler,
> > > > the
> > > > >> logic should be:
> > > > >> 1. Process what ever request seen in the controller request queue
> > > > >> 2. For the given epoch, drop request if its correlation id is
> > smaller
> > > > than
> > > > >> that of the last processed request.
> > > > >>
> > > > >> Thanks,
> > > > >>
> > > > >> Jiangjie (Becket) Qin
> > > > >>
> > > > >> On Fri, Jul 20, 2018 at 8:07 AM, Jun Rao 
> wrote:
> > > > >>
> > > > >>> I agree that there is no strong ordering when there are more than
> > one
> > > > >>> socket connections. Currently, we rely on controllerEpoch and
> > > > leaderEpoch
> > > > >>> to ensure that the receiving broker picks up the latest state for
> > > each
> > > > >>> partition.
> > > > >>>
> > > > >>> One potential issue with the dequeue approach is that if the
> queue
> > is
> > > > >> full,
> > > > >>> there is no guarantee that the controller requests will be
> enqueued
> > > > >>> quickly.
> > > > >>>
> > > > >>> Thanks,
> > > > >>>
> > > > >>> Jun
> > > > >>>
> > > > >>> On Fri, Jul 20, 2018 at 5:25 AM, Mayuresh Gharat <
> > > > >>> gharatmayures...@gmail.com
> > > >  wrote:
> > > > >>>
> > > >  Yea, the correlationId is only set to 0 in the NetworkClient
> > > > >> constructor.
> > > >  Since we reuse the same NetworkClient between Controller and the
> > > > >> broker,
> > > > >>> a
> > > >  disconnection should not cause it to reset to 0, in which case
> it
> > > can
> > > > >> be
> > > >  used to reject obsolete requests.
> > > > 
> > > >  Thanks,
> > > > 
> > > >  Mayuresh
> > > > 
> > > >  On Thu, Jul 19, 2018 at 1:52 PM Lucas Wang <
> lucasatu...@gmail.com
> > >
> > > > >>> 

Re: [DISCUSS] KIP-346 - Limit blast radius of log compaction failure

2018-07-23 Thread Stanislav Kozlovski
I renamed the KIP and that changed the link. Sorry about that. Here is the
new link:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-346+-+Improve+LogCleaner+behavior+on+error

On Mon, Jul 23, 2018 at 5:11 PM Stanislav Kozlovski 
wrote:

> Hey group,
>
> I created a new KIP about making log compaction more fault-tolerant.
> Please give it a look here and please share what you think, especially in
> regards to the points in the "Needs Discussion" paragraph.
>
> KIP: KIP-346
> 
> --
> Best,
> Stanislav
>


-- 
Best,
Stanislav


Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-07-23 Thread Lucas Wang
Sure, I can summarize the usage of correlation id. But before I do that, it
seems
the same out-of-order processing can also happen to Produce requests sent
by producers,
following the same example you described earlier.
If that's the case, I think this probably deserves a separate doc and
design independent of this KIP.

Lucas



On Mon, Jul 23, 2018 at 12:39 PM, Dong Lin  wrote:

> Hey Lucas,
>
> Could you update the KIP if you are confident with the approach which uses
> correlation id? The idea around correlation id is kind of scattered across
> multiple emails. It will be useful if other reviews can read the KIP to
> understand the latest proposal.
>
> Thanks,
> Dong
>
> On Mon, Jul 23, 2018 at 12:32 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com> wrote:
>
> > I like the idea of the dequeue implementation by Lucas. This will help us
> > avoid additional queue for controller and additional configs in Kafka.
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Sun, Jul 22, 2018 at 2:58 AM Becket Qin  wrote:
> >
> > > Hi Jun,
> > >
> > > The usage of correlation ID might still be useful to address the cases
> > > that the controller epoch and leader epoch check are not sufficient to
> > > guarantee correct behavior. For example, if the controller sends a
> > > LeaderAndIsrRequest followed by a StopReplicaRequest, and the broker
> > > processes it in the reverse order, the replica may still be wrongly
> > > recreated, right?
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > > On Jul 22, 2018, at 11:47 AM, Jun Rao  wrote:
> > > >
> > > > Hmm, since we already use controller epoch and leader epoch for
> > properly
> > > > caching the latest partition state, do we really need correlation id
> > for
> > > > ordering the controller requests?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Fri, Jul 20, 2018 at 2:18 PM, Becket Qin 
> > > wrote:
> > > >
> > > >> Lucas and Mayuresh,
> > > >>
> > > >> Good idea. The correlation id should work.
> > > >>
> > > >> In the ControllerChannelManager, a request will be resent until a
> > > response
> > > >> is received. So if the controller to broker connection disconnects
> > after
> > > >> controller sends R1_a, but before the response of R1_a is received,
> a
> > > >> disconnection may cause the controller to resend R1_b. i.e. until R1
> > is
> > > >> acked, R2 won't be sent by the controller.
> > > >> This gives two guarantees:
> > > >> 1. Correlation id wise: R1_a < R1_b < R2.
> > > >> 2. On the broker side, when R2 is seen, R1 must have been processed
> at
> > > >> least once.
> > > >>
> > > >> So on the broker side, with a single thread controller request
> > handler,
> > > the
> > > >> logic should be:
> > > >> 1. Process what ever request seen in the controller request queue
> > > >> 2. For the given epoch, drop request if its correlation id is
> smaller
> > > than
> > > >> that of the last processed request.
> > > >>
> > > >> Thanks,
> > > >>
> > > >> Jiangjie (Becket) Qin
> > > >>
> > > >> On Fri, Jul 20, 2018 at 8:07 AM, Jun Rao  wrote:
> > > >>
> > > >>> I agree that there is no strong ordering when there are more than
> one
> > > >>> socket connections. Currently, we rely on controllerEpoch and
> > > leaderEpoch
> > > >>> to ensure that the receiving broker picks up the latest state for
> > each
> > > >>> partition.
> > > >>>
> > > >>> One potential issue with the dequeue approach is that if the queue
> is
> > > >> full,
> > > >>> there is no guarantee that the controller requests will be enqueued
> > > >>> quickly.
> > > >>>
> > > >>> Thanks,
> > > >>>
> > > >>> Jun
> > > >>>
> > > >>> On Fri, Jul 20, 2018 at 5:25 AM, Mayuresh Gharat <
> > > >>> gharatmayures...@gmail.com
> > >  wrote:
> > > >>>
> > >  Yea, the correlationId is only set to 0 in the NetworkClient
> > > >> constructor.
> > >  Since we reuse the same NetworkClient between Controller and the
> > > >> broker,
> > > >>> a
> > >  disconnection should not cause it to reset to 0, in which case it
> > can
> > > >> be
> > >  used to reject obsolete requests.
> > > 
> > >  Thanks,
> > > 
> > >  Mayuresh
> > > 
> > >  On Thu, Jul 19, 2018 at 1:52 PM Lucas Wang  >
> > > >>> wrote:
> > > 
> > > > @Dong,
> > > > Great example and explanation, thanks!
> > > >
> > > > @All
> > > > Regarding the example given by Dong, it seems even if we use a
> > queue,
> > >  and a
> > > > dedicated controller request handling thread,
> > > > the same result can still happen because R1_a will be sent on one
> > > > connection, and R1_b & R2 will be sent on a different connection,
> > > > and there is no ordering between different connections on the
> > broker
> > >  side.
> > > > I was discussing with Mayuresh offline, and it seems correlation
> id
> > >  within
> > > > the same NetworkClient object is monotonically increasing and
> never
> > >  reset,
> > > > hence a broker can leverage 

Re: [DISCUSS] KIP-289: Improve the default group id behavior in KafkaConsumer

2018-07-23 Thread Stanislav Kozlovski
Hey Vahid,

No I don't see an issue with it. I believe it to be the best approach.

Best,
Stanisav

On Mon, Jul 23, 2018 at 12:41 PM Vahid S Hashemian <
vahidhashem...@us.ibm.com> wrote:

> Hi Stanislav,
>
> Thanks for the feedback.
> Do you see an issue with using `null` as the default group id (as
> addressed by Jason in his response)?
> This default group id would not support offset commits and consumers would
> use `auto.offset.reset` config when there is no current offset.
>
> Thanks.
> --Vahid
>
>
>
>
> From:   Stanislav Kozlovski 
> To: dev@kafka.apache.org
> Date:   07/20/2018 11:09 AM
> Subject:Re: [DISCUSS] KIP-289: Improve the default group id
> behavior in KafkaConsumer
>
>
>
> I agree with Jason's notion that
> >  implicit use of the empty group.id to commit offsets is more likely to
> be causing users unexpected problems than actually providing a useful
> capability.
> I was initially confused that this is the behavior when investigating a
> new-ish JIRA issue <
> https://issues.apache.org/jira/browse/KAFKA-6758
> > about
> the same topic.
> So, +1 to deprecating "" as a group.id
>
> The question after that becomes what the *default* value should be -
> should
> we:
> a) treat an unconfigured group.id consumer as a sort of intermittent
> consumer where you don't store offsets at all (thereby making the user
> explicitly sign up for them)
> b) have a default value which makes use of them? I sort of like the
> former.
>
> @Dhruvil, thinking about it at a high-level - yes. I can't think of a
> situation where it makes sense to name something an empty string as far as
> I'm aware - to me it seems like potential for confusion
>
>
> On Fri, Jul 20, 2018 at 10:22 AM Rajini Sivaram 
> wrote:
>
> > +1 to deprecate use of "" as group.id since it is odd to have a resource
> > name that you cannot set ACLs for. Agree, we have to support older
> clients
> > though.
> >
> > Thanks,
> >
> > Rajini
> >
> > On Fri, Jul 20, 2018 at 5:25 PM, Jason Gustafson 
> > wrote:
> >
> > > Hi Vahid,
> > >
> > > Sorry for getting to this so late. I think there are two things here:
> > >
> > > 1. The use of "" as a groupId has always been a dubious practice at
> best.
> > > We definitely ought to deprecate its use in the client. Perhaps in the
> > next
> > > major release, we can remove support completely. However, since older
> > > clients depend on it, we may have to continue letting the broker
> support
> > it
> > > to some extent. Perhaps we just need to bump the OffsetCommit request
> API
> > > and only accept the offset commit for older versions. You probably
> have
> > to
> > > do this anyway if you want to introduce the new error code since old
> > > clients will not expect it.
> > >
> > > 2. There should be a way for the consumer to indicate that it has no
> > group
> > > id and will not commit offsets. This is an explicit instruction that
> the
> > > consumer should not bother with coordinator lookup and such. We
> currently
> > > have some brittle logic in place to let users avoid the coordinator
> > lookup,
> > > but it is a bit error-prone. I was hoping that we could change the
> > default
> > > value of group.id to be null so that the user had to take an explicit
> > > action to opt into coordinator management (groups or offsets).
> However,
> > it
> > > is true that some users may be unknowingly depending on offset storage
> if
> > > they are using both the default group.id and the default
> > > enable.auto.commit. Perhaps one option is to disable
> enable.auto.commit
> > > automatically if no group.id is specified? I am not sure if there are
> > any
> > > drawbacks, but my feeling is that implicit use of the empty group.id
> to
> > > commit offsets is more likely to be causing users unexpected problems
> > than
> > > actually providing a useful capability.
> > >
> > > Thoughts?
> > >
> > > Thanks,
> > > Jason
> > >
> > >
> > >
> > >
> > > On Mon, May 28, 2018 at 9:50 AM, Vahid S Hashemian <
> > > vahidhashem...@us.ibm.com> wrote:
> > >
> > > > Hi Viktor,
> > > >
> > > > Thanks for sharing your opinion.
> > > > So you're in favor of disallowing the empty ("") group id altogether
> > > (even
> > > > for fetching).
> > > > Given that ideally no one should be using the empty group id (at
> least
> > in
> > > > a production setting) I think the impact would be minimal in either
> > case.
> > > >
> > > > But as you said, let's hear what others think and I'd be happy to
> > modify
> > > > the KIP if needed.
> > > >
> > > > Regards.
> > > > --Vahid
> > > >
> > > >
> > > >
> > > >
> > > > From:   Viktor Somogyi 
> > > > To: dev 
> > > > Date:   05/28/2018 05:18 AM
> > > > Subject:Re: [DISCUSS] KIP-289: Improve the default group id
> > > > behavior in KafkaConsumer
> > > >
> > > >
> > > >
> > > > Hi Vahid,
> > > >
> > > > (with the argument that using the default group id for offset commit
> > > > should not be the user's intention in practice).
> > > >
> > > > Yea, so in my opinion too this 

[DISCUSS] KIP-346 - Limit blast radius of log compaction failure

2018-07-23 Thread Stanislav Kozlovski
Hey group,

I created a new KIP about making log compaction more fault-tolerant. Please
give it a look here and please share what you think, especially in regards
to the points in the "Needs Discussion" paragraph.

KIP: KIP-346

-- 
Best,
Stanislav


Build failed in Jenkins: kafka-2.0-jdk8 #88

2018-07-23 Thread Apache Jenkins Server
See 

--
[...truncated 2.48 MB...]
org.apache.kafka.streams.TopologyTest > 
shouldNotAllowToAddStateStoreToNonExistingProcessor STARTED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowToAddStateStoreToNonExistingProcessor PASSED

org.apache.kafka.streams.TopologyTest > 
sourceAndProcessorShouldHaveSingleSubtopology STARTED

org.apache.kafka.streams.TopologyTest > 
sourceAndProcessorShouldHaveSingleSubtopology PASSED

org.apache.kafka.streams.TopologyTest > 
tableNamedMaterializedCountShouldPreserveTopologyStructure STARTED

org.apache.kafka.streams.TopologyTest > 
tableNamedMaterializedCountShouldPreserveTopologyStructure PASSED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowNullStoreNameWhenConnectingProcessorAndStateStores STARTED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowNullStoreNameWhenConnectingProcessorAndStateStores PASSED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowNullNameWhenAddingSourceWithTopic STARTED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowNullNameWhenAddingSourceWithTopic PASSED

org.apache.kafka.streams.TopologyTest > 
sourceAndProcessorWithStateShouldHaveSingleSubtopology STARTED

org.apache.kafka.streams.TopologyTest > 
sourceAndProcessorWithStateShouldHaveSingleSubtopology PASSED

org.apache.kafka.streams.TopologyTest > shouldNotAllowNullTopicWhenAddingSink 
STARTED

org.apache.kafka.streams.TopologyTest > shouldNotAllowNullTopicWhenAddingSink 
PASSED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowToAddGlobalStoreWithSourceNameEqualsProcessorName STARTED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowToAddGlobalStoreWithSourceNameEqualsProcessorName PASSED

org.apache.kafka.streams.TopologyTest > 
singleSourceWithListOfTopicsShouldHaveSingleSubtopology STARTED

org.apache.kafka.streams.TopologyTest > 
singleSourceWithListOfTopicsShouldHaveSingleSubtopology PASSED

org.apache.kafka.streams.TopologyTest > 
sourceWithMultipleProcessorsShouldHaveSingleSubtopology STARTED

org.apache.kafka.streams.TopologyTest > 
sourceWithMultipleProcessorsShouldHaveSingleSubtopology PASSED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowZeroStoreNameWhenConnectingProcessorAndStateStores STARTED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowZeroStoreNameWhenConnectingProcessorAndStateStores PASSED

org.apache.kafka.streams.TopologyTest > 
kTableNamedMaterializedMapValuesShouldPreserveTopologyStructure STARTED

org.apache.kafka.streams.TopologyTest > 
kTableNamedMaterializedMapValuesShouldPreserveTopologyStructure PASSED

org.apache.kafka.streams.TopologyTest > 
timeWindowZeroArgCountShouldPreserveTopologyStructure STARTED

org.apache.kafka.streams.TopologyTest > 
timeWindowZeroArgCountShouldPreserveTopologyStructure PASSED

org.apache.kafka.streams.TopologyTest > 
multipleSourcesWithProcessorsShouldHaveDistinctSubtopologies STARTED

org.apache.kafka.streams.TopologyTest > 
multipleSourcesWithProcessorsShouldHaveDistinctSubtopologies PASSED

org.apache.kafka.streams.TopologyTest > shouldThrowOnUnassignedStateStoreAccess 
STARTED

org.apache.kafka.streams.TopologyTest > shouldThrowOnUnassignedStateStoreAccess 
PASSED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowNullProcessorNameWhenConnectingProcessorAndStateStores STARTED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowNullProcessorNameWhenConnectingProcessorAndStateStores PASSED

org.apache.kafka.streams.TopologyTest > 
kTableAnonymousMaterializedMapValuesShouldPreserveTopologyStructure STARTED

org.apache.kafka.streams.TopologyTest > 
kTableAnonymousMaterializedMapValuesShouldPreserveTopologyStructure PASSED

org.apache.kafka.streams.TopologyTest > 
sessionWindowZeroArgCountShouldPreserveTopologyStructure STARTED

org.apache.kafka.streams.TopologyTest > 
sessionWindowZeroArgCountShouldPreserveTopologyStructure PASSED

org.apache.kafka.streams.TopologyTest > 
tableAnonymousMaterializedCountShouldPreserveTopologyStructure STARTED

org.apache.kafka.streams.TopologyTest > 
tableAnonymousMaterializedCountShouldPreserveTopologyStructure PASSED

org.apache.kafka.streams.TopologyTest > shouldDescribeGlobalStoreTopology 
STARTED

org.apache.kafka.streams.TopologyTest > shouldDescribeGlobalStoreTopology PASSED

org.apache.kafka.streams.TopologyTest > 
kTableNonMaterializedMapValuesShouldPreserveTopologyStructure STARTED

org.apache.kafka.streams.TopologyTest > 
kTableNonMaterializedMapValuesShouldPreserveTopologyStructure PASSED

org.apache.kafka.streams.TopologyTest > 
kGroupedStreamAnonymousMaterializedCountShouldPreserveTopologyStructure STARTED

org.apache.kafka.streams.TopologyTest > 
kGroupedStreamAnonymousMaterializedCountShouldPreserveTopologyStructure PASSED

org.apache.kafka.streams.TopologyTest > 
shouldDescribeMultipleGlobalStoreTopology STARTED

org.apache.kafka.streams.TopologyTest > 
shouldDescribeMultipleGlobalStoreTopology PASSED


Re: Discussion: New components in JIRA?

2018-07-23 Thread Guozhang Wang
I've just updated the web docs on http://kafka.apache.org/contributing
accordingly.

On Mon, Jul 23, 2018 at 3:30 PM, khaireddine Rezgui <
khaireddine...@gmail.com> wrote:

> Good job Ray for the wiki, it's clear enough.
>
> Le 23 juil. 2018 10:17 PM, "Ray Chiang"  a écrit :
>
> Okay, I've created a wiki page Reporting Issues in Apache Kafka
> <
> https://cwiki.apache.org/confluence/display/KAFKA/
> Reporting+Issues+in+Apache+Kafka>.
>
> I'd appreciate any feedback.  If this is good enough, I can file a JIRA
> to change the link under "Bugs" in the "Project information" page.
>
>
> -Ray
>
>
> On 7/23/18 11:28 AM, Ray Chiang wrote:
> > Good point.  I'll look into adding some JIRA guidelines to the
> > documentation/wiki.
> >
> > -Ray
> >
> > On 7/22/18 10:23 AM, Guozhang Wang wrote:
> >> Hello Ray,
> >>
> >> Thanks for brining this up. I'm generally +1 on the first two, while for
> >> the last category, personally I felt leaving them as part of `tools` is
> >> fine, but I'm also open for other opinions.
> >>
> >> A more general question though, is that today we do not have any
> >> guidelines
> >> to ask JIRA reporters to set the right component, i.e. it is purely
> >> best-effort, and we cannot disallow reporters to add any new component
> >> names. And so far the project does not really have a tradition to manage
> >> JIRA reports per-component, as the goal is to not "separate" the project
> >> into silos but recommending everyone to get hands on every aspect of the
> >> project.
> >>
> >>
> >> Guozhang
> >>
> >>
> >> On Fri, Jul 20, 2018 at 2:44 PM, Ray Chiang  wrote:
> >>
> >>> I've been doing a little bit of component cleanup in JIRA.  What do
> >>> people
> >>> think of adding
> >>> one or more of the following components?
> >>>
> >>> - logging: For any consumer/producer/broker logging (i.e. log4j). This
> >>> should help disambiguate from the "log" component (i.e. Kafka
> >>> messages).
> >>>
> >>> - mirrormaker: There are enough requests specific to MirrorMaker
> >>> that it
> >>> could be put into its own component.
> >>>
> >>> - scripts: I'm a little more ambivalent about this one, but any of the
> >>> bin/*.sh script fixes could belong in their own category.  I'm not
> >>> sure if
> >>> other people feel strongly for how the "tools" component should be used
> >>> w.r.t. the run scripts.
> >>>
> >>> Any thoughts?
> >>>
> >>> -Ray
> >>>
> >>>
> >>
> >
>



-- 
-- Guozhang


[jira] [Created] (KAFKA-7196) Remove heartbeat delayed operation for those removed consumers at the end of each rebalance

2018-07-23 Thread Lincong Li (JIRA)
Lincong Li created KAFKA-7196:
-

 Summary: Remove heartbeat delayed operation for those removed 
consumers at the end of each rebalance
 Key: KAFKA-7196
 URL: https://issues.apache.org/jira/browse/KAFKA-7196
 Project: Kafka
  Issue Type: Bug
  Components: core, purgatory
Reporter: Lincong Li


During the consumer group rebalance, when the joining group phase finishes, the 
heartbeat delayed operation of the consumer that fails to rejoin the group 
should be removed from the purgatory. Otherwise, even though the member ID of 
the consumer has been removed from the group, its heartbeat delayed operation 
is still registered in the purgatory and the heartbeat delayed operation is 
going to timeout and then another unnecessary rebalance is triggered because of 
it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Discussion: New components in JIRA?

2018-07-23 Thread khaireddine Rezgui
Good job Ray for the wiki, it's clear enough.

Le 23 juil. 2018 10:17 PM, "Ray Chiang"  a écrit :

Okay, I've created a wiki page Reporting Issues in Apache Kafka
<
https://cwiki.apache.org/confluence/display/KAFKA/Reporting+Issues+in+Apache+Kafka>.

I'd appreciate any feedback.  If this is good enough, I can file a JIRA
to change the link under "Bugs" in the "Project information" page.


-Ray


On 7/23/18 11:28 AM, Ray Chiang wrote:
> Good point.  I'll look into adding some JIRA guidelines to the
> documentation/wiki.
>
> -Ray
>
> On 7/22/18 10:23 AM, Guozhang Wang wrote:
>> Hello Ray,
>>
>> Thanks for brining this up. I'm generally +1 on the first two, while for
>> the last category, personally I felt leaving them as part of `tools` is
>> fine, but I'm also open for other opinions.
>>
>> A more general question though, is that today we do not have any
>> guidelines
>> to ask JIRA reporters to set the right component, i.e. it is purely
>> best-effort, and we cannot disallow reporters to add any new component
>> names. And so far the project does not really have a tradition to manage
>> JIRA reports per-component, as the goal is to not "separate" the project
>> into silos but recommending everyone to get hands on every aspect of the
>> project.
>>
>>
>> Guozhang
>>
>>
>> On Fri, Jul 20, 2018 at 2:44 PM, Ray Chiang  wrote:
>>
>>> I've been doing a little bit of component cleanup in JIRA.  What do
>>> people
>>> think of adding
>>> one or more of the following components?
>>>
>>> - logging: For any consumer/producer/broker logging (i.e. log4j). This
>>> should help disambiguate from the "log" component (i.e. Kafka
>>> messages).
>>>
>>> - mirrormaker: There are enough requests specific to MirrorMaker
>>> that it
>>> could be put into its own component.
>>>
>>> - scripts: I'm a little more ambivalent about this one, but any of the
>>> bin/*.sh script fixes could belong in their own category.  I'm not
>>> sure if
>>> other people feel strongly for how the "tools" component should be used
>>> w.r.t. the run scripts.
>>>
>>> Any thoughts?
>>>
>>> -Ray
>>>
>>>
>>
>


Build failed in Jenkins: kafka-trunk-jdk10 #310

2018-07-23 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-7193: Use ZooKeeper IP address in streams tests to avoid timeouts

--
[...truncated 1.54 MB...]

kafka.admin.TopicCommandTest > testCreateAlterTopicWithRackAware PASSED

kafka.admin.TopicCommandTest > testTopicDeletion STARTED

kafka.admin.TopicCommandTest > testTopicDeletion PASSED

kafka.admin.TopicCommandTest > testInvalidTopicLevelConfig STARTED

kafka.admin.TopicCommandTest > testInvalidTopicLevelConfig PASSED

kafka.admin.TopicCommandTest > testConfigPreservationAcrossPartitionAlteration 
STARTED

kafka.admin.TopicCommandTest > testConfigPreservationAcrossPartitionAlteration 
PASSED

kafka.admin.TopicCommandTest > testAlterIfExists STARTED

kafka.admin.TopicCommandTest > testAlterIfExists PASSED

kafka.admin.TopicCommandTest > testDescribeAndListTopicsMarkedForDeletion 
STARTED

kafka.admin.TopicCommandTest > testDescribeAndListTopicsMarkedForDeletion PASSED

kafka.admin.TopicCommandTest > testDeleteIfExists STARTED

kafka.admin.TopicCommandTest > testDeleteIfExists PASSED

kafka.admin.PreferredReplicaElectionCommandTest > 
testBasicPreferredReplicaElection STARTED

kafka.admin.PreferredReplicaElectionCommandTest > 
testBasicPreferredReplicaElection PASSED

kafka.admin.PreferredReplicaElectionCommandTest > testPreferredReplicaJsonData 
STARTED

kafka.admin.PreferredReplicaElectionCommandTest > testPreferredReplicaJsonData 
PASSED

kafka.admin.DelegationTokenCommandTest > testDelegationTokenRequests STARTED

kafka.admin.DelegationTokenCommandTest > testDelegationTokenRequests PASSED

kafka.admin.BrokerApiVersionsCommandTest > checkBrokerApiVersionCommandOutput 
STARTED

kafka.admin.BrokerApiVersionsCommandTest > checkBrokerApiVersionCommandOutput 
PASSED

kafka.admin.DeleteTopicTest > testDeleteTopicWithCleaner STARTED

kafka.admin.DeleteTopicTest > testDeleteTopicWithCleaner PASSED

kafka.admin.DeleteTopicTest > testResumeDeleteTopicOnControllerFailover STARTED

kafka.admin.DeleteTopicTest > testResumeDeleteTopicOnControllerFailover PASSED

kafka.admin.DeleteTopicTest > testResumeDeleteTopicWithRecoveredFollower STARTED

kafka.admin.DeleteTopicTest > testResumeDeleteTopicWithRecoveredFollower PASSED

kafka.admin.DeleteTopicTest > testDeleteTopicAlreadyMarkedAsDeleted STARTED

kafka.admin.DeleteTopicTest > testDeleteTopicAlreadyMarkedAsDeleted PASSED

kafka.admin.DeleteTopicTest > testIncreasePartitionCountDuringDeleteTopic 
STARTED

kafka.admin.DeleteTopicTest > testIncreasePartitionCountDuringDeleteTopic PASSED

kafka.admin.DeleteTopicTest > testPartitionReassignmentDuringDeleteTopic STARTED

kafka.admin.DeleteTopicTest > testPartitionReassignmentDuringDeleteTopic PASSED

kafka.admin.DeleteTopicTest > testDeleteNonExistingTopic STARTED

kafka.admin.DeleteTopicTest > testDeleteNonExistingTopic PASSED

kafka.admin.DeleteTopicTest > testRecreateTopicAfterDeletion STARTED

kafka.admin.DeleteTopicTest > testRecreateTopicAfterDeletion PASSED

kafka.admin.DeleteTopicTest > testDisableDeleteTopic STARTED

kafka.admin.DeleteTopicTest > testDisableDeleteTopic PASSED

kafka.admin.DeleteTopicTest > testAddPartitionDuringDeleteTopic STARTED

kafka.admin.DeleteTopicTest > testAddPartitionDuringDeleteTopic PASSED

kafka.admin.DeleteTopicTest > testDeleteTopicWithAllAliveReplicas STARTED

kafka.admin.DeleteTopicTest > testDeleteTopicWithAllAliveReplicas PASSED

kafka.admin.DeleteTopicTest > testDeleteTopicDuringAddPartition STARTED

kafka.admin.DeleteTopicTest > testDeleteTopicDuringAddPartition PASSED

kafka.admin.DeleteTopicTest > testDeletingPartiallyDeletedTopic STARTED

kafka.admin.DeleteTopicTest > testDeletingPartiallyDeletedTopic PASSED

kafka.admin.AdminTest > testGetBrokerMetadatas STARTED

kafka.admin.AdminTest > testGetBrokerMetadatas PASSED

kafka.admin.AdminTest > testBootstrapClientIdConfig STARTED

kafka.admin.AdminTest > testBootstrapClientIdConfig PASSED

kafka.admin.AdminTest > testManualReplicaAssignment STARTED

kafka.admin.AdminTest > testManualReplicaAssignment PASSED

kafka.admin.AdminTest > testConcurrentTopicCreation STARTED

kafka.admin.AdminTest > testConcurrentTopicCreation PASSED

kafka.admin.AdminTest > testTopicCreationWithCollision STARTED

kafka.admin.AdminTest > testTopicCreationWithCollision PASSED

kafka.admin.AdminTest > testTopicCreationInZK STARTED

kafka.admin.AdminTest > testTopicCreationInZK PASSED

kafka.admin.AclCommandTest > testInvalidAuthorizerProperty STARTED

kafka.admin.AclCommandTest > testInvalidAuthorizerProperty PASSED

kafka.admin.AclCommandTest > testAclsOnPrefixedResources STARTED

kafka.admin.AclCommandTest > testAclsOnPrefixedResources PASSED

kafka.admin.AclCommandTest > testAclCli STARTED

kafka.admin.AclCommandTest > testAclCli PASSED

kafka.admin.AclCommandTest > testProducerConsumerCli STARTED

kafka.admin.AclCommandTest > testProducerConsumerCli PASSED

kafka.admin.ConfigCommandTest > 

Re: Discussion: New components in JIRA?

2018-07-23 Thread Ray Chiang
Okay, I've created a wiki page Reporting Issues in Apache Kafka 
.  
I'd appreciate any feedback.  If this is good enough, I can file a JIRA 
to change the link under "Bugs" in the "Project information" page.


-Ray

On 7/23/18 11:28 AM, Ray Chiang wrote:
Good point.  I'll look into adding some JIRA guidelines to the 
documentation/wiki.


-Ray

On 7/22/18 10:23 AM, Guozhang Wang wrote:

Hello Ray,

Thanks for brining this up. I'm generally +1 on the first two, while for
the last category, personally I felt leaving them as part of `tools` is
fine, but I'm also open for other opinions.

A more general question though, is that today we do not have any 
guidelines

to ask JIRA reporters to set the right component, i.e. it is purely
best-effort, and we cannot disallow reporters to add any new component
names. And so far the project does not really have a tradition to manage
JIRA reports per-component, as the goal is to not "separate" the project
into silos but recommending everyone to get hands on every aspect of the
project.


Guozhang


On Fri, Jul 20, 2018 at 2:44 PM, Ray Chiang  wrote:

I've been doing a little bit of component cleanup in JIRA.  What do 
people

think of adding
one or more of the following components?

- logging: For any consumer/producer/broker logging (i.e. log4j). This
should help disambiguate from the "log" component (i.e. Kafka 
messages).


- mirrormaker: There are enough requests specific to MirrorMaker 
that it

could be put into its own component.

- scripts: I'm a little more ambivalent about this one, but any of the
bin/*.sh script fixes could belong in their own category.  I'm not 
sure if

other people feel strongly for how the "tools" component should be used
w.r.t. the run scripts.

Any thoughts?

-Ray










[jira] [Created] (KAFKA-7195) StreamStreamJoinIntegrationTest fails in 2.0 Jenkins

2018-07-23 Thread Ted Yu (JIRA)
Ted Yu created KAFKA-7195:
-

 Summary: StreamStreamJoinIntegrationTest fails in 2.0 Jenkins
 Key: KAFKA-7195
 URL: https://issues.apache.org/jira/browse/KAFKA-7195
 Project: Kafka
  Issue Type: Test
Reporter: Ted Yu


>From 
>https://builds.apache.org/job/kafka-2.0-jdk8/87/testReport/junit/org.apache.kafka.streams.integration/StreamStreamJoinIntegrationTest/testOuter_caching_enabled___false_/
> :
{code}
java.lang.AssertionError: 
Expected: is <[A-null]>
 but: was <[A-a, A-b, A-c, A-d]>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
at 
org.apache.kafka.streams.integration.AbstractJoinIntegrationTest.checkResult(AbstractJoinIntegrationTest.java:171)
at 
org.apache.kafka.streams.integration.AbstractJoinIntegrationTest.runTest(AbstractJoinIntegrationTest.java:212)
at 
org.apache.kafka.streams.integration.AbstractJoinIntegrationTest.runTest(AbstractJoinIntegrationTest.java:184)
at 
org.apache.kafka.streams.integration.StreamStreamJoinIntegrationTest.testOuter(StreamStreamJoinIntegrationTest.java:198)
{code}
However, some test output was missing:
{code}
[2018-07-23 20:51:36,363] INFO Socket c
...[truncated 1627692 chars]...
671)
{code}
I ran the test locally which passed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: kafka-2.0-jdk8 #87

2018-07-23 Thread Apache Jenkins Server
See 


Changes:

[rajinisivaram] KAFKA-7193: Use ZooKeeper IP address in streams tests to avoid 
timeouts

--
[...truncated 2.48 MB...]
org.apache.kafka.streams.TopologyTest > 
shouldNotAllowToAddStateStoreToNonExistingProcessor STARTED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowToAddStateStoreToNonExistingProcessor PASSED

org.apache.kafka.streams.TopologyTest > 
sourceAndProcessorShouldHaveSingleSubtopology STARTED

org.apache.kafka.streams.TopologyTest > 
sourceAndProcessorShouldHaveSingleSubtopology PASSED

org.apache.kafka.streams.TopologyTest > 
tableNamedMaterializedCountShouldPreserveTopologyStructure STARTED

org.apache.kafka.streams.TopologyTest > 
tableNamedMaterializedCountShouldPreserveTopologyStructure PASSED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowNullStoreNameWhenConnectingProcessorAndStateStores STARTED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowNullStoreNameWhenConnectingProcessorAndStateStores PASSED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowNullNameWhenAddingSourceWithTopic STARTED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowNullNameWhenAddingSourceWithTopic PASSED

org.apache.kafka.streams.TopologyTest > 
sourceAndProcessorWithStateShouldHaveSingleSubtopology STARTED

org.apache.kafka.streams.TopologyTest > 
sourceAndProcessorWithStateShouldHaveSingleSubtopology PASSED

org.apache.kafka.streams.TopologyTest > shouldNotAllowNullTopicWhenAddingSink 
STARTED

org.apache.kafka.streams.TopologyTest > shouldNotAllowNullTopicWhenAddingSink 
PASSED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowToAddGlobalStoreWithSourceNameEqualsProcessorName STARTED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowToAddGlobalStoreWithSourceNameEqualsProcessorName PASSED

org.apache.kafka.streams.TopologyTest > 
singleSourceWithListOfTopicsShouldHaveSingleSubtopology STARTED

org.apache.kafka.streams.TopologyTest > 
singleSourceWithListOfTopicsShouldHaveSingleSubtopology PASSED

org.apache.kafka.streams.TopologyTest > 
sourceWithMultipleProcessorsShouldHaveSingleSubtopology STARTED

org.apache.kafka.streams.TopologyTest > 
sourceWithMultipleProcessorsShouldHaveSingleSubtopology PASSED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowZeroStoreNameWhenConnectingProcessorAndStateStores STARTED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowZeroStoreNameWhenConnectingProcessorAndStateStores PASSED

org.apache.kafka.streams.TopologyTest > 
kTableNamedMaterializedMapValuesShouldPreserveTopologyStructure STARTED

org.apache.kafka.streams.TopologyTest > 
kTableNamedMaterializedMapValuesShouldPreserveTopologyStructure PASSED

org.apache.kafka.streams.TopologyTest > 
timeWindowZeroArgCountShouldPreserveTopologyStructure STARTED

org.apache.kafka.streams.TopologyTest > 
timeWindowZeroArgCountShouldPreserveTopologyStructure PASSED

org.apache.kafka.streams.TopologyTest > 
multipleSourcesWithProcessorsShouldHaveDistinctSubtopologies STARTED

org.apache.kafka.streams.TopologyTest > 
multipleSourcesWithProcessorsShouldHaveDistinctSubtopologies PASSED

org.apache.kafka.streams.TopologyTest > shouldThrowOnUnassignedStateStoreAccess 
STARTED

org.apache.kafka.streams.TopologyTest > shouldThrowOnUnassignedStateStoreAccess 
PASSED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowNullProcessorNameWhenConnectingProcessorAndStateStores STARTED

org.apache.kafka.streams.TopologyTest > 
shouldNotAllowNullProcessorNameWhenConnectingProcessorAndStateStores PASSED

org.apache.kafka.streams.TopologyTest > 
kTableAnonymousMaterializedMapValuesShouldPreserveTopologyStructure STARTED

org.apache.kafka.streams.TopologyTest > 
kTableAnonymousMaterializedMapValuesShouldPreserveTopologyStructure PASSED

org.apache.kafka.streams.TopologyTest > 
sessionWindowZeroArgCountShouldPreserveTopologyStructure STARTED

org.apache.kafka.streams.TopologyTest > 
sessionWindowZeroArgCountShouldPreserveTopologyStructure PASSED

org.apache.kafka.streams.TopologyTest > 
tableAnonymousMaterializedCountShouldPreserveTopologyStructure STARTED

org.apache.kafka.streams.TopologyTest > 
tableAnonymousMaterializedCountShouldPreserveTopologyStructure PASSED

org.apache.kafka.streams.TopologyTest > shouldDescribeGlobalStoreTopology 
STARTED

org.apache.kafka.streams.TopologyTest > shouldDescribeGlobalStoreTopology PASSED

org.apache.kafka.streams.TopologyTest > 
kTableNonMaterializedMapValuesShouldPreserveTopologyStructure STARTED

org.apache.kafka.streams.TopologyTest > 
kTableNonMaterializedMapValuesShouldPreserveTopologyStructure PASSED

org.apache.kafka.streams.TopologyTest > 
kGroupedStreamAnonymousMaterializedCountShouldPreserveTopologyStructure STARTED

org.apache.kafka.streams.TopologyTest > 
kGroupedStreamAnonymousMaterializedCountShouldPreserveTopologyStructure PASSED

org.apache.kafka.streams.TopologyTest > 

Request for permission to create KIP

2018-07-23 Thread Afshartous, Nick

Hi all,


Requesting permission to create a KIP in regards to


  KAFKA-6690 Priorities for Source Topics


My Wiki ID is nafshartous.


Cheers,

--

Nick


Re: [DISCUSS] KIP-289: Improve the default group id behavior in KafkaConsumer

2018-07-23 Thread Vahid S Hashemian
Hi Stanislav,

Thanks for the feedback.
Do you see an issue with using `null` as the default group id (as 
addressed by Jason in his response)?
This default group id would not support offset commits and consumers would 
use `auto.offset.reset` config when there is no current offset.

Thanks.
--Vahid




From:   Stanislav Kozlovski 
To: dev@kafka.apache.org
Date:   07/20/2018 11:09 AM
Subject:Re: [DISCUSS] KIP-289: Improve the default group id 
behavior in KafkaConsumer



I agree with Jason's notion that
>  implicit use of the empty group.id to commit offsets is more likely to
be causing users unexpected problems than actually providing a useful
capability.
I was initially confused that this is the behavior when investigating a
new-ish JIRA issue <
https://issues.apache.org/jira/browse/KAFKA-6758
> about
the same topic.
So, +1 to deprecating "" as a group.id

The question after that becomes what the *default* value should be - 
should
we:
a) treat an unconfigured group.id consumer as a sort of intermittent
consumer where you don't store offsets at all (thereby making the user
explicitly sign up for them)
b) have a default value which makes use of them? I sort of like the 
former.

@Dhruvil, thinking about it at a high-level - yes. I can't think of a
situation where it makes sense to name something an empty string as far as
I'm aware - to me it seems like potential for confusion


On Fri, Jul 20, 2018 at 10:22 AM Rajini Sivaram 
wrote:

> +1 to deprecate use of "" as group.id since it is odd to have a resource
> name that you cannot set ACLs for. Agree, we have to support older 
clients
> though.
>
> Thanks,
>
> Rajini
>
> On Fri, Jul 20, 2018 at 5:25 PM, Jason Gustafson 
> wrote:
>
> > Hi Vahid,
> >
> > Sorry for getting to this so late. I think there are two things here:
> >
> > 1. The use of "" as a groupId has always been a dubious practice at 
best.
> > We definitely ought to deprecate its use in the client. Perhaps in the
> next
> > major release, we can remove support completely. However, since older
> > clients depend on it, we may have to continue letting the broker 
support
> it
> > to some extent. Perhaps we just need to bump the OffsetCommit request 
API
> > and only accept the offset commit for older versions. You probably 
have
> to
> > do this anyway if you want to introduce the new error code since old
> > clients will not expect it.
> >
> > 2. There should be a way for the consumer to indicate that it has no
> group
> > id and will not commit offsets. This is an explicit instruction that 
the
> > consumer should not bother with coordinator lookup and such. We 
currently
> > have some brittle logic in place to let users avoid the coordinator
> lookup,
> > but it is a bit error-prone. I was hoping that we could change the
> default
> > value of group.id to be null so that the user had to take an explicit
> > action to opt into coordinator management (groups or offsets). 
However,
> it
> > is true that some users may be unknowingly depending on offset storage 
if
> > they are using both the default group.id and the default
> > enable.auto.commit. Perhaps one option is to disable 
enable.auto.commit
> > automatically if no group.id is specified? I am not sure if there are
> any
> > drawbacks, but my feeling is that implicit use of the empty group.id 
to
> > commit offsets is more likely to be causing users unexpected problems
> than
> > actually providing a useful capability.
> >
> > Thoughts?
> >
> > Thanks,
> > Jason
> >
> >
> >
> >
> > On Mon, May 28, 2018 at 9:50 AM, Vahid S Hashemian <
> > vahidhashem...@us.ibm.com> wrote:
> >
> > > Hi Viktor,
> > >
> > > Thanks for sharing your opinion.
> > > So you're in favor of disallowing the empty ("") group id altogether
> > (even
> > > for fetching).
> > > Given that ideally no one should be using the empty group id (at 
least
> in
> > > a production setting) I think the impact would be minimal in either
> case.
> > >
> > > But as you said, let's hear what others think and I'd be happy to
> modify
> > > the KIP if needed.
> > >
> > > Regards.
> > > --Vahid
> > >
> > >
> > >
> > >
> > > From:   Viktor Somogyi 
> > > To: dev 
> > > Date:   05/28/2018 05:18 AM
> > > Subject:Re: [DISCUSS] KIP-289: Improve the default group id
> > > behavior in KafkaConsumer
> > >
> > >
> > >
> > > Hi Vahid,
> > >
> > > (with the argument that using the default group id for offset commit
> > > should not be the user's intention in practice).
> > >
> > > Yea, so in my opinion too this use case doesn't seem too practical.
> Also
> > I
> > > think breaking the offset commit is not smaller from this 
perspective
> > than
> > > breaking fetch and offset fetch. If we suppose that someone uses the
> > > default group id and we break the offset commit then that might be
> harder
> > > to detect than breaking the whole thing altogether. (If we think 
about
> an
> > > upgrade situation.)
> > > So since we think it is not a practical use case, I think it 

Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-07-23 Thread Dong Lin
Hey Lucas,

Could you update the KIP if you are confident with the approach which uses
correlation id? The idea around correlation id is kind of scattered across
multiple emails. It will be useful if other reviews can read the KIP to
understand the latest proposal.

Thanks,
Dong

On Mon, Jul 23, 2018 at 12:32 PM, Mayuresh Gharat <
gharatmayures...@gmail.com> wrote:

> I like the idea of the dequeue implementation by Lucas. This will help us
> avoid additional queue for controller and additional configs in Kafka.
>
> Thanks,
>
> Mayuresh
>
> On Sun, Jul 22, 2018 at 2:58 AM Becket Qin  wrote:
>
> > Hi Jun,
> >
> > The usage of correlation ID might still be useful to address the cases
> > that the controller epoch and leader epoch check are not sufficient to
> > guarantee correct behavior. For example, if the controller sends a
> > LeaderAndIsrRequest followed by a StopReplicaRequest, and the broker
> > processes it in the reverse order, the replica may still be wrongly
> > recreated, right?
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > > On Jul 22, 2018, at 11:47 AM, Jun Rao  wrote:
> > >
> > > Hmm, since we already use controller epoch and leader epoch for
> properly
> > > caching the latest partition state, do we really need correlation id
> for
> > > ordering the controller requests?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Fri, Jul 20, 2018 at 2:18 PM, Becket Qin 
> > wrote:
> > >
> > >> Lucas and Mayuresh,
> > >>
> > >> Good idea. The correlation id should work.
> > >>
> > >> In the ControllerChannelManager, a request will be resent until a
> > response
> > >> is received. So if the controller to broker connection disconnects
> after
> > >> controller sends R1_a, but before the response of R1_a is received, a
> > >> disconnection may cause the controller to resend R1_b. i.e. until R1
> is
> > >> acked, R2 won't be sent by the controller.
> > >> This gives two guarantees:
> > >> 1. Correlation id wise: R1_a < R1_b < R2.
> > >> 2. On the broker side, when R2 is seen, R1 must have been processed at
> > >> least once.
> > >>
> > >> So on the broker side, with a single thread controller request
> handler,
> > the
> > >> logic should be:
> > >> 1. Process what ever request seen in the controller request queue
> > >> 2. For the given epoch, drop request if its correlation id is smaller
> > than
> > >> that of the last processed request.
> > >>
> > >> Thanks,
> > >>
> > >> Jiangjie (Becket) Qin
> > >>
> > >> On Fri, Jul 20, 2018 at 8:07 AM, Jun Rao  wrote:
> > >>
> > >>> I agree that there is no strong ordering when there are more than one
> > >>> socket connections. Currently, we rely on controllerEpoch and
> > leaderEpoch
> > >>> to ensure that the receiving broker picks up the latest state for
> each
> > >>> partition.
> > >>>
> > >>> One potential issue with the dequeue approach is that if the queue is
> > >> full,
> > >>> there is no guarantee that the controller requests will be enqueued
> > >>> quickly.
> > >>>
> > >>> Thanks,
> > >>>
> > >>> Jun
> > >>>
> > >>> On Fri, Jul 20, 2018 at 5:25 AM, Mayuresh Gharat <
> > >>> gharatmayures...@gmail.com
> >  wrote:
> > >>>
> >  Yea, the correlationId is only set to 0 in the NetworkClient
> > >> constructor.
> >  Since we reuse the same NetworkClient between Controller and the
> > >> broker,
> > >>> a
> >  disconnection should not cause it to reset to 0, in which case it
> can
> > >> be
> >  used to reject obsolete requests.
> > 
> >  Thanks,
> > 
> >  Mayuresh
> > 
> >  On Thu, Jul 19, 2018 at 1:52 PM Lucas Wang 
> > >>> wrote:
> > 
> > > @Dong,
> > > Great example and explanation, thanks!
> > >
> > > @All
> > > Regarding the example given by Dong, it seems even if we use a
> queue,
> >  and a
> > > dedicated controller request handling thread,
> > > the same result can still happen because R1_a will be sent on one
> > > connection, and R1_b & R2 will be sent on a different connection,
> > > and there is no ordering between different connections on the
> broker
> >  side.
> > > I was discussing with Mayuresh offline, and it seems correlation id
> >  within
> > > the same NetworkClient object is monotonically increasing and never
> >  reset,
> > > hence a broker can leverage that to properly reject obsolete
> > >> requests.
> > > Thoughts?
> > >
> > > Thanks,
> > > Lucas
> > >
> > > On Thu, Jul 19, 2018 at 12:11 PM, Mayuresh Gharat <
> > > gharatmayures...@gmail.com> wrote:
> > >
> > >> Actually nvm, correlationId is reset in case of connection loss, I
> >  think.
> > >>
> > >> Thanks,
> > >>
> > >> Mayuresh
> > >>
> > >> On Thu, Jul 19, 2018 at 11:11 AM Mayuresh Gharat <
> > >> gharatmayures...@gmail.com>
> > >> wrote:
> > >>
> > >>> I agree with Dong that out-of-order processing can happen with
> >  having 2
> > >>> separate queues as well and 

Re: [DISCUSS] KIP-289: Improve the default group id behavior in KafkaConsumer

2018-07-23 Thread Vahid S Hashemian
Hi Dhruvil,

Could you please share the reasoning behind your suggestion?
My understanding is that there is more restrictions around topic names 
because they appear in folder names, and there are limitations by file 
systems on what can be used in file/folder names or how long they can be.

Thanks.
--Vahid



From:   Dhruvil Shah 
To: dev@kafka.apache.org
Date:   07/20/2018 11:02 AM
Subject:Re: [DISCUSS] KIP-289: Improve the default group id 
behavior in KafkaConsumer



If we are looking into deprecating the empty group id, would it also make
sense to have the same character restriction for it as that for topic
names? We have stricter validation for topic names but none for group id
and transaction id. I think we should (eventually) make character
restriction the same across all entities. We may not necessarily want to 
do
this as part of the proposed KIP but I wanted to get an opinion on it
anyway.

On Fri, Jul 20, 2018 at 10:22 AM Rajini Sivaram 
wrote:

> +1 to deprecate use of "" as group.id since it is odd to have a resource
> name that you cannot set ACLs for. Agree, we have to support older 
clients
> though.
>
> Thanks,
>
> Rajini
>
> On Fri, Jul 20, 2018 at 5:25 PM, Jason Gustafson 
> wrote:
>
> > Hi Vahid,
> >
> > Sorry for getting to this so late. I think there are two things here:
> >
> > 1. The use of "" as a groupId has always been a dubious practice at 
best.
> > We definitely ought to deprecate its use in the client. Perhaps in the
> next
> > major release, we can remove support completely. However, since older
> > clients depend on it, we may have to continue letting the broker 
support
> it
> > to some extent. Perhaps we just need to bump the OffsetCommit request 
API
> > and only accept the offset commit for older versions. You probably 
have
> to
> > do this anyway if you want to introduce the new error code since old
> > clients will not expect it.
> >
> > 2. There should be a way for the consumer to indicate that it has no
> group
> > id and will not commit offsets. This is an explicit instruction that 
the
> > consumer should not bother with coordinator lookup and such. We 
currently
> > have some brittle logic in place to let users avoid the coordinator
> lookup,
> > but it is a bit error-prone. I was hoping that we could change the
> default
> > value of group.id to be null so that the user had to take an explicit
> > action to opt into coordinator management (groups or offsets). 
However,
> it
> > is true that some users may be unknowingly depending on offset storage 
if
> > they are using both the default group.id and the default
> > enable.auto.commit. Perhaps one option is to disable 
enable.auto.commit
> > automatically if no group.id is specified? I am not sure if there are
> any
> > drawbacks, but my feeling is that implicit use of the empty group.id 
to
> > commit offsets is more likely to be causing users unexpected problems
> than
> > actually providing a useful capability.
> >
> > Thoughts?
> >
> > Thanks,
> > Jason
> >
> >
> >
> >
> > On Mon, May 28, 2018 at 9:50 AM, Vahid S Hashemian <
> > vahidhashem...@us.ibm.com> wrote:
> >
> > > Hi Viktor,
> > >
> > > Thanks for sharing your opinion.
> > > So you're in favor of disallowing the empty ("") group id altogether
> > (even
> > > for fetching).
> > > Given that ideally no one should be using the empty group id (at 
least
> in
> > > a production setting) I think the impact would be minimal in either
> case.
> > >
> > > But as you said, let's hear what others think and I'd be happy to
> modify
> > > the KIP if needed.
> > >
> > > Regards.
> > > --Vahid
> > >
> > >
> > >
> > >
> > > From:   Viktor Somogyi 
> > > To: dev 
> > > Date:   05/28/2018 05:18 AM
> > > Subject:Re: [DISCUSS] KIP-289: Improve the default group id
> > > behavior in KafkaConsumer
> > >
> > >
> > >
> > > Hi Vahid,
> > >
> > > (with the argument that using the default group id for offset commit
> > > should not be the user's intention in practice).
> > >
> > > Yea, so in my opinion too this use case doesn't seem too practical.
> Also
> > I
> > > think breaking the offset commit is not smaller from this 
perspective
> > than
> > > breaking fetch and offset fetch. If we suppose that someone uses the
> > > default group id and we break the offset commit then that might be
> harder
> > > to detect than breaking the whole thing altogether. (If we think 
about
> an
> > > upgrade situation.)
> > > So since we think it is not a practical use case, I think it would 
be
> > > better to break altogether but ofc that's just my 2 cents :). Let's
> > gather
> > > other's input as well.
> > >
> > > Cheers,
> > > Viktor
> > >
> > > On Fri, May 25, 2018 at 5:43 PM, Vahid S Hashemian <
> > > vahidhashem...@us.ibm.com> wrote:
> > >
> > > > Hi Victor,
> > > >
> > > > Thanks for reviewing the KIP.
> > > >
> > > > Yes, to minimize the backward compatibility impact, there would be 
no
> > > harm
> > > > in letting a stand-alone consumer consume 

Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-07-23 Thread Mayuresh Gharat
I like the idea of the dequeue implementation by Lucas. This will help us
avoid additional queue for controller and additional configs in Kafka.

Thanks,

Mayuresh

On Sun, Jul 22, 2018 at 2:58 AM Becket Qin  wrote:

> Hi Jun,
>
> The usage of correlation ID might still be useful to address the cases
> that the controller epoch and leader epoch check are not sufficient to
> guarantee correct behavior. For example, if the controller sends a
> LeaderAndIsrRequest followed by a StopReplicaRequest, and the broker
> processes it in the reverse order, the replica may still be wrongly
> recreated, right?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> > On Jul 22, 2018, at 11:47 AM, Jun Rao  wrote:
> >
> > Hmm, since we already use controller epoch and leader epoch for properly
> > caching the latest partition state, do we really need correlation id for
> > ordering the controller requests?
> >
> > Thanks,
> >
> > Jun
> >
> > On Fri, Jul 20, 2018 at 2:18 PM, Becket Qin 
> wrote:
> >
> >> Lucas and Mayuresh,
> >>
> >> Good idea. The correlation id should work.
> >>
> >> In the ControllerChannelManager, a request will be resent until a
> response
> >> is received. So if the controller to broker connection disconnects after
> >> controller sends R1_a, but before the response of R1_a is received, a
> >> disconnection may cause the controller to resend R1_b. i.e. until R1 is
> >> acked, R2 won't be sent by the controller.
> >> This gives two guarantees:
> >> 1. Correlation id wise: R1_a < R1_b < R2.
> >> 2. On the broker side, when R2 is seen, R1 must have been processed at
> >> least once.
> >>
> >> So on the broker side, with a single thread controller request handler,
> the
> >> logic should be:
> >> 1. Process what ever request seen in the controller request queue
> >> 2. For the given epoch, drop request if its correlation id is smaller
> than
> >> that of the last processed request.
> >>
> >> Thanks,
> >>
> >> Jiangjie (Becket) Qin
> >>
> >> On Fri, Jul 20, 2018 at 8:07 AM, Jun Rao  wrote:
> >>
> >>> I agree that there is no strong ordering when there are more than one
> >>> socket connections. Currently, we rely on controllerEpoch and
> leaderEpoch
> >>> to ensure that the receiving broker picks up the latest state for each
> >>> partition.
> >>>
> >>> One potential issue with the dequeue approach is that if the queue is
> >> full,
> >>> there is no guarantee that the controller requests will be enqueued
> >>> quickly.
> >>>
> >>> Thanks,
> >>>
> >>> Jun
> >>>
> >>> On Fri, Jul 20, 2018 at 5:25 AM, Mayuresh Gharat <
> >>> gharatmayures...@gmail.com
>  wrote:
> >>>
>  Yea, the correlationId is only set to 0 in the NetworkClient
> >> constructor.
>  Since we reuse the same NetworkClient between Controller and the
> >> broker,
> >>> a
>  disconnection should not cause it to reset to 0, in which case it can
> >> be
>  used to reject obsolete requests.
> 
>  Thanks,
> 
>  Mayuresh
> 
>  On Thu, Jul 19, 2018 at 1:52 PM Lucas Wang 
> >>> wrote:
> 
> > @Dong,
> > Great example and explanation, thanks!
> >
> > @All
> > Regarding the example given by Dong, it seems even if we use a queue,
>  and a
> > dedicated controller request handling thread,
> > the same result can still happen because R1_a will be sent on one
> > connection, and R1_b & R2 will be sent on a different connection,
> > and there is no ordering between different connections on the broker
>  side.
> > I was discussing with Mayuresh offline, and it seems correlation id
>  within
> > the same NetworkClient object is monotonically increasing and never
>  reset,
> > hence a broker can leverage that to properly reject obsolete
> >> requests.
> > Thoughts?
> >
> > Thanks,
> > Lucas
> >
> > On Thu, Jul 19, 2018 at 12:11 PM, Mayuresh Gharat <
> > gharatmayures...@gmail.com> wrote:
> >
> >> Actually nvm, correlationId is reset in case of connection loss, I
>  think.
> >>
> >> Thanks,
> >>
> >> Mayuresh
> >>
> >> On Thu, Jul 19, 2018 at 11:11 AM Mayuresh Gharat <
> >> gharatmayures...@gmail.com>
> >> wrote:
> >>
> >>> I agree with Dong that out-of-order processing can happen with
>  having 2
> >>> separate queues as well and it can even happen today.
> >>> Can we use the correlationId in the request from the controller
> >> to
>  the
> >>> broker to handle ordering ?
> >>>
> >>> Thanks,
> >>>
> >>> Mayuresh
> >>>
> >>>
> >>> On Thu, Jul 19, 2018 at 6:41 AM Becket Qin  >>>
> > wrote:
> >>>
>  Good point, Joel. I agree that a dedicated controller request
>  handling
>  thread would be a better isolation. It also solves the
> >> reordering
> > issue.
> 
>  On Thu, Jul 19, 2018 at 2:23 PM, Joel Koshy <
> >> jjkosh...@gmail.com>
> >> wrote:
> 
> > Good example. I think 

Build failed in Jenkins: kafka-trunk-jdk10 #309

2018-07-23 Thread Apache Jenkins Server
See 

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on H20 (ubuntu xenial) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/kafka.git # timeout=10
Fetching upstream changes from https://github.com/apache/kafka.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
ERROR: Error fetching remote repo 'origin'
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/kafka.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:888)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1155)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1186)
at hudson.scm.SCM.checkout(SCM.java:504)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1208)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1794)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress https://github.com/apache/kafka.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:
stdout: 
stderr: error: Could not read 0f3affc0f40751dc8fd064b36b6e859728f63e37
error: Could not read 95fbb2e03f4fe79737c71632e0ef2dfdcfb85a69
error: Could not read 08c465028d057ac23cdfe6d57641fe40240359dd
remote: Counting objects: 6128, done.
remote: Compressing objects:   5% (1/19)   remote: Compressing objects: 
 10% (2/19)   remote: Compressing objects:  15% (3/19)   
remote: Compressing objects:  21% (4/19)   remote: Compressing objects: 
 26% (5/19)   remote: Compressing objects:  31% (6/19)   
remote: Compressing objects:  36% (7/19)   remote: Compressing objects: 
 42% (8/19)   remote: Compressing objects:  47% (9/19)   
remote: Compressing objects:  52% (10/19)   remote: Compressing 
objects:  57% (11/19)   remote: Compressing objects:  63% (12/19)   
remote: Compressing objects:  68% (13/19)   remote: Compressing 
objects:  73% (14/19)   remote: Compressing objects:  78% (15/19)   
remote: Compressing objects:  84% (16/19)   remote: Compressing 
objects:  89% (17/19)   remote: Compressing objects:  94% (18/19)   
remote: Compressing objects: 100% (19/19)   remote: Compressing 
objects: 100% (19/19), done.
Receiving objects:   0% (1/6128)   Receiving objects:   1% (62/6128)   
Receiving objects:   2% (123/6128)   Receiving objects:   3% (184/6128)   
Receiving objects:   4% (246/6128)   Receiving objects:   5% (307/6128)   
Receiving objects:   6% (368/6128)   Receiving objects:   7% (429/6128)   
Receiving objects:   8% (491/6128)   Receiving objects:   9% (552/6128)   
Receiving objects:  10% (613/6128)   Receiving objects:  11% (675/6128)   
Receiving objects:  12% (736/6128)   Receiving objects:  13% (797/6128)   
Receiving objects:  14% (858/6128)   Receiving objects:  15% (920/6128)   
Receiving objects:  16% (981/6128)   Receiving objects:  17% (1042/6128)   
Receiving objects:  18% (1104/6128)   Receiving objects:  19% (1165/6128)   
Receiving objects:  20% (1226/6128)   Receiving objects:  21% (1287/6128)   
Receiving objects:  22% (1349/6128)   Receiving objects:  23% (1410/6128)   
Receiving objects:  24% (1471/6128)   Receiving objects:  25% (1532/6128)   
Receiving objects:  26% (1594/6128)   Receiving objects:  27% (1655/6128)   
Receiving objects:  28% (1716/6128)   Receiving objects:  29% (1778/6128)   
Receiving objects:  30% (1839/6128)   Receiving objects:  31% (1900/6128)   
Receiving objects:  32% (1961/6128)   Receiving objects:  33% (2023/6128)   
Receiving objects:  34% (2084/6128)   Receiving objects:  35% (2145/6128)   
Receiving objects:  36% (2207/6128)   Receiving objects:  37% (2268/6128)   
Receiving objects:  38% (2329/6128)   Receiving objects:  39% (2390/6128)   
Receiving objects:  40% (2452/6128)   Receiving objects:  41% (2513/6128)   
Receiving objects:  42% (2574/6128)   Receiving objects:  43% (2636/6128)   
Receiving objects:  44% (2697/6128)   Receiving objects:  45% (2758/6128)   
Receiving objects:  46% (2819/6128)   Receiving objects:  47% (2881/6128)   
Receiving 

1.1.1 and KAFKA-6929

2018-07-23 Thread Jordan Pilat
It looks like even though the fix for this¹ was backported to the 1.1 line² as 
part of the PR³, the ticket didn't get updated to include a fix version of 
1.1.1, and thus missed inclusion in the 1.1.1 release notes.

1 - https://issues.apache.org/jira/browse/KAFKA-6929
2 - 
https://github.com/apache/kafka/commit/a346590895f1702799d2c165c1a301de90a1d649
3 - https://github.com/apache/kafka/pull/5060

- Jordan Pilat


Re: Discussion: New components in JIRA?

2018-07-23 Thread Ray Chiang
Good point.  I'll look into adding some JIRA guidelines to the 
documentation/wiki.


-Ray

On 7/22/18 10:23 AM, Guozhang Wang wrote:

Hello Ray,

Thanks for brining this up. I'm generally +1 on the first two, while for
the last category, personally I felt leaving them as part of `tools` is
fine, but I'm also open for other opinions.

A more general question though, is that today we do not have any guidelines
to ask JIRA reporters to set the right component, i.e. it is purely
best-effort, and we cannot disallow reporters to add any new component
names. And so far the project does not really have a tradition to manage
JIRA reports per-component, as the goal is to not "separate" the project
into silos but recommending everyone to get hands on every aspect of the
project.


Guozhang


On Fri, Jul 20, 2018 at 2:44 PM, Ray Chiang  wrote:


I've been doing a little bit of component cleanup in JIRA.  What do people
think of adding
one or more of the following components?

- logging: For any consumer/producer/broker logging (i.e. log4j). This
should help disambiguate from the "log" component (i.e. Kafka messages).

- mirrormaker: There are enough requests specific to MirrorMaker that it
could be put into its own component.

- scripts: I'm a little more ambivalent about this one, but any of the
bin/*.sh script fixes could belong in their own category.  I'm not sure if
other people feel strongly for how the "tools" component should be used
w.r.t. the run scripts.

Any thoughts?

-Ray








Re: [DISCUSS] KIP-289: Improve the default group id behavior in KafkaConsumer

2018-07-23 Thread Vahid S Hashemian
Hi Jason,

Thanks for the feedback.

1. I think the current KIP follows your first bullet suggestion; except 
for the addition of a OffsetCommit version and continuing to support old 
versions (I will add these).
2. Regarding your suggestion, using `null` as default group id seems 
reasonable. Could you please elaborate on disabling `enable.auto.commit`? 
They way I see it, the new version of the OffsetCommit will reject 
committing offsets for empty ("") and null group ids. Are you suggesting 
to avoid committing offsets (by default) on the client side for these 
group ids?

Thanks!
--Vahid




From:   Jason Gustafson 
To: dev 
Date:   07/20/2018 09:25 AM
Subject:Re: [DISCUSS] KIP-289: Improve the default group id 
behavior in KafkaConsumer



Hi Vahid,

Sorry for getting to this so late. I think there are two things here:

1. The use of "" as a groupId has always been a dubious practice at best.
We definitely ought to deprecate its use in the client. Perhaps in the 
next
major release, we can remove support completely. However, since older
clients depend on it, we may have to continue letting the broker support 
it
to some extent. Perhaps we just need to bump the OffsetCommit request API
and only accept the offset commit for older versions. You probably have to
do this anyway if you want to introduce the new error code since old
clients will not expect it.

2. There should be a way for the consumer to indicate that it has no group
id and will not commit offsets. This is an explicit instruction that the
consumer should not bother with coordinator lookup and such. We currently
have some brittle logic in place to let users avoid the coordinator 
lookup,
but it is a bit error-prone. I was hoping that we could change the default
value of group.id to be null so that the user had to take an explicit
action to opt into coordinator management (groups or offsets). However, it
is true that some users may be unknowingly depending on offset storage if
they are using both the default group.id and the default
enable.auto.commit. Perhaps one option is to disable enable.auto.commit
automatically if no group.id is specified? I am not sure if there are any
drawbacks, but my feeling is that implicit use of the empty group.id to
commit offsets is more likely to be causing users unexpected problems than
actually providing a useful capability.

Thoughts?

Thanks,
Jason




On Mon, May 28, 2018 at 9:50 AM, Vahid S Hashemian <
vahidhashem...@us.ibm.com> wrote:

> Hi Viktor,
>
> Thanks for sharing your opinion.
> So you're in favor of disallowing the empty ("") group id altogether 
(even
> for fetching).
> Given that ideally no one should be using the empty group id (at least 
in
> a production setting) I think the impact would be minimal in either 
case.
>
> But as you said, let's hear what others think and I'd be happy to modify
> the KIP if needed.
>
> Regards.
> --Vahid
>
>
>
>
> From:   Viktor Somogyi 
> To: dev 
> Date:   05/28/2018 05:18 AM
> Subject:Re: [DISCUSS] KIP-289: Improve the default group id
> behavior in KafkaConsumer
>
>
>
> Hi Vahid,
>
> (with the argument that using the default group id for offset commit
> should not be the user's intention in practice).
>
> Yea, so in my opinion too this use case doesn't seem too practical. Also 
I
> think breaking the offset commit is not smaller from this perspective 
than
> breaking fetch and offset fetch. If we suppose that someone uses the
> default group id and we break the offset commit then that might be 
harder
> to detect than breaking the whole thing altogether. (If we think about 
an
> upgrade situation.)
> So since we think it is not a practical use case, I think it would be
> better to break altogether but ofc that's just my 2 cents :). Let's 
gather
> other's input as well.
>
> Cheers,
> Viktor
>
> On Fri, May 25, 2018 at 5:43 PM, Vahid S Hashemian <
> vahidhashem...@us.ibm.com> wrote:
>
> > Hi Victor,
> >
> > Thanks for reviewing the KIP.
> >
> > Yes, to minimize the backward compatibility impact, there would be no
> harm
> > in letting a stand-alone consumer consume messages under a "" group id
> (as
> > long as there is no offset commit).
> > It would have to knowingly seek to an offset or rely on the
> > auto.offset.reset config for the starting offset.
> > This way the existing functionality would be preserved for the most 
part
> > (with the argument that using the default group id for offset commit
> > should not be the user's intention in practice).
> >
> > Does it seem reasonable?
> >
> > Thanks.
> > --Vahid
> >
> >
> >
> >
> > From:   Viktor Somogyi 
> > To: dev 
> > Date:   05/25/2018 04:56 AM
> > Subject:Re: [DISCUSS] KIP-289: Improve the default group id
> > behavior in KafkaConsumer
> >
> >
> >
> > Hi Vahid,
> >
> > When reading your KIP I coldn't fully understand why did you decide at
> > failing with "offset_commit" in case #2? Can't we fail with an empty
> group
> > id even in "fetch" or "fetch_offset"? 

[jira] [Resolved] (KAFKA-7193) ZooKeeper client times out with localhost due to random choice of ipv4/ipv6

2018-07-23 Thread Rajini Sivaram (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram resolved KAFKA-7193.
---
   Resolution: Fixed
 Reviewer: Ismael Juma
Fix Version/s: 2.0.0

> ZooKeeper client times out with localhost due to random choice of ipv4/ipv6
> ---
>
> Key: KAFKA-7193
> URL: https://issues.apache.org/jira/browse/KAFKA-7193
> Project: Kafka
>  Issue Type: Bug
>  Components: zkclient
>Affects Versions: 2.0.0
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
>Priority: Major
> Fix For: 2.0.0
>
>
> ZooKeeper client from version 3.4.13 doesn't handle connections to 
> `localhost` very well. If ZooKeeper is started on 127.0.0.1 on a machine that 
> has both ipv4 and ipv6 and a client is created using `localhost` rather than 
> the IP address in the connection string, ZooKeeper client attempts to connect 
> to ipv4 or ipv6 randomly with a fixed one second backoff if connection fails. 
> With the default 6 second connection timeout in Kafka, this can result in 
> client connection failures if ipv6 is chosen in consecutive address 
> selections.
> Streams tests are failing intermittently as a result of this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6918) Kafka server fails to start with IBM JAVA

2018-07-23 Thread Ray Chiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang resolved KAFKA-6918.
---
   Resolution: Fixed
Fix Version/s: 1.1.1

> Kafka server fails to start with IBM JAVA
> -
>
> Key: KAFKA-6918
> URL: https://issues.apache.org/jira/browse/KAFKA-6918
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Nayana Thorat
>Priority: Critical
> Fix For: 1.1.1
>
>
> Kafka server start fails with below error:
> bin/kafka-server-start.sh -daemon config/server.properties
> ERROR:
> (kafka.server.KafkaConfig)
>  FATAL  (kafka.Kafka$)
> java.lang.IllegalArgumentException: Signal already used by VM: INT
>     at 
> com.ibm.misc.SignalDispatcher.registerSignal(SignalDispatcher.java:127)
>     at sun.misc.Signal.handle(Signal.java:184)
>     at kafka.Kafka$.registerHandler$1(Kafka.scala:67)
>     at kafka.Kafka$.registerLoggingSignalHandler(Kafka.scala:74)
>     at kafka.Kafka$.main(Kafka.scala:85)
>     at kafka.Kafka.main(Kafka.scala)
>  
> Tried with binaries and well as built  Apache Kafka(v1.0.0) from source.
> Installed  IBM SDK on Ubuntu 16.04. 
> IBM java link:
> wget 
> http://public.dhe.ibm.com/ibmdl/export/pub/systems/cloud/runtimes/java/8.0.5.10/linux/x86_64/ibm-java-sdk-8.0-5.10-x86_64-archive.bin
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-342 Add Customizable SASL extensions to OAuthBearer authentication

2018-07-23 Thread Ron Dagostino
Hi Rajini.  I think a class is fine as long as we make sure the semantics
of immutability are clear -- it would have to be a value class, and any
constructor that accepts a Map as input would have to copy that Map rather
than store it in a member variable.  Similarly, any Map that it might
return would have to be unmodifiable.

Ron

On Mon, Jul 23, 2018 at 12:24 PM Rajini Sivaram 
wrote:

> Hi Ron, Stanislav,
>
> I agree with Stanislav that it would be better to leave `SaslExtensions` as
> a class rather than make it an interface. We don''t really expect users to
> extends this class, so it is convenient to have an implementation since
> users need to create an instance. The class provided by the public API
> should be sufficient in the vast majority of the cases. Ron, do you agree?
>
> On Mon, Jul 23, 2018 at 11:35 AM, Ron Dagostino  wrote:
>
> > Hi Stanislav.  See https://tools.ietf.org/html/rfc7628#section-3.1, and
> > that section refers to the core ABNF productions defined in
> > https://tools.ietf.org/html/rfc5234#appendix-B.
> >
> > Ron
> >
> > > On Jul 23, 2018, at 1:30 AM, Stanislav Kozlovski <
> stanis...@confluent.io>
> > wrote:
> > >
> > > Hey Ron and Rajini,
> > >
> > > Here are my thoughts:
> > > Regarding separators in SaslExtensions - Agreed, that was a bad move.
> > > Should definitely not be a concern of CallbackHandler and LoginModule
> > > implementors.
> > > SaslExtensions interface - Wouldn't implementing it as an interface
> mean
> > > that users will have to make sure they're passing in an unmodifiable
> map
> > > themselves. I believe it would be better if we enforced that through
> > class
> > > constructors instead.
> > > SaslExtensions#map() - I'd also prefer this. The reason I went with
> > > `extensionValue` and `extensionNames` was because I figured it made
> sense
> > > to have `ScramExtensions` extend `SaslExtensions` and therefore have
> > their
> > > API be similar. In the end, do you think that it is worth it to have
> > > `ScramExtensions` extend `SaslExtensions`?
> > > @Ron, could you point me to the SASL OAuth mechanism specific regular
> > > expressions for keys/values you mentioned are in RFC 7628 (
> > > https://tools.ietf.org/html/rfc7628) ? I could not find any while
> > > originally implementing this.
> > >
> > > Best,
> > > Stanislav
> > >
> > >> On Sun, Jul 22, 2018 at 6:46 PM Ron Dagostino 
> > wrote:
> > >>
> > >> Hi again, Rajini and Stanislav.  I wonder if making SaslExtensions an
> > >> interface rather than a class might be a good solution.  For example:
> > >>
> > >> public interface SaslExtensions {
> > >>   /**
> > >>* @return an immutable map view of the SASL extensions
> > >>*/
> > >>Map map();
> > >> }
> > >>
> > >> This solves the issue of lack of clarity on immutability, and it also
> > >> eliminates copying, like this:
> > >>
> > >> SaslExtensions myMethod() {
> > >>Map myRetval = getUnmodifiableSaslExtensionsMap();
> > >>return new SaslExtensions() {
> > >>public Map map() {
> > >>return myRetval;
> > >>}
> > >>}
> > >> }
> > >>
> > >> Alternatively, we could do it like this:
> > >>
> > >> /**
> > >> * Supplier that returns immutable map view of SASL Extensions
> > >> */
> > >> public interface SaslExtensions extends Supplier>
> {
> > >>// empty
> > >> }
> > >>
> > >> The we could simply return the instance like this, again without
> > copying:
> > >>
> > >> SaslExtensions myMethod() {
> > >>Map myRetval = getUnmodifiableSaslExtensionsMap();
> > >>return () -> myRetval;
> > >> }
> > >>
> > >> I think the main reason for making SaslExtensions part of the public
> > >> interface is to avoid adding a Map to the Subject's public
> credentials.
> > >> Making SaslExtensions an interface meets that requirement and then
> > allows
> > >> us to be free to implement whatever we want internally.
> > >>
> > >> Thoughts?
> > >>
> > >> Ron
> > >>
> > >>> On Sun, Jul 22, 2018 at 12:45 PM Ron Dagostino 
> > wrote:
> > >>>
> > >>> Hi Rajini.  The SaslServer is going to have to validate the
> extensions,
> > >>> too, but I’m okay with keeping the validation logic elsewhere as long
> > as
> > >> it
> > >>> can be reused in both the client and the secret.
> > >>>
> > >>> I strongly prefer exposing a map() method as opposed to
> > extensionNames()
> > >>> and extensionValue(String) methods. It is a smaller API (2 methods
> > >> instead
> > >>> of 1), and it gives clients of the API full map-related functionality
> > >>> (there’s a lot of support for dealing with maps in a variety of
> ways).
> > >>>
> > >>> Regardless of whether we go with a map() method or extensionNames()
> and
> > >>> extensionValue(String) methods, the semantics of mutability need to
> be
> > >>> clear.  I think either way we should never share a map that anyone
> else
> > >>> could possibly mutate — either a map that someone gives us or a map
> > that
> > >> we
> > >>> might expose.
> > >>>
> > >>> Thoughts?
> > >>>
> > >>> Ron
> 

[jira] [Created] (KAFKA-7194) Error deserializing assignment after rebalance

2018-07-23 Thread Konstantine Karantasis (JIRA)
Konstantine Karantasis created KAFKA-7194:
-

 Summary: Error deserializing assignment after rebalance
 Key: KAFKA-7194
 URL: https://issues.apache.org/jira/browse/KAFKA-7194
 Project: Kafka
  Issue Type: Bug
Reporter: Konstantine Karantasis
Assignee: Jason Gustafson


A simple sink connector task is failing in a test with the following exception: 
{noformat}
[2018-07-02 12:31:13,200] ERROR WorkerSinkTask{id=verifiable-sink-0} Task threw 
an uncaught and unrecoverable exception 
(org.apache.kafka.connect.runtime.WorkerTask)

org.apache.kafka.common.protocol.types.SchemaException: Error reading field 
'version': java.nio.BufferUnderflowException

        at org.apache.kafka.common.protocol.types.Schema.read(Schema.java:77)

        at 
org.apache.kafka.clients.consumer.internals.ConsumerProtocol.deserializeAssignment(ConsumerProtocol.java:105)

        at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:243)

        at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:421)

        at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:353)

        at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:338)

        at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:333)

        at 
org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1218)

        at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1181)

        at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1115)

        at 
org.apache.kafka.connect.runtime.WorkerSinkTask.pollConsumer(WorkerSinkTask.java:444)

        at 
org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:317)

        at 
org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:225)

        at 
org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:193)

        at 
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)

        at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)

        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

        at java.util.concurrent.FutureTask.run(FutureTask.java:266)

        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748){noformat}
 

After dumping the consumer offsets on the partition that this consumer group is 
writing with: 
{noformat}
bin/kafka-dump-log.sh --offsets-decoder --files ./.log 
{noformat}
we get: 
{noformat}
Dumping ./.log

Starting offset: 0

offset: 0 position: 0 CreateTime: 1530534673177 isvalid: true keysize: 27 
valuesize: 217 magic: 2 compresscodec: NONE producerId: -1 producerEpoch: -1 
sequence: -1 isTransactional: false headerKeys: [] key: 
{"metadata":"connect-verifiable-sink"} payload: 
{"protocolType":"consumer","protocol":"range","generationId":1,"assignment":"{consumer-4-bad84955-e702-44fe-a018-677bd3b3a9d4=[test-0]}"}

offset: 1 position: 314 CreateTime: 1530534673206 isvalid: true keysize: 27 
valuesize: 32 magic: 2 compresscodec: NONE producerId: -1 producerEpoch: -1 
sequence: -1 isTransactional: false headerKeys: [] key: 
{"metadata":"connect-verifiable-sink"} payload: 
{"protocolType":"consumer","protocol":null,"generationId":2,"assignment":"{}"}{noformat}
 

Since the broker seems to send a non-empty response to the consumer, there's a 
chance that the response buffer is consumed more than once at some point when 
parsing the response in the client. 

Here's what the kafka-request.log shows it sends to the client with the 
`SYNC_GROUP` response that throws the error: 
{noformat}
[2018-07-02 12:31:13,185] DEBUG Completed 
request:RequestHeader(apiKey=SYNC_GROUP, apiVersion=2, clientId=consumer-4, 
correlationId=5) -- 
{group_id=connect-verifiable-sink,generation_id=1,member_id=consumer-4-bad84955-e702-44fe-a018-677bd3b3a9d4,group_assignment=[{member_id=consumer-4-bad84955-e702-44fe-a018-677bd3b3a9d4,member_assignment=java.nio.HeapByteBuffer[pos=0
 lim=24 
cap=24]}]},response:{throttle_time_ms=0,error_code=0,member_assignment=java.nio.HeapByteBuffer[pos=0
 lim=24 cap=24]} from connection 
172.31.40.44:9092-172.31.35.189:49191-25;totalTime:8.904,requestQueueTime:0.063,localTime:8.558,remoteTime:0.0,throttleTime:0.03,responseQueueTime:0.037,sendTime:0.245,securityProtocol:PLAINTEXT,principal:User:ANONYMOUS,listener:PLAINTEXT
 (kafka.request.logger){noformat}
 



--
This 

Re: Question about issues of Kafka release version 1.1.1

2018-07-23 Thread Ismael Juma
Seems like you're right Lambdaliu. Rajini/Jason, can you please check and
update the JIRAs?

Ismael

On Mon, Jul 23, 2018 at 7:09 AM lambdaliu(刘少波) 
wrote:

> Hi team,
>
> I Have downloaded the source release of kafka version 1.1.1 and found the
> JIRA
> issues KAFKA-6911 and KAFKA-6809 listed in the release notes but it's PR
> looks
> like doesn't contain in the source release. Is this a valid situation?
> Should we
> create a JIRA issue to trace it?
>
> Regards,
> Lambdaliu(Shaobo Liu)
> 
>


Re: [DISCUSS] KIP-342 Add Customizable SASL extensions to OAuthBearer authentication

2018-07-23 Thread Rajini Sivaram
Hi Ron, Stanislav,

I agree with Stanislav that it would be better to leave `SaslExtensions` as
a class rather than make it an interface. We don''t really expect users to
extends this class, so it is convenient to have an implementation since
users need to create an instance. The class provided by the public API
should be sufficient in the vast majority of the cases. Ron, do you agree?

On Mon, Jul 23, 2018 at 11:35 AM, Ron Dagostino  wrote:

> Hi Stanislav.  See https://tools.ietf.org/html/rfc7628#section-3.1, and
> that section refers to the core ABNF productions defined in
> https://tools.ietf.org/html/rfc5234#appendix-B.
>
> Ron
>
> > On Jul 23, 2018, at 1:30 AM, Stanislav Kozlovski 
> wrote:
> >
> > Hey Ron and Rajini,
> >
> > Here are my thoughts:
> > Regarding separators in SaslExtensions - Agreed, that was a bad move.
> > Should definitely not be a concern of CallbackHandler and LoginModule
> > implementors.
> > SaslExtensions interface - Wouldn't implementing it as an interface mean
> > that users will have to make sure they're passing in an unmodifiable map
> > themselves. I believe it would be better if we enforced that through
> class
> > constructors instead.
> > SaslExtensions#map() - I'd also prefer this. The reason I went with
> > `extensionValue` and `extensionNames` was because I figured it made sense
> > to have `ScramExtensions` extend `SaslExtensions` and therefore have
> their
> > API be similar. In the end, do you think that it is worth it to have
> > `ScramExtensions` extend `SaslExtensions`?
> > @Ron, could you point me to the SASL OAuth mechanism specific regular
> > expressions for keys/values you mentioned are in RFC 7628 (
> > https://tools.ietf.org/html/rfc7628) ? I could not find any while
> > originally implementing this.
> >
> > Best,
> > Stanislav
> >
> >> On Sun, Jul 22, 2018 at 6:46 PM Ron Dagostino 
> wrote:
> >>
> >> Hi again, Rajini and Stanislav.  I wonder if making SaslExtensions an
> >> interface rather than a class might be a good solution.  For example:
> >>
> >> public interface SaslExtensions {
> >>   /**
> >>* @return an immutable map view of the SASL extensions
> >>*/
> >>Map map();
> >> }
> >>
> >> This solves the issue of lack of clarity on immutability, and it also
> >> eliminates copying, like this:
> >>
> >> SaslExtensions myMethod() {
> >>Map myRetval = getUnmodifiableSaslExtensionsMap();
> >>return new SaslExtensions() {
> >>public Map map() {
> >>return myRetval;
> >>}
> >>}
> >> }
> >>
> >> Alternatively, we could do it like this:
> >>
> >> /**
> >> * Supplier that returns immutable map view of SASL Extensions
> >> */
> >> public interface SaslExtensions extends Supplier> {
> >>// empty
> >> }
> >>
> >> The we could simply return the instance like this, again without
> copying:
> >>
> >> SaslExtensions myMethod() {
> >>Map myRetval = getUnmodifiableSaslExtensionsMap();
> >>return () -> myRetval;
> >> }
> >>
> >> I think the main reason for making SaslExtensions part of the public
> >> interface is to avoid adding a Map to the Subject's public credentials.
> >> Making SaslExtensions an interface meets that requirement and then
> allows
> >> us to be free to implement whatever we want internally.
> >>
> >> Thoughts?
> >>
> >> Ron
> >>
> >>> On Sun, Jul 22, 2018 at 12:45 PM Ron Dagostino 
> wrote:
> >>>
> >>> Hi Rajini.  The SaslServer is going to have to validate the extensions,
> >>> too, but I’m okay with keeping the validation logic elsewhere as long
> as
> >> it
> >>> can be reused in both the client and the secret.
> >>>
> >>> I strongly prefer exposing a map() method as opposed to
> extensionNames()
> >>> and extensionValue(String) methods. It is a smaller API (2 methods
> >> instead
> >>> of 1), and it gives clients of the API full map-related functionality
> >>> (there’s a lot of support for dealing with maps in a variety of ways).
> >>>
> >>> Regardless of whether we go with a map() method or extensionNames() and
> >>> extensionValue(String) methods, the semantics of mutability need to be
> >>> clear.  I think either way we should never share a map that anyone else
> >>> could possibly mutate — either a map that someone gives us or a map
> that
> >> we
> >>> might expose.
> >>>
> >>> Thoughts?
> >>>
> >>> Ron
> >>>
>  On Jul 22, 2018, at 11:23 AM, Rajini Sivaram  >
> >>> wrote:
> 
>  Hmm I think we need a much simpler SaslExtensions class if we are
>  making it part of the public API.
> 
>  1. I don't see the point of including separator anywhere in
> >>> SaslExtensions.
>  Extensions provide a map and we propagate the map from client to
> server
>  using the protocol associated with the mechanism in use. The separator
> >> is
>  not configurable and should not be a concern of the implementor of
>  SaslExtensionsCallback interface that provides an instance of
> >>> SaslExtensions
>  .
> 
>  2. I agree with Ron that we need 

[DISCUSS] KIP-332: Update AclCommand to use AdminClient API

2018-07-23 Thread Manikumar
Hi all,

I have created a KIP to use AdminClient API in AclCommand (kafka-acls.sh)

*https://cwiki.apache.org/confluence/display/KAFKA/KIP-332%3A+Update+AclCommand+to+use+AdminClient+API*


Please take a look.

Thanks,


Re: Plan for new Kafka Connect Transform

2018-07-23 Thread Randall Hauch
How about changing the ReplaceField SMT to be able to support nested
fields? If we come up with a unified way to identify nested fields, then we
could add support for nested fields to other SMTs, too.

Best regards,

Randall

On Thu, Jul 19, 2018 at 2:18 PM, karri saisatish kumar reddy <
ksskreddy2...@gmail.com> wrote:

> Respected Sir,
>
> I have recently worked on kafka as a part of my project.I have used
> debezium connector and SMT for changing schema in my project.
>
>
> I need to rename the  following deeper  field but found no suitable SMT .
> My schema has following structure ( -> implies subfield)
> Header - > General Info - >  Source_id
>
> Need to rename Source_id to Source.
> The current relevant SMT "*ReplaceField*" can only rename the top level
> field.
>
> I want to write a SMT for renaming of deeper fields using recursion.Is it a
> good idea to contribute? Are there any possible blockers for it ?
> Sorry if i asked this in bad place.
>
> Awaiting for the reply
>
>
>
>
> Thank you,
> K.S.Satish Kumar Reddy
> 4th year under graduate student,
> IIT Kharagpur.
>


Question about issues of Kafka release version 1.1.1

2018-07-23 Thread 刘少波
Hi team,

I Have downloaded the source release of kafka version 1.1.1 and found the JIRA
issues KAFKA-6911 and KAFKA-6809 listed in the release notes but it's PR looks
like doesn't contain in the source release. Is this a valid situation?  Should 
we
create a JIRA issue to trace it?

Regards,
Lambdaliu(Shaobo Liu)



KIP-213 - Scalable/Usable Foreign-Key KTable joins - Rebooted.

2018-07-23 Thread Adam Bellemare
Here is the new discussion thread for KIP-213. I picked back up on the KIP
as this is something that we too at Flipp are now running in production.
Jan started this last year, and I know that Trivago is also using something
similar in production, at least in terms of APIs and functionality.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-213+Support+non-key+joining+in+KTable

I do have an implementation of the code for Kafka 1.0.2 (our local
production version) but I won't post it yet as I would like to focus on the
workflow and design first. That being said, I also need to add some clearer
integration tests (I did a lot of testing using a non-Kafka Streams
framework) and clean up the code a bit more before putting it in a PR
against trunk (I can do so later this week likely).

Please take a look,

Thanks

Adam Bellemare


[jira] [Created] (KAFKA-7193) ZooKeeper client times out with localhost due to random choice of ipv4/ipv6

2018-07-23 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-7193:
-

 Summary: ZooKeeper client times out with localhost due to random 
choice of ipv4/ipv6
 Key: KAFKA-7193
 URL: https://issues.apache.org/jira/browse/KAFKA-7193
 Project: Kafka
  Issue Type: Bug
  Components: zkclient
Affects Versions: 2.0.0
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram


ZooKeeper client from version 3.4.13 doesn't handle connections to `localhost` 
very well. If ZooKeeper is started on 127.0.0.1 on a machine that has both ipv4 
and ipv6 and a client is created using `localhost` rather than the IP address 
in the connection string, ZooKeeper client attempts to connect to ipv4 or ipv6 
randomly with a fixed one second backoff if connection fails. With the default 
6 second connection timeout in Kafka, this can result in client connection 
failures if ipv6 is chosen in consecutive address selections.

Streams tests are failing intermittently as a result of this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-342 Add Customizable SASL extensions to OAuthBearer authentication

2018-07-23 Thread Ron Dagostino
Hi Stanislav.  See https://tools.ietf.org/html/rfc7628#section-3.1, and that 
section refers to the core ABNF productions defined in 
https://tools.ietf.org/html/rfc5234#appendix-B.

Ron

> On Jul 23, 2018, at 1:30 AM, Stanislav Kozlovski  
> wrote:
> 
> Hey Ron and Rajini,
> 
> Here are my thoughts:
> Regarding separators in SaslExtensions - Agreed, that was a bad move.
> Should definitely not be a concern of CallbackHandler and LoginModule
> implementors.
> SaslExtensions interface - Wouldn't implementing it as an interface mean
> that users will have to make sure they're passing in an unmodifiable map
> themselves. I believe it would be better if we enforced that through class
> constructors instead.
> SaslExtensions#map() - I'd also prefer this. The reason I went with
> `extensionValue` and `extensionNames` was because I figured it made sense
> to have `ScramExtensions` extend `SaslExtensions` and therefore have their
> API be similar. In the end, do you think that it is worth it to have
> `ScramExtensions` extend `SaslExtensions`?
> @Ron, could you point me to the SASL OAuth mechanism specific regular
> expressions for keys/values you mentioned are in RFC 7628 (
> https://tools.ietf.org/html/rfc7628) ? I could not find any while
> originally implementing this.
> 
> Best,
> Stanislav
> 
>> On Sun, Jul 22, 2018 at 6:46 PM Ron Dagostino  wrote:
>> 
>> Hi again, Rajini and Stanislav.  I wonder if making SaslExtensions an
>> interface rather than a class might be a good solution.  For example:
>> 
>> public interface SaslExtensions {
>>   /**
>>* @return an immutable map view of the SASL extensions
>>*/
>>Map map();
>> }
>> 
>> This solves the issue of lack of clarity on immutability, and it also
>> eliminates copying, like this:
>> 
>> SaslExtensions myMethod() {
>>Map myRetval = getUnmodifiableSaslExtensionsMap();
>>return new SaslExtensions() {
>>public Map map() {
>>return myRetval;
>>}
>>}
>> }
>> 
>> Alternatively, we could do it like this:
>> 
>> /**
>> * Supplier that returns immutable map view of SASL Extensions
>> */
>> public interface SaslExtensions extends Supplier> {
>>// empty
>> }
>> 
>> The we could simply return the instance like this, again without copying:
>> 
>> SaslExtensions myMethod() {
>>Map myRetval = getUnmodifiableSaslExtensionsMap();
>>return () -> myRetval;
>> }
>> 
>> I think the main reason for making SaslExtensions part of the public
>> interface is to avoid adding a Map to the Subject's public credentials.
>> Making SaslExtensions an interface meets that requirement and then allows
>> us to be free to implement whatever we want internally.
>> 
>> Thoughts?
>> 
>> Ron
>> 
>>> On Sun, Jul 22, 2018 at 12:45 PM Ron Dagostino  wrote:
>>> 
>>> Hi Rajini.  The SaslServer is going to have to validate the extensions,
>>> too, but I’m okay with keeping the validation logic elsewhere as long as
>> it
>>> can be reused in both the client and the secret.
>>> 
>>> I strongly prefer exposing a map() method as opposed to extensionNames()
>>> and extensionValue(String) methods. It is a smaller API (2 methods
>> instead
>>> of 1), and it gives clients of the API full map-related functionality
>>> (there’s a lot of support for dealing with maps in a variety of ways).
>>> 
>>> Regardless of whether we go with a map() method or extensionNames() and
>>> extensionValue(String) methods, the semantics of mutability need to be
>>> clear.  I think either way we should never share a map that anyone else
>>> could possibly mutate — either a map that someone gives us or a map that
>> we
>>> might expose.
>>> 
>>> Thoughts?
>>> 
>>> Ron
>>> 
 On Jul 22, 2018, at 11:23 AM, Rajini Sivaram 
>>> wrote:
 
 Hmm I think we need a much simpler SaslExtensions class if we are
 making it part of the public API.
 
 1. I don't see the point of including separator anywhere in
>>> SaslExtensions.
 Extensions provide a map and we propagate the map from client to server
 using the protocol associated with the mechanism in use. The separator
>> is
 not configurable and should not be a concern of the implementor of
 SaslExtensionsCallback interface that provides an instance of
>>> SaslExtensions
 .
 
 2. I agree with Ron that we need mechanism-specific validation of the
 values from SaslExtensions. But I think we could do the validation in
>> the
 appropriate `SaslClient` implementation of that mechanism.
 
 I think we could just have a very simple extensions class and move
 everything else to appropriate internal classes of the mechanisms using
 extensions. What do you think?
 
 public class SaslExtensions {
   private final Map extensionMap;
 
   public SaslExtensions(Map extensionMap) {
   this.extensionMap = extensionMap;
   }
 
   public String extensionValue(String name) {
   return extensionMap.get(name);
   }
 
  

Jenkins build is back to normal : kafka-trunk-jdk8 #2834

2018-07-23 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-7192) State-store can desynchronise with changelog

2018-07-23 Thread Jon Bates (JIRA)
Jon Bates created KAFKA-7192:


 Summary: State-store can desynchronise with changelog
 Key: KAFKA-7192
 URL: https://issues.apache.org/jira/browse/KAFKA-7192
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 1.1.1
Reporter: Jon Bates


n.b. this bug has been verified with exactly-once processing enabled

Consider the following scenario:
 * A record, N is read into a Kafka topology
 * the state store is updated
 * the topology crashes

h3. *Expected behaviour:*
 # Node is restarted
 # Offset was never updated, so record N is reprocessed
 # State-store is reset to position N-1
 # Record is reprocessed

h3. 
[|https://github.com/spadger/kafka-streams-sad-state-store#actual-behaviour]*Actual
 Behaviour*
 # Node is restarted
 # Record N is reprocessed (good)
 # The state store has the state from the previous processing

I'd consider this a corruption of the state-store, hence the critical Priority, 
although High may be more appropriate.

I wrote a proof-of-concept here, which demonstrates the problem on Linux:

https://github.com/spadger/kafka-streams-sad-state-store



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)