[jira] [Commented] (KAFKA-4074) Deleting a topic can make it unavailable even if delete.topic.enable is false

2016-09-06 Thread Manikumar Reddy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469613#comment-15469613
 ] 

Manikumar Reddy commented on KAFKA-4074:


[~junrao]  Yes, Both issues are same.  I can merge my changes with KAFKA-3175 
PR.

> Deleting a topic can make it unavailable even if delete.topic.enable is false
> -
>
> Key: KAFKA-4074
> URL: https://issues.apache.org/jira/browse/KAFKA-4074
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Reporter: Joel Koshy
>Assignee: Manikumar Reddy
> Fix For: 0.10.1.0
>
>
> The {{delete.topic.enable}} configuration does not completely block the 
> effects of delete topic since the controller may (indirectly) query the list 
> of topics under the delete-topic znode.
> To reproduce:
> * Delete topic X
> * Force a controller move (either by bouncing or removing the controller 
> znode)
> * The new controller will send out UpdateMetadataRequests with leader=-2 for 
> the partitions of X
> * Producers eventually stop producing to that topic
> The reason for this is that when ControllerChannelManager adds 
> UpdateMetadataRequests for brokers, we directly use the partitionsToBeDeleted 
> field of the DeleteTopicManager (which is set to the partitions of the topics 
> under the delete-topic znode on controller startup).
> In order to get out of the situation you have to remove X from the znode and 
> then force another controller move.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4111) broker compress data of certain size instead on a produce request

2016-09-06 Thread Manikumar Reddy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469580#comment-15469580
 ] 

Manikumar Reddy commented on KAFKA-4111:


Ok got it. Broker handles each producer request separately. It is difficult to 
merge producer requests at broker side.

In general, we want producers to compress the data.  We can tune the 
batch.size/linger.ms config params to adjust the producer batch size.
Also it is advisable to use single producer per jvm/app  to get the full 
benefits of batching.

> broker compress data of certain size instead on a produce request
> -
>
> Key: KAFKA-4111
> URL: https://issues.apache.org/jira/browse/KAFKA-4111
> Project: Kafka
>  Issue Type: Improvement
>  Components: compression
>Affects Versions: 0.10.0.1
>Reporter: julien1987
>
> When "compression.type" is set on broker config, broker compress data on 
> every produce request. But on our sences, produce requst is very many, and 
> data of every request is not so much. So compression result is not good. Can 
> Broker compress data of every certain size from many produce requests?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk8 #864

2016-09-06 Thread Apache Jenkins Server
See 

Changes:

[ismael] MINOR: Reduce the log level when the peer isn't authenticated but is

--
[...truncated 12235 lines...]
org.apache.kafka.streams.kstream.internals.KTableKTableLeftJoinTest > testJoin 
PASSED

org.apache.kafka.streams.kstream.internals.KTableKTableLeftJoinTest > 
testNotSendingOldValue STARTED

org.apache.kafka.streams.kstream.internals.KTableKTableLeftJoinTest > 
testNotSendingOldValue PASSED

org.apache.kafka.streams.kstream.internals.KeyValuePrinterProcessorTest > 
testPrintKeyValueWithProvidedSerde STARTED

org.apache.kafka.streams.kstream.internals.KeyValuePrinterProcessorTest > 
testPrintKeyValueWithProvidedSerde PASSED

org.apache.kafka.streams.kstream.internals.KeyValuePrinterProcessorTest > 
testPrintKeyValuesWithName STARTED

org.apache.kafka.streams.kstream.internals.KeyValuePrinterProcessorTest > 
testPrintKeyValuesWithName PASSED

org.apache.kafka.streams.kstream.internals.KeyValuePrinterProcessorTest > 
testPrintKeyValueDefaultSerde STARTED

org.apache.kafka.streams.kstream.internals.KeyValuePrinterProcessorTest > 
testPrintKeyValueDefaultSerde PASSED

org.apache.kafka.streams.kstream.internals.KTableSourceTest > 
testSendingOldValue STARTED

org.apache.kafka.streams.kstream.internals.KTableSourceTest > 
testSendingOldValue PASSED

org.apache.kafka.streams.kstream.internals.KTableSourceTest > 
testNotSendingOldValue STARTED

org.apache.kafka.streams.kstream.internals.KTableSourceTest > 
testNotSendingOldValue PASSED

org.apache.kafka.streams.kstream.internals.KTableSourceTest > testKTable STARTED

org.apache.kafka.streams.kstream.internals.KTableSourceTest > testKTable PASSED

org.apache.kafka.streams.kstream.internals.KTableSourceTest > testValueGetter 
STARTED

org.apache.kafka.streams.kstream.internals.KTableSourceTest > testValueGetter 
PASSED

org.apache.kafka.streams.kstream.internals.KTableMapKeysTest > 
testMapKeysConvertingToStream STARTED

org.apache.kafka.streams.kstream.internals.KTableMapKeysTest > 
testMapKeysConvertingToStream PASSED

org.apache.kafka.streams.kstream.internals.KStreamForeachTest > testForeach 
STARTED

org.apache.kafka.streams.kstream.internals.KStreamForeachTest > testForeach 
PASSED

org.apache.kafka.streams.kstream.internals.KTableKTableOuterJoinTest > 
testSendingOldValue STARTED

org.apache.kafka.streams.kstream.internals.KTableKTableOuterJoinTest > 
testSendingOldValue PASSED

org.apache.kafka.streams.kstream.internals.KTableKTableOuterJoinTest > testJoin 
STARTED

org.apache.kafka.streams.kstream.internals.KTableKTableOuterJoinTest > testJoin 
PASSED

org.apache.kafka.streams.kstream.internals.KTableKTableOuterJoinTest > 
testNotSendingOldValue STARTED

org.apache.kafka.streams.kstream.internals.KTableKTableOuterJoinTest > 
testNotSendingOldValue PASSED

org.apache.kafka.streams.kstream.internals.KStreamKStreamJoinTest > 
testOuterJoin STARTED

org.apache.kafka.streams.kstream.internals.KStreamKStreamJoinTest > 
testOuterJoin PASSED

org.apache.kafka.streams.kstream.internals.KStreamKStreamJoinTest > testJoin 
STARTED

org.apache.kafka.streams.kstream.internals.KStreamKStreamJoinTest > testJoin 
PASSED

org.apache.kafka.streams.kstream.internals.KStreamKStreamJoinTest > 
testWindowing STARTED

org.apache.kafka.streams.kstream.internals.KStreamKStreamJoinTest > 
testWindowing PASSED

org.apache.kafka.streams.kstream.internals.KStreamFlatMapValuesTest > 
testFlatMapValues STARTED

org.apache.kafka.streams.kstream.internals.KStreamFlatMapValuesTest > 
testFlatMapValues PASSED

org.apache.kafka.streams.kstream.internals.KTableKTableJoinTest > testJoin 
STARTED

org.apache.kafka.streams.kstream.internals.KTableKTableJoinTest > testJoin 
PASSED

org.apache.kafka.streams.kstream.internals.KTableKTableJoinTest > 
testNotSendingOldValues STARTED

org.apache.kafka.streams.kstream.internals.KTableKTableJoinTest > 
testNotSendingOldValues PASSED

org.apache.kafka.streams.kstream.internals.KTableKTableJoinTest > 
testSendingOldValues STARTED

org.apache.kafka.streams.kstream.internals.KTableKTableJoinTest > 
testSendingOldValues PASSED

org.apache.kafka.streams.kstream.internals.KTableAggregateTest > testAggBasic 
STARTED

org.apache.kafka.streams.kstream.internals.KTableAggregateTest > testAggBasic 
PASSED

org.apache.kafka.streams.kstream.internals.KTableAggregateTest > testCount 
STARTED

org.apache.kafka.streams.kstream.internals.KTableAggregateTest > testCount 
PASSED

org.apache.kafka.streams.kstream.internals.KTableAggregateTest > 
testAggRepartition STARTED

org.apache.kafka.streams.kstream.internals.KTableAggregateTest > 
testAggRepartition PASSED

org.apache.kafka.streams.kstream.internals.KTableAggregateTest > 
testRemoveOldBeforeAddNew STARTED

org.apache.kafka.streams.kstream.internals.KTableAggregateTest > 
testRemoveOldBeforeAddNew PASSED

org.apache.kafka.streams.kstream.internals.KStreamFilterTest > testFilterNot 
STARTED


[jira] [Created] (KAFKA-4137) Refactor multi-threaded consumer for safer network layer access

2016-09-06 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-4137:
--

 Summary: Refactor multi-threaded consumer for safer network layer 
access
 Key: KAFKA-4137
 URL: https://issues.apache.org/jira/browse/KAFKA-4137
 Project: Kafka
  Issue Type: Improvement
  Components: consumer
Reporter: Jason Gustafson


In KIP-62, we added a background thread to send heartbeats while the user is 
processing fetched data from a call to poll(). In the implementation, we 
elected to share the instance of {{NetworkClient}} between the foreground 
thread and this background thread. After working with the system test failure 
in KAFKA-3807, we've realized that this probably wasn't a good decision. It is 
very tricky to get the synchronization correct with respect to response 
callbacks and reasoning about the multi-threaded behavior is very difficult. 
For example, a common pattern is to send a request and then call 
{{NetworkClient.poll()}} to await its return. With another thread also 
potentially calling poll(), the response can actually return before the sending 
thread itself invokes poll(). This can cause unnecessary (and potentially 
unbounded) blocking, and avoiding it is quite complex. 

A different approach we've discussed would be to use two instances of 
NetworkClient, one dedicated to fetching, and one dedicated to coordinator 
communication. The fetching NetworkClient can continue to work exclusively in 
the foreground thread and we can confine the coordinator NetworkClient to the 
background thread. This provides much better isolation and avoids all of the 
race conditions with calling poll() from two threads. The main complication is 
in how to expose blocking APIs to interact with the background thread. For 
example, in the current consumer API, rebalance are completed in the foreground 
thread, so we would need to coordinate with the background thread to preserve 
this (e.g. by using a Future abstraction).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [ANNOUNCE] New committer: Jason Gustafson

2016-09-06 Thread Ashish Singh
Congrats, Jason!

On Tuesday, September 6, 2016, Jason Gustafson  wrote:

> Thanks all!
>
> On Tue, Sep 6, 2016 at 5:13 PM, Becket Qin  > wrote:
>
> > Congrats, Jason!
> >
> > On Tue, Sep 6, 2016 at 5:09 PM, Onur Karaman
>  > >
> > wrote:
> >
> > > congrats jason!
> > >
> > > On Tue, Sep 6, 2016 at 4:12 PM, Sriram Subramanian  >
> > > wrote:
> > >
> > > > Congratulations Jason!
> > > >
> > > > On Tue, Sep 6, 2016 at 3:40 PM, Vahid S Hashemian <
> > > > vahidhashem...@us.ibm.com 
> > > > > wrote:
> > > >
> > > > > Congratulations Jason on this very well deserved recognition.
> > > > >
> > > > > --Vahid
> > > > >
> > > > >
> > > > >
> > > > > From:   Neha Narkhede >
> > > > > To: "dev@kafka.apache.org " <
> dev@kafka.apache.org >,
> > > > > "us...@kafka.apache.org "  >
> > > > > Cc: "priv...@kafka.apache.org " <
> priv...@kafka.apache.org >
> > > > > Date:   09/06/2016 03:26 PM
> > > > > Subject:[ANNOUNCE] New committer: Jason Gustafson
> > > > >
> > > > >
> > > > >
> > > > > The PMC for Apache Kafka has invited Jason Gustafson to join as a
> > > > > committer and
> > > > > we are pleased to announce that he has accepted!
> > > > >
> > > > > Jason has contributed numerous patches to a wide range of areas,
> > > notably
> > > > > within the new consumer and the Kafka Connect layers. He has
> > displayed
> > > > > great taste and judgement which has been apparent through his
> > > involvement
> > > > > across the board from mailing lists, JIRA, code reviews to
> > contributing
> > > > > features, bug fixes and code and documentation improvements.
> > > > >
> > > > > Thank you for your contribution and welcome to Apache Kafka, Jason!
> > > > > --
> > > > > Thanks,
> > > > > Neha
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>


-- 
Ashish h


Re: [DISCUSS] Time-based releases for Apache Kafka

2016-09-06 Thread Sriram Subramanian
Sounds good to me. 

> On Sep 6, 2016, at 8:22 PM, Jason Gustafson  wrote:
> 
> Hey All,
> 
> It sounds like the general consensus is in favor of time-based releases. We
> can continue the discussion about LTS, but I wanted to go ahead and get
> things moving forward by volunteering to manage the next release, which is
> currently slated for October. If that sounds OK, I'll draft a release plan
> and send it out to the community for feedback and a vote.
> 
> Thanks,
> Jason
> 
>> On Thu, Aug 25, 2016 at 2:03 PM, Ofir Manor  wrote:
>> 
>> I happily agree that Kafka is a solid and the community is great :)
>> But I think there is a gap in perception here.
>> For me, LTS means that someone is actively taking care of a release -
>> actively backporting critical fixes (security, stability, data loss,
>> corruption, hangs etc) from trunk to that LTS version periodically for an
>> extended period of time, for example 18-36 months... So people can really
>> rely on the same Kafka version for a long time.
>> Is someone doing it today for 0.9.0? When is 0.9.0.2 expected? When is
>> 0.8.2.3 expected? Will they cover all known critical issues for whoever
>> relies on them in production?
>> In other words, what is the scope of support that the community want to
>> commit for older versions? (upgrade compatibility? investigating bug
>> reports? proactively backporting fixes?)
>> BTW, another legit option is that the Apache Kafka project won't commit to
>> LTS releases. It could let commercial vendors compete on supporting very
>> old versions. I find that actually quite reasonable as well.
>> 
>> Ofir Manor
>> 
>> Co-Founder & CTO | Equalum
>> 
>> Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io
>> 
>> On Thu, Aug 25, 2016 at 8:19 PM, Andrew Schofield <
>> andrew_schofield_j...@outlook.com> wrote:
>> 
>>> I agree that the Kafka community has managed to maintain a very high
>>> quality level, so I'm not concerned
>>> about the quality of non-LTS releases. If the principle is that every
>>> release is supported for 2 years, that
>>> would be good. I suppose that if the burden of having that many
>> in-support
>>> releases proves too heavy,
>>> as you say we could reconsider.
>>> 
>>> Andrew Schofield
>>> 
>>> 
 From: g...@confluent.io
 Date: Thu, 25 Aug 2016 09:57:30 -0700
 Subject: Re: [DISCUSS] Time-based releases for Apache Kafka
 To: dev@kafka.apache.org
 
 I prefer Ismael's suggestion for supporting 2-years (6 releases)
 rather than have designated LTS releases.
 
 The LTS model seems to work well when some releases are high quality
 (LTS) and the rest are a bit more questionable. It is great for
 companies like Redhat, where they have to invest less to support few
 releases and let the community deal with everything else.
 
 Until now the Kafka community has managed to maintain very high
 quality level. Not just for releases, our trunk is often of better
 quality than other project's releases - we don't think of stability as
 something you tuck into a release (and just some releases) but rather
 as an on-going concern. There are costs to doing things that way, but
 in general, I think it has served us well - allowing even conservative
 companies to run on the latest released version.
 
 I hope we can agree to at least try maintaining last 6 releases as LTS
 (i.e. every single release is supported for 2 years) rather than
 designate some releases as better than others. Of course, if this
 totally fails, we can reconsider.
 
 Gwen
 
 On Thu, Aug 25, 2016 at 9:51 AM, Andrew Schofield
  wrote:
> The proposal sounds pretty good, but the main thing currently missing
>>> is a proper long-term support release.
> 
> Having 3 releases a year sounds OK, but if they're all equivalent and
>>> bugfix releases are produced for the most
> recent 2 or 3 releases, anyone wanting to run on an "in support"
>>> release of Kafka has to upgrade every 8-12 months.
> If you don't actually want anything specific from the newer releases,
>>> it's just unnecessary churn.
> 
> Wouldn't it be better to designate one release every 12-18 months as a
>>> long-term support release with bugfix releases
> produced for those for a longer period of say 24 months. That halves
>>> the upgrade work for people just wanting to keep
> "in support". Now that adoption is increasing, there are plenty of
>>> users that just want a dependable messaging system
> without having to be deeply knowledgeable about its innards.
> 
> LTS works nicely for plenty of open-source projects. I think it would
>>> work well for Kafka too.
> 
> Andrew Schofield
> 
> 
>> From: ofir.ma...@equalum.io
>> Date: Thu, 25 Aug 2016 

[jira] [Commented] (KAFKA-4111) broker compress data of certain size instead on a produce request

2016-09-06 Thread julien1987 (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469377#comment-15469377
 ] 

julien1987 commented on KAFKA-4111:
---

  Sorry, my expression was not clear. I mean all messages from multiple 
producers are on the same topic and partition. Total count of  all messages is 
big but count of messages from every producer is very small.

  For example,  there are 1k producers, every producer send 5 messages and 
these messages are on the same topic-partition. I want broker can compress 5k 
messages(1 time) instead of compressing 5 messages every time(1k times).

> broker compress data of certain size instead on a produce request
> -
>
> Key: KAFKA-4111
> URL: https://issues.apache.org/jira/browse/KAFKA-4111
> Project: Kafka
>  Issue Type: Improvement
>  Components: compression
>Affects Versions: 0.10.0.1
>Reporter: julien1987
>
> When "compression.type" is set on broker config, broker compress data on 
> every produce request. But on our sences, produce requst is very many, and 
> data of every request is not so much. So compression result is not good. Can 
> Broker compress data of every certain size from many produce requests?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] Time-based releases for Apache Kafka

2016-09-06 Thread Jason Gustafson
Hey All,

It sounds like the general consensus is in favor of time-based releases. We
can continue the discussion about LTS, but I wanted to go ahead and get
things moving forward by volunteering to manage the next release, which is
currently slated for October. If that sounds OK, I'll draft a release plan
and send it out to the community for feedback and a vote.

Thanks,
Jason

On Thu, Aug 25, 2016 at 2:03 PM, Ofir Manor  wrote:

> I happily agree that Kafka is a solid and the community is great :)
> But I think there is a gap in perception here.
> For me, LTS means that someone is actively taking care of a release -
> actively backporting critical fixes (security, stability, data loss,
> corruption, hangs etc) from trunk to that LTS version periodically for an
> extended period of time, for example 18-36 months... So people can really
> rely on the same Kafka version for a long time.
> Is someone doing it today for 0.9.0? When is 0.9.0.2 expected? When is
> 0.8.2.3 expected? Will they cover all known critical issues for whoever
> relies on them in production?
> In other words, what is the scope of support that the community want to
> commit for older versions? (upgrade compatibility? investigating bug
> reports? proactively backporting fixes?)
> BTW, another legit option is that the Apache Kafka project won't commit to
> LTS releases. It could let commercial vendors compete on supporting very
> old versions. I find that actually quite reasonable as well.
>
> Ofir Manor
>
> Co-Founder & CTO | Equalum
>
> Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io
>
> On Thu, Aug 25, 2016 at 8:19 PM, Andrew Schofield <
> andrew_schofield_j...@outlook.com> wrote:
>
> > I agree that the Kafka community has managed to maintain a very high
> > quality level, so I'm not concerned
> > about the quality of non-LTS releases. If the principle is that every
> > release is supported for 2 years, that
> > would be good. I suppose that if the burden of having that many
> in-support
> > releases proves too heavy,
> > as you say we could reconsider.
> >
> > Andrew Schofield
> >
> > 
> > > From: g...@confluent.io
> > > Date: Thu, 25 Aug 2016 09:57:30 -0700
> > > Subject: Re: [DISCUSS] Time-based releases for Apache Kafka
> > > To: dev@kafka.apache.org
> > >
> > > I prefer Ismael's suggestion for supporting 2-years (6 releases)
> > > rather than have designated LTS releases.
> > >
> > > The LTS model seems to work well when some releases are high quality
> > > (LTS) and the rest are a bit more questionable. It is great for
> > > companies like Redhat, where they have to invest less to support few
> > > releases and let the community deal with everything else.
> > >
> > > Until now the Kafka community has managed to maintain very high
> > > quality level. Not just for releases, our trunk is often of better
> > > quality than other project's releases - we don't think of stability as
> > > something you tuck into a release (and just some releases) but rather
> > > as an on-going concern. There are costs to doing things that way, but
> > > in general, I think it has served us well - allowing even conservative
> > > companies to run on the latest released version.
> > >
> > > I hope we can agree to at least try maintaining last 6 releases as LTS
> > > (i.e. every single release is supported for 2 years) rather than
> > > designate some releases as better than others. Of course, if this
> > > totally fails, we can reconsider.
> > >
> > > Gwen
> > >
> > > On Thu, Aug 25, 2016 at 9:51 AM, Andrew Schofield
> > >  wrote:
> > >> The proposal sounds pretty good, but the main thing currently missing
> > is a proper long-term support release.
> > >>
> > >> Having 3 releases a year sounds OK, but if they're all equivalent and
> > bugfix releases are produced for the most
> > >> recent 2 or 3 releases, anyone wanting to run on an "in support"
> > release of Kafka has to upgrade every 8-12 months.
> > >> If you don't actually want anything specific from the newer releases,
> > it's just unnecessary churn.
> > >>
> > >> Wouldn't it be better to designate one release every 12-18 months as a
> > long-term support release with bugfix releases
> > >> produced for those for a longer period of say 24 months. That halves
> > the upgrade work for people just wanting to keep
> > >> "in support". Now that adoption is increasing, there are plenty of
> > users that just want a dependable messaging system
> > >> without having to be deeply knowledgeable about its innards.
> > >>
> > >> LTS works nicely for plenty of open-source projects. I think it would
> > work well for Kafka too.
> > >>
> > >> Andrew Schofield
> > >>
> > >> 
> > >>> From: ofir.ma...@equalum.io
> > >>> Date: Thu, 25 Aug 2016 16:07:07 +0300
> > >>> Subject: Re: [DISCUSS] Time-based releases for Apache Kafka
> > >>> To: dev@kafka.apache.org
> > >>>
> > 

Re: [VOTE] KIP-78 Cluster Id (second attempt)

2016-09-06 Thread Sriram Subramanian
+1 binding

> On Sep 6, 2016, at 7:46 PM, Ismael Juma  wrote:
> 
> Hi all,
> 
> I would like to (re)initiate[1] the voting process for KIP-78 Cluster Id:
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-78%3A+Cluster+Id
> 
> As explained in the KIP and discussion thread, we see this as a good first
> step that can serve as a foundation for future improvements.
> 
> Thanks,
> Ismael
> 
> [1] Even though I created a new vote thread, Gmail placed the messages in
> the discuss thread, making it not as visible as required. It's important to
> mention that two +1s were cast by Gwen and Sriram:
> 
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201609.mbox/%3CCAD5tkZbLv7fvH4q%2BKe%2B%3DJMgGq%2BZT2t34e0WRUsCT1ErhtKOg1w%40mail.gmail.com%3E


Build failed in Jenkins: kafka-trunk-jdk8 #863

2016-09-06 Thread Apache Jenkins Server
See 

Changes:

[ismael] MINOR: Include TopicPartition in warning when log cleaner resets dirty

[ismael] KAFKA-4129; Processor throw exception when getting channel remote

[cshapi] MINOR: More graceful handling of buffers that are too small in Record's

--
[...truncated 3466 lines...]

kafka.api.AuthorizerIntegrationTest > 
testCreatePermissionNeededForWritingToNonExistentTopic STARTED

kafka.api.AuthorizerIntegrationTest > 
testCreatePermissionNeededForWritingToNonExistentTopic PASSED

kafka.api.AuthorizerIntegrationTest > 
testSimpleConsumeWithOffsetLookupAndNoGroupAccess STARTED

kafka.api.AuthorizerIntegrationTest > 
testSimpleConsumeWithOffsetLookupAndNoGroupAccess PASSED

kafka.api.AuthorizerIntegrationTest > testConsumeWithTopicDescribe STARTED

kafka.api.AuthorizerIntegrationTest > testConsumeWithTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoTopicAccess STARTED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoTopicAccess PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoTopicAccess STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoTopicAccess PASSED

kafka.api.AuthorizerIntegrationTest > testProduceWithNoTopicAccess STARTED

kafka.api.AuthorizerIntegrationTest > testProduceWithNoTopicAccess PASSED

kafka.api.AuthorizerIntegrationTest > testProduceWithTopicWrite STARTED

kafka.api.AuthorizerIntegrationTest > testProduceWithTopicWrite PASSED

kafka.api.AuthorizerIntegrationTest > testAuthorization STARTED

kafka.api.AuthorizerIntegrationTest > testAuthorization PASSED

kafka.api.AuthorizerIntegrationTest > testUnauthorizedDeleteWithDescribe STARTED

kafka.api.AuthorizerIntegrationTest > testUnauthorizedDeleteWithDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testConsumeWithTopicAndGroupRead STARTED

kafka.api.AuthorizerIntegrationTest > testConsumeWithTopicAndGroupRead PASSED

kafka.api.AuthorizerIntegrationTest > testUnauthorizedDeleteWithoutDescribe 
STARTED

kafka.api.AuthorizerIntegrationTest > testUnauthorizedDeleteWithoutDescribe 
PASSED

kafka.api.AuthorizerIntegrationTest > testConsumeWithTopicWrite STARTED

kafka.api.AuthorizerIntegrationTest > testConsumeWithTopicWrite PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoGroupAccess STARTED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoGroupAccess PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoAccess STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoAccess PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoGroupAccess STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoGroupAccess PASSED

kafka.api.AuthorizerIntegrationTest > testConsumeWithNoAccess STARTED

kafka.api.AuthorizerIntegrationTest > testConsumeWithNoAccess PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithTopicAndGroupRead 
STARTED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithTopicAndGroupRead 
PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicDescribe STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testProduceWithTopicDescribe STARTED

kafka.api.AuthorizerIntegrationTest > testProduceWithTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchTopicDescribe STARTED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicAndGroupRead STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicAndGroupRead PASSED

kafka.api.AuthorizerIntegrationTest > 
testSimpleConsumeWithExplicitSeekAndNoGroupAccess STARTED

kafka.api.AuthorizerIntegrationTest > 
testSimpleConsumeWithExplicitSeekAndNoGroupAccess PASSED

kafka.api.SaslPlainPlaintextConsumerTest > testCoordinatorFailover STARTED

kafka.api.SaslPlainPlaintextConsumerTest > testCoordinatorFailover PASSED

kafka.api.SaslPlainPlaintextConsumerTest > testSimpleConsumption STARTED

kafka.api.SaslPlainPlaintextConsumerTest > testSimpleConsumption PASSED

kafka.api.RequestResponseSerializationTest > 
testSerializationAndDeserialization STARTED

kafka.api.RequestResponseSerializationTest > 
testSerializationAndDeserialization PASSED

kafka.api.RequestResponseSerializationTest > testFetchResponseVersion STARTED

kafka.api.RequestResponseSerializationTest > testFetchResponseVersion PASSED

kafka.api.RequestResponseSerializationTest > testProduceResponseVersion STARTED

kafka.api.RequestResponseSerializationTest > testProduceResponseVersion PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testNoConsumeAcl STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testNoConsumeAcl PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testProduceConsumeViaAssign 
STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testProduceConsumeViaAssign 
PASSED


Re: [ANNOUNCE] New committer: Jason Gustafson

2016-09-06 Thread Jason Gustafson
Thanks all!

On Tue, Sep 6, 2016 at 5:13 PM, Becket Qin  wrote:

> Congrats, Jason!
>
> On Tue, Sep 6, 2016 at 5:09 PM, Onur Karaman  >
> wrote:
>
> > congrats jason!
> >
> > On Tue, Sep 6, 2016 at 4:12 PM, Sriram Subramanian 
> > wrote:
> >
> > > Congratulations Jason!
> > >
> > > On Tue, Sep 6, 2016 at 3:40 PM, Vahid S Hashemian <
> > > vahidhashem...@us.ibm.com
> > > > wrote:
> > >
> > > > Congratulations Jason on this very well deserved recognition.
> > > >
> > > > --Vahid
> > > >
> > > >
> > > >
> > > > From:   Neha Narkhede 
> > > > To: "dev@kafka.apache.org" ,
> > > > "us...@kafka.apache.org" 
> > > > Cc: "priv...@kafka.apache.org" 
> > > > Date:   09/06/2016 03:26 PM
> > > > Subject:[ANNOUNCE] New committer: Jason Gustafson
> > > >
> > > >
> > > >
> > > > The PMC for Apache Kafka has invited Jason Gustafson to join as a
> > > > committer and
> > > > we are pleased to announce that he has accepted!
> > > >
> > > > Jason has contributed numerous patches to a wide range of areas,
> > notably
> > > > within the new consumer and the Kafka Connect layers. He has
> displayed
> > > > great taste and judgement which has been apparent through his
> > involvement
> > > > across the board from mailing lists, JIRA, code reviews to
> contributing
> > > > features, bug fixes and code and documentation improvements.
> > > >
> > > > Thank you for your contribution and welcome to Apache Kafka, Jason!
> > > > --
> > > > Thanks,
> > > > Neha
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
>


[VOTE] KIP-78 Cluster Id (second attempt)

2016-09-06 Thread Ismael Juma
Hi all,

I would like to (re)initiate[1] the voting process for KIP-78 Cluster Id:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-78%3A+Cluster+Id

As explained in the KIP and discussion thread, we see this as a good first
step that can serve as a foundation for future improvements.

Thanks,
Ismael

[1] Even though I created a new vote thread, Gmail placed the messages in
the discuss thread, making it not as visible as required. It's important to
mention that two +1s were cast by Gwen and Sriram:

http://mail-archives.apache.org/mod_mbox/kafka-dev/201609.mbox/%3CCAD5tkZbLv7fvH4q%2BKe%2B%3DJMgGq%2BZT2t34e0WRUsCT1ErhtKOg1w%40mail.gmail.com%3E


[GitHub] kafka pull request #1825: MINOR: Reduce the log level when the peer isn't au...

2016-09-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1825


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: kafka-trunk-jdk8 #862

2016-09-06 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] MINOR: changes embedded broker time to MockTime

--
[...truncated 12223 lines...]
org.apache.kafka.streams.processor.internals.assignment.AssignmentInfoTest > 
testEncodeDecode PASSED

org.apache.kafka.streams.processor.internals.assignment.AssignmentInfoTest > 
shouldDecodePreviousVersion STARTED

org.apache.kafka.streams.processor.internals.assignment.AssignmentInfoTest > 
shouldDecodePreviousVersion PASSED

org.apache.kafka.streams.processor.internals.assignment.SubscriptionInfoTest > 
testEncodeDecode STARTED

org.apache.kafka.streams.processor.internals.assignment.SubscriptionInfoTest > 
testEncodeDecode PASSED

org.apache.kafka.streams.processor.internals.assignment.SubscriptionInfoTest > 
shouldEncodeDecodeWithUserEndPoint STARTED

org.apache.kafka.streams.processor.internals.assignment.SubscriptionInfoTest > 
shouldEncodeDecodeWithUserEndPoint PASSED

org.apache.kafka.streams.processor.internals.assignment.SubscriptionInfoTest > 
shouldBeBackwardCompatible STARTED

org.apache.kafka.streams.processor.internals.assignment.SubscriptionInfoTest > 
shouldBeBackwardCompatible PASSED

org.apache.kafka.streams.processor.internals.StreamThreadTest > 
testInjectClients STARTED

org.apache.kafka.streams.processor.internals.StreamThreadTest > 
testInjectClients PASSED

org.apache.kafka.streams.processor.internals.StreamThreadTest > testMaybeCommit 
STARTED

org.apache.kafka.streams.processor.internals.StreamThreadTest > testMaybeCommit 
PASSED

org.apache.kafka.streams.processor.internals.StreamThreadTest > 
testPartitionAssignmentChange STARTED

org.apache.kafka.streams.processor.internals.StreamThreadTest > 
testPartitionAssignmentChange PASSED

org.apache.kafka.streams.processor.internals.StreamThreadTest > testMaybeClean 
STARTED

org.apache.kafka.streams.processor.internals.StreamThreadTest > testMaybeClean 
PASSED

org.apache.kafka.streams.processor.internals.PunctuationQueueTest > 
testPunctuationInterval STARTED

org.apache.kafka.streams.processor.internals.PunctuationQueueTest > 
testPunctuationInterval PASSED

org.apache.kafka.streams.processor.internals.RecordQueueTest > testTimeTracking 
STARTED

org.apache.kafka.streams.processor.internals.RecordQueueTest > testTimeTracking 
PASSED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
testRegisterNonPersistentStore STARTED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
testRegisterNonPersistentStore PASSED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
testGetStore STARTED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
testGetStore PASSED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
testClose STARTED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
testClose PASSED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
testChangeLogOffsets STARTED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
testChangeLogOffsets PASSED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
testRegisterPersistentStore STARTED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
testRegisterPersistentStore PASSED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
testNoTopic STARTED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
testNoTopic PASSED

org.apache.kafka.streams.processor.internals.StreamTaskTest > testProcessOrder 
STARTED

org.apache.kafka.streams.processor.internals.StreamTaskTest > testProcessOrder 
PASSED

org.apache.kafka.streams.processor.internals.StreamTaskTest > testPauseResume 
STARTED

org.apache.kafka.streams.processor.internals.StreamTaskTest > testPauseResume 
PASSED

org.apache.kafka.streams.processor.internals.StreamTaskTest > 
testMaybePunctuate STARTED

org.apache.kafka.streams.processor.internals.StreamTaskTest > 
testMaybePunctuate PASSED

org.apache.kafka.streams.processor.internals.QuickUnionTest > testUnite STARTED

org.apache.kafka.streams.processor.internals.QuickUnionTest > testUnite PASSED

org.apache.kafka.streams.processor.internals.QuickUnionTest > testUniteMany 
STARTED

org.apache.kafka.streams.processor.internals.QuickUnionTest > testUniteMany 
PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowNullProcessorSupplier STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowNullProcessorSupplier PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotSetApplicationIdToNull STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotSetApplicationIdToNull PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > testSourceTopics 
STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 

Build failed in Jenkins: kafka-trunk-jdk7 #1519

2016-09-06 Thread Apache Jenkins Server
See 

Changes:

[ismael] MINOR: Include TopicPartition in warning when log cleaner resets dirty

[ismael] KAFKA-4129; Processor throw exception when getting channel remote

[cshapi] MINOR: More graceful handling of buffers that are too small in Record's

--
[...truncated 1697 lines...]
kafka.api.SslProducerSendTest > testSendOffset PASSED

kafka.api.SslProducerSendTest > testAutoCreateTopic STARTED

kafka.api.SslProducerSendTest > testAutoCreateTopic PASSED

kafka.api.SslProducerSendTest > testSendWithInvalidCreateTime STARTED

kafka.api.SslProducerSendTest > testSendWithInvalidCreateTime PASSED

kafka.api.SslProducerSendTest > testSendCompressedMessageWithCreateTime STARTED

kafka.api.SslProducerSendTest > testSendCompressedMessageWithCreateTime PASSED

kafka.api.SslProducerSendTest > testCloseWithZeroTimeoutFromCallerThread STARTED

kafka.api.SslProducerSendTest > testCloseWithZeroTimeoutFromCallerThread PASSED

kafka.api.SslProducerSendTest > testCloseWithZeroTimeoutFromSenderThread STARTED

kafka.api.SslProducerSendTest > testCloseWithZeroTimeoutFromSenderThread PASSED

kafka.api.SslProducerSendTest > testSendNonCompressedMessageWithLogApendTime 
STARTED

kafka.api.SslProducerSendTest > testSendNonCompressedMessageWithLogApendTime 
PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicWrite STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicWrite PASSED

kafka.api.AuthorizerIntegrationTest > testConsumeWithNoTopicAccess STARTED

kafka.api.AuthorizerIntegrationTest > testConsumeWithNoTopicAccess PASSED

kafka.api.AuthorizerIntegrationTest > 
testCreatePermissionNeededToReadFromNonExistentTopic STARTED

kafka.api.AuthorizerIntegrationTest > 
testCreatePermissionNeededToReadFromNonExistentTopic PASSED

kafka.api.AuthorizerIntegrationTest > testDeleteWithWildCardAuth STARTED

kafka.api.AuthorizerIntegrationTest > testDeleteWithWildCardAuth PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoAccess STARTED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoAccess PASSED

kafka.api.AuthorizerIntegrationTest > testListOfsetsWithTopicDescribe STARTED

kafka.api.AuthorizerIntegrationTest > testListOfsetsWithTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testProduceWithTopicRead STARTED

kafka.api.AuthorizerIntegrationTest > testProduceWithTopicRead PASSED

kafka.api.AuthorizerIntegrationTest > testListOffsetsWithNoTopicAccess STARTED

kafka.api.AuthorizerIntegrationTest > testListOffsetsWithNoTopicAccess PASSED

kafka.api.AuthorizerIntegrationTest > 
testCreatePermissionNeededForWritingToNonExistentTopic STARTED

kafka.api.AuthorizerIntegrationTest > 
testCreatePermissionNeededForWritingToNonExistentTopic PASSED

kafka.api.AuthorizerIntegrationTest > 
testSimpleConsumeWithOffsetLookupAndNoGroupAccess STARTED

kafka.api.AuthorizerIntegrationTest > 
testSimpleConsumeWithOffsetLookupAndNoGroupAccess PASSED

kafka.api.AuthorizerIntegrationTest > testConsumeWithTopicDescribe STARTED

kafka.api.AuthorizerIntegrationTest > testConsumeWithTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoTopicAccess STARTED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoTopicAccess PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoTopicAccess STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoTopicAccess PASSED

kafka.api.AuthorizerIntegrationTest > testProduceWithNoTopicAccess STARTED

kafka.api.AuthorizerIntegrationTest > testProduceWithNoTopicAccess PASSED

kafka.api.AuthorizerIntegrationTest > testProduceWithTopicWrite STARTED

kafka.api.AuthorizerIntegrationTest > testProduceWithTopicWrite PASSED

kafka.api.AuthorizerIntegrationTest > testAuthorization STARTED

kafka.api.AuthorizerIntegrationTest > testAuthorization PASSED

kafka.api.AuthorizerIntegrationTest > testUnauthorizedDeleteWithDescribe STARTED

kafka.api.AuthorizerIntegrationTest > testUnauthorizedDeleteWithDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testConsumeWithTopicAndGroupRead STARTED

kafka.api.AuthorizerIntegrationTest > testConsumeWithTopicAndGroupRead PASSED

kafka.api.AuthorizerIntegrationTest > testUnauthorizedDeleteWithoutDescribe 
STARTED

kafka.api.AuthorizerIntegrationTest > testUnauthorizedDeleteWithoutDescribe 
PASSED

kafka.api.AuthorizerIntegrationTest > testConsumeWithTopicWrite STARTED

kafka.api.AuthorizerIntegrationTest > testConsumeWithTopicWrite PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoGroupAccess STARTED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoGroupAccess PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoAccess STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoAccess PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoGroupAccess STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoGroupAccess PASSED


[jira] [Commented] (KAFKA-4074) Deleting a topic can make it unavailable even if delete.topic.enable is false

2016-09-06 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469233#comment-15469233
 ] 

Jun Rao commented on KAFKA-4074:


[~jjkoshy], is this the same issue as 
https://issues.apache.org/jira/browse/KAFKA-3175 ?

> Deleting a topic can make it unavailable even if delete.topic.enable is false
> -
>
> Key: KAFKA-4074
> URL: https://issues.apache.org/jira/browse/KAFKA-4074
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Reporter: Joel Koshy
>Assignee: Manikumar Reddy
> Fix For: 0.10.1.0
>
>
> The {{delete.topic.enable}} configuration does not completely block the 
> effects of delete topic since the controller may (indirectly) query the list 
> of topics under the delete-topic znode.
> To reproduce:
> * Delete topic X
> * Force a controller move (either by bouncing or removing the controller 
> znode)
> * The new controller will send out UpdateMetadataRequests with leader=-2 for 
> the partitions of X
> * Producers eventually stop producing to that topic
> The reason for this is that when ControllerChannelManager adds 
> UpdateMetadataRequests for brokers, we directly use the partitionsToBeDeleted 
> field of the DeleteTopicManager (which is set to the partitions of the topics 
> under the delete-topic znode on controller startup).
> In order to get out of the situation you have to remove X from the znode and 
> then force another controller move.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-78: Cluster Id

2016-09-06 Thread Ismael Juma
Hi Jun,

This is a new [VOTE] thread, you can see it in the archives:

http://mail-archives.apache.org/mod_mbox/kafka-dev/201609.mbox/%3CCAD5tkZbLv7fvH4q%2BKe%2B%3DJMgGq%2BZT2t34e0WRUsCT1ErhtKOg1w%40mail.gmail.com%3E

I suspect Gmail is incorrectly placing it in the wrong thread (it did for
me too, annoyingly). I will try again and hopefully Gmail will do the right
thing this time.

Ismael

On Wed, Sep 7, 2016 at 2:34 AM, Jun Rao  wrote:

> Ismael,
>
> Could you move this to a new [VOTE] thread to make the voting clear?
>
> Thanks,
>
> Jun
>
> On Tue, Sep 6, 2016 at 4:11 PM, Ismael Juma  wrote:
>
> > Hi all,
> >
> > I would like to initiate the voting process for KIP-78: Cluster Id:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-78%3A+Cluster+Id
> >
> > As explained in the KIP and discussion thread, we see this as a good
> first
> > step that can serve as a foundation for future improvements.
> >
> > Thanks,
> > Ismael
> >
>


[jira] [Comment Edited] (KAFKA-4024) First metadata update always take retry.backoff.ms milliseconds to complete

2016-09-06 Thread Yuto Kawamura (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15458639#comment-15458639
 ] 

Yuto Kawamura edited comment on KAFKA-4024 at 9/7/16 1:46 AM:
--

I reconsidered this issue and think I found that this is much worse than I 
explained before.

IIUC, in short, setting {{retry.backoff.ms}} to lager value can delays 
KafkaProducer to update outdated metadata.
That is, when we set {{retry.backoff.ms}} to 1 second for example, and a 
partition leadership failover happens, the producer will take 1 seconds to fire 
metadata request in the worst case, even though it could detect broker 
disconnection or outdated partition leadership information.

Here's the result of my experiment. I modified 
{{KafkaProducerMetadataUpdateDurationTest}} and observed DEBUG logs of 
NetworkClient and Metadata.

clients/src/main/java/KafkaProducerMetadataUpdateDurationTest.java:
{code}
import java.util.Properties;
import java.util.concurrent.TimeUnit;

import org.apache.kafka.clients.producer.Callback;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.Producer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.clients.producer.RecordMetadata;
import org.apache.kafka.common.serialization.StringSerializer;

public final class KafkaProducerMetadataUpdateDurationTest {
public static void main(String[] args) throws InterruptedException {
Properties props = new Properties();
props.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, 
"HOST-1:9092,HOST-2:9092,HOST-3:9092");
props.setProperty(ProducerConfig.RECONNECT_BACKOFF_MS_CONFIG, "1000");
props.setProperty(ProducerConfig.RETRIES_CONFIG, 
String.valueOf(Integer.MAX_VALUE));
String retryBackoffMs = System.getProperty("retry.backoff.ms");
System.err.println("Experimenting with retry.backoff.ms = " + 
retryBackoffMs);
props.setProperty(ProducerConfig.RETRY_BACKOFF_MS_CONFIG, 
retryBackoffMs);

Producer producer =
new KafkaProducer<>(props, new StringSerializer(), new 
StringSerializer());

try {
int i = 0;
while (true) {
final int produceSeq = i++;
final long t0 = System.nanoTime();
producer.send(new ProducerRecord<>("test", produceSeq % 3, 
"key", "value"),
  new Callback() {
  @Override
  public void onCompletion(RecordMetadata 
metadata, Exception exception) {
  long produceDuration = 
TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - t0);
  System.err.printf("Produce[%d]: 
duration=%d, exception=%s\n", produceSeq, produceDuration, exception);
  }
  });
long sendDuration = 
TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - t0);
System.err.printf("Send[%d]: duration=%d\n", produceSeq, 
sendDuration);
Thread.sleep(1000);
}
} finally {
producer.close();
}
}
}
{code}

log4j.properties:
{code}
log4j.rootLogger=INFO, stdout

log4j.logger.org.apache.kafka.clients.Metadata=DEBUG, stdout
log4j.additivity.org.apache.kafka.clients.Metadata=false
log4j.logger.org.apache.kafka.clients.NetworkClient=DEBUG, stdout
log4j.additivity.org.apache.kafka.clients.NetworkClient=false
log4j.logger.org.apache.kafka.clients.producer.internals.Sender=DEBUG, stdout
log4j.additivity.org.apache.kafka.clients.producer.internals.Sender=DEBUG, 
stdout

log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n
{code}

Topic "test" has 3 replicas and 3 partitions.
Then I started KafkaProducerMetadataUpdateDurationTest, and stopped broker 1 
manually at (*2). Here's the log:

{code}
./bin/kafka-run-class.sh -Dlog4j.configuration=file:./log4j.properties 
-Dretry.backoff.ms=1 KafkaProducerMetadataUpdateDurationTest
Experimenting with retry.backoff.ms = 1
...
[2016-09-02 22:36:29,839] INFO Kafka version : 0.10.1.0-SNAPSHOT 
(org.apache.kafka.common.utils.AppInfoParser)
[2016-09-02 22:36:29,839] INFO Kafka commitId : 8f3462552fa4d6a6 
(org.apache.kafka.common.utils.AppInfoParser)
[2016-09-02 22:36:39,826] DEBUG Initialize connection to node -2 for sending 
metadata request (org.apache.kafka.clients.NetworkClient)
[2016-09-02 22:36:39,826] DEBUG Initiating connection to node -2 at 
HOST-2:9092. (org.apache.kafka.clients.NetworkClient)
[2016-09-02 22:36:39,883] DEBUG Completed connection to node -2 
(org.apache.kafka.clients.NetworkClient)

# *1 The 

Re: [VOTE] KIP-78: Cluster Id

2016-09-06 Thread Jun Rao
Ismael,

Could you move this to a new [VOTE] thread to make the voting clear?

Thanks,

Jun

On Tue, Sep 6, 2016 at 4:11 PM, Ismael Juma  wrote:

> Hi all,
>
> I would like to initiate the voting process for KIP-78: Cluster Id:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-78%3A+Cluster+Id
>
> As explained in the KIP and discussion thread, we see this as a good first
> step that can serve as a foundation for future improvements.
>
> Thanks,
> Ismael
>


[jira] [Commented] (KAFKA-4033) KIP-70: Revise Partition Assignment Semantics on New Consumer's Subscription Change

2016-09-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469156#comment-15469156
 ] 

ASF GitHub Bot commented on KAFKA-4033:
---

GitHub user vahidhashemian reopened a pull request:

https://github.com/apache/kafka/pull/1726

KAFKA-4033: KIP-70: Revise Partition Assignment Semantics on New Consumer's 
Subscription Change

This PR changes topic subscription semantics so a change in subscription 
does not immediately cause a rebalance.
Instead, the next poll or the next scheduled metadata refresh will update 
the assigned partitions.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vahidhashemian/kafka KAFKA-4033

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1726.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1726


commit 09d35a5f8b5b91c9d71de345c14ccf8c8ef21876
Author: Vahid Hashemian 
Date:   2016-08-22T18:45:52Z

KAFKA-4033: KIP-70: Revise Partition Assignment Semantics on New Consumer's 
Subscription Change

This PR changes topic subscription semantics so a change in subscription 
does not immediately cause a rebalance.
Instead, the next poll or the next scheduled metadata refresh will update 
the assigned partitions.

commit e26a42fff83ea8a3d997ad97d5cc2b5704f1e62b
Author: Vahid Hashemian 
Date:   2016-08-22T23:29:48Z

Unit test for subscription change

commit cc9c4c4750addbc7a99588ff69e669b96a18d361
Author: Vahid Hashemian 
Date:   2016-08-24T06:22:05Z

Clean up KafkaConsumerTest.java and add reusable methods

commit f60fb52a702a6bcb854e9560548174fc5ad86730
Author: Vahid Hashemian 
Date:   2016-08-26T00:23:22Z

Fix unsubscribe semantics and update unit tests




> KIP-70: Revise Partition Assignment Semantics on New Consumer's Subscription 
> Change
> ---
>
> Key: KAFKA-4033
> URL: https://issues.apache.org/jira/browse/KAFKA-4033
> Project: Kafka
>  Issue Type: Bug
>Reporter: Vahid Hashemian
>Assignee: Vahid Hashemian
>
> Modify the new consumer's implementation of topics subscribe and unsubscribe 
> interfaces so that they do not cause an immediate assignment update (this is 
> how the regex subscribe interface is implemented). Instead, the assignment 
> remains valid until it has been revoked in the next rebalance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4033) KIP-70: Revise Partition Assignment Semantics on New Consumer's Subscription Change

2016-09-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469155#comment-15469155
 ] 

ASF GitHub Bot commented on KAFKA-4033:
---

Github user vahidhashemian closed the pull request at:

https://github.com/apache/kafka/pull/1726


> KIP-70: Revise Partition Assignment Semantics on New Consumer's Subscription 
> Change
> ---
>
> Key: KAFKA-4033
> URL: https://issues.apache.org/jira/browse/KAFKA-4033
> Project: Kafka
>  Issue Type: Bug
>Reporter: Vahid Hashemian
>Assignee: Vahid Hashemian
>
> Modify the new consumer's implementation of topics subscribe and unsubscribe 
> interfaces so that they do not cause an immediate assignment update (this is 
> how the regex subscribe interface is implemented). Instead, the assignment 
> remains valid until it has been revoked in the next rebalance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1726: KAFKA-4033: KIP-70: Revise Partition Assignment Se...

2016-09-06 Thread vahidhashemian
GitHub user vahidhashemian reopened a pull request:

https://github.com/apache/kafka/pull/1726

KAFKA-4033: KIP-70: Revise Partition Assignment Semantics on New Consumer's 
Subscription Change

This PR changes topic subscription semantics so a change in subscription 
does not immediately cause a rebalance.
Instead, the next poll or the next scheduled metadata refresh will update 
the assigned partitions.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vahidhashemian/kafka KAFKA-4033

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1726.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1726


commit 09d35a5f8b5b91c9d71de345c14ccf8c8ef21876
Author: Vahid Hashemian 
Date:   2016-08-22T18:45:52Z

KAFKA-4033: KIP-70: Revise Partition Assignment Semantics on New Consumer's 
Subscription Change

This PR changes topic subscription semantics so a change in subscription 
does not immediately cause a rebalance.
Instead, the next poll or the next scheduled metadata refresh will update 
the assigned partitions.

commit e26a42fff83ea8a3d997ad97d5cc2b5704f1e62b
Author: Vahid Hashemian 
Date:   2016-08-22T23:29:48Z

Unit test for subscription change

commit cc9c4c4750addbc7a99588ff69e669b96a18d361
Author: Vahid Hashemian 
Date:   2016-08-24T06:22:05Z

Clean up KafkaConsumerTest.java and add reusable methods

commit f60fb52a702a6bcb854e9560548174fc5ad86730
Author: Vahid Hashemian 
Date:   2016-08-26T00:23:22Z

Fix unsubscribe semantics and update unit tests




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #1726: KAFKA-4033: KIP-70: Revise Partition Assignment Se...

2016-09-06 Thread vahidhashemian
Github user vahidhashemian closed the pull request at:

https://github.com/apache/kafka/pull/1726


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #1672: MINOR: More graceful handling of buffers that are ...

2016-09-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1672


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3937) Kafka Clients Leak Native Memory For Longer Than Needed With Compressed Messages

2016-09-06 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469078#comment-15469078
 ] 

Ismael Juma commented on KAFKA-3937:


[~williamyu], we occasionally commit to the `0.9.0` branch, but there is no set 
plan for a 0.9.0.2 release. It doesn't mean it won't happen, but resources at 
the moment are mostly focused on 0.10.1.0.

> Kafka Clients Leak Native Memory For Longer Than Needed With Compressed 
> Messages
> 
>
> Key: KAFKA-3937
> URL: https://issues.apache.org/jira/browse/KAFKA-3937
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.10.0.0
> Environment: Linux, latest oracle java-8
>Reporter: Tom Crayford
>Assignee: William Yu
>Priority: Minor
> Fix For: 0.10.1.0
>
>
> In https://issues.apache.org/jira/browse/KAFKA-3933, we discovered that 
> brokers can crash when performing log recovery, as they leak native memory 
> whilst decompressing compressed segments, and that native memory isn't 
> cleaned up rapidly enough by garbage collection and finalizers. The work to 
> fix that in the brokers is taking part in 
> https://github.com/apache/kafka/pull/1598. As part of that PR, Ismael Juma 
> asked me to fix similar issues in the client. Rather than have one large PR 
> that fixes everything, I'd rather break this work up into seperate things, so 
> I'm filing this JIRA to track the followup work. I should get to a PR on this 
> at some point relatively soon, once the other PR has landed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3937) Kafka Clients Leak Native Memory For Longer Than Needed With Compressed Messages

2016-09-06 Thread William Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469051#comment-15469051
 ] 

William Yu commented on KAFKA-3937:
---

[~ijuma] So, we don't plan on upgrading to kafka 0.10.x anytime in the near 
future as we still have to get all our services onto the kafka 0.9.x client 
from 0.8.2. Is Kafka 0.9.x still being patched and released? If it is can, then 
I can submit a patch for 9. Thanks. 

> Kafka Clients Leak Native Memory For Longer Than Needed With Compressed 
> Messages
> 
>
> Key: KAFKA-3937
> URL: https://issues.apache.org/jira/browse/KAFKA-3937
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.8.2.2, 0.9.0.1, 0.10.0.0
> Environment: Linux, latest oracle java-8
>Reporter: Tom Crayford
>Assignee: William Yu
>Priority: Minor
> Fix For: 0.10.1.0
>
>
> In https://issues.apache.org/jira/browse/KAFKA-3933, we discovered that 
> brokers can crash when performing log recovery, as they leak native memory 
> whilst decompressing compressed segments, and that native memory isn't 
> cleaned up rapidly enough by garbage collection and finalizers. The work to 
> fix that in the brokers is taking part in 
> https://github.com/apache/kafka/pull/1598. As part of that PR, Ismael Juma 
> asked me to fix similar issues in the client. Rather than have one large PR 
> that fixes everything, I'd rather break this work up into seperate things, so 
> I'm filing this JIRA to track the followup work. I should get to a PR on this 
> at some point relatively soon, once the other PR has landed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1826: KAFKA-4129: Processor throw exception when getting...

2016-09-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1826


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-4129) Processor throw exception when getting channel remote address after closing the channel

2016-09-06 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-4129:
---
   Resolution: Fixed
Fix Version/s: 0.10.1.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 1826
[https://github.com/apache/kafka/pull/1826]

> Processor throw exception when getting channel remote address after closing 
> the channel
> ---
>
> Key: KAFKA-4129
> URL: https://issues.apache.org/jira/browse/KAFKA-4129
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.1
>Reporter: TAO XIAO
>Assignee: TAO XIAO
> Fix For: 0.10.1.0
>
>
> In Processor {{configureNewConnections()}} catch block, it explicitly closes 
> {{channel}} before calling {{channel.getRemoteAddress}} which results in 
> {{ClosedChannelException}} being thrown. This is due to Java implementation 
> that no remote address can be returned after the channel is closed
> {code}
> case NonFatal(e) =>
>  // need to close the channel here to avoid a socket leak.
>  close(channel)
>  error(s"Processor $id closed connection from 
> ${channel.getRemoteAddress}", e)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4129) Processor throw exception when getting channel remote address after closing the channel

2016-09-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469049#comment-15469049
 ] 

ASF GitHub Bot commented on KAFKA-4129:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1826


> Processor throw exception when getting channel remote address after closing 
> the channel
> ---
>
> Key: KAFKA-4129
> URL: https://issues.apache.org/jira/browse/KAFKA-4129
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.1
>Reporter: TAO XIAO
>Assignee: TAO XIAO
> Fix For: 0.10.1.0
>
>
> In Processor {{configureNewConnections()}} catch block, it explicitly closes 
> {{channel}} before calling {{channel.getRemoteAddress}} which results in 
> {{ClosedChannelException}} being thrown. This is due to Java implementation 
> that no remote address can be returned after the channel is closed
> {code}
> case NonFatal(e) =>
>  // need to close the channel here to avoid a socket leak.
>  close(channel)
>  error(s"Processor $id closed connection from 
> ${channel.getRemoteAddress}", e)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1801: MINOR: Include TopicPartition in warning when log ...

2016-09-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1801


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [ANNOUNCE] New committer: Jason Gustafson

2016-09-06 Thread Becket Qin
Congrats, Jason!

On Tue, Sep 6, 2016 at 5:09 PM, Onur Karaman 
wrote:

> congrats jason!
>
> On Tue, Sep 6, 2016 at 4:12 PM, Sriram Subramanian 
> wrote:
>
> > Congratulations Jason!
> >
> > On Tue, Sep 6, 2016 at 3:40 PM, Vahid S Hashemian <
> > vahidhashem...@us.ibm.com
> > > wrote:
> >
> > > Congratulations Jason on this very well deserved recognition.
> > >
> > > --Vahid
> > >
> > >
> > >
> > > From:   Neha Narkhede 
> > > To: "dev@kafka.apache.org" ,
> > > "us...@kafka.apache.org" 
> > > Cc: "priv...@kafka.apache.org" 
> > > Date:   09/06/2016 03:26 PM
> > > Subject:[ANNOUNCE] New committer: Jason Gustafson
> > >
> > >
> > >
> > > The PMC for Apache Kafka has invited Jason Gustafson to join as a
> > > committer and
> > > we are pleased to announce that he has accepted!
> > >
> > > Jason has contributed numerous patches to a wide range of areas,
> notably
> > > within the new consumer and the Kafka Connect layers. He has displayed
> > > great taste and judgement which has been apparent through his
> involvement
> > > across the board from mailing lists, JIRA, code reviews to contributing
> > > features, bug fixes and code and documentation improvements.
> > >
> > > Thank you for your contribution and welcome to Apache Kafka, Jason!
> > > --
> > > Thanks,
> > > Neha
> > >
> > >
> > >
> > >
> > >
> >
>


Build failed in Jenkins: kafka-trunk-jdk7 #1518

2016-09-06 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] MINOR: changes embedded broker time to MockTime

--
[...truncated 1484 lines...]
kafka.api.SslEndToEndAuthorizationTest > testNoProduceAcl STARTED

kafka.api.SslEndToEndAuthorizationTest > testNoProduceAcl PASSED

kafka.api.SslEndToEndAuthorizationTest > testNoGroupAcl STARTED

kafka.api.SslEndToEndAuthorizationTest > testNoGroupAcl PASSED

kafka.api.SslEndToEndAuthorizationTest > testProduceConsumeViaSubscribe STARTED

kafka.api.SslEndToEndAuthorizationTest > testProduceConsumeViaSubscribe PASSED

kafka.api.SaslSslConsumerTest > testCoordinatorFailover STARTED

kafka.api.SaslSslConsumerTest > testCoordinatorFailover PASSED

kafka.api.SaslSslConsumerTest > testSimpleConsumption STARTED

kafka.api.SaslSslConsumerTest > testSimpleConsumption PASSED

kafka.api.test.ProducerCompressionTest > testCompression[0] STARTED

kafka.api.test.ProducerCompressionTest > testCompression[0] PASSED

kafka.api.test.ProducerCompressionTest > testCompression[1] STARTED

kafka.api.test.ProducerCompressionTest > testCompression[1] PASSED

kafka.api.test.ProducerCompressionTest > testCompression[2] STARTED

kafka.api.test.ProducerCompressionTest > testCompression[2] PASSED

kafka.api.test.ProducerCompressionTest > testCompression[3] STARTED

kafka.api.test.ProducerCompressionTest > testCompression[3] PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testNoConsumeAcl STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testNoConsumeAcl PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testProduceConsumeViaAssign 
STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testProduceConsumeViaAssign 
PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testNoProduceAcl STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testNoProduceAcl PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testNoGroupAcl STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > testNoGroupAcl PASSED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testProduceConsumeViaSubscribe STARTED

kafka.api.SaslPlainSslEndToEndAuthorizationTest > 
testProduceConsumeViaSubscribe PASSED

kafka.api.SslConsumerTest > testCoordinatorFailover STARTED

kafka.api.SslConsumerTest > testCoordinatorFailover PASSED

kafka.api.SslConsumerTest > testSimpleConsumption STARTED

kafka.api.SslConsumerTest > testSimpleConsumption PASSED

kafka.api.ProducerFailureHandlingTest > testCannotSendToInternalTopic STARTED

kafka.api.ProducerFailureHandlingTest > testCannotSendToInternalTopic PASSED

kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckOne STARTED

kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckOne PASSED

kafka.api.ProducerFailureHandlingTest > testWrongBrokerList STARTED

kafka.api.ProducerFailureHandlingTest > testWrongBrokerList PASSED

kafka.api.ProducerFailureHandlingTest > testNotEnoughReplicas STARTED

kafka.api.ProducerFailureHandlingTest > testNotEnoughReplicas PASSED

kafka.api.ProducerFailureHandlingTest > testNonExistentTopic STARTED

kafka.api.ProducerFailureHandlingTest > testNonExistentTopic PASSED

kafka.api.ProducerFailureHandlingTest > testInvalidPartition STARTED

kafka.api.ProducerFailureHandlingTest > testInvalidPartition PASSED

kafka.api.ProducerFailureHandlingTest > testSendAfterClosed STARTED

kafka.api.ProducerFailureHandlingTest > testSendAfterClosed PASSED

kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckZero STARTED

kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckZero PASSED

kafka.api.ProducerFailureHandlingTest > 
testNotEnoughReplicasAfterBrokerShutdown STARTED

kafka.api.ProducerFailureHandlingTest > 
testNotEnoughReplicasAfterBrokerShutdown PASSED

kafka.api.ProducerBounceTest > testBrokerFailure STARTED

kafka.api.ProducerBounceTest > testBrokerFailure PASSED

kafka.api.SaslPlaintextConsumerTest > testCoordinatorFailover STARTED

kafka.api.SaslPlaintextConsumerTest > testCoordinatorFailover PASSED

kafka.api.SaslPlaintextConsumerTest > testSimpleConsumption STARTED

kafka.api.SaslPlaintextConsumerTest > testSimpleConsumption PASSED

kafka.api.SslProducerSendTest > testSendNonCompressedMessageWithCreateTime 
STARTED

kafka.api.SslProducerSendTest > testSendNonCompressedMessageWithCreateTime 
PASSED

kafka.api.SslProducerSendTest > testSendCompressedMessageWithLogAppendTime 
STARTED

kafka.api.SslProducerSendTest > testSendCompressedMessageWithLogAppendTime 
PASSED

kafka.api.SslProducerSendTest > testClose STARTED

kafka.api.SslProducerSendTest > testClose PASSED

kafka.api.SslProducerSendTest > testFlush STARTED

kafka.api.SslProducerSendTest > testFlush PASSED

kafka.api.SslProducerSendTest > testSendToPartition STARTED

kafka.api.SslProducerSendTest > testSendToPartition PASSED

kafka.api.SslProducerSendTest > testSendOffset STARTED

kafka.api.SslProducerSendTest > testSendOffset PASSED


Re: [ANNOUNCE] New committer: Jason Gustafson

2016-09-06 Thread Onur Karaman
congrats jason!

On Tue, Sep 6, 2016 at 4:12 PM, Sriram Subramanian  wrote:

> Congratulations Jason!
>
> On Tue, Sep 6, 2016 at 3:40 PM, Vahid S Hashemian <
> vahidhashem...@us.ibm.com
> > wrote:
>
> > Congratulations Jason on this very well deserved recognition.
> >
> > --Vahid
> >
> >
> >
> > From:   Neha Narkhede 
> > To: "dev@kafka.apache.org" ,
> > "us...@kafka.apache.org" 
> > Cc: "priv...@kafka.apache.org" 
> > Date:   09/06/2016 03:26 PM
> > Subject:[ANNOUNCE] New committer: Jason Gustafson
> >
> >
> >
> > The PMC for Apache Kafka has invited Jason Gustafson to join as a
> > committer and
> > we are pleased to announce that he has accepted!
> >
> > Jason has contributed numerous patches to a wide range of areas, notably
> > within the new consumer and the Kafka Connect layers. He has displayed
> > great taste and judgement which has been apparent through his involvement
> > across the board from mailing lists, JIRA, code reviews to contributing
> > features, bug fixes and code and documentation improvements.
> >
> > Thank you for your contribution and welcome to Apache Kafka, Jason!
> > --
> > Thanks,
> > Neha
> >
> >
> >
> >
> >
>


[jira] [Resolved] (KAFKA-4048) Connect does not support RetriableException consistently for sinks

2016-09-06 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan resolved KAFKA-4048.

Resolution: Not A Problem

Turns out all exceptions from {{task.flush()}} are treated as retriable (see 
{{WorkerSinkTask.commitOffsets()}}), so there is nothing to do here.

> Connect does not support RetriableException consistently for sinks
> --
>
> Key: KAFKA-4048
> URL: https://issues.apache.org/jira/browse/KAFKA-4048
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Shikhar Bhushan
>Assignee: Shikhar Bhushan
>
> We only allow for handling {{RetriableException}} from calls to 
> {{SinkTask.put()}}, but this is something we should support also for 
> {{flush()}}  and arguably also {{open()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4136) KafkaBasedLog should include offsets it is trying to/successfully read to in log messages

2016-09-06 Thread Ewen Cheslack-Postava (JIRA)
Ewen Cheslack-Postava created KAFKA-4136:


 Summary: KafkaBasedLog should include offsets it is trying 
to/successfully read to in log messages
 Key: KAFKA-4136
 URL: https://issues.apache.org/jira/browse/KAFKA-4136
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 0.10.0.1
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
Priority: Minor


Including the offset information will aid in debugging. It'll need to be logged 
for all partitions, but often the # of partitions won't be too large and this 
helps to understand after an issue occurs what values were being used (e.g. for 
configs you can normally get this from DistributedHerder, but for the offsets 
topic, it allows you to dump the offsets topic and figure out what offset was 
being used even if the connector doesn't log any information itself).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-63: Unify store and downstream caching in streams

2016-09-06 Thread Guozhang Wang
Hi Matthias,

I agree with your concerns of coupling with record forwarding with record
storing in the state store, and my understanding is that this can (and
should) be resolved with the current interface. Here are my thoughts:

1. The global cache, MemoryLRUCacheBytes, although is currently defined as
internal class, since it is exposed in ProcessorContext anyways, should
really be a public class anyways that users can access to (I have some
other comments about the names, but will rather leave them in the PR).

2. In the processor API, the users can choose to use the cache to store the
intermediate results in the cache, and register the flush listener via
addDirtyEntryFlushListener (again some naming suggestions in PR but use it
for discussion for now). And as a result, if the old processor code looks
like this:



process(...) {

  state.put(...);
  context.forward(...);
}


Users can now leverage the cache on some of the processors by modifying the
code as:



init(...) {

  context.getCache().addDirtyEntyFlushLisener(processorName,
{state.put(...); context.forward(...)})
}

process(...) {

  context.getCache().put(processorName, ..);
}



3. Note whether or not to apply caching is optional for each processor node
now, and is decoupled with its logic of forwarding / storing in persistent
state stores.

One may argue that now if users want to make use of the cache, he will need
to make code changes; but I think this is a reasonable requirement to users
actually, since that 1) currently we do one update-per-incoming-record, and
without code changes this behavior will be preserved, and 2) for DSL
implementation, we can just follow the above pattern to abstract it from
users, so they can pick up these changes automatically.


Guozhang


On Tue, Sep 6, 2016 at 7:41 AM, Eno Thereska  wrote:

> A small update to the KIP: the deduping of records using the cache does
> not affect the .to operator since we'd have already deduped the KTable
> before the operator. Adjusting KIP.
>
> Thanks
> Eno
>
> > On 5 Sep 2016, at 12:43, Eno Thereska  wrote:
> >
> > Hi Matthias,
> >
> > The motivation for KIP-63 was primarily aggregates and reducing the load
> on "both" state stores and downstream. I think there is agreement that for
> the DSL the motivation and design make sense.
> >
> > For the Processor API: caching is a major component in any system, and
> it is difficult to continue to operate as before, without fully
> understanding the consequences. Hence, I think this is mostly a case of
> educating users to understand the boundaries of the solution.
> >
> > Introducing a cache, either for the state store only, or for downstream
> forwarding only, or for both, leads to moving from a model where we process
> each request end-to-end (today) to one where a request is temporarily
> buffered in a cache. In all the cases, this opens up the question of what
> to do next once the request then leaves the cache, and how to express that
> (future) behaviour. E.g., even when the cache is just for downstream
> forwarding (i.e., decoupled from any state store), the processor API user
> might be surprised that context.forward() does not immediately do anything.
> >
> > I agree that for ultra-flexibility, a processor API user should be able
> to choose whether the dedup cache is put 1) on top of a store only, 2) on
> forward only, 3) on both store and forward, but given the motivation for
> KIP-63 (aggregates), I believe a decoupled store-forward dedup cache is a
> reasonable choice that provides good default behaviour, without prodding
> the user to specify the combinations.
> >
> > We need to educate users that if a cache is used in the Processor API,
> the forwarding will happen in the future.
> >
> > -Eno
> >
> >
> >
> >> On 4 Sep 2016, at 19:11, Matthias J. Sax  wrote:
> >>
> >>> Processor code should always work; independently if caching is enabled
> >> or not.
> >>
> >> If we want to get this, I guess we need a quite different design (see
> (1)).
> >>
> >> The point is, that we want to dedup the output, and not state updates.
> >>
> >> It just happens that our starting point was KTable, for which state
> >> updates and downstream changelog output is the same thing. Thus, we can
> >> just use the internal KTable state to do the deduplication for the
> >> downstream changelog.
> >>
> >> However, from a general point of view (Processor API view), if we dedup
> >> the output, we want dedup/caching for the processor (and not for a state
> >> store). Of course, we need a state to do the dedup. For KTable, both
> >> things merge into a single abstraction, and we use only a single state
> >> instead of two. From a general point of view, we would need two states
> >> though (one for the actual state, and one for dedup -- think Processor
> >> API -- not DSL).
> >>
> >>
> >> Alternative proposal 1:
> >> 

Re: [VOTE] KIP-78: Cluster Id

2016-09-06 Thread Gwen Shapira
+1 (binding)

Thank you for the KIP and discussion. Looking forward to the PR and
future improvements.

On Tue, Sep 6, 2016 at 4:11 PM, Ismael Juma  wrote:
> Hi all,
>
> I would like to initiate the voting process for KIP-78: Cluster Id:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-78%3A+Cluster+Id
>
> As explained in the KIP and discussion thread, we see this as a good first
> step that can serve as a foundation for future improvements.
>
> Thanks,
> Ismael



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog


Agenda item for next KIP meeting

2016-09-06 Thread Vahid S Hashemian
Hi,

I'd like to add KIP-54 to the agenda for next KIP meeting.
This KIP has been in discussion phase for a long time, and it would be 
nice to have an online discussion about it, collect additional feedback, 
and move forward, if possible.

Thanks.
 
Regards,
--Vahid



Re: [VOTE] KIP-78: Cluster Id

2016-09-06 Thread Sriram Subramanian
+1 (binding)

On Tue, Sep 6, 2016 at 4:11 PM, Ismael Juma  wrote:

> Hi all,
>
> I would like to initiate the voting process for KIP-78: Cluster Id:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-78%3A+Cluster+Id
>
> As explained in the KIP and discussion thread, we see this as a good first
> step that can serve as a foundation for future improvements.
>
> Thanks,
> Ismael
>


Re: [ANNOUNCE] New committer: Jason Gustafson

2016-09-06 Thread Sriram Subramanian
Congratulations Jason!

On Tue, Sep 6, 2016 at 3:40 PM, Vahid S Hashemian  wrote:

> Congratulations Jason on this very well deserved recognition.
>
> --Vahid
>
>
>
> From:   Neha Narkhede 
> To: "dev@kafka.apache.org" ,
> "us...@kafka.apache.org" 
> Cc: "priv...@kafka.apache.org" 
> Date:   09/06/2016 03:26 PM
> Subject:[ANNOUNCE] New committer: Jason Gustafson
>
>
>
> The PMC for Apache Kafka has invited Jason Gustafson to join as a
> committer and
> we are pleased to announce that he has accepted!
>
> Jason has contributed numerous patches to a wide range of areas, notably
> within the new consumer and the Kafka Connect layers. He has displayed
> great taste and judgement which has been apparent through his involvement
> across the board from mailing lists, JIRA, code reviews to contributing
> features, bug fixes and code and documentation improvements.
>
> Thank you for your contribution and welcome to Apache Kafka, Jason!
> --
> Thanks,
> Neha
>
>
>
>
>


[VOTE] KIP-78: Cluster Id

2016-09-06 Thread Ismael Juma
Hi all,

I would like to initiate the voting process for KIP-78: Cluster Id:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-78%3A+Cluster+Id

As explained in the KIP and discussion thread, we see this as a good first
step that can serve as a foundation for future improvements.

Thanks,
Ismael


Re: [DISCUSS] KIP-78: Cluster Id

2016-09-06 Thread Ismael Juma
On Mon, Sep 5, 2016 at 10:41 PM, Ismael Juma  wrote:

> I will update the KIP to make this point clearer.
>

Done.

Ismael


Re: [ANNOUNCE] New committer: Jason Gustafson

2016-09-06 Thread Vahid S Hashemian
Congratulations Jason on this very well deserved recognition.

--Vahid



From:   Neha Narkhede 
To: "dev@kafka.apache.org" , 
"us...@kafka.apache.org" 
Cc: "priv...@kafka.apache.org" 
Date:   09/06/2016 03:26 PM
Subject:[ANNOUNCE] New committer: Jason Gustafson



The PMC for Apache Kafka has invited Jason Gustafson to join as a 
committer and
we are pleased to announce that he has accepted!

Jason has contributed numerous patches to a wide range of areas, notably
within the new consumer and the Kafka Connect layers. He has displayed
great taste and judgement which has been apparent through his involvement
across the board from mailing lists, JIRA, code reviews to contributing
features, bug fixes and code and documentation improvements.

Thank you for your contribution and welcome to Apache Kafka, Jason!
-- 
Thanks,
Neha






[jira] [Commented] (KAFKA-4132) Make our upgrade docs more factual and less scary

2016-09-06 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468842#comment-15468842
 ] 

Gwen Shapira commented on KAFKA-4132:
-

Assigning to [~iwrigley], as he volunteered off-list. I hope no one was working 
on it yet...

> Make our upgrade docs more factual and less scary
> -
>
> Key: KAFKA-4132
> URL: https://issues.apache.org/jira/browse/KAFKA-4132
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Gwen Shapira
>Assignee: Ian Wrigley
>
> I got some feedback that our upgrade docs currently scare people away from 
> upgrading Kafka. One reason is that the first line is "We have breaking 
> changes!".
> This was partially intentional (we were trying to scare people into reading 
> docs), but I think we scared people into staying on older versions.
> I think we need to break upgrade into two sections:
> 1. Upgrade section of design docs: should include our awesome backward 
> compatibility story and highlight changes, new features and compelling bug 
> fixes (breaking and non-breaking). It should also emphasize the important of 
> following the upgrade docs to the letter.
> 2. The actual upgrade instructions should move to the Ops section of the docs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4132) Make our upgrade docs more factual and less scary

2016-09-06 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira updated KAFKA-4132:

Assignee: Ian Wrigley

> Make our upgrade docs more factual and less scary
> -
>
> Key: KAFKA-4132
> URL: https://issues.apache.org/jira/browse/KAFKA-4132
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Gwen Shapira
>Assignee: Ian Wrigley
>
> I got some feedback that our upgrade docs currently scare people away from 
> upgrading Kafka. One reason is that the first line is "We have breaking 
> changes!".
> This was partially intentional (we were trying to scare people into reading 
> docs), but I think we scared people into staying on older versions.
> I think we need to break upgrade into two sections:
> 1. Upgrade section of design docs: should include our awesome backward 
> compatibility story and highlight changes, new features and compelling bug 
> fixes (breaking and non-breaking). It should also emphasize the important of 
> following the upgrade docs to the letter.
> 2. The actual upgrade instructions should move to the Ops section of the docs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1808: MINOR: changes embedded broker time to MockTime

2016-09-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1808


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [ANNOUNCE] New committer: Jason Gustafson

2016-09-06 Thread Ismael Juma
Congratulations Jason! Well-deserved. :)

Ismael

On Tue, Sep 6, 2016 at 11:25 PM, Neha Narkhede  wrote:

> The PMC for Apache Kafka has invited Jason Gustafson to join as a
> committer and
> we are pleased to announce that he has accepted!
>
> Jason has contributed numerous patches to a wide range of areas, notably
> within the new consumer and the Kafka Connect layers. He has displayed
> great taste and judgement which has been apparent through his involvement
> across the board from mailing lists, JIRA, code reviews to contributing
> features, bug fixes and code and documentation improvements.
>
> Thank you for your contribution and welcome to Apache Kafka, Jason!
> --
> Thanks,
> Neha
>


[jira] [Commented] (KAFKA-3478) Finer Stream Flow Control

2016-09-06 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468825#comment-15468825
 ] 

Guozhang Wang commented on KAFKA-3478:
--

[~mjsax] Since separate JIRAs have been created for these two separate goals, 
could we close this ticket then, or do you think there are still some 
additional feature requests that are covered only in this ticket?

> Finer Stream Flow Control
> -
>
> Key: KAFKA-3478
> URL: https://issues.apache.org/jira/browse/KAFKA-3478
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Guozhang Wang
>  Labels: user-experience
> Fix For: 0.10.1.0
>
>
> Today we have a event-time based flow control mechanism in order to 
> synchronize multiple input streams in a best effort manner:
> http://docs.confluent.io/3.0.0/streams/architecture.html#flow-control-with-timestamps
> However, there are some use cases where users would like to have finer 
> control of the input streams, for example, with two input streams, one of 
> them always reading from offset 0 upon (re)-starting, and the other reading 
> for log end offset.
> Today we only have one consumer config "offset.auto.reset" to control that 
> behavior, which means all streams are read either from "earliest" or "latest".
> We should consider how to improve this settings to allow users have finer 
> control over these frameworks.
> =
> A finer flow control could also be used to allow for populating a {{KTable}} 
> (with an "initial" state) before starting the actual processing (this feature 
> was ask for in the mailing list multiple times already). Even if it is quite 
> hard to define, *when* the initial populating phase should end, this might 
> still be useful. There would be the following possibilities:
>  1) an initial fixed time period for populating
>(it might be hard for a user to estimate the correct value)
>  2) an "idle" period, ie, if no update to a KTable for a certain time is
> done, we consider it as populated
>  3) a timestamp cut off point, ie, all records with an older timestamp
> belong to the initial populating phase
>  4) a throughput threshold, ie, if the populating frequency falls below
> the threshold, the KTable is considered "finished"
>  5) maybe something else ??
> The API might look something like this
> {noformat}
> KTable table = builder.table("topic", 1000); // populate the table without 
> reading any other topics until see one record with timestamp 1000.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [ANNOUNCE] New committer: Jason Gustafson

2016-09-06 Thread Gwen Shapira
Well deserved, Jason. Looking forward to your future contributions :)

On Tue, Sep 6, 2016 at 3:29 PM, Guozhang Wang  wrote:
> Welcome, and really happy to have you onboard Jason!
>
>
> Guozhang
>
> On Tue, Sep 6, 2016 at 3:25 PM, Neha Narkhede  wrote:
>
>> The PMC for Apache Kafka has invited Jason Gustafson to join as a
>> committer and
>> we are pleased to announce that he has accepted!
>>
>> Jason has contributed numerous patches to a wide range of areas, notably
>> within the new consumer and the Kafka Connect layers. He has displayed
>> great taste and judgement which has been apparent through his involvement
>> across the board from mailing lists, JIRA, code reviews to contributing
>> features, bug fixes and code and documentation improvements.
>>
>> Thank you for your contribution and welcome to Apache Kafka, Jason!
>> --
>> Thanks,
>> Neha
>>
>
>
>
> --
> -- Guozhang


Re: [ANNOUNCE] New committer: Jason Gustafson

2016-09-06 Thread Jun Rao
Congratulations, Jason!

Thanks,

Jun

On Tue, Sep 6, 2016 at 3:25 PM, Neha Narkhede  wrote:

> The PMC for Apache Kafka has invited Jason Gustafson to join as a
> committer and
> we are pleased to announce that he has accepted!
>
> Jason has contributed numerous patches to a wide range of areas, notably
> within the new consumer and the Kafka Connect layers. He has displayed
> great taste and judgement which has been apparent through his involvement
> across the board from mailing lists, JIRA, code reviews to contributing
> features, bug fixes and code and documentation improvements.
>
> Thank you for your contribution and welcome to Apache Kafka, Jason!
> --
> Thanks,
> Neha
>


Re: [ANNOUNCE] New committer: Jason Gustafson

2016-09-06 Thread Guozhang Wang
Welcome, and really happy to have you onboard Jason!


Guozhang

On Tue, Sep 6, 2016 at 3:25 PM, Neha Narkhede  wrote:

> The PMC for Apache Kafka has invited Jason Gustafson to join as a
> committer and
> we are pleased to announce that he has accepted!
>
> Jason has contributed numerous patches to a wide range of areas, notably
> within the new consumer and the Kafka Connect layers. He has displayed
> great taste and judgement which has been apparent through his involvement
> across the board from mailing lists, JIRA, code reviews to contributing
> features, bug fixes and code and documentation improvements.
>
> Thank you for your contribution and welcome to Apache Kafka, Jason!
> --
> Thanks,
> Neha
>



-- 
-- Guozhang


[jira] [Commented] (KAFKA-4113) Allow KTable bootstrap

2016-09-06 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468817#comment-15468817
 ] 

Guozhang Wang commented on KAFKA-4113:
--

More thoughts regarding this issue after discussing with [~jkreps] offline:

1. Within a single stream, records may not be ordered based on their 
timestamps. And since Kafka Streams always try to process records based on the 
source topic's log append ordering, it will result in so-called {{late arrived 
records}}.

2. When a single Kafka Streams task has multiple input streams, it will try to 
"synchronize" these streams based on their timestamps in a best-effort manner. 
Therefore we may process a record from one stream with higher timestamp before 
processing another record from another stream with lower timestamp.

3. For table-stream joins, ideally we want to process records strictly 
following the timestamp ordering across multiple streams; in that case, I 
believe users do NOT really need to {{bootstrap a KTable}} since as long as the 
timestamps of the stream and the table changelog are defined correctly, we are 
guaranteed to have the correct table snapshot when joining with the stream. For 
example, say if your table's update traffic is only once-a-week, whose 
timestamp is defined as the EOW, then when you are (re-)starting the 
application, it is guaranteed that any stream records whose timestamp is 
defined before the EOW time will be joined against the "old" table snapshot, 
and any stream records whose timestamp is defined after the EOW time will be 
joined against the "updated" table snapshot, which is the right behavior.

4. However because 1) and 2) above, we are not strictly following the timestamp 
ordering, the join result is not guaranteed to be "correct" or deterministic. 
In addition, note that the above reasoning assumes that the timestamps on both 
streams are defined in a consistent manner, for example, if both of these 
records are generated by the same application who's using the same clock for 
setting the timestamps; otherwise, for example if the joining streams are not 
set by the same application of service and there is a time drift between their 
clocks, then even {{strictly following the timestamp-based ordering}} may still 
not generate the correct result, and scenarios like [~mjsax] mentioned that a 
KTable's record may not available when the KStream's record has arrived and is 
trying to be enriched with the KTable, if its timestamp is indeed defined to be 
smaller than the corresponding KTable's record timestamp.

5. Therefore, users propose {{bootstrap a KTable}} mainly as a way to give the 
table's changelog stream's time a bit "advantage" over the ordering based on 
their timestamps so that they are more likely to be processed than the other 
record stream with the similar timestamps. On the other hand, because of 4) 
mentioned above I think it is very hard, or even impossible to get absolute 
"correct answers", but just deterministic answers to the best (that is also the 
motivation of using window retention period in Kafka Streams, or watermarks / 
hints indicating if there is no late records, along with triggering mechanisms 
in other frameworks I think).

Following these arguments, here are a list of proposals I'm thinking about to 
tackle this requirement:

1. Give users the flexibility to define ordering priorities across multiple 
streams in determining what is the next record to process (i.e. "synchronize" 
them). There are difference ways to expose this API; for example, Samza uses a 
{{MessageChooser}} user-customizable interface.

2. More restrictive than 1) above, we only allow users to specify an amount of 
time that one stream should go a little "in advance" with other streams such 
that its records with timestamp {{t + delta}} where {{delta}} is configurable 
is considered at the same time with other stream's records with time {{t}}. I 
think most the proposed options in the description of this ticket fall into 
this category.

3. Different to proposal 1) / 2) above. We change the implementation of 
KTable-KStream joins to also materialize the KStream based on a sliding window, 
so that when a record from KStream arrives, it tries to join with the current 
snapshot of KTable with its backed state store; and when a record from KTable 
arrives, it tries to join with the KStream's materialized window store with any 
matching records whose timestamp is smaller than the KTable's update record.

4. This is complementary to 2) / 3) above, that if we do not make the stream 
synchronization mechanism customizable as proposed in 1), then we can at least 
consider making it deterministic. So that any join types will generate 
deterministic results as well.

Thoughts [~mjsax] [~enothereska] [~damianguy]?

> Allow KTable bootstrap
> --
>
> Key: KAFKA-4113
> URL: 

Re: [ANNOUNCE] New committer: Jason Gustafson

2016-09-06 Thread Matthias J. Sax
Congrats Jason!!!

-Matthias

On 09/07/2016 12:25 AM, Neha Narkhede wrote:
> The PMC for Apache Kafka has invited Jason Gustafson to join as a committer 
> and
> we are pleased to announce that he has accepted!
> 
> Jason has contributed numerous patches to a wide range of areas, notably
> within the new consumer and the Kafka Connect layers. He has displayed
> great taste and judgement which has been apparent through his involvement
> across the board from mailing lists, JIRA, code reviews to contributing
> features, bug fixes and code and documentation improvements.
> 
> Thank you for your contribution and welcome to Apache Kafka, Jason!
> 



signature.asc
Description: OpenPGP digital signature


[ANNOUNCE] New committer: Jason Gustafson

2016-09-06 Thread Neha Narkhede
The PMC for Apache Kafka has invited Jason Gustafson to join as a committer and
we are pleased to announce that he has accepted!

Jason has contributed numerous patches to a wide range of areas, notably
within the new consumer and the Kafka Connect layers. He has displayed
great taste and judgement which has been apparent through his involvement
across the board from mailing lists, JIRA, code reviews to contributing
features, bug fixes and code and documentation improvements.

Thank you for your contribution and welcome to Apache Kafka, Jason!
-- 
Thanks,
Neha


[jira] [Created] (KAFKA-4135) Inconsistent javadoc for KafkaConsumer.poll behavior when there is no subscription

2016-09-06 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-4135:
--

 Summary: Inconsistent javadoc for KafkaConsumer.poll behavior when 
there is no subscription
 Key: KAFKA-4135
 URL: https://issues.apache.org/jira/browse/KAFKA-4135
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Reporter: Jason Gustafson
Priority: Minor


Currently, the javadoc for {{KafkaConsumer.poll}} says the following: 
"It is an error to not have subscribed to any topics or partitions before 
polling for data." However, we don't actually raise an exception if this is the 
case. Perhaps we should?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (KAFKA-3779) Add the LRU cache for KTable.to() operator

2016-09-06 Thread Eno Thereska (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eno Thereska reopened KAFKA-3779:
-

> Add the LRU cache for KTable.to() operator
> --
>
> Key: KAFKA-3779
> URL: https://issues.apache.org/jira/browse/KAFKA-3779
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Eno Thereska
> Fix For: 0.10.1.0
>
>
> The KTable.to operator currently does not use a cache. We can add a cache to 
> this operator to deduplicate and reduce data traffic as well. This is to be 
> done after KAFKA-3777.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3779) Add the LRU cache for KTable.to() operator

2016-09-06 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468615#comment-15468615
 ] 

Eno Thereska commented on KAFKA-3779:
-

Sure. Thanks.

> Add the LRU cache for KTable.to() operator
> --
>
> Key: KAFKA-3779
> URL: https://issues.apache.org/jira/browse/KAFKA-3779
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Eno Thereska
> Fix For: 0.10.1.0
>
>
> The KTable.to operator currently does not use a cache. We can add a cache to 
> this operator to deduplicate and reduce data traffic as well. This is to be 
> done after KAFKA-3777.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is back to normal : kafka-trunk-jdk8 #861

2016-09-06 Thread Apache Jenkins Server
See 



[jira] [Comment Edited] (KAFKA-3779) Add the LRU cache for KTable.to() operator

2016-09-06 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468453#comment-15468453
 ] 

Guozhang Wang edited comment on KAFKA-3779 at 9/6/16 8:29 PM:
--

Thanks for the explanation, sounds reasonable to me and we do not need to add 
it now. Just for keeping track  in case we want to add it later, could we keep 
it as open?


was (Author: guozhang):
Thanks for the explanation, LGTM. Just for keeping track  in case we want to 
add it later, could we keep it as open?

> Add the LRU cache for KTable.to() operator
> --
>
> Key: KAFKA-3779
> URL: https://issues.apache.org/jira/browse/KAFKA-3779
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Eno Thereska
> Fix For: 0.10.1.0
>
>
> The KTable.to operator currently does not use a cache. We can add a cache to 
> this operator to deduplicate and reduce data traffic as well. This is to be 
> done after KAFKA-3777.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3779) Add the LRU cache for KTable.to() operator

2016-09-06 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468453#comment-15468453
 ] 

Guozhang Wang commented on KAFKA-3779:
--

Thanks for the explanation, LGTM. Just for keeping track  in case we want to 
add it later, could we keep it as open?

> Add the LRU cache for KTable.to() operator
> --
>
> Key: KAFKA-3779
> URL: https://issues.apache.org/jira/browse/KAFKA-3779
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Eno Thereska
> Fix For: 0.10.1.0
>
>
> The KTable.to operator currently does not use a cache. We can add a cache to 
> this operator to deduplicate and reduce data traffic as well. This is to be 
> done after KAFKA-3777.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3769) KStream job spending 60% of time writing metrics

2016-09-06 Thread Greg Fodor (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468374#comment-15468374
 ] 

Greg Fodor commented on KAFKA-3769:
---

It's just the sensor calls inside of Selector, not Kafka Streams
specific. I'll verify as much as I can from the profiler snapshot that
it's the same issue and will open a jira.



> KStream job spending 60% of time writing metrics
> 
>
> Key: KAFKA-3769
> URL: https://issues.apache.org/jira/browse/KAFKA-3769
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Greg Fodor
>Assignee: Guozhang Wang
>Priority: Critical
> Fix For: 0.10.1.0
>
>
> I've been profiling a complex streams job, and found two major hotspots when 
> writing metrics, which take up about 60% of the CPU time of the job. (!) A PR 
> is attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3962) ConfigDef support for resource-specific configuration

2016-09-06 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468352#comment-15468352
 ] 

Shikhar Bhushan commented on KAFKA-3962:


This is also realizable today by using {{ConfigDef.originalsWithPrefix()}}, if 
the template format is {{some.property.$resource}} rather than 
{{$resource.some.property}} as suggested above. There isn't framework-level 
validation support when doing this, though.

> ConfigDef support for resource-specific configuration
> -
>
> Key: KAFKA-3962
> URL: https://issues.apache.org/jira/browse/KAFKA-3962
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Shikhar Bhushan
>
> It often comes up with connectors that you want some piece of configuration 
> that should be overridable at the topic-level, table-level, etc.
> The ConfigDef API should allow for defining these resource-overridable config 
> properties and we should have getter variants that accept a resource 
> argument, and return the more specific config value (falling back to the 
> default).
> There are a couple of possible ways to allow for this:
> 1. Support for map-style config properties "resource1:v1,resource2:v2". There 
> are escaping considerations to think through here. Also, how should the user 
> override fallback/default values -- perhaps {{*}} as a special resource?
> 2. Templatized configs -- so you would define {{$resource.some.property}}. 
> The default value is more naturally overridable here, by the user setting 
> {{some.property}} without the {{$resource}} prefix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-3962) ConfigDef support for resource-specific configuration

2016-09-06 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468352#comment-15468352
 ] 

Shikhar Bhushan edited comment on KAFKA-3962 at 9/6/16 7:49 PM:


This is also realizable today by using {{ConfigDef.originalsWithPrefix()}}, if 
the template format is {{some.property.$resource}} rather than 
{{$resource.some.property}} as suggested above. There isn't library-level 
validation support when doing this, though.


was (Author: shikhar):
This is also realizable today by using {{ConfigDef.originalsWithPrefix()}}, if 
the template format is {{some.property.$resource}} rather than 
{{$resource.some.property}} as suggested above. There isn't framework-level 
validation support when doing this, though.

> ConfigDef support for resource-specific configuration
> -
>
> Key: KAFKA-3962
> URL: https://issues.apache.org/jira/browse/KAFKA-3962
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Shikhar Bhushan
>
> It often comes up with connectors that you want some piece of configuration 
> that should be overridable at the topic-level, table-level, etc.
> The ConfigDef API should allow for defining these resource-overridable config 
> properties and we should have getter variants that accept a resource 
> argument, and return the more specific config value (falling back to the 
> default).
> There are a couple of possible ways to allow for this:
> 1. Support for map-style config properties "resource1:v1,resource2:v2". There 
> are escaping considerations to think through here. Also, how should the user 
> override fallback/default values -- perhaps {{*}} as a special resource?
> 2. Templatized configs -- so you would define {{$resource.some.property}}. 
> The default value is more naturally overridable here, by the user setting 
> {{some.property}} without the {{$resource}} prefix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3779) Add the LRU cache for KTable.to() operator

2016-09-06 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468313#comment-15468313
 ] 

Eno Thereska commented on KAFKA-3779:
-

I suggest we wait on adding dedup for (4) until we have obtained more 
experience with KIP-63 deployment, and that we keep KIP-63 primarily focused on 
(2) above. For (2) the semantics and expected gains are clear. I'm worried that 
this will be confusing for (4). [~damianguy] what do you think?

> Add the LRU cache for KTable.to() operator
> --
>
> Key: KAFKA-3779
> URL: https://issues.apache.org/jira/browse/KAFKA-3779
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Eno Thereska
> Fix For: 0.10.1.0
>
>
> The KTable.to operator currently does not use a cache. We can add a cache to 
> this operator to deduplicate and reduce data traffic as well. This is to be 
> done after KAFKA-3777.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3779) Add the LRU cache for KTable.to() operator

2016-09-06 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468302#comment-15468302
 ] 

Guozhang Wang commented on KAFKA-3779:
--

Are all KTable changelog stream contains deduped data even after KAFKA-3776? 
Generally speaking, the KTable changelog will only be deduped if it has a 
corresponding state store for upon creation. Currently we have the following 
scenarios for creating a KTable.

1. {{builder.table()}} to read from a source topic.
2. aggregation operators that generate a windowed / non-windowed KTable.
3. KTable's non-stateful operators such as {{filter}} that generates a new 
KTable.
4. KTable-KTable join that generate a new KTable.

Today 1) and 2) above have a state store for the generated KTable, and hence it 
is dedupped; for 3) as long as the original KTable is deduped it will be 
deduped as well; for 4) the resulted KTable is not backed by a state store 
since it may not be deduped.

Hence the new function {{KTable.getStoreName()}} may still return a null value; 
in this case does it still make sense to add the cache for its {{KTable.to()}} 
function?

> Add the LRU cache for KTable.to() operator
> --
>
> Key: KAFKA-3779
> URL: https://issues.apache.org/jira/browse/KAFKA-3779
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Eno Thereska
> Fix For: 0.10.1.0
>
>
> The KTable.to operator currently does not use a cache. We can add a cache to 
> this operator to deduplicate and reduce data traffic as well. This is to be 
> done after KAFKA-3777.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4043) User-defined handler for topology restart

2016-09-06 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468278#comment-15468278
 ] 

Guozhang Wang commented on KAFKA-4043:
--

Hi [~gfodor], when there is a rebalance happening and some tasks (with their 
embedded topology) are shutting down since the partitions are reassigned to 
other instances, the library will close the topology by calling 
`Processor.close()` in the topology DAG order, as a user you can customize the 
`close()` function to shutdown any created background threads / etc. Does this 
resolve this issue?

> User-defined handler for topology restart
> -
>
> Key: KAFKA-4043
> URL: https://issues.apache.org/jira/browse/KAFKA-4043
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.0.1
>Reporter: Greg Fodor
>Assignee: Guozhang Wang
>
> Since Kafka Streams is just a library, there's a lot of cool stuff we've been 
> able to do that would be trickier if it were part of a larger 
> cluster-oriented job execution system that had assumptions about the 
> semantics of a job. One of the jobs we have uses Kafka Streams to do top 
> level data flow, and then one of our processors actually will kick off 
> background threads to do work based upon the data flow state. Happy to fill 
> in more details of our use-case, but fundamentally the model is that we have 
> a Kafka Streams data flow that is reading state from upstream, and that state 
> dictates that work needs to be done, which results in a dedicated work thread 
> to be spawned by our job.
> This works great, but we're running into an issue when there is partition 
> reassignment, since we have no way to detect this and cleanly shut down these 
> threads. In our case, we'd like to shut down the background worker threads if 
> there is a partition rebalance or if the job raises an exception and attempts 
> to restart. In practice what is happening is we are getting duplicate threads 
> for the same work on a partition rebalance.
> Implementation-wise, this seems like some type of event handler that can be 
> attached to the topology at build time that can will be called when the data 
> flow needs to rebalance or rebuild its task threads in general (ideally 
> passing as much information about the reason along.) I could imagine this 
> being factored similarly to the KafkaStreams#setUncaughtExceptionHandler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4134) Transparently notify users of "Connection Refused" for client to broker connections

2016-09-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468274#comment-15468274
 ] 

ASF GitHub Bot commented on KAFKA-4134:
---

GitHub user cotedm opened a pull request:

https://github.com/apache/kafka/pull/1829

KAFKA-4134: log ConnectException at WARN

Simply log the connection refused instance.  If we're worried about 
spamming users, I can add a flag to make sure we only log this exception once, 
but the initial change is to simply log what we're given.  @ijuma looks like 
you were last to touch this code, would you mind having a look?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cotedm/kafka connectexceptiondebug

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1829.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1829


commit 4ad69909b71db6b8f28b8879bfc508e86b124af8
Author: Dustin Cote 
Date:   2016-09-06T19:15:53Z

log ConnectException at WARN




> Transparently notify users of "Connection Refused" for client to broker 
> connections
> ---
>
> Key: KAFKA-4134
> URL: https://issues.apache.org/jira/browse/KAFKA-4134
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer, producer 
>Affects Versions: 0.10.0.1
>Reporter: Dustin Cote
>Assignee: Dustin Cote
>Priority: Minor
>
> Currently, Producers and Consumers log at the WARN level if the bootstrap 
> server disconnects and if there is an unexpected exception in the network 
> Selector.  However, we log at DEBUG level if an IOException occurs in order 
> to prevent spamming the user with every network hiccup.  This has the side 
> effect of users making initial connections to brokers not getting any 
> feedback if the bootstrap server list is invalid.  For example, if one starts 
> the console producer or consumer up without any brokers running, nothing 
> indicates messages are not being received until the socket timeout is hit.
> I propose we be more granular and log the ConnectException to let the user 
> know their broker(s) are not reachable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1829: KAFKA-4134: log ConnectException at WARN

2016-09-06 Thread cotedm
GitHub user cotedm opened a pull request:

https://github.com/apache/kafka/pull/1829

KAFKA-4134: log ConnectException at WARN

Simply log the connection refused instance.  If we're worried about 
spamming users, I can add a flag to make sure we only log this exception once, 
but the initial change is to simply log what we're given.  @ijuma looks like 
you were last to touch this code, would you mind having a look?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cotedm/kafka connectexceptiondebug

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1829.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1829


commit 4ad69909b71db6b8f28b8879bfc508e86b124af8
Author: Dustin Cote 
Date:   2016-09-06T19:15:53Z

log ConnectException at WARN




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-4134) Transparently notify users of "Connection Refused" for client to broker connections

2016-09-06 Thread Dustin Cote (JIRA)
Dustin Cote created KAFKA-4134:
--

 Summary: Transparently notify users of "Connection Refused" for 
client to broker connections
 Key: KAFKA-4134
 URL: https://issues.apache.org/jira/browse/KAFKA-4134
 Project: Kafka
  Issue Type: Improvement
  Components: consumer, producer 
Affects Versions: 0.10.0.1
Reporter: Dustin Cote
Assignee: Dustin Cote
Priority: Minor


Currently, Producers and Consumers log at the WARN level if the bootstrap 
server disconnects and if there is an unexpected exception in the network 
Selector.  However, we log at DEBUG level if an IOException occurs in order to 
prevent spamming the user with every network hiccup.  This has the side effect 
of users making initial connections to brokers not getting any feedback if the 
bootstrap server list is invalid.  For example, if one starts the console 
producer or consumer up without any brokers running, nothing indicates messages 
are not being received until the socket timeout is hit.

I propose we be more granular and log the ConnectException to let the user know 
their broker(s) are not reachable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4023) Add thread id as prefix in Kafka Streams thread logging

2016-09-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468167#comment-15468167
 ] 

ASF GitHub Bot commented on KAFKA-4023:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1803


> Add thread id as prefix in Kafka Streams thread logging
> ---
>
> Key: KAFKA-4023
> URL: https://issues.apache.org/jira/browse/KAFKA-4023
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Bill Bejeck
>  Labels: newbie++
> Fix For: 0.10.1.0
>
>
> A single Kafka Streams instance can include multiple stream threads, and 
> hence without logging prefix it is difficult to determine which thread's 
> producing which log entries.
> We should
> 1) add the log-prefix as thread id in StreamThread logger, as well as its 
> contained StreamPartitionAssignor.
> 2) add the log-prefix as task id in StreamTask / StandbyTask, as well as its 
> contained RecordCollector and ProcessorStateManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1803: KAFKA-4023: add thread/task id for logging prefix

2016-09-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1803


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-4023) Add thread id as prefix in Kafka Streams thread logging

2016-09-06 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-4023:
-
   Resolution: Fixed
Fix Version/s: 0.10.1.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 1803
[https://github.com/apache/kafka/pull/1803]

> Add thread id as prefix in Kafka Streams thread logging
> ---
>
> Key: KAFKA-4023
> URL: https://issues.apache.org/jira/browse/KAFKA-4023
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Bill Bejeck
>  Labels: newbie++
> Fix For: 0.10.1.0
>
>
> A single Kafka Streams instance can include multiple stream threads, and 
> hence without logging prefix it is difficult to determine which thread's 
> producing which log entries.
> We should
> 1) add the log-prefix as thread id in StreamThread logger, as well as its 
> contained StreamPartitionAssignor.
> 2) add the log-prefix as task id in StreamTask / StandbyTask, as well as its 
> contained RecordCollector and ProcessorStateManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3769) KStream job spending 60% of time writing metrics

2016-09-06 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468157#comment-15468157
 ] 

Guozhang Wang commented on KAFKA-3769:
--

Yes please :) Also is it a general issue in {{Selector}} class, or a specific 
issue in Kafka Streams?

> KStream job spending 60% of time writing metrics
> 
>
> Key: KAFKA-3769
> URL: https://issues.apache.org/jira/browse/KAFKA-3769
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Greg Fodor
>Assignee: Guozhang Wang
>Priority: Critical
> Fix For: 0.10.1.0
>
>
> I've been profiling a complex streams job, and found two major hotspots when 
> writing metrics, which take up about 60% of the CPU time of the job. (!) A PR 
> is attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4133) Provide a configuration to control consumer max in-flight fetches

2016-09-06 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-4133:
--

 Summary: Provide a configuration to control consumer max in-flight 
fetches
 Key: KAFKA-4133
 URL: https://issues.apache.org/jira/browse/KAFKA-4133
 Project: Kafka
  Issue Type: Improvement
  Components: consumer
Reporter: Jason Gustafson


With KIP-74, we now have a good way to limit the size of fetch responses, but 
it may still be difficult for users to control overall memory since the 
consumer will send fetches in parallel to all the brokers which own partitions 
that it is subscribed to. To give users finer control, it might make sense to 
add a `max.in.flight.fetches` setting to limit the total number of concurrent 
fetches at any time. This would require a KIP since it's a new configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4001) Improve Kafka Streams Join Semantics (KIP-77)

2016-09-06 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-4001:
---
Status: Patch Available  (was: In Progress)

> Improve Kafka Streams Join Semantics (KIP-77)
> -
>
> Key: KAFKA-4001
> URL: https://issues.apache.org/jira/browse/KAFKA-4001
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
> Fix For: 0.10.1.0
>
>
> Kafka Streams supports three types of joins:
> * KStream-KStream
> * KStream-KTable
> * KTable-KTable
> Furthermore, Kafka Streams supports the join variant, namely
> * inner join
> * left join
> * outer join
> Not all combination of "type" and "variant" are supported.
> *The problem is, that the semantics of the different joins do use different 
> semantics (and are thus inconsistent).*
> With this ticket, we want to
> * introduce unique semantics over all joins
> * improve handling of "null"
> * add missing inner KStream-KTable join
> See KIP-76 for more details: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-76%3A+Improve+Kafka+Streams+Join+Semantics



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4127) Possible data loss

2016-09-06 Thread Shikhar Bhushan (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468012#comment-15468012
 ] 

Shikhar Bhushan commented on KAFKA-4127:


Dupe of KAFKA-3968

> Possible data loss
> --
>
> Key: KAFKA-4127
> URL: https://issues.apache.org/jira/browse/KAFKA-4127
> Project: Kafka
>  Issue Type: Bug
> Environment: Normal three node Kafka cluster. All machines running 
> linux.
>Reporter: Ramnatthan Alagappan
>
> I am running a three node Kakfa cluster. ZooKeeper runs in a standalone mode.
> When I create a new message topic, I see the following sequence of system 
> calls:
> mkdir("/appdir/my-topic1-0")
> creat("/appdir/my-topic1-0/.log")
> I have configured Kafka to write the messages persistently to the disk before 
> acknowledging the client. Specifically, I have set flush.interval_messages to 
> 1, min_insync_replicas to 3, and disabled dirty election.  Now, I insert a 
> new message into the created topic.
> I see that Kafka writes the message to the log file and flushes the data down 
> to disk by carefully fsync'ing the log file. I get an acknowledgment back 
> from the cluster after the message is safely persisted on all three replicas 
> and written to disk. 
> Unfortunately, Kafka can still lose data since it does not explicitly fsync 
> the directory to persist the directory entries of the topic directory and the 
> log file. If a crash happens after acknowledging the client, it is possible 
> for Kafka lose the directory entry for the topic directory or the log file. 
> Many systems carefully issue fsync to the parent directory when a new file or 
> directory is created. This is required for the file to be completely 
> persisted to the disk.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-4127) Possible data loss

2016-09-06 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan resolved KAFKA-4127.

Resolution: Duplicate

> Possible data loss
> --
>
> Key: KAFKA-4127
> URL: https://issues.apache.org/jira/browse/KAFKA-4127
> Project: Kafka
>  Issue Type: Bug
> Environment: Normal three node Kafka cluster. All machines running 
> linux.
>Reporter: Ramnatthan Alagappan
>
> I am running a three node Kakfa cluster. ZooKeeper runs in a standalone mode.
> When I create a new message topic, I see the following sequence of system 
> calls:
> mkdir("/appdir/my-topic1-0")
> creat("/appdir/my-topic1-0/.log")
> I have configured Kafka to write the messages persistently to the disk before 
> acknowledging the client. Specifically, I have set flush.interval_messages to 
> 1, min_insync_replicas to 3, and disabled dirty election.  Now, I insert a 
> new message into the created topic.
> I see that Kafka writes the message to the log file and flushes the data down 
> to disk by carefully fsync'ing the log file. I get an acknowledgment back 
> from the cluster after the message is safely persisted on all three replicas 
> and written to disk. 
> Unfortunately, Kafka can still lose data since it does not explicitly fsync 
> the directory to persist the directory entries of the topic directory and the 
> log file. If a crash happens after acknowledging the client, it is possible 
> for Kafka lose the directory entry for the topic directory or the log file. 
> Many systems carefully issue fsync to the parent directory when a new file or 
> directory is created. This is required for the file to be completely 
> persisted to the disk.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (KAFKA-4127) Possible data loss

2016-09-06 Thread Shikhar Bhushan (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shikhar Bhushan closed KAFKA-4127.
--

> Possible data loss
> --
>
> Key: KAFKA-4127
> URL: https://issues.apache.org/jira/browse/KAFKA-4127
> Project: Kafka
>  Issue Type: Bug
> Environment: Normal three node Kafka cluster. All machines running 
> linux.
>Reporter: Ramnatthan Alagappan
>
> I am running a three node Kakfa cluster. ZooKeeper runs in a standalone mode.
> When I create a new message topic, I see the following sequence of system 
> calls:
> mkdir("/appdir/my-topic1-0")
> creat("/appdir/my-topic1-0/.log")
> I have configured Kafka to write the messages persistently to the disk before 
> acknowledging the client. Specifically, I have set flush.interval_messages to 
> 1, min_insync_replicas to 3, and disabled dirty election.  Now, I insert a 
> new message into the created topic.
> I see that Kafka writes the message to the log file and flushes the data down 
> to disk by carefully fsync'ing the log file. I get an acknowledgment back 
> from the cluster after the message is safely persisted on all three replicas 
> and written to disk. 
> Unfortunately, Kafka can still lose data since it does not explicitly fsync 
> the directory to persist the directory entries of the topic directory and the 
> log file. If a crash happens after acknowledging the client, it is possible 
> for Kafka lose the directory entry for the topic directory or the log file. 
> Many systems carefully issue fsync to the parent directory when a new file or 
> directory is created. This is required for the file to be completely 
> persisted to the disk.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1828: MINOR: allow creation of statestore without loggin...

2016-09-06 Thread norwood
GitHub user norwood opened a pull request:

https://github.com/apache/kafka/pull/1828

MINOR: allow creation of statestore without loggingenabled or explicit 
sourcetopic

@guozhangwang 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/norwood/kafka manual-store

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1828.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1828


commit 643a4a59826a6383fd5c1408f21eefe6cf994752
Author: dan norwood 
Date:   2016-09-02T17:36:13Z

allow creation of statestore without loggingenabled or explicit sourcetopic




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4081) Consumer API consumer new interface commitSyn does not verify the validity of offset

2016-09-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467915#comment-15467915
 ] 

ASF GitHub Bot commented on KAFKA-4081:
---

GitHub user mimaison opened a pull request:

https://github.com/apache/kafka/pull/1827

KAFKA-4081: Consumer API consumer new interface commitSync does not v…

…erify the validity of offset

Commit throws InvalidOffsetException if the offset is negative

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mimaison/kafka KAFKA-4081

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1827.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1827


commit 334f82f60416c511665519362ce23d4ab22cc5fd
Author: Mickael Maison 
Date:   2016-09-01T16:41:16Z

KAFKA-4081: Consumer API consumer new interface commitSync does not verify 
the validity of offset

Commit throws InvalidOffsetException if the offset is negative




> Consumer API consumer new interface commitSyn does not verify the validity of 
> offset
> 
>
> Key: KAFKA-4081
> URL: https://issues.apache.org/jira/browse/KAFKA-4081
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.1
>Reporter: lifeng
>Assignee: Mickael Maison
>
> Consumer API consumer new interface commitSyn synchronization update offset, 
> for the illegal offset successful return, illegal offset<0 or offset>hw



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1827: KAFKA-4081: Consumer API consumer new interface co...

2016-09-06 Thread mimaison
GitHub user mimaison opened a pull request:

https://github.com/apache/kafka/pull/1827

KAFKA-4081: Consumer API consumer new interface commitSync does not v…

…erify the validity of offset

Commit throws InvalidOffsetException if the offset is negative

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mimaison/kafka KAFKA-4081

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1827.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1827


commit 334f82f60416c511665519362ce23d4ab22cc5fd
Author: Mickael Maison 
Date:   2016-09-01T16:41:16Z

KAFKA-4081: Consumer API consumer new interface commitSync does not verify 
the validity of offset

Commit throws InvalidOffsetException if the offset is negative




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-79 - ListOffsetRequest v1 and offsetForTime() method in new consumer.

2016-09-06 Thread Becket Qin
Hi Magnus,

Thanks for the comments. I agree that querying messages within a time range
is a valid use case (actually this is an example use case in my previous
email). The current proposal can achieve this by having two
ListOffsetRequest, right? I think the current API already supports the use
cases that require the offsets for multiple timestamps. The question is
that whether it is worth adding more complexity to the protocol to make it
easier for multiple timestamp query. Personally I think given that query
multiple timestamps is likely an infrequent operation, there is no need to
optimize for it and complicates the protocol.

Thanks,

Jiangjie (Becket) Qin

On Mon, Sep 5, 2016 at 11:21 PM, Magnus Edenhill  wrote:

> Good write-up Qin, the API looks promising.
>
> I have one comment:
>
> 2016-09-03 5:20 GMT+02:00 Becket Qin :
>
> > The currently offsetsForTimes() API obviously does not support querying
> > multiple timestamps for the same partition. It doesn't seems a feature
> for
> > ListOffsetRequest v0 either (sounds more like a bug). My intuition is
> that
> > it's a rare use case. Given it does not exist before and we don't see a
> > strong need from the community either, maybe it is better to keep it
> simple
> > for ListOffsetRequest v1. We can add it later if it turns out to be a
> > useful feature (that may need a interface change, but I honestly do not
> > think people would frequently query many different timestamps for the
> same
> > partition)
> >
>
> I argue that the current behaviour of OffsetRequest with regards to
> duplicate partitions is a bug
> and think it would be a mistake to move the same semantics over to thew new
> ListOffset API.
> One use case is that an application may want to know the offset range
> between two timestamps,
> e.g., for reprocessing, batching, searching, etc.
>
>
> Thanks,
> Magnus
>
>
>
> >
> > Have a good long weekend!
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> >
> >
> >
> > On Fri, Sep 2, 2016 at 6:10 PM, Ismael Juma  wrote:
> >
> > > Thanks for the proposal Becket. Looks good overall, a few comments:
> > >
> > > ListOffsetResponse => [TopicName [PartitionOffsets]]
> > > >   PartitionOffsets => Partition ErrorCode Timestamp [Offset]
> > > >   Partition => int32
> > > >   ErrorCode => int16
> > > >   Timestamp => int64
> > > >   Offset => int
> > >
> > >
> > > It should be int64 for `Offset` right?
> > >
> > > Implementation wise, we will migrate to o.a.k.common.requests.
> > > ListOffsetRequest
> > > > class on the broker side.
> > >
> > >
> > > Could you clarify what you mean here? We already
> > > use o.a.k.common.requests.ListOffsetRequest in KafkaApis.
> > >
> > > long offset = consumer.offsetForTime(Collections.singletonMap(
> > > topicPartition,
> > > > targetTime)).offset;
> > >
> > >
> > > The result of `offsetForTime` is a Map, so we can't just call `offset`
> on
> > > it. You probably meant something like:
> > >
> > > long offset = consumer.offsetForTime(Collections.singletonMap(
> > > topicPartition,
> > > targetTime)).get(topicPartition).offset;
> > >
> > > Test searchByTimestamp with CreateTime and LogAppendTime
> > > >
> > >
> > > Do you mean `Test offsetForTime`?
> > >
> > > And:
> > >
> > > 1. In KAFKA-1588, the following issue was described "When performing an
> > > OffsetRequest, if you request the same topic and partition combination
> > in a
> > > single request more than once (for example, if you want to get both the
> > > head and tail offsets for a partition in the same request), you will
> get
> > a
> > > response for both, but they will be the same offset". Will the new
> > request
> > > version support the use case where multiple timestamps are passed for
> the
> > > same topic partition? And if we do support it at the protocol level, do
> > we
> > > also want to support it at the API level or do we think the additional
> > > complexity is not worth it?
> > >
> > > 2. Is `offsetForTime` the right method name given that we are getting
> > > multiple offsets? Maybe it should be `offsetsForTimes` or something
> like
> > > that.
> > >
> > > Ismael
> > >
> > > On Wed, Aug 31, 2016 at 4:38 AM, Becket Qin 
> > wrote:
> > >
> > > > Hi Kafka devs,
> > > >
> > > > I created KIP-79 to allow consumer to precisely query the offsets
> based
> > > on
> > > > timestamp.
> > > >
> > > > In short we propose to :
> > > > 1. add a ListOffsetRequest/ListOffsetResponse v1, and
> > > > 2. add an offsetForTime() method in new consumer.
> > > >
> > > > The KIP wiki is the following:
> > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > action?pageId=65868090
> > > >
> > > > Comments are welcome.
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > >
> >
>


[jira] [Created] (KAFKA-4132) Make our upgrade docs more factual and less scary

2016-09-06 Thread Gwen Shapira (JIRA)
Gwen Shapira created KAFKA-4132:
---

 Summary: Make our upgrade docs more factual and less scary
 Key: KAFKA-4132
 URL: https://issues.apache.org/jira/browse/KAFKA-4132
 Project: Kafka
  Issue Type: Improvement
  Components: documentation
Reporter: Gwen Shapira


I got some feedback that our upgrade docs currently scare people away from 
upgrading Kafka. One reason is that the first line is "We have breaking 
changes!".

This was partially intentional (we were trying to scare people into reading 
docs), but I think we scared people into staying on older versions.

I think we need to break upgrade into two sections:

1. Upgrade section of design docs: should include our awesome backward 
compatibility story and highlight changes, new features and compelling bug 
fixes (breaking and non-breaking). It should also emphasize the important of 
following the upgrade docs to the letter.

2. The actual upgrade instructions should move to the Ops section of the docs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-2729) Cached zkVersion not equal to that in zookeeper, broker not recovering.

2016-09-06 Thread Charly Molter (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467691#comment-15467691
 ] 

Charly Molter edited comment on KAFKA-2729 at 9/6/16 3:38 PM:
--

Hi,

We had this issue on a test cluster running 0.10.0.0 so I took time to 
investigate some more.

We had a bunch of disconnections to Zookeeper and we had 2 changes of 
controller in a short time.

Broker 103 was controller with epoch 44
Broker 104 was controller with epoch 45

I looked at one specific partitions and found the following pattern:

101 was the broker which thought was leader but kept failing shrink the ISR 
with:
Partition [verifiable-test-topic,0] on broker 101: Shrinking ISR for partition 
[verifiable-test-topic,0] from 101,301,201 to 101,201
Partition [verifiable-test-topic,0] on broker 101: Cached zkVersion [185] not 
equal to that in zookeeper, skip updating ISR

Looking at ZK we have:
get /brokers/topics/verifiable-test-topic/partitions/0/state
{"controller_epoch":44,"leader":301,"version":1,"leader_epoch":96,"isr":[301]}

And metadata (to a random broker) is saying:
Topic: verifiable-test-topicPartition: 0Leader: 301 Replicas: 
101,201,301   Isr: 301

Digging in the logs here’s what we think happened:

1. 103 sends becomeFollower to 301 with epoch 44 and leaderEpoch 95
2. 104 sends becomeLeader to 101 with epoch 45 and leaderEpoch 95 (after update 
zk!)
3. 103 sends becomeLeader to 301 with epoch 44 and leaderEpoch 96 (after 
updating zk!)
4. 104 sends becomeFollower to 301 with epoch 45 and leaderEpoch 95

4) Is ignored by 301 as the leaderEpoch is older than the current one.

We are missing a request: 103 sends becomeFollower to 101 with epoch 44 and 
leaderEpoch 95

I believe this happened because when the controller steps down it empties its 
request queue so this request never left the controller: 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L53-L57

So we ended up in a case where 301 and 101 think they are both leaders. 
Obviously 101 wants to update the state in ZK to remove 301 as it’s not even 
fetching from 101.

Does this seem correct to you?

It seems impossible to avoid having no Controller overlap, which could make it 
quite hard to avoid having 2 leaders for a short time. Though there should be a 
way for this situation to get back to a good state.

I believe the impact of this would be:
- writes = -1 unavailability
- writes != -1 possible log divergence (I’m unsure about this).

Hope this helps. While I had to fix the cluster by bouncing a node I kept most 
of the logs so let me know if you need more info.


was (Author: cmolter):
Hi,

We had this issue on a test cluster running 0.10.0.0 so I took time to 
investigate some more.

We had a bunch of disconnections to Zookeeper and we had 2 changes of 
controller in a short time.

Broker 103 was leader with epoch 44
Broker 104 was leader with epoch 45

I looked at one specific partitions and found the following pattern:

101 was the broker which thought was leader but kept failing shrink the ISR 
with:
Partition [verifiable-test-topic,0] on broker 101: Shrinking ISR for partition 
[verifiable-test-topic,0] from 101,301,201 to 101,201
Partition [verifiable-test-topic,0] on broker 101: Cached zkVersion [185] not 
equal to that in zookeeper, skip updating ISR

Looking at ZK we have:
get /brokers/topics/verifiable-test-topic/partitions/0/state
{"controller_epoch":44,"leader":301,"version":1,"leader_epoch":96,"isr":[301]}

And metadata (to a random broker) is saying:
Topic: verifiable-test-topicPartition: 0Leader: 301 Replicas: 
101,201,301   Isr: 301

Digging in the logs here’s what we think happened:

1. 103 sends becomeFollower to 301 with epoch 44 and leaderEpoch 95
2. 104 sends becomeLeader to 101 with epoch 45 and leaderEpoch 95 (after update 
zk!)
3. 103 sends becomeLeader to 301 with epoch 44 and leaderEpoch 96 (after 
updating zk!)
4. 104 sends becomeFollower to 301 with epoch 45 and leaderEpoch 95

4) Is ignored by 301 as the leaderEpoch is older than the current one.

We are missing a request: 103 sends becomeFollower to 101 with epoch 44 and 
leaderEpoch 95

I believe this happened because when the controller steps down it empties its 
request queue so this request never left the controller: 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L53-L57

So we ended up in a case where 301 and 101 think they are both leaders. 
Obviously 101 wants to update the state in ZK to remove 301 as it’s not even 
fetching from 101.

Does this seem correct to you?

It seems impossible to avoid having no Controller overlap, which could make it 
quite hard to avoid having 2 leaders for a short time. Though there should be a 
way for this situation to get back to a good state.

I believe the impact of 

[jira] [Comment Edited] (KAFKA-2729) Cached zkVersion not equal to that in zookeeper, broker not recovering.

2016-09-06 Thread Charly Molter (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467691#comment-15467691
 ] 

Charly Molter edited comment on KAFKA-2729 at 9/6/16 3:37 PM:
--

Hi,

We had this issue on a test cluster running 0.10.0.0 so I took time to 
investigate some more.

We had a bunch of disconnections to Zookeeper and we had 2 changes of 
controller in a short time.

Broker 103 was leader with epoch 44
Broker 104 was leader with epoch 45

I looked at one specific partitions and found the following pattern:

101 was the broker which thought was leader but kept failing shrink the ISR 
with:
Partition [verifiable-test-topic,0] on broker 101: Shrinking ISR for partition 
[verifiable-test-topic,0] from 101,301,201 to 101,201
Partition [verifiable-test-topic,0] on broker 101: Cached zkVersion [185] not 
equal to that in zookeeper, skip updating ISR

Looking at ZK we have:
get /brokers/topics/verifiable-test-topic/partitions/0/state
{"controller_epoch":44,"leader":301,"version":1,"leader_epoch":96,"isr":[301]}

And metadata (to a random broker) is saying:
Topic: verifiable-test-topicPartition: 0Leader: 301 Replicas: 
101,201,301   Isr: 301

Digging in the logs here’s what we think happened:

1. 103 sends becomeFollower to 301 with epoch 44 and leaderEpoch 95
2. 104 sends becomeLeader to 101 with epoch 45 and leaderEpoch 95 (after update 
zk!)
3. 103 sends becomeLeader to 301 with epoch 44 and leaderEpoch 96 (after 
updating zk!)
4. 104 sends becomeFollower to 301 with epoch 45 and leaderEpoch 95

4) Is ignored by 301 as the leaderEpoch is older than the current one.

We are missing a request: 103 sends becomeFollower to 101 with epoch 44 and 
leaderEpoch 95

I believe this happened because when the controller steps down it empties its 
request queue so this request never left the controller: 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L53-L57

So we ended up in a case where 301 and 101 think they are both leaders. 
Obviously 101 wants to update the state in ZK to remove 301 as it’s not even 
fetching from 101.

Does this seem correct to you?

It seems impossible to avoid having no Controller overlap, which could make it 
quite hard to avoid having 2 leaders for a short time. Though there should be a 
way for this situation to get back to a good state.

I believe the impact of this would be:
- writes = -1 unavailability
- writes != -1 possible log divergence (I’m unsure about this).

Hope this helps. While I had to fix the cluster by bouncing a node I kept most 
of the logs so let me know if you need more info.


was (Author: cmolter):
Hi,

We had this issue on a test cluster running 0.10.0.0 so I took time to 
investigate some more.

We had a bunch of disconnections to Zookeeper and we had 2 changes of 
controller in a short time.

Broker 103 was leader with epoch 44
Broker 104 was leader with epoch 45

I looked at one specific partitions and found the following pattern:

101 was the broker which thought was leader but kept failing shrink the ISR 
with:
Partition [verifiable-test-topic,0] on broker 101: Shrinking ISR for partition 
[verifiable-test-topic,0] from 101,301,201 to 101,201
Partition [verifiable-test-topic,0] on broker 101: Cached zkVersion [185] not 
equal to that in zookeeper, skip updating ISR

Looking at ZK we have:
get /brokers/topics/verifiable-test-topic/partitions/0/state
{"controller_epoch":44,"leader":301,"version":1,"leader_epoch":96,"isr":[301]}

And metadata (to a random broker) is saying:
Topic: verifiable-test-topicPartition: 0Leader: 301 Replicas: 
101,201,301   Isr: 301

Digging in the logs here’s what we think happened:

1. 103 sends becomeFollower to 301 with epoch 44 and leaderEpoch 95
2. 104 sends becomeLeader to 101 with epoch 45 and leaderEpoch 95 (after update 
zk!)
3. 103 sends becomeLeader to 301 with epoch 44 and leaderEpoch 96 (after 
updating zk!)
4. 104 sends becomeFollower to 301 with epoch 45 and leaderEpoch 95

4) Is ignored by 301 as the leaderEpoch is older than the current one.

We are missing a request: 103 sends becomeFollower to 101 with epoch 44 and 
leaderEpoch 95

I believe this happened because when the controller steps down it empties its 
request queue so this request never left the controller: 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L53-L57

So we ended up in a case where 301 and 101 think they are both leaders. 
Obviously 101 wants to update the state in ZK to remove 301 as it’s not even 
fetching from 101.

Does this seem correct to you?

It seems impossible to avoid having no Controller overlap, which could make it 
quite hard to avoid having 2 leaders for a short time. Though there should be a 
way for this situation to get back to a good state.

I believe the impact of this 

[jira] [Comment Edited] (KAFKA-2729) Cached zkVersion not equal to that in zookeeper, broker not recovering.

2016-09-06 Thread Charly Molter (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467691#comment-15467691
 ] 

Charly Molter edited comment on KAFKA-2729 at 9/6/16 3:32 PM:
--

Hi,

We had this issue on a test cluster running 0.10.0.0 so I took time to 
investigate some more.

We had a bunch of disconnections to Zookeeper and we had 2 changes of 
controller in a short time.

Broker 103 was leader with epoch 44
Broker 104 was leader with epoch 45

I looked at one specific partitions and found the following pattern:

101 was the broker which thought was leader but kept failing shrink the ISR 
with:
Partition [verifiable-test-topic,0] on broker 101: Shrinking ISR for partition 
[verifiable-test-topic,0] from 101,301,201 to 101,201
Partition [verifiable-test-topic,0] on broker 101: Cached zkVersion [185] not 
equal to that in zookeeper, skip updating ISR

Looking at ZK we have:
get /brokers/topics/verifiable-test-topic/partitions/0/state
{"controller_epoch":44,"leader":301,"version":1,"leader_epoch":96,"isr":[301]}

And metadata (to a random broker) is saying:
Topic: verifiable-test-topicPartition: 0Leader: 301 Replicas: 
101,201,301   Isr: 301

Digging in the logs here’s what we think happened:

1. 103 sends becomeFollower to 301 with epoch 44 and leaderEpoch 95
2. 104 sends becomeLeader to 101 with epoch 45 and leaderEpoch 95 (after update 
zk!)
3. 103 sends becomeLeader to 301 with epoch 44 and leaderEpoch 96 (after 
updating zk!)
4. 104 sends becomeFollower to 301 with epoch 45 and leaderEpoch 95

4) Is ignored by 301 as the leaderEpoch is older than the current one.

We are missing a request: 103 sends becomeFollower to 101 with epoch 44 and 
leaderEpoch 95

I believe this happened because when the controller steps down it empties its 
request queue so this request never left the controller: 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L53-L57

So we ended up in a case where 301 and 101 think they are both leaders. 
Obviously 101 wants to update the state in ZK to remove 301 as it’s not even 
fetching from 101.

Does this seem correct to you?

It seems impossible to avoid having no Controller overlap, which could make it 
quite hard to avoid having 2 leaders for a short time. Though there should be a 
way for this situation to get back to a good state.

I believe the impact of this would be:
- writes = -1 unavailability
- writes != -1 possible log divergence depending on min in-sync replicas (I’m 
unsure about this).

Hope this helps. While I had to fix the cluster by bouncing a node I kept most 
of the logs so let me know if you need more info.


was (Author: cmolter):
Hi,

We had this issue on a test cluster so I took time to investigate some more.

We had a bunch of disconnections to Zookeeper and we had 2 changes of 
controller in a short time.

Broker 103 was leader with epoch 44
Broker 104 was leader with epoch 45

I looked at one specific partitions and found the following pattern:

101 was the broker which thought was leader but kept failing shrink the ISR 
with:
Partition [verifiable-test-topic,0] on broker 101: Shrinking ISR for partition 
[verifiable-test-topic,0] from 101,301,201 to 101,201
Partition [verifiable-test-topic,0] on broker 101: Cached zkVersion [185] not 
equal to that in zookeeper, skip updating ISR

Looking at ZK we have:
get /brokers/topics/verifiable-test-topic/partitions/0/state
{"controller_epoch":44,"leader":301,"version":1,"leader_epoch":96,"isr":[301]}

And metadata (to a random broker) is saying:
Topic: verifiable-test-topicPartition: 0Leader: 301 Replicas: 
101,201,301   Isr: 301

Digging in the logs here’s what we think happened:

1. 103 sends becomeFollower to 301 with epoch 44 and leaderEpoch 95
2. 104 sends becomeLeader to 101 with epoch 45 and leaderEpoch 95 (after update 
zk!)
3. 103 sends becomeLeader to 301 with epoch 44 and leaderEpoch 96 (after 
updating zk!)
4. 104 sends becomeFollower to 301 with epoch 45 and leaderEpoch 95

4) Is ignored by 301 as the leaderEpoch is older than the current one.

We are missing a request: 103 sends becomeFollower to 101 with epoch 44 and 
leaderEpoch 95

I believe this happened because when the controller steps down it empties its 
request queue so this request never left the controller: 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L53-L57

So we ended up in a case where 301 and 101 think they are both leaders. 
Obviously 101 wants to update the state in ZK to remove 301 as it’s not even 
fetching from 101.

Does this seem correct to you?

It seems impossible to avoid having no Controller overlap, which could make it 
quite hard to avoid having 2 leaders for a short time. Though there should be a 
way for this situation to get back to a good state.

I believe the 

[jira] [Commented] (KAFKA-4131) Multiple Regex KStream-Consumers cause Null pointer exception in addRawRecords in RecordQueue class

2016-09-06 Thread Bill Bejeck (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467695#comment-15467695
 ] 

Bill Bejeck commented on KAFKA-4131:


If it's ok, I'll pick this one up

> Multiple Regex KStream-Consumers cause Null pointer exception in 
> addRawRecords in RecordQueue class
> ---
>
> Key: KAFKA-4131
> URL: https://issues.apache.org/jira/browse/KAFKA-4131
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.0
> Environment: Servers: Confluent Distribution 3.0.0 (i.e. kafka 0.10.0 
> release)
> Client: Kafka-streams and Kafka-client... commit: 
> 6fb33afff976e467bfa8e0b29eb827
> 70a2a3aaec
>Reporter: David J. Garcia
>Assignee: Guozhang Wang
>
> When you start two consumer processes with a regex topic (with 2 or more
> partitions for the matching topics), the second (i.e. nonleader) consumer
> will fail with a null pointer exception.
> Exception in thread "StreamThread-4" java.lang.NullPointerException
>  at org.apache.kafka.streams.processor.internals.
> RecordQueue.addRawRecords(RecordQueue.java:78)
>  at org.apache.kafka.streams.processor.internals.
> PartitionGroup.addRawRecords(PartitionGroup.java:117)
>  at org.apache.kafka.streams.processor.internals.
> StreamTask.addRecords(StreamTask.java:139)
>  at org.apache.kafka.streams.processor.internals.
> StreamThread.runLoop(StreamThread.java:299)
>  at org.apache.kafka.streams.processor.internals.
> StreamThread.run(StreamThread.java:208)
> The issue may be in the TopologyBuilder line 832:
> String[] topics = (sourceNodeFactory.pattern != null) ?
> sourceNodeFactory.getTopics(subscriptionUpdates.getUpdates()) :
> sourceNodeFactory.getTopics();
> Because the 2nd consumer joins as a follower, “getUpdates” returns an
> empty collection and the regular expression doesn’t get applied to any
> topics.
> Steps to Reproduce:
> 1.) Create at least two topics with at least 2 partitions each.  And start 
> sending messages to them.
> 2.) Start a single threaded Regex KStream-Consumer (i.e. becomes the leader)
> 3)  Start a new instance of this consumer (i.e. it should receive some of the 
> partitions)
> The second consumer will die with the above exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-4131) Multiple Regex KStream-Consumers cause Null pointer exception in addRawRecords in RecordQueue class

2016-09-06 Thread Bill Bejeck (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck reassigned KAFKA-4131:
--

Assignee: Bill Bejeck  (was: Guozhang Wang)

> Multiple Regex KStream-Consumers cause Null pointer exception in 
> addRawRecords in RecordQueue class
> ---
>
> Key: KAFKA-4131
> URL: https://issues.apache.org/jira/browse/KAFKA-4131
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.0
> Environment: Servers: Confluent Distribution 3.0.0 (i.e. kafka 0.10.0 
> release)
> Client: Kafka-streams and Kafka-client... commit: 
> 6fb33afff976e467bfa8e0b29eb827
> 70a2a3aaec
>Reporter: David J. Garcia
>Assignee: Bill Bejeck
>
> When you start two consumer processes with a regex topic (with 2 or more
> partitions for the matching topics), the second (i.e. nonleader) consumer
> will fail with a null pointer exception.
> Exception in thread "StreamThread-4" java.lang.NullPointerException
>  at org.apache.kafka.streams.processor.internals.
> RecordQueue.addRawRecords(RecordQueue.java:78)
>  at org.apache.kafka.streams.processor.internals.
> PartitionGroup.addRawRecords(PartitionGroup.java:117)
>  at org.apache.kafka.streams.processor.internals.
> StreamTask.addRecords(StreamTask.java:139)
>  at org.apache.kafka.streams.processor.internals.
> StreamThread.runLoop(StreamThread.java:299)
>  at org.apache.kafka.streams.processor.internals.
> StreamThread.run(StreamThread.java:208)
> The issue may be in the TopologyBuilder line 832:
> String[] topics = (sourceNodeFactory.pattern != null) ?
> sourceNodeFactory.getTopics(subscriptionUpdates.getUpdates()) :
> sourceNodeFactory.getTopics();
> Because the 2nd consumer joins as a follower, “getUpdates” returns an
> empty collection and the regular expression doesn’t get applied to any
> topics.
> Steps to Reproduce:
> 1.) Create at least two topics with at least 2 partitions each.  And start 
> sending messages to them.
> 2.) Start a single threaded Regex KStream-Consumer (i.e. becomes the leader)
> 3)  Start a new instance of this consumer (i.e. it should receive some of the 
> partitions)
> The second consumer will die with the above exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2729) Cached zkVersion not equal to that in zookeeper, broker not recovering.

2016-09-06 Thread Charly Molter (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467691#comment-15467691
 ] 

Charly Molter commented on KAFKA-2729:
--

Hi,

We had this issue on a test cluster so I took time to investigate some more.

We had a bunch of disconnections to Zookeeper and we had 2 changes of 
controller in a short time.

Broker 103 was leader with epoch 44
Broker 104 was leader with epoch 45

I looked at one specific partitions and found the following pattern:

101 was the broker which thought was leader but kept failing shrink the ISR 
with:
Partition [verifiable-test-topic,0] on broker 101: Shrinking ISR for partition 
[verifiable-test-topic,0] from 101,301,201 to 101,201
Partition [verifiable-test-topic,0] on broker 101: Cached zkVersion [185] not 
equal to that in zookeeper, skip updating ISR

Looking at ZK we have:
get /brokers/topics/verifiable-test-topic/partitions/0/state
{"controller_epoch":44,"leader":301,"version":1,"leader_epoch":96,"isr":[301]}

And metadata (to a random broker) is saying:
Topic: verifiable-test-topicPartition: 0Leader: 301 Replicas: 
101,201,301   Isr: 301

Digging in the logs here’s what we think happened:

1. 103 sends becomeFollower to 301 with epoch 44 and leaderEpoch 95
2. 104 sends becomeLeader to 101 with epoch 45 and leaderEpoch 95 (after update 
zk!)
3. 103 sends becomeLeader to 301 with epoch 44 and leaderEpoch 96 (after 
updating zk!)
4. 104 sends becomeFollower to 301 with epoch 45 and leaderEpoch 95

4) Is ignored by 301 as the leaderEpoch is older than the current one.

We are missing a request: 103 sends becomeFollower to 101 with epoch 44 and 
leaderEpoch 95

I believe this happened because when the controller steps down it empties its 
request queue so this request never left the controller: 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L53-L57

So we ended up in a case where 301 and 101 think they are both leaders. 
Obviously 101 wants to update the state in ZK to remove 301 as it’s not even 
fetching from 101.

Does this seem correct to you?

It seems impossible to avoid having no Controller overlap, which could make it 
quite hard to avoid having 2 leaders for a short time. Though there should be a 
way for this situation to get back to a good state.

I believe the impact of this would be:
- writes = -1 unavailability
- writes != -1 possible log divergence depending on min in-sync replicas (I’m 
unsure about this).

Hope this helps. While I had to fix the cluster by bouncing a node I kept most 
of the logs so let me know if you need more info.

> Cached zkVersion not equal to that in zookeeper, broker not recovering.
> ---
>
> Key: KAFKA-2729
> URL: https://issues.apache.org/jira/browse/KAFKA-2729
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Danil Serdyuchenko
>
> After a small network wobble where zookeeper nodes couldn't reach each other, 
> we started seeing a large number of undereplicated partitions. The zookeeper 
> cluster recovered, however we continued to see a large number of 
> undereplicated partitions. Two brokers in the kafka cluster were showing this 
> in the logs:
> {code}
> [2015-10-27 11:36:00,888] INFO Partition 
> [__samza_checkpoint_event-creation_1,3] on broker 5: Shrinking ISR for 
> partition [__samza_checkpoint_event-creation_1,3] from 6,5 to 5 
> (kafka.cluster.Partition)
> [2015-10-27 11:36:00,891] INFO Partition 
> [__samza_checkpoint_event-creation_1,3] on broker 5: Cached zkVersion [66] 
> not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
> {code}
> For all of the topics on the effected brokers. Both brokers only recovered 
> after a restart. Our own investigation yielded nothing, I was hoping you 
> could shed some light on this issue. Possibly if it's related to: 
> https://issues.apache.org/jira/browse/KAFKA-1382 , however we're using 
> 0.8.2.1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4130) [docs] Link to Varnish architect notes is broken

2016-09-06 Thread Stevo Slavic (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stevo Slavic updated KAFKA-4130:

Description: 
Paragraph in Kafka documentation
{quote}
This style of pagecache-centric design is described in an article on the design 
of Varnish here (along with a healthy dose of arrogance). 
{quote}
contains a broken link.

Should probably link to http://varnish-cache.org/wiki/ArchitectNotes

  was:
Paraagraph in Kafka documentation
{quote}
This style of pagecache-centric design is described in an article on the design 
of Varnish here (along with a healthy dose of arrogance). 
{quote}
contains a broken link.

Should probably link to http://varnish-cache.org/wiki/ArchitectNotes


> [docs] Link to Varnish architect notes is broken
> 
>
> Key: KAFKA-4130
> URL: https://issues.apache.org/jira/browse/KAFKA-4130
> Project: Kafka
>  Issue Type: Bug
>  Components: website
>Affects Versions: 0.9.0.1, 0.10.0.1
>Reporter: Stevo Slavic
>Priority: Trivial
>
> Paragraph in Kafka documentation
> {quote}
> This style of pagecache-centric design is described in an article on the 
> design of Varnish here (along with a healthy dose of arrogance). 
> {quote}
> contains a broken link.
> Should probably link to http://varnish-cache.org/wiki/ArchitectNotes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-4131) Multiple Regex KStream-Consumers cause Null pointer exception in addRawRecords in RecordQueue class

2016-09-06 Thread David J. Garcia (JIRA)
David J. Garcia created KAFKA-4131:
--

 Summary: Multiple Regex KStream-Consumers cause Null pointer 
exception in addRawRecords in RecordQueue class
 Key: KAFKA-4131
 URL: https://issues.apache.org/jira/browse/KAFKA-4131
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 0.10.1.0
 Environment: Servers: Confluent Distribution 3.0.0 (i.e. kafka 0.10.0 
release)
Client: Kafka-streams and Kafka-client... commit: 6fb33afff976e467bfa8e0b29eb827
70a2a3aaec
Reporter: David J. Garcia
Assignee: Guozhang Wang


When you start two consumer processes with a regex topic (with 2 or more
partitions for the matching topics), the second (i.e. nonleader) consumer
will fail with a null pointer exception.

Exception in thread "StreamThread-4" java.lang.NullPointerException
 at org.apache.kafka.streams.processor.internals.
RecordQueue.addRawRecords(RecordQueue.java:78)
 at org.apache.kafka.streams.processor.internals.
PartitionGroup.addRawRecords(PartitionGroup.java:117)
 at org.apache.kafka.streams.processor.internals.
StreamTask.addRecords(StreamTask.java:139)
 at org.apache.kafka.streams.processor.internals.
StreamThread.runLoop(StreamThread.java:299)
 at org.apache.kafka.streams.processor.internals.
StreamThread.run(StreamThread.java:208)

The issue may be in the TopologyBuilder line 832:
String[] topics = (sourceNodeFactory.pattern != null) ?
sourceNodeFactory.getTopics(subscriptionUpdates.getUpdates()) :
sourceNodeFactory.getTopics();

Because the 2nd consumer joins as a follower, “getUpdates” returns an
empty collection and the regular expression doesn’t get applied to any
topics.

Steps to Reproduce:
1.) Create at least two topics with at least 2 partitions each.  And start 
sending messages to them.
2.) Start a single threaded Regex KStream-Consumer (i.e. becomes the leader)
3)  Start a new instance of this consumer (i.e. it should receive some of the 
partitions)

The second consumer will die with the above exception.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-63: Unify store and downstream caching in streams

2016-09-06 Thread Eno Thereska
A small update to the KIP: the deduping of records using the cache does not 
affect the .to operator since we'd have already deduped the KTable before the 
operator. Adjusting KIP.

Thanks
Eno

> On 5 Sep 2016, at 12:43, Eno Thereska  wrote:
> 
> Hi Matthias,
> 
> The motivation for KIP-63 was primarily aggregates and reducing the load on 
> "both" state stores and downstream. I think there is agreement that for the 
> DSL the motivation and design make sense.
> 
> For the Processor API: caching is a major component in any system, and it is 
> difficult to continue to operate as before, without fully understanding the 
> consequences. Hence, I think this is mostly a case of educating users to 
> understand the boundaries of the solution. 
> 
> Introducing a cache, either for the state store only, or for downstream 
> forwarding only, or for both, leads to moving from a model where we process 
> each request end-to-end (today) to one where a request is temporarily 
> buffered in a cache. In all the cases, this opens up the question of what to 
> do next once the request then leaves the cache, and how to express that 
> (future) behaviour. E.g., even when the cache is just for downstream 
> forwarding (i.e., decoupled from any state store), the processor API user 
> might be surprised that context.forward() does not immediately do anything.
> 
> I agree that for ultra-flexibility, a processor API user should be able to 
> choose whether the dedup cache is put 1) on top of a store only, 2) on 
> forward only, 3) on both store and forward, but given the motivation for 
> KIP-63 (aggregates), I believe a decoupled store-forward dedup cache is a 
> reasonable choice that provides good default behaviour, without prodding the 
> user to specify the combinations.
> 
> We need to educate users that if a cache is used in the Processor API, the 
> forwarding will happen in the future. 
> 
> -Eno
> 
> 
> 
>> On 4 Sep 2016, at 19:11, Matthias J. Sax  wrote:
>> 
>>> Processor code should always work; independently if caching is enabled
>> or not.
>> 
>> If we want to get this, I guess we need a quite different design (see (1)).
>> 
>> The point is, that we want to dedup the output, and not state updates.
>> 
>> It just happens that our starting point was KTable, for which state
>> updates and downstream changelog output is the same thing. Thus, we can
>> just use the internal KTable state to do the deduplication for the
>> downstream changelog.
>> 
>> However, from a general point of view (Processor API view), if we dedup
>> the output, we want dedup/caching for the processor (and not for a state
>> store). Of course, we need a state to do the dedup. For KTable, both
>> things merge into a single abstraction, and we use only a single state
>> instead of two. From a general point of view, we would need two states
>> though (one for the actual state, and one for dedup -- think Processor
>> API -- not DSL).
>> 
>> 
>> Alternative proposal 1:
>> (see also (2) -- which might be better than this one)
>> 
>> Thus, it might be a cleaner design to decouple user-states and
>> dedup-state from each other. If a user enables dedup/caching (for a
>> processor) we add an additional state to do the dedup and this
>> dedup-state is independent from all user states and context.forward()
>> works as always. The dedup state could be hidden from the user and could
>> be a pure in-memory state (no need for any recovery -- only flush on
>> commit). Internally, a context.forward() would call dedupState.put() and
>> trigger actual output if dedup state needs to evict records.
>> 
>> The disadvantage would be, that we end up with two states for KTable.
>> The advantage is, that deduplication can be switched off/on without any
>> Processor code change.
>> 
>> 
>> Alternative proposal 2:
>> 
>> We basically keep the current KIP design, including not to disable
>> context.forward() if a cached state is used. Additionally, for cached
>> state, we rename put() into putAndForward() which is only available for
>> cached states. Thus, in processor code, a state must be explicitly cast
>> into a cached state. We also make the user aware, that an update/put to
>> a state result in downstream output and that context.forward() would be
>> a "direct/non-cached" output.
>> 
>> The disadvantage of this is, that processor code is not independent from
>> caching and thus, caching cannot just be switched on/off (ie, we do not
>> follow the initial statement of this mail). The advantage is, we can
>> keep a single state for KTable and this design is just small changes to
>> the current KIP.
>> 
>> 
>> 
>> -Matthias
>> 
>> 
>> On 09/04/2016 07:10 PM, Matthias J. Sax wrote:
>>> Sure, you can use a non-cached state. However, if you write code like
>>> below for a non-cached state, and learn about caching later on, and
>>> think, caching is a cool feature, I want to use it, you would simply
>>> want to enable caching 

[jira] [Created] (KAFKA-4130) [docs] Link to Varnish architect notes is broken

2016-09-06 Thread Stevo Slavic (JIRA)
Stevo Slavic created KAFKA-4130:
---

 Summary: [docs] Link to Varnish architect notes is broken
 Key: KAFKA-4130
 URL: https://issues.apache.org/jira/browse/KAFKA-4130
 Project: Kafka
  Issue Type: Bug
  Components: website
Affects Versions: 0.10.0.1, 0.9.0.1
Reporter: Stevo Slavic
Priority: Trivial


Paraagraph in Kafka documentation
{quote}
This style of pagecache-centric design is described in an article on the design 
of Varnish here (along with a healthy dose of arrogance). 
{quote}
contains a broken link.

Should probably link to http://varnish-cache.org/wiki/ArchitectNotes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4129) Processor throw exception when getting channel remote address after closing the channel

2016-09-06 Thread TAO XIAO (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

TAO XIAO updated KAFKA-4129:

Status: Patch Available  (was: Open)

> Processor throw exception when getting channel remote address after closing 
> the channel
> ---
>
> Key: KAFKA-4129
> URL: https://issues.apache.org/jira/browse/KAFKA-4129
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.1
>Reporter: TAO XIAO
>Assignee: TAO XIAO
>
> In Processor {{configureNewConnections()}} catch block, it explicitly closes 
> {{channel}} before calling {{channel.getRemoteAddress}} which results in 
> {{ClosedChannelException}} being thrown. This is due to Java implementation 
> that no remote address can be returned after the channel is closed
> {code}
> case NonFatal(e) =>
>  // need to close the channel here to avoid a socket leak.
>  close(channel)
>  error(s"Processor $id closed connection from 
> ${channel.getRemoteAddress}", e)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4129) Processor throw exception when getting channel remote address after closing the channel

2016-09-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467403#comment-15467403
 ] 

ASF GitHub Bot commented on KAFKA-4129:
---

GitHub user xiaotao183 opened a pull request:

https://github.com/apache/kafka/pull/1826

KAFKA-4129: Processor throw exception when getting channel remote address 
after closing the channel

Get channel remote address before calling ```channel.close```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xiaotao183/kafka KAFKA-4129

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1826.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1826


commit 3d4a4b01789c46ac940caba6be35f73c3db715fa
Author: Tao Xiao 
Date:   2016-09-06T13:12:35Z

get channel remote address before calling channel.close




> Processor throw exception when getting channel remote address after closing 
> the channel
> ---
>
> Key: KAFKA-4129
> URL: https://issues.apache.org/jira/browse/KAFKA-4129
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.1
>Reporter: TAO XIAO
>Assignee: TAO XIAO
>
> In Processor {{configureNewConnections()}} catch block, it explicitly closes 
> {{channel}} before calling {{channel.getRemoteAddress}} which results in 
> {{ClosedChannelException}} being thrown. This is due to Java implementation 
> that no remote address can be returned after the channel is closed
> {code}
> case NonFatal(e) =>
>  // need to close the channel here to avoid a socket leak.
>  close(channel)
>  error(s"Processor $id closed connection from 
> ${channel.getRemoteAddress}", e)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #1826: KAFKA-4129: Processor throw exception when getting...

2016-09-06 Thread xiaotao183
GitHub user xiaotao183 opened a pull request:

https://github.com/apache/kafka/pull/1826

KAFKA-4129: Processor throw exception when getting channel remote address 
after closing the channel

Get channel remote address before calling ```channel.close```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xiaotao183/kafka KAFKA-4129

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1826.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1826


commit 3d4a4b01789c46ac940caba6be35f73c3db715fa
Author: Tao Xiao 
Date:   2016-09-06T13:12:35Z

get channel remote address before calling channel.close




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-4129) Processor throw exception when getting channel remote address after closing the channel

2016-09-06 Thread TAO XIAO (JIRA)
TAO XIAO created KAFKA-4129:
---

 Summary: Processor throw exception when getting channel remote 
address after closing the channel
 Key: KAFKA-4129
 URL: https://issues.apache.org/jira/browse/KAFKA-4129
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.10.0.1
Reporter: TAO XIAO
Assignee: TAO XIAO


In Processor {{configureNewConnections()}} catch block, it explicitly closes 
{{channel}} before calling {{channel.getRemoteAddress}} which results in 
{{ClosedChannelException}} being thrown. This is due to Java implementation 
that no remote address can be returned after the channel is closed

{code}
case NonFatal(e) =>
 // need to close the channel here to avoid a socket leak.
 close(channel)
 error(s"Processor $id closed connection from 
${channel.getRemoteAddress}", e)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Plans to improve SSL performance in Kafka, for 0.10.x?

2016-09-06 Thread Todd Palino
Yeah, that's why I mentioned it with a caveat :) Someone (I can't recall
who, but it was someone I consider reasonably knowledgable as I actually
gave it some weight) mentioned it, but I haven't looked into it further
than that. I agree that I don't see how this is going to help us at the app
layer.

-Todd

On Tuesday, September 6, 2016, Ismael Juma  wrote:

> Hi Todd,
>
> Thanks for sharing your experience enabling TLS in your clusters. Very
> helpful. One comment below.
>
> On Sun, Sep 4, 2016 at 6:28 PM, Todd Palino  > wrote:
> >
> > Right now, we're specifically avoiding moving consume traffic to SSL, due
> > to the zero copy send issue. Now I've been told (but I have not
> > investigated) that OpenSSL can solve this. It would probably be a good
> use
> > of time to look into that further.
> >
>
> As far as I know, OpenSSL can reduce the TLS overhead, but we will still
> lose the zero-copy optimisation. There is some attempts at making it
> possible to retain zero-copy with TLS in the kernel[1][2], but it's
> probably too early for us to consider that for Kafka.
>
> Ismael
>
> [1] https://lwn.net/Articles/666509/
> [2]
> http://techblog.netflix.com/2016/08/protecting-netflix-
> viewing-privacy-at.html
>


-- 
*Todd Palino*
Staff Site Reliability Engineer
Data Infrastructure Streaming



linkedin.com/in/toddpalino


[jira] [Updated] (KAFKA-4128) Kafka broker losses messages when zookeeper session times out

2016-09-06 Thread Mazhar Shaikh (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mazhar Shaikh updated KAFKA-4128:
-
Description: 
Pumping 30k msgs/second after some 6-8 hrs of run below logs are printed and 
the messages are lost.

[More than 5k messages are lost on every partitions]

Below are few logs:

[2016-09-06 05:00:42,595] INFO Client session timed out, have not heard from 
server in 20903ms for sessionid 0x256fabec47c0003, closing socket connection 
and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2016-09-06 05:00:42,696] INFO zookeeper state changed (Disconnected) 
(org.I0Itec.zkclient.ZkClient)
[2016-09-06 05:00:42,753] INFO Partition [topic,62] on broker 4: Shrinking ISR 
for partition [topic,62] from 4,2 to 4 (kafka.cluster.Partition)
[2016-09-06 05:00:43,585] INFO Opening socket connection to server 
b0/169.254.2.1:2182. Will not attempt to authenticate using SASL (unknown 
error) (org.apache.zookeeper.ClientCnxn)
[2016-09-06 05:00:43,586] INFO Socket connection established to 
b0/169.254.2.1:2182, initiating session (org.apache.zookeeper.ClientCnxn)
[2016-09-06 05:00:43,587] INFO Unable to read additional data from server 
sessionid 0x256fabec47c0003, likely server has closed socket, closing socket 
connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2016-09-06 05:00:44,644] INFO Opening socket connection to server 
b1/169.254.2.116:2181. Will not attempt to authenticate using SASL (unknown 
error) (org.apache.zookeeper.ClientCnxn)
[2016-09-06 05:00:44,651] INFO Socket connection established to 
b1/169.254.2.116:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2016-09-06 05:00:44,658] INFO zookeeper state changed (Expired) 
(org.I0Itec.zkclient.ZkClient)
[2016-09-06 05:00:44,659] INFO Initiating client connection, 
connectString=b2.broker.com:2181,b1.broker.com:2181,zoo3.broker.com:2182 
sessionTimeout=15000 watcher=org.I0Itec.zkclient.ZkClient@37b8e86a 
(org.apache.zookeeper.ZooKeeper)
[2016-09-06 05:00:44,659] INFO Unable to reconnect to ZooKeeper service, 
session 0x256fabec47c0003 has expired, closing socket connection 
(org.apache.zookeeper.ClientCnxn)
[2016-09-06 05:00:44,661] INFO EventThread shut down 
(org.apache.zookeeper.ClientCnxn)
[2016-09-06 05:00:44,662] INFO Opening socket connection to server 
b2/169.254.2.216:2181. Will not attempt to authenticate using SASL (unknown 
error) (org.apache.zookeeper.ClientCnxn)
[2016-09-06 05:00:44,662] INFO Socket connection established to 
b2/169.254.2.216:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2016-09-06 05:00:44,665] ERROR Error handling event ZkEvent[New session event 
sent to kafka.controller.KafkaController$SessionExpirationListener@33b7dedc] 
(org.I0Itec.zkclient.ZkEventThread)
java.lang.IllegalStateException: Kafka scheduler has not been started
at kafka.utils.KafkaScheduler.ensureStarted(KafkaScheduler.scala:114)
at kafka.utils.KafkaScheduler.shutdown(KafkaScheduler.scala:86)
at 
kafka.controller.KafkaController.onControllerResignation(KafkaController.scala:350)
at 
kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSession$1.apply$mcZ$sp(KafkaController.scala:1108)
at 
kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSession$1.apply(KafkaController.scala:1107)
at 
kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSession$1.apply(KafkaController.scala:1107)
at kafka.utils.Utils$.inLock(Utils.scala:535)
at 
kafka.controller.KafkaController$SessionExpirationListener.handleNewSession(KafkaController.scala:1107)
at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472)
at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
[2016-09-06 05:00:44,666] INFO re-registering broker info in ZK for broker 4 
(kafka.server.KafkaHealthcheck)
[2016-09-06 05:00:44,801] INFO Session establishment complete on server 
b2/169.254.2.216:2181, sessionid = 0x256fabec47c0005, negotiated timeout = 
15000 (org.apache.zookeeper.ClientCnxn)
[2016-09-06 05:00:44,802] INFO zookeeper state changed (SyncConnected) 
(org.I0Itec.zkclient.ZkClient)
[2016-09-06 05:00:44,812] INFO Registered broker 4 at path /brokers/ids/4 with 
address b5.broker.com:9092. (kafka.utils.ZkUtils$)
[2016-09-06 05:00:44,813] INFO done re-registering broker 
(kafka.server.KafkaHealthcheck)
[2016-09-06 05:00:44,814] INFO Subscribing to /brokers/topics path to watch for 
new topics (kafka.server.KafkaHealthcheck)
[2016-09-06 05:00:44,831] INFO Partition [topic,62] on broker 4: Expanding ISR 
for partition [topic,62] from 4 to 4,2 (kafka.cluster.Partition)
[2016-09-06 05:00:44,865] INFO New leader is 1 
(kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
[2016-09-06 05:00:45,762] INFO [ReplicaFetcherManager on broker 4] Removed 
fetcher for partitions 

[jira] [Updated] (KAFKA-4128) Kafka broker losses messages when zookeeper session times out

2016-09-06 Thread Mazhar Shaikh (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mazhar Shaikh updated KAFKA-4128:
-
Priority: Critical  (was: Major)

> Kafka broker losses messages when zookeeper session times out
> -
>
> Key: KAFKA-4128
> URL: https://issues.apache.org/jira/browse/KAFKA-4128
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1, 0.9.0.1
>Reporter: Mazhar Shaikh
>Priority: Critical
>
> Pumping 30k msgs/second after some 6-8 hrs of run below logs are printed and 
> the messages are lost.
> [More than 5k messages are lost on every partitions]
> Below are few logs:
> [2016-09-06 05:00:42,595] INFO Client session timed out, have not heard from 
> server in 20903ms for sessionid 0x256fabec47c0003, closing socket connection 
> and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> [2016-09-06 05:00:42,696] INFO zookeeper state changed (Disconnected) 
> (org.I0Itec.zkclient.ZkClient)
> [2016-09-06 05:00:42,753] INFO Partition [topic,62] on broker 4: Shrinking 
> ISR for partition [topic,62] from 4,2 to 4 (kafka.cluster.Partition)
> [2016-09-06 05:00:43,585] INFO Opening socket connection to server 
> sysctrl/169.254.2.1:2182. Will not attempt to authenticate using SASL 
> (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2016-09-06 05:00:43,586] INFO Socket connection established to 
> sysctrl/169.254.2.1:2182, initiating session (org.apache.zookeeper.ClientCnxn)
> [2016-09-06 05:00:43,587] INFO Unable to read additional data from server 
> sessionid 0x256fabec47c0003, likely server has closed socket, closing socket 
> connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> [2016-09-06 05:00:44,644] INFO Opening socket connection to server 
> b1/169.254.2.116:2181. Will not attempt to authenticate using SASL (unknown 
> error) (org.apache.zookeeper.ClientCnxn)
> [2016-09-06 05:00:44,651] INFO Socket connection established to 
> b1/169.254.2.116:2181, initiating session (org.apache.zookeeper.ClientCnxn)
> [2016-09-06 05:00:44,658] INFO zookeeper state changed (Expired) 
> (org.I0Itec.zkclient.ZkClient)
> [2016-09-06 05:00:44,659] INFO Initiating client connection, 
> connectString=b2.broker.com:2181,b1.broker.com:2181,zoo3.broker.com:2182 
> sessionTimeout=15000 watcher=org.I0Itec.zkclient.ZkClient@37b8e86a 
> (org.apache.zookeeper.ZooKeeper)
> [2016-09-06 05:00:44,659] INFO Unable to reconnect to ZooKeeper service, 
> session 0x256fabec47c0003 has expired, closing socket connection 
> (org.apache.zookeeper.ClientCnxn)
> [2016-09-06 05:00:44,661] INFO EventThread shut down 
> (org.apache.zookeeper.ClientCnxn)
> [2016-09-06 05:00:44,662] INFO Opening socket connection to server 
> b2/169.254.2.216:2181. Will not attempt to authenticate using SASL (unknown 
> error) (org.apache.zookeeper.ClientCnxn)
> [2016-09-06 05:00:44,662] INFO Socket connection established to 
> b2/169.254.2.216:2181, initiating session (org.apache.zookeeper.ClientCnxn)
> [2016-09-06 05:00:44,665] ERROR Error handling event ZkEvent[New session 
> event sent to 
> kafka.controller.KafkaController$SessionExpirationListener@33b7dedc] 
> (org.I0Itec.zkclient.ZkEventThread)
> java.lang.IllegalStateException: Kafka scheduler has not been started
> at kafka.utils.KafkaScheduler.ensureStarted(KafkaScheduler.scala:114)
> at kafka.utils.KafkaScheduler.shutdown(KafkaScheduler.scala:86)
> at 
> kafka.controller.KafkaController.onControllerResignation(KafkaController.scala:350)
> at 
> kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSession$1.apply$mcZ$sp(KafkaController.scala:1108)
> at 
> kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSession$1.apply(KafkaController.scala:1107)
> at 
> kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSession$1.apply(KafkaController.scala:1107)
> at kafka.utils.Utils$.inLock(Utils.scala:535)
> at 
> kafka.controller.KafkaController$SessionExpirationListener.handleNewSession(KafkaController.scala:1107)
> at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472)
> at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
> [2016-09-06 05:00:44,666] INFO re-registering broker info in ZK for broker 4 
> (kafka.server.KafkaHealthcheck)
> [2016-09-06 05:00:44,801] INFO Session establishment complete on server 
> b2/169.254.2.216:2181, sessionid = 0x256fabec47c0005, negotiated timeout = 
> 15000 (org.apache.zookeeper.ClientCnxn)
> [2016-09-06 05:00:44,802] INFO zookeeper state changed (SyncConnected) 
> (org.I0Itec.zkclient.ZkClient)
> [2016-09-06 05:00:44,812] INFO Registered broker 4 at path /brokers/ids/4 
> with address b5.broker.com:9092. (kafka.utils.ZkUtils$)
> [2016-09-06 

  1   2   >