[DISCUSS] JIRA issue required even for minor/hotfix pull requests?

2015-07-13 Thread Ismael Juma
Hi all,

Guozhang raised this topic in the [DISCUSS] Using GitHub Pull Requests for
contributions and code review thread and suggested starting a new thread
for it.

In the Spark project, they say:

If the change is new, then it usually needs a new JIRA. However, trivial
changes, where what should change is virtually the same as how it should
change do not require a JIRA.
Example: Fix typos in Foo scaladoc.

In such cases, the commit message would be prefixed with [MINOR] or
[HOTFIX] instead of [KAFKA-xxx].

I can see the pros and cons for each approach.

Always requiring a JIRA ticket makes it more consistent and makes it
possible to use JIRA as the place to prioritise what needs attention
(although this is imperfect as code review will take place in the pull
request and it's likely that JIRA won't always be fully in sync for
in-progress items).

Skipping JIRA tickets for minor/hotfix pull requests (where the JIRA ticket
just duplicates the information in the pull request) eliminates redundant
work and reduces the barrier to contribution (it is likely that people will
occasionally submit PRs without a JIRA even when the change is too big for
that though).

Guozhang suggested in the original thread:

Personally I think it is better to not enforcing a JIRA ticket for minor /
hotfix commits, for example, we can format the title with [MINOR] [HOTFIX]
etc as in Spark

What do others think?

Best,
Ismael


Re: Question about sub-projects and project merging

2015-07-13 Thread Greg Stein
Hi Jay,

Looking at your question, I see the Apache Samza and Apache Kafka
*communities* have little overlap(*). The Board looks at communities, and
their overlap or lack thereof. Smushing two communities under one TLP is
what we have historically called an umbrella TLP, and discourage.
Communities should be allowed to operate independently.

If you have *one* community, then one TLP makes sense.

If you have *two* communities, then increase the overlap. When they look
like one community, and that one community votes to merge TLPs ... then ask
for that.

Cheers,
-g

(*) 2 common PMC members, 3 common committers.


On Mon, Jul 13, 2015 at 12:37 AM, Jay Kreps jay.kr...@gmail.com wrote:

 Hey board members,

 There is a longish thread on the Apache Samza mailing list on the
 relationship between Kafka and Samza and whether they wouldn't make a lot
 more sense as a single project. This raised some questions I was hoping to
 get advice on.

 Discussion thread (warning: super long, I attempt to summarize relevant
 bits below):

 http://mail-archives.apache.org/mod_mbox/samza-dev/201507.mbox/%3ccabyby7d_-jcxj7fizsjuebjedgbep33flyx3nrozt0yeox9...@mail.gmail.com%3E

 Anyhow, some people thought Apache has lot's of sub-projects, that would
 be a graceful way to step in the right direction. At that point others
 popped up and said, sub-projects are discouraged by the board.

 I'm not sure if we understand technically what a subproject is, but I
 think it means a second repo/committership under the same PMC.

 A few questions:
 - Is that what a sub-project is?
 - Are they discouraged? If so, why?
 - Assuming it makes sense in this case what is the process for making one?
 - Putting aside sub-projects as a mechanism what are examples where
 communities merged successfully? We were pointed towards Lucene/SOLR. Are
 there others?

 Relevant background info:
 - Samza depends on Kafka, but not vice versa
 - There is some overlap in committers but not extensive (3/11 Samza
 committers are also Kafka committers)

 Thanks for the advice!

 -Jay






Re: [DISCUSS] JIRA issue required even for minor/hotfix pull requests?

2015-07-13 Thread Joe Stein
Ismael,

If you create a pull request on github today then a JIRA is created so
folks can see and respond and such. The JIRA hooks also provide in comment
updates too.

What issue are you having or looking to-do?

~ Joe Stein

On Mon, Jul 13, 2015 at 6:52 AM, Ismael Juma ism...@juma.me.uk wrote:

 Hi all,

 Guozhang raised this topic in the [DISCUSS] Using GitHub Pull Requests for
 contributions and code review thread and suggested starting a new thread
 for it.

 In the Spark project, they say:

 If the change is new, then it usually needs a new JIRA. However, trivial
 changes, where what should change is virtually the same as how it should
 change do not require a JIRA.
 Example: Fix typos in Foo scaladoc.

 In such cases, the commit message would be prefixed with [MINOR] or
 [HOTFIX] instead of [KAFKA-xxx].

 I can see the pros and cons for each approach.

 Always requiring a JIRA ticket makes it more consistent and makes it
 possible to use JIRA as the place to prioritise what needs attention
 (although this is imperfect as code review will take place in the pull
 request and it's likely that JIRA won't always be fully in sync for
 in-progress items).

 Skipping JIRA tickets for minor/hotfix pull requests (where the JIRA ticket
 just duplicates the information in the pull request) eliminates redundant
 work and reduces the barrier to contribution (it is likely that people will
 occasionally submit PRs without a JIRA even when the change is too big for
 that though).

 Guozhang suggested in the original thread:

 Personally I think it is better to not enforcing a JIRA ticket for minor /
 hotfix commits, for example, we can format the title with [MINOR] [HOTFIX]
 etc as in Spark

 What do others think?

 Best,
 Ismael



[jira] [Commented] (KAFKA-1835) Kafka new producer needs options to make blocking behavior explicit

2015-07-13 Thread Stefan Miklosovic (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624382#comment-14624382
 ] 

Stefan Miklosovic commented on KAFKA-1835:
--

[~becket_qin] Anything which would unblock me would be awesome. In process of 
the transition to newer Kafka releases (I am currently on 0.8.1.1 because of 
this), if this will not be resolved, I would have to stick with old Scala 
producer but I would rewrite consumers to use new API of 0.8.3. While this 
could work, I do not like this approach, I want to get new stuff both on 
producer and consumer side and get rid of Scala dependencies all over the 
project.

 Kafka new producer needs options to make blocking behavior explicit
 ---

 Key: KAFKA-1835
 URL: https://issues.apache.org/jira/browse/KAFKA-1835
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Affects Versions: 0.8.2.0, 0.8.3, 0.9.0
Reporter: Paul Pearcy
 Fix For: 0.8.3

 Attachments: KAFKA-1835-New-producer--blocking_v0.patch, 
 KAFKA-1835.patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The new (0.8.2 standalone) producer will block the first time it attempts to 
 retrieve metadata for a topic. This is not the desired behavior in some use 
 cases where async non-blocking guarantees are required and message loss is 
 acceptable in known cases. Also, most developers will assume an API that 
 returns a future is safe to call in a critical request path. 
 Discussing on the mailing list, the most viable option is to have the 
 following settings:
  pre.initialize.topics=x,y,z
  pre.initialize.timeout=x
  
 This moves potential blocking to the init of the producer and outside of some 
 random request. The potential will still exist for blocking in a corner case 
 where connectivity with Kafka is lost and a topic not included in pre-init 
 has a message sent for the first time. 
 There is the question of what to do when initialization fails. There are a 
 couple of options that I'd like available:
 - Fail creation of the client 
 - Fail all sends until the meta is available 
 Open to input on how the above option should be expressed. 
 It is also worth noting more nuanced solutions exist that could work without 
 the extra settings, they just end up having extra complications and at the 
 end of the day not adding much value. For instance, the producer could accept 
 and queue messages(note: more complicated than I am making it sound due to 
 storing all accepted messages in pre-partitioned compact binary form), but 
 you're still going to be forced to choose to either start blocking or 
 dropping messages at some point. 
 I have some test cases I am going to port over to the Kafka producer 
 integration ones and start from there. My current impl is in scala, but 
 porting to Java shouldn't be a big deal (was using a promise to track init 
 status, but will likely need to make that an atomic bool). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1835) Kafka new producer needs options to make blocking behavior explicit

2015-07-13 Thread Stefan Miklosovic (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624382#comment-14624382
 ] 

Stefan Miklosovic edited comment on KAFKA-1835 at 7/13/15 8:52 AM:
---

[~becket_qin] Anything which would unblock me would be awesome. In process of 
the transition to newer Kafka releases (I am currently on 0.8.1.1 because of 
this), if this is not resolved, I would have to stick with old Scala producer 
but I would rewrite consumers to use new API of 0.8.3. While this could work, I 
do not like this approach, I want to get new stuff both on producer and 
consumer side and get rid of Scala dependencies all over the project.


was (Author: smiklosovic):
[~becket_qin] Anything which would unblock me would be awesome. In process of 
the transition to newer Kafka releases (I am currently on 0.8.1.1 because of 
this), if this will not be resolved, I would have to stick with old Scala 
producer but I would rewrite consumers to use new API of 0.8.3. While this 
could work, I do not like this approach, I want to get new stuff both on 
producer and consumer side and get rid of Scala dependencies all over the 
project.

 Kafka new producer needs options to make blocking behavior explicit
 ---

 Key: KAFKA-1835
 URL: https://issues.apache.org/jira/browse/KAFKA-1835
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Affects Versions: 0.8.2.0, 0.8.3, 0.9.0
Reporter: Paul Pearcy
 Fix For: 0.8.3

 Attachments: KAFKA-1835-New-producer--blocking_v0.patch, 
 KAFKA-1835.patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The new (0.8.2 standalone) producer will block the first time it attempts to 
 retrieve metadata for a topic. This is not the desired behavior in some use 
 cases where async non-blocking guarantees are required and message loss is 
 acceptable in known cases. Also, most developers will assume an API that 
 returns a future is safe to call in a critical request path. 
 Discussing on the mailing list, the most viable option is to have the 
 following settings:
  pre.initialize.topics=x,y,z
  pre.initialize.timeout=x
  
 This moves potential blocking to the init of the producer and outside of some 
 random request. The potential will still exist for blocking in a corner case 
 where connectivity with Kafka is lost and a topic not included in pre-init 
 has a message sent for the first time. 
 There is the question of what to do when initialization fails. There are a 
 couple of options that I'd like available:
 - Fail creation of the client 
 - Fail all sends until the meta is available 
 Open to input on how the above option should be expressed. 
 It is also worth noting more nuanced solutions exist that could work without 
 the extra settings, they just end up having extra complications and at the 
 end of the day not adding much value. For instance, the producer could accept 
 and queue messages(note: more complicated than I am making it sound due to 
 storing all accepted messages in pre-partitioned compact binary form), but 
 you're still going to be forced to choose to either start blocking or 
 dropping messages at some point. 
 I have some test cases I am going to port over to the Kafka producer 
 integration ones and start from there. My current impl is in scala, but 
 porting to Java shouldn't be a big deal (was using a promise to track init 
 status, but will likely need to make that an atomic bool). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] JIRA issue required even for minor/hotfix pull requests?

2015-07-13 Thread Joe Stein
Sorry, meant to say 'an email to dev list' instead of 'a JIRA' below. The
hooks in JIRA comments I have seen working recently.

~ Joe Stein

On Mon, Jul 13, 2015 at 8:42 AM, Joe Stein joe.st...@stealth.ly wrote:

 Ismael,

 If you create a pull request on github today then a JIRA is created so
 folks can see and respond and such. The JIRA hooks also provide in comment
 updates too.

 What issue are you having or looking to-do?

 ~ Joe Stein

 On Mon, Jul 13, 2015 at 6:52 AM, Ismael Juma ism...@juma.me.uk wrote:

 Hi all,

 Guozhang raised this topic in the [DISCUSS] Using GitHub Pull Requests
 for
 contributions and code review thread and suggested starting a new thread
 for it.

 In the Spark project, they say:

 If the change is new, then it usually needs a new JIRA. However, trivial
 changes, where what should change is virtually the same as how it
 should
 change do not require a JIRA.
 Example: Fix typos in Foo scaladoc.

 In such cases, the commit message would be prefixed with [MINOR] or
 [HOTFIX] instead of [KAFKA-xxx].

 I can see the pros and cons for each approach.

 Always requiring a JIRA ticket makes it more consistent and makes it
 possible to use JIRA as the place to prioritise what needs attention
 (although this is imperfect as code review will take place in the pull
 request and it's likely that JIRA won't always be fully in sync for
 in-progress items).

 Skipping JIRA tickets for minor/hotfix pull requests (where the JIRA
 ticket
 just duplicates the information in the pull request) eliminates redundant
 work and reduces the barrier to contribution (it is likely that people
 will
 occasionally submit PRs without a JIRA even when the change is too big for
 that though).

 Guozhang suggested in the original thread:

 Personally I think it is better to not enforcing a JIRA ticket for minor
 /
 hotfix commits, for example, we can format the title with [MINOR] [HOTFIX]
 etc as in Spark

 What do others think?

 Best,
 Ismael





[jira] [Commented] (KAFKA-2275) Add a ListTopic() API to the new consumer

2015-07-13 Thread Stevo Slavic (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624399#comment-14624399
 ] 

Stevo Slavic commented on KAFKA-2275:
-

[~guozhang] so it's {{listTopics()}} not {{listTopic()}}?

 Add a ListTopic() API to the new consumer
 -

 Key: KAFKA-2275
 URL: https://issues.apache.org/jira/browse/KAFKA-2275
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Priority: Critical
 Fix For: 0.8.3


 One usecase for this API is for consumers that want specific partition 
 assignment with regex subscription. For implementation, it involves sending a 
 TopicMetadataRequest to a random broker and parse the response.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-2331) Kafka does not spread partitions in a topic among all consumers evenly

2015-07-13 Thread Stefan Miklosovic (JIRA)
Stefan Miklosovic created KAFKA-2331:


 Summary: Kafka does not spread partitions in a topic among all 
consumers evenly
 Key: KAFKA-2331
 URL: https://issues.apache.org/jira/browse/KAFKA-2331
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1.1
Reporter: Stefan Miklosovic


I want to have 1 topic with 10 partitions. I am using default configuration of 
Kafka. I create 1 topic with 10 partitions by that helper script and now I am 
about to produce messages to it.

The thing is that even all partitions are indeed consumed, some consumers have 
more then 1 partition assigned even I have number of consumer threads equal to 
partitions in a topic hence some threads are idle.

Let's describe it in more detail.

I know that common stuff that you need one consumer thread per partition. I 
want to be able to commit offsets per partition and this is possible only when 
I have 1 thread per consumer connector per partition (I am using high level 
consumer).

So I create 10 threads, in each thread I am calling 
Consumer.createJavaConsumerConnector() where I am doing this

topicCountMap.put(mytopic, 1);
and in the end I have 1 iterator which consumes messages from 1 partition.

When I do this 10 times, I have 10 consumers, consumer per thread per partition 
where I can commit offsets independently per partition because if I put 
different number from 1 in topic map, I would end up with more then 1 consumer 
thread for that topic for given consumer instance so if I am about to commit 
offsets with created consumer instance, it would commit them for all threads 
which is not desired.

But the thing is that when I use consumers, only 7 consumers are involved and 
it seems that other consumer threads are idle but I do not know why.

The thing is that I am creating these consumer threads in a loop. So I start 
first thread (submit to executor service), then another, then another and so on.

So the scenario is that first consumer gets all 10 partitions, then 2nd 
connects so it is splits between these two to 5 and 5 (or something similar), 
then other threads are connecting.

I understand this as a partition rebalancing among all consumers so it behaves 
well in such sense that if more consumers are being created, partition 
rebalancing occurs between these consumers so every consumer should have some 
partitions to operate upon.

But from the results I see that there is only 7 consumers and according to 
consumed messages it seems they are split like 3,2,1,1,1,1,1 partition-wise. 
Yes, these 7 consumers covered all 10 partitions, but why consumers with more 
then 1 partition do no split and give partitions to remaining 3 consumers?

I am pretty much wondering what is happening with remaining 3 threads and why 
they do not grab partitions from consumers which have more then 1 partition 
assigned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2145) An option to add topic owners.

2015-07-13 Thread Neelesh Srinivas Salian (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624578#comment-14624578
 ] 

Neelesh Srinivas Salian commented on KAFKA-2145:


I don't think I can get to this JIRA at the moment.
[~singhashish] , do you have the time to look at this?

 An option to add topic owners. 
 ---

 Key: KAFKA-2145
 URL: https://issues.apache.org/jira/browse/KAFKA-2145
 Project: Kafka
  Issue Type: Improvement
Reporter: Parth Brahmbhatt
Assignee: Neelesh Srinivas Salian

 We need to expose a way so users can identify users/groups that share 
 ownership of topic. We discussed adding this as part of 
 https://issues.apache.org/jira/browse/KAFKA-2035 and agreed that it will be 
 simpler to add owner as a logconfig. 
 The owner field can be used for auditing and also by authorization layer to 
 grant access without having to explicitly configure acls. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 35615: Patch for KAFKA-1782

2015-07-13 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35615/#review91575
---


Some general comments:

1. Regarding the @Before and @After annotations, one suggestion from the JIRA 
was that we remove any annotations other than @Test itself but use scalatest 
features (for example, 
http://doc.scalatest.org/2.2.4/#org.scalatest.BeforeAndAfter) instead. Now I 
cannot remember a strong motiviation for this move, so I feel it may be also OK 
as you chose to use the junit tags anyways.
2. Regarding org.junit.Assert and org.scalatest.Assertions in imports, if we 
decide to be junit-heavy instead of scalatest-heavy for our unit tests, we 
should then use the former for most of the time and only the latter for 
intercept[..] since it is not supported in the fomer. There seems a few places 
where both of them are used for fail / assert, etc.


core/src/test/scala/integration/kafka/api/ConsumerTest.scala (line 33)
https://reviews.apache.org/r/35615/#comment145012

Seems this import is not used inside the class?



core/src/test/scala/integration/kafka/api/ProducerFailureHandlingTest.scala 
(line 22)
https://reviews.apache.org/r/35615/#comment145015

Seems this import is not used as well. BTW I have another general comment 
regarding this issue.



core/src/test/scala/integration/kafka/api/ProducerFailureHandlingTest.scala 
(lines 263 - 266)
https://reviews.apache.org/r/35615/#comment145013

Is this intentional, as we already import org.junit.Assert._?



core/src/test/scala/integration/kafka/api/ProducerFailureHandlingTest.scala 
(lines 289 - 294)
https://reviews.apache.org/r/35615/#comment145014

Same as above.



core/src/test/scala/integration/kafka/api/ProducerSendTest.scala (line 34)
https://reviews.apache.org/r/35615/#comment145016

Sometimes we use org.scalatest.Assertions.fail() and sometimes we use 
org.junit.Assert.fail(); it would better that we are consistent in one of them, 
and personally I recommend following the org.junit package since it is more 
general.



core/src/test/scala/unit/kafka/admin/AdminTest.scala (lines 29 - 30)
https://reviews.apache.org/r/35615/#comment145017

Since we already imported Assert._ we do not need to import the other 
Assertions._ for this class.



core/src/test/scala/unit/kafka/network/SocketServerTest.scala (line 35)
https://reviews.apache.org/r/35615/#comment145026

Shall we import import org.junit.{After, Before, Test} instead of 
org.scalatest.junit.JUnitSuite?



core/src/test/scala/unit/kafka/producer/ProducerTest.scala (lines 26 - 34)
https://reviews.apache.org/r/35615/#comment145027

Could you group the imports of org.*, java.*, kafka.*, etc together? Same 
for some other places.



core/src/test/scala/unit/kafka/producer/ProducerTest.scala (line 118)
https://reviews.apache.org/r/35615/#comment145028

I think we can just use org.junit.Assert.fail here.



core/src/test/scala/unit/kafka/producer/ProducerTest.scala (line 131)
https://reviews.apache.org/r/35615/#comment145029

same above.



core/src/test/scala/unit/kafka/producer/ProducerTest.scala (line 144)
https://reviews.apache.org/r/35615/#comment145030

same above.



core/src/test/scala/unit/kafka/producer/ProducerTest.scala (line 203)
https://reviews.apache.org/r/35615/#comment145031

same above.



core/src/test/scala/unit/kafka/producer/ProducerTest.scala (line 263)
https://reviews.apache.org/r/35615/#comment145032

same above



core/src/test/scala/unit/kafka/producer/ProducerTest.scala (line 297)
https://reviews.apache.org/r/35615/#comment145033

same above



core/src/test/scala/unit/kafka/producer/ProducerTest.scala (line 311)
https://reviews.apache.org/r/35615/#comment145034

same above


- Guozhang Wang


On June 18, 2015, 6:53 p.m., Alexander Pakulov wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/35615/
 ---
 
 (Updated June 18, 2015, 6:53 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1782
 https://issues.apache.org/jira/browse/KAFKA-1782
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1782; Junit3 Misusage
 
 
 Diffs
 -
 
   core/src/test/scala/integration/kafka/api/ConsumerBounceTest.scala 
 f56096b826bdbf760411a54ba067a6a83eca8a10 
   core/src/test/scala/integration/kafka/api/ConsumerTest.scala 
 17b17b9b1520c7cc2e2ba96cdb1f9ff06e47bcad 
   core/src/test/scala/integration/kafka/api/IntegrationTestHarness.scala 
 07b1ff47bfc3cd3f948c9533c8dc977fa36d996f 
   core/src/test/scala/integration/kafka/api/ProducerBounceTest.scala 
 ce70a0a449883723a9b59ea48da34ba30b3f6daf 
   core/src/test/scala/integration/kafka/api/ProducerCompressionTest.scala 
 83de81cb3f79a6966dd5ef462733d8a22cd6d467 
   

[jira] [Commented] (KAFKA-2198) kafka-topics.sh exits with 0 status on failures

2015-07-13 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625827#comment-14625827
 ] 

Gwen Shapira commented on KAFKA-2198:
-

Thanks for the patch! pushed to trunk.

 kafka-topics.sh exits with 0 status on failures
 ---

 Key: KAFKA-2198
 URL: https://issues.apache.org/jira/browse/KAFKA-2198
 Project: Kafka
  Issue Type: Bug
  Components: admin
Affects Versions: 0.8.2.1
Reporter: Bob Halley
Assignee: Manikumar Reddy
 Attachments: KAFKA-2198.patch, KAFKA-2198_2015-05-19_18:27:01.patch, 
 KAFKA-2198_2015-05-19_18:41:25.patch, KAFKA-2198_2015-07-10_22:02:02.patch, 
 KAFKA-2198_2015-07-10_23:11:23.patch, KAFKA-2198_2015-07-13_19:24:46.patch


 In the two failure cases below, kafka-topics.sh exits with status 0.  You 
 shouldn't need to parse output from the command to know if it failed or not.
 Case 1: Forgetting to add Kafka zookeeper chroot path to zookeeper spec
 $ kafka-topics.sh --alter --topic foo --config min.insync.replicas=2 
 --zookeeper 10.0.0.1  echo succeeded
 succeeded
 Case 2: Bad config option.  (Also, do we really need the java backtrace?  
 It's a lot of noise most of the time.)
 $ kafka-topics.sh --alter --topic foo --config min.insync.replicasTYPO=2 
 --zookeeper 10.0.0.1/kafka  echo succeeded
 Error while executing topic command requirement failed: Unknown configuration 
 min.insync.replicasTYPO.
 java.lang.IllegalArgumentException: requirement failed: Unknown configuration 
 min.insync.replicasTYPO.
 at scala.Predef$.require(Predef.scala:233)
 at kafka.log.LogConfig$$anonfun$validateNames$1.apply(LogConfig.scala:183)
 at kafka.log.LogConfig$$anonfun$validateNames$1.apply(LogConfig.scala:182)
 at scala.collection.Iterator$class.foreach(Iterator.scala:727)
 at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
 at kafka.log.LogConfig$.validateNames(LogConfig.scala:182)
 at kafka.log.LogConfig$.validate(LogConfig.scala:190)
 at 
 kafka.admin.TopicCommand$.parseTopicConfigsToBeAdded(TopicCommand.scala:205)
 at 
 kafka.admin.TopicCommand$$anonfun$alterTopic$1.apply(TopicCommand.scala:103)
 at 
 kafka.admin.TopicCommand$$anonfun$alterTopic$1.apply(TopicCommand.scala:100)
 at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
 at kafka.admin.TopicCommand$.alterTopic(TopicCommand.scala:100)
 at kafka.admin.TopicCommand$.main(TopicCommand.scala:57)
 at kafka.admin.TopicCommand.main(TopicCommand.scala)
 succeeded



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-git-pr #2

2015-07-13 Thread Apache Jenkins Server
See https://builds.apache.org/job/kafka-trunk-git-pr/2/changes

Changes:

[cshapi] KAFKA-2198: kafka-topics.sh exits with 0 status on failures; patched 
by Manikumar Reddy reviewed by Gwen Shapira

--
[...truncated 1442 lines...]
kafka.api.ProducerBounceTest  testBrokerFailure PASSED

kafka.api.ApiUtilsTest  testShortStringNonASCII PASSED

kafka.api.ApiUtilsTest  testShortStringASCII PASSED

kafka.api.test.ProducerCompressionTest  testCompression[0] PASSED

kafka.api.test.ProducerCompressionTest  testCompression[1] PASSED

kafka.api.test.ProducerCompressionTest  testCompression[2] PASSED

kafka.api.test.ProducerCompressionTest  testCompression[3] PASSED

kafka.cluster.BrokerEndPointTest  testHashAndEquals PASSED

kafka.cluster.BrokerEndPointTest  testFromOldJSON PASSED

kafka.cluster.BrokerEndPointTest  testBrokerEndpointFromURI PASSED

kafka.cluster.BrokerEndPointTest  testEndpointFromURI PASSED

kafka.cluster.BrokerEndPointTest  testFromJSON PASSED

kafka.cluster.BrokerEndPointTest  testSerDe PASSED

kafka.javaapi.consumer.ZookeeperConsumerConnectorTest  testBasic PASSED

kafka.javaapi.message.ByteBufferMessageSetTest  testEquals PASSED

kafka.javaapi.message.ByteBufferMessageSetTest  
testIteratorIsConsistentWithCompression PASSED

kafka.javaapi.message.ByteBufferMessageSetTest  testIteratorIsConsistent PASSED

kafka.javaapi.message.ByteBufferMessageSetTest  testSizeInBytes PASSED

kafka.javaapi.message.ByteBufferMessageSetTest  testWrittenEqualsRead PASSED

kafka.javaapi.message.ByteBufferMessageSetTest  testEqualsWithCompression 
PASSED

kafka.javaapi.message.ByteBufferMessageSetTest  testSizeInBytesWithCompression 
PASSED

kafka.integration.UncleanLeaderElectionTest  
testCleanLeaderElectionDisabledByTopicOverride PASSED

kafka.integration.UncleanLeaderElectionTest  
testUncleanLeaderElectionInvalidTopicOverride PASSED

kafka.integration.UncleanLeaderElectionTest  testUncleanLeaderElectionEnabled 
PASSED

kafka.integration.UncleanLeaderElectionTest  testUncleanLeaderElectionDisabled 
PASSED

kafka.integration.UncleanLeaderElectionTest  
testUncleanLeaderElectionEnabledByTopicOverride PASSED

kafka.integration.FetcherTest  testFetcher PASSED

kafka.integration.RollingBounceTest  testRollingBounce PASSED

kafka.integration.MinIsrConfigTest  testDefaultKafkaConfig PASSED

kafka.integration.PrimitiveApiTest  testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest  testProduceAndMultiFetch PASSED

kafka.integration.PrimitiveApiTest  testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest  testPipelinedProduceRequests PASSED

kafka.integration.PrimitiveApiTest  
testDefaultEncoderProducerAndFetchWithCompression PASSED

kafka.integration.PrimitiveApiTest  testConsumerEmptyTopic PASSED

kafka.integration.PrimitiveApiTest  testEmptyFetchRequest PASSED

kafka.integration.PrimitiveApiTest  testMultiProduce PASSED

kafka.integration.AutoOffsetResetTest  testResetToLatestWhenOffsetTooHigh 
PASSED

kafka.integration.AutoOffsetResetTest  testResetToLatestWhenOffsetTooLow PASSED

kafka.integration.AutoOffsetResetTest  testResetToEarliestWhenOffsetTooLow 
PASSED

kafka.integration.AutoOffsetResetTest  testResetToEarliestWhenOffsetTooHigh 
PASSED

kafka.integration.TopicMetadataTest  testIsrAfterBrokerShutDownAndJoinsBack 
PASSED

kafka.integration.TopicMetadataTest  testAliveBrokerListWithNoTopics PASSED

kafka.integration.TopicMetadataTest  testBasicTopicMetadata PASSED

kafka.integration.TopicMetadataTest  
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.TopicMetadataTest  
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.TopicMetadataTest  testAutoCreateTopic PASSED

kafka.integration.TopicMetadataTest  testGetAllTopicMetadata PASSED

kafka.integration.TopicMetadataTest  testTopicMetadataRequest PASSED

kafka.metrics.KafkaTimerTest  testKafkaTimer PASSED

kafka.utils.UtilsTest  testCsvMap PASSED

kafka.utils.UtilsTest  testCircularIterator PASSED

kafka.utils.UtilsTest  testReplaceSuffix PASSED

kafka.utils.UtilsTest  testAbs PASSED

kafka.utils.UtilsTest  testReadInt PASSED

kafka.utils.UtilsTest  testInLock PASSED

kafka.utils.UtilsTest  testCsvList PASSED

kafka.utils.UtilsTest  testReadBytes PASSED

kafka.utils.UtilsTest  testDoublyLinkedList PASSED

kafka.utils.UtilsTest  testSwallow PASSED

kafka.utils.SchedulerTest  testMockSchedulerNonPeriodicTask PASSED

kafka.utils.SchedulerTest  testRestart PASSED

kafka.utils.SchedulerTest  testReentrantTaskInMockScheduler PASSED

kafka.utils.SchedulerTest  testNonPeriodicTask PASSED

kafka.utils.SchedulerTest  testPeriodicTask PASSED

kafka.utils.SchedulerTest  testMockSchedulerPeriodicTask PASSED

kafka.utils.ByteBoundedBlockingQueueTest  testByteBoundedBlockingQueue PASSED

kafka.utils.CommandLineUtilsTest  testParseSingleArg PASSED

kafka.utils.CommandLineUtilsTest  testParseEmptyArg PASSED


Build failed in Jenkins: KafkaPreCommit #147

2015-07-13 Thread Apache Jenkins Server
See https://builds.apache.org/job/KafkaPreCommit/147/changes

Changes:

[cshapi] KAFKA-2198: kafka-topics.sh exits with 0 status on failures; patched 
by Manikumar Reddy reviewed by Gwen Shapira

--
[...truncated 3035 lines...]
kafka.consumer.MetricsTest  testMetricsReporterAfterDeletingTopic PASSED

kafka.consumer.ZookeeperConsumerConnectorTest  testCompressionSetConsumption 
PASSED

kafka.consumer.ZookeeperConsumerConnectorTest  testBasic PASSED

kafka.consumer.ZookeeperConsumerConnectorTest  testLeaderSelectionForPartition 
PASSED

kafka.consumer.ZookeeperConsumerConnectorTest  testConsumerDecoder PASSED

kafka.consumer.ZookeeperConsumerConnectorTest  testConsumerRebalanceListener 
PASSED

kafka.consumer.ZookeeperConsumerConnectorTest  testCompression PASSED

kafka.consumer.ConsumerIteratorTest  testConsumerIteratorDecodingFailure PASSED

kafka.consumer.ConsumerIteratorTest  
testConsumerIteratorDeduplicationDeepIterator PASSED

kafka.javaapi.consumer.ZookeeperConsumerConnectorTest  testBasic PASSED

kafka.javaapi.message.ByteBufferMessageSetTest  testSizeInBytes PASSED

kafka.javaapi.message.ByteBufferMessageSetTest  testWrittenEqualsRead PASSED

kafka.javaapi.message.ByteBufferMessageSetTest  testEquals PASSED

kafka.javaapi.message.ByteBufferMessageSetTest  testSizeInBytesWithCompression 
PASSED

kafka.javaapi.message.ByteBufferMessageSetTest  
testIteratorIsConsistentWithCompression PASSED

kafka.javaapi.message.ByteBufferMessageSetTest  testIteratorIsConsistent PASSED

kafka.javaapi.message.ByteBufferMessageSetTest  testEqualsWithCompression 
PASSED

kafka.server.LogOffsetTest  testGetOffsetsBeforeNow PASSED

kafka.server.LogOffsetTest  testGetOffsetsForUnknownTopic PASSED

kafka.server.LogOffsetTest  testGetOffsetsBeforeEarliestTime PASSED

kafka.server.LogOffsetTest  testEmptyLogsGetOffsets PASSED

kafka.server.LogOffsetTest  testGetOffsetsBeforeLatestTime PASSED

kafka.server.HighwatermarkPersistenceTest  
testHighWatermarkPersistenceSinglePartition PASSED

kafka.server.HighwatermarkPersistenceTest  
testHighWatermarkPersistenceMultiplePartitions PASSED

kafka.server.KafkaConfigTest  testInvalidCompressionType PASSED

kafka.server.KafkaConfigTest  testLogRetentionTimeMinutesProvided PASSED

kafka.server.KafkaConfigTest  testAdvertisePortDefault PASSED

kafka.server.KafkaConfigTest  testAdvertiseHostNameDefault PASSED

kafka.server.KafkaConfigTest  testDuplicateListeners PASSED

kafka.server.KafkaConfigTest  testBadListenerProtocol PASSED

kafka.server.KafkaConfigTest  testLogRetentionTimeBothMinutesAndMsProvided 
PASSED

kafka.server.KafkaConfigTest  testLogRollTimeMsProvided PASSED

kafka.server.KafkaConfigTest  testLogRollTimeBothMsAndHoursProvided PASSED

kafka.server.KafkaConfigTest  testListenerDefaults PASSED

kafka.server.KafkaConfigTest  testUncleanElectionDisabled PASSED

kafka.server.KafkaConfigTest  testUncleanElectionEnabled PASSED

kafka.server.KafkaConfigTest  testLogRetentionValid PASSED

kafka.server.KafkaConfigTest  testLogRetentionTimeMsProvided PASSED

kafka.server.KafkaConfigTest  testAdvertiseDefaults PASSED

kafka.server.KafkaConfigTest  testAdvertiseConfigured PASSED

kafka.server.KafkaConfigTest  testDefaultCompressionType PASSED

kafka.server.KafkaConfigTest  testValidCompressionType PASSED

kafka.server.KafkaConfigTest  testLogRetentionTimeNoConfigProvided PASSED

kafka.server.KafkaConfigTest  testUncleanElectionInvalid PASSED

kafka.server.KafkaConfigTest  testVersionConfiguration PASSED

kafka.server.KafkaConfigTest  testUncleanLeaderElectionDefault PASSED

kafka.server.KafkaConfigTest  testLogRetentionTimeHoursProvided PASSED

kafka.server.KafkaConfigTest  testLogRetentionTimeBothMinutesAndHoursProvided 
PASSED

kafka.server.KafkaConfigTest  testLogRetentionUnlimited PASSED

kafka.server.KafkaConfigTest  testLogRollTimeNoConfigProvided PASSED

kafka.server.LogRecoveryTest  testHWCheckpointWithFailuresSingleLogSegment 
PASSED

kafka.server.LogRecoveryTest  testHWCheckpointNoFailuresSingleLogSegment PASSED

kafka.server.LogRecoveryTest  testHWCheckpointNoFailuresMultipleLogSegments 
PASSED

kafka.server.LogRecoveryTest  testHWCheckpointWithFailuresMultipleLogSegments 
PASSED

kafka.server.DelayedOperationTest  testRequestExpiry PASSED

kafka.server.DelayedOperationTest  testRequestSatisfaction PASSED

kafka.server.DelayedOperationTest  testRequestPurge PASSED

kafka.server.AdvertiseBrokerTest  testBrokerAdvertiseToZK PASSED

kafka.server.ReplicaManagerTest  testHighWaterMarkDirectoryMapping PASSED

kafka.server.ReplicaManagerTest  testIllegalRequiredAcks PASSED

kafka.server.ReplicaManagerTest  testHighwaterMarkRelativeDirectoryMapping 
PASSED

kafka.server.ServerGenerateBrokerIdTest  testAutoGenerateBrokerId PASSED

kafka.server.ServerGenerateBrokerIdTest  testUserConfigAndGeneratedBrokerId 
PASSED

kafka.server.ServerGenerateBrokerIdTest  testMultipleLogDirsMetaProps PASSED

kafka.server.ServerGenerateBrokerIdTest  

[jira] [Updated] (KAFKA-1782) Junit3 Misusage

2015-07-13 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1782:
-
Status: In Progress  (was: Patch Available)

 Junit3 Misusage
 ---

 Key: KAFKA-1782
 URL: https://issues.apache.org/jira/browse/KAFKA-1782
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Alexander Pakulov
  Labels: newbie
 Fix For: 0.8.3

 Attachments: KAFKA-1782.patch, KAFKA-1782_2015-06-18_11:52:49.patch


 This is found while I was working on KAFKA-1580: in many of our cases where 
 we explicitly extend from junit3suite (e.g. ProducerFailureHandlingTest), we 
 are actually misusing a bunch of features that only exist in Junit4, such as 
 (expected=classOf). For example, the following code
 {code}
 import org.scalatest.junit.JUnit3Suite
 import org.junit.Test
 import java.io.IOException
 class MiscTest extends JUnit3Suite {
   @Test (expected = classOf[IOException])
   def testSendOffset() {
   }
 }
 {code}
 will actually pass even though IOException was not thrown since this 
 annotation is not supported in Junit3. Whereas
 {code}
 import org.junit._
 import java.io.IOException
 class MiscTest extends JUnit3Suite {
   @Test (expected = classOf[IOException])
   def testSendOffset() {
   }
 }
 {code}
 or
 {code}
 import org.scalatest.junit.JUnitSuite
 import org.junit._
 import java.io.IOException
 class MiscTest extends JUnit3Suite {
   @Test (expected = classOf[IOException])
   def testSendOffset() {
   }
 }
 {code}
 or
 {code}
 import org.junit._
 import java.io.IOException
 class MiscTest {
   @Test (expected = classOf[IOException])
   def testSendOffset() {
   }
 }
 {code}
 will fail.
 I would propose to not rely on Junit annotations other than @Test itself but 
 use scala unit test annotations instead, for example:
 {code}
 import org.junit._
 import java.io.IOException
 class MiscTest {
   @Test
   def testSendOffset() {
 intercept[IOException] {
   //nothing
 }
   }
 }
 {code}
 will fail with a clearer stacktrace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Inquiry regarding unreviewed patch (KAFKA-1614)

2015-07-13 Thread Jisoo Kim
To whom may it concern,

My coworker submitted a patch
https://issues.apache.org/jira/browse/KAFKA-1614 about a year ago, which
enables JMX to report segment information, so the amount of data each
broker has can be calculated through JMX polling.

May I ask the progress on reviewing the patch? I'd like to add a new
feature that does the similar thing to Yahoo's Kafka Manager, and would
greatly appreciate if the patch can be applied to the repo, so Yahoo's
Kafka Manager can display the information when using Kafka with the version
with the patch. Thanks!

Regards,
Jisoo


Re: Inquiry regarding unreviewed patch (KAFKA-1614)

2015-07-13 Thread Jisoo Kim
Also, please let me know if there's a way for another program to know the
amount of data each broker currently holds.

Thanks,
Jisoo

On Mon, Jul 13, 2015 at 4:59 PM, Jisoo Kim jisoo@metamarkets.com
wrote:

 To whom may it concern,

 My coworker submitted a patch
 https://issues.apache.org/jira/browse/KAFKA-1614 about a year ago,
 which enables JMX to report segment information, so the amount of data each
 broker has can be calculated through JMX polling.

 May I ask the progress on reviewing the patch? I'd like to add a new
 feature that does the similar thing to Yahoo's Kafka Manager, and would
 greatly appreciate if the patch can be applied to the repo, so Yahoo's
 Kafka Manager can display the information when using Kafka with the version
 with the patch. Thanks!

 Regards,
 Jisoo



[jira] [Commented] (KAFKA-2214) kafka-reassign-partitions.sh --verify should return non-zero exit codes when reassignment is not completed yet

2015-07-13 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625844#comment-14625844
 ] 

Gwen Shapira commented on KAFKA-2214:
-

Thank you!

Can you also address [~miguno] comment? 
I think your suggestion to return different error code for failed and in 
progress is reasonable.

 kafka-reassign-partitions.sh --verify should return non-zero exit codes when 
 reassignment is not completed yet
 --

 Key: KAFKA-2214
 URL: https://issues.apache.org/jira/browse/KAFKA-2214
 Project: Kafka
  Issue Type: Improvement
  Components: admin
Affects Versions: 0.8.1.1, 0.8.2.0
Reporter: Michael Noll
Assignee: Manikumar Reddy
Priority: Minor
 Attachments: KAFKA-2214.patch, KAFKA-2214_2015-07-10_21:56:04.patch, 
 KAFKA-2214_2015-07-13_21:10:58.patch


 h4. Background
 The admin script {{kafka-reassign-partitions.sh}} should integrate better 
 with automation tools such as Ansible, which rely on scripts adhering to Unix 
 best practices such as appropriate exit codes on success/failure.
 h4. Current behavior (incorrect)
 When reassignments are still in progress {{kafka-reassign-partitions.sh}} 
 prints {{ERROR}} messages but returns an exit code of zero, which indicates 
 success.  This behavior makes it a bit cumbersome to integrate the script 
 into automation tools such as Ansible.
 {code}
 $ kafka-reassign-partitions.sh --zookeeper zookeeper1:2181 
 --reassignment-json-file partitions-to-move.json --verify
 Status of partition reassignment:
 ERROR: Assigned replicas (316,324,311) don't match the list of replicas for 
 reassignment (316,324) for partition [mytopic,2]
 Reassignment of partition [mytopic,0] completed successfully
 Reassignment of partition [myothertopic,1] completed successfully
 Reassignment of partition [myothertopic,3] completed successfully
 ...
 $ echo $?
 0
 # But preferably the exit code in the presence of ERRORs should be, say, 1.
 {code}
 h3. How to improve
 I'd suggest that, using the above as the running example, if there are any 
 {{ERROR}} entries in the output (i.e. if there are any assignments remaining 
 that don't match the desired assignments), then the 
 {{kafka-reassign-partitions.sh}}  should return a non-zero exit code.
 h3. Notes
 In Kafka 0.8.2 the output is a bit different: The ERROR messages are now 
 phrased differently.
 Before:
 {code}
 ERROR: Assigned replicas (316,324,311) don't match the list of replicas for 
 reassignment (316,324) for partition [mytopic,2]
 {code}
 Now:
 {code}
 Reassignment of partition [mytopic,2] is still in progress
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1835) Kafka new producer needs options to make blocking behavior explicit

2015-07-13 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625781#comment-14625781
 ] 

Jiangjie Qin commented on KAFKA-1835:
-

[~ewencp] I agree that handling exception is something users have to do. But 
telling user they are guaranteed to receive exception for a valid configuration 
sounds a bit awkward to me. I think it would be better to only give exception 
to user when there is really something went wrong instead of asking user to 
handle false alarms.
WRT the stale metadata, I agree with you we should let user know immediately if 
a metadata refresh failed (actually from this point of view, we should try to 
fetch metadata from bootstrap servers up on clients instantiation instead of 
doing it later because bootstrap servers might even not connectable), but we 
might want to be very careful on failing send if we can still send them. This 
looks more of a design decision rather than a bug to me. One argument is that 
we should let user know immediately if something goes wrong. On the other hand, 
we want to deliver the message if possible instead of simply dropping them on 
the floor. So maybe we can append the messages but throw an exception saying 
that metadata is outdated.
Also, I think it might worth thinking what kind of exception we want to expose 
to user. For instance, if a partition of a topic is offline, should we throw 
exception in send() or should we just send messages to other available 
partitions. If user were sending keyed messages, the answer would be obvious, 
what if it is sending non-keyed messages?
Thanks for the feedback [~stevenz3wu], I guess in your case you are producing 
messages to a changing topic set. In that case, it is necessary to deal with 
the exception during producing if matadata timeout is set to 0. But for people 
who are producing to a single fixed topic, supposedly metadata should not be 
lost after the first successful metadata fetch. If it is lost then that would 
be a big problem such as partition gets offline.

 Kafka new producer needs options to make blocking behavior explicit
 ---

 Key: KAFKA-1835
 URL: https://issues.apache.org/jira/browse/KAFKA-1835
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Affects Versions: 0.8.2.0, 0.8.3, 0.9.0
Reporter: Paul Pearcy
 Fix For: 0.8.3

 Attachments: KAFKA-1835-New-producer--blocking_v0.patch, 
 KAFKA-1835.patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The new (0.8.2 standalone) producer will block the first time it attempts to 
 retrieve metadata for a topic. This is not the desired behavior in some use 
 cases where async non-blocking guarantees are required and message loss is 
 acceptable in known cases. Also, most developers will assume an API that 
 returns a future is safe to call in a critical request path. 
 Discussing on the mailing list, the most viable option is to have the 
 following settings:
  pre.initialize.topics=x,y,z
  pre.initialize.timeout=x
  
 This moves potential blocking to the init of the producer and outside of some 
 random request. The potential will still exist for blocking in a corner case 
 where connectivity with Kafka is lost and a topic not included in pre-init 
 has a message sent for the first time. 
 There is the question of what to do when initialization fails. There are a 
 couple of options that I'd like available:
 - Fail creation of the client 
 - Fail all sends until the meta is available 
 Open to input on how the above option should be expressed. 
 It is also worth noting more nuanced solutions exist that could work without 
 the extra settings, they just end up having extra complications and at the 
 end of the day not adding much value. For instance, the producer could accept 
 and queue messages(note: more complicated than I am making it sound due to 
 storing all accepted messages in pre-partitioned compact binary form), but 
 you're still going to be forced to choose to either start blocking or 
 dropping messages at some point. 
 I have some test cases I am going to port over to the Kafka producer 
 integration ones and start from there. My current impl is in scala, but 
 porting to Java shouldn't be a big deal (was using a promise to track init 
 status, but will likely need to make that an atomic bool). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1835) Kafka new producer needs options to make blocking behavior explicit

2015-07-13 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625781#comment-14625781
 ] 

Jiangjie Qin edited comment on KAFKA-1835 at 7/14/15 4:06 AM:
--

[~ewencp] I agree that handling exception is something users have to do. But 
telling user they are guaranteed to receive exception for a valid configuration 
sounds a bit awkward to me. I think it would be better to only give exception 
to user when there is really something went wrong instead of asking user to 
handle false alarms.

WRT the stale metadata, I agree with you we should let user know immediately if 
a metadata refresh failed (actually from this point of view, we should try to 
fetch metadata from bootstrap servers up on clients instantiation instead of 
doing it later because bootstrap servers might even not connectable), but we 
might want to be very careful on failing send if we can still send them. This 
looks more of a design decision rather than a bug to me. One argument is that 
we should let user know immediately if something goes wrong. On the other hand, 
we want to deliver the message if possible instead of simply dropping them on 
the floor. So maybe we can append the messages but throw an exception saying 
that metadata is outdated.

Also, I think it might worth thinking what kind of exception we want to expose 
to user. For instance, if a partition of a topic is offline, should we throw 
exception in send() or should we just send messages to other available 
partitions. If user were sending keyed messages, the answer would be obvious, 
what if it is sending non-keyed messages?

Thanks for the feedback [~stevenz3wu], I guess in your case you are producing 
messages to a changing topic set. In that case, it is necessary to deal with 
the exception during producing if matadata timeout is set to 0. But for people 
who are producing to a single fixed topic, supposedly metadata should not be 
lost after the first successful metadata fetch. If it is lost then that would 
be a big problem such as partition gets offline.


was (Author: becket_qin):
[~ewencp] I agree that handling exception is something users have to do. But 
telling user they are guaranteed to receive exception for a valid configuration 
sounds a bit awkward to me. I think it would be better to only give exception 
to user when there is really something went wrong instead of asking user to 
handle false alarms.
WRT the stale metadata, I agree with you we should let user know immediately if 
a metadata refresh failed (actually from this point of view, we should try to 
fetch metadata from bootstrap servers up on clients instantiation instead of 
doing it later because bootstrap servers might even not connectable), but we 
might want to be very careful on failing send if we can still send them. This 
looks more of a design decision rather than a bug to me. One argument is that 
we should let user know immediately if something goes wrong. On the other hand, 
we want to deliver the message if possible instead of simply dropping them on 
the floor. So maybe we can append the messages but throw an exception saying 
that metadata is outdated.
Also, I think it might worth thinking what kind of exception we want to expose 
to user. For instance, if a partition of a topic is offline, should we throw 
exception in send() or should we just send messages to other available 
partitions. If user were sending keyed messages, the answer would be obvious, 
what if it is sending non-keyed messages?
Thanks for the feedback [~stevenz3wu], I guess in your case you are producing 
messages to a changing topic set. In that case, it is necessary to deal with 
the exception during producing if matadata timeout is set to 0. But for people 
who are producing to a single fixed topic, supposedly metadata should not be 
lost after the first successful metadata fetch. If it is lost then that would 
be a big problem such as partition gets offline.

 Kafka new producer needs options to make blocking behavior explicit
 ---

 Key: KAFKA-1835
 URL: https://issues.apache.org/jira/browse/KAFKA-1835
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Affects Versions: 0.8.2.0, 0.8.3, 0.9.0
Reporter: Paul Pearcy
 Fix For: 0.8.3

 Attachments: KAFKA-1835-New-producer--blocking_v0.patch, 
 KAFKA-1835.patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The new (0.8.2 standalone) producer will block the first time it attempts to 
 retrieve metadata for a topic. This is not the desired behavior in some use 
 cases where async non-blocking guarantees are required and message loss is 
 acceptable in known cases. Also, most developers will assume an API that 
 returns a future is safe to call in 

[jira] [Commented] (KAFKA-1835) Kafka new producer needs options to make blocking behavior explicit

2015-07-13 Thread Steven Zhen Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625763#comment-14625763
 ] 

Steven Zhen Wu commented on KAFKA-1835:
---

[~ewencp] As a user, I don't mind handling the error if metadata not available 
or buffer is full. but fail fast and don't block my thread, because API is 
advertised as non-blocking/async.

[~becket_qin] we did a work-around exactly as you described. our goal is never 
block caller thread. so if we have seen metadata for a topic, go to the fast 
lane for calling send directly. otherwise, we put the msg into a queue. then a 
background thread drain the queue and check partitionsFor(...) which can block.

 Kafka new producer needs options to make blocking behavior explicit
 ---

 Key: KAFKA-1835
 URL: https://issues.apache.org/jira/browse/KAFKA-1835
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Affects Versions: 0.8.2.0, 0.8.3, 0.9.0
Reporter: Paul Pearcy
 Fix For: 0.8.3

 Attachments: KAFKA-1835-New-producer--blocking_v0.patch, 
 KAFKA-1835.patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The new (0.8.2 standalone) producer will block the first time it attempts to 
 retrieve metadata for a topic. This is not the desired behavior in some use 
 cases where async non-blocking guarantees are required and message loss is 
 acceptable in known cases. Also, most developers will assume an API that 
 returns a future is safe to call in a critical request path. 
 Discussing on the mailing list, the most viable option is to have the 
 following settings:
  pre.initialize.topics=x,y,z
  pre.initialize.timeout=x
  
 This moves potential blocking to the init of the producer and outside of some 
 random request. The potential will still exist for blocking in a corner case 
 where connectivity with Kafka is lost and a topic not included in pre-init 
 has a message sent for the first time. 
 There is the question of what to do when initialization fails. There are a 
 couple of options that I'd like available:
 - Fail creation of the client 
 - Fail all sends until the meta is available 
 Open to input on how the above option should be expressed. 
 It is also worth noting more nuanced solutions exist that could work without 
 the extra settings, they just end up having extra complications and at the 
 end of the day not adding much value. For instance, the producer could accept 
 and queue messages(note: more complicated than I am making it sound due to 
 storing all accepted messages in pre-partitioned compact binary form), but 
 you're still going to be forced to choose to either start blocking or 
 dropping messages at some point. 
 I have some test cases I am going to port over to the Kafka producer 
 integration ones and start from there. My current impl is in scala, but 
 porting to Java shouldn't be a big deal (was using a promise to track init 
 status, but will likely need to make that an atomic bool). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-242) Subsequent calls of ConsumerConnector.createMessageStreams cause Consumer offset to be incorrect

2015-07-13 Thread Stefan Miklosovic (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624630#comment-14624630
 ] 

Stefan Miklosovic commented on KAFKA-242:
-

[~jkreps] I am hitting the same issues 

I opened JIRA here https://issues.apache.org/jira/browse/KAFKA-2331

I am calling ConsumerConnector.createMessageStreams fastly and it seems that it 
is not handling rebalancing of partitions correctly.

 Subsequent calls of ConsumerConnector.createMessageStreams cause Consumer 
 offset to be incorrect
 

 Key: KAFKA-242
 URL: https://issues.apache.org/jira/browse/KAFKA-242
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 0.7
Reporter: David Arthur
 Attachments: kafka.log


 When calling ConsumerConnector.createMessageStreams in rapid succession, the 
 Consumer offset is incorrectly advanced causing the consumer to lose 
 messages. This seems to happen when createMessageStreams is called before the 
 rebalancing triggered by the previous call to createMessageStreams has 
 completed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] JIRA issue required even for minor/hotfix pull requests?

2015-07-13 Thread Ismael Juma
Hi Joe,

Yes, I am aware of the emails and automatic JIRA updates.

The question is whether a contributor who wants to make a simple change (eg
fix a typo, improve a scaladoc, make a small code improvement) should have
to create a JIRA for it and then submit the PR or if they can just skip the
JIRA step. I will update the following wiki page accordingly once we decide
one way or another:

https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes

Best,
Ismael

On Mon, Jul 13, 2015 at 1:46 PM, Joe Stein joe.st...@stealth.ly wrote:

 Sorry, meant to say 'an email to dev list' instead of 'a JIRA' below. The
 hooks in JIRA comments I have seen working recently.

 ~ Joe Stein

 On Mon, Jul 13, 2015 at 8:42 AM, Joe Stein joe.st...@stealth.ly wrote:

  Ismael,
 
  If you create a pull request on github today then a JIRA is created so
  folks can see and respond and such. The JIRA hooks also provide in
 comment
  updates too.
 
  What issue are you having or looking to-do?
 
  ~ Joe Stein
 
  On Mon, Jul 13, 2015 at 6:52 AM, Ismael Juma ism...@juma.me.uk wrote:
 
  Hi all,
 
  Guozhang raised this topic in the [DISCUSS] Using GitHub Pull Requests
  for
  contributions and code review thread and suggested starting a new
 thread
  for it.
 
  In the Spark project, they say:
 
  If the change is new, then it usually needs a new JIRA. However,
 trivial
  changes, where what should change is virtually the same as how it
  should
  change do not require a JIRA.
  Example: Fix typos in Foo scaladoc.
 
  In such cases, the commit message would be prefixed with [MINOR] or
  [HOTFIX] instead of [KAFKA-xxx].
 
  I can see the pros and cons for each approach.
 
  Always requiring a JIRA ticket makes it more consistent and makes it
  possible to use JIRA as the place to prioritise what needs attention
  (although this is imperfect as code review will take place in the pull
  request and it's likely that JIRA won't always be fully in sync for
  in-progress items).
 
  Skipping JIRA tickets for minor/hotfix pull requests (where the JIRA
  ticket
  just duplicates the information in the pull request) eliminates
 redundant
  work and reduces the barrier to contribution (it is likely that people
  will
  occasionally submit PRs without a JIRA even when the change is too big
 for
  that though).
 
  Guozhang suggested in the original thread:
 
  Personally I think it is better to not enforcing a JIRA ticket for
 minor
  /
  hotfix commits, for example, we can format the title with [MINOR]
 [HOTFIX]
  etc as in Spark
 
  What do others think?
 
  Best,
  Ismael
 
 
 



Re: [DISCUSS] JIRA issue required even for minor/hotfix pull requests?

2015-07-13 Thread Joe Stein
Ismael,

If the patch lives on a pull request and is a simple hotfix a committer
could +1 and commit it. I don't see anything in the
https://cwiki.apache.org/confluence/display/KAFKA/Bylaws preventing this
already now. I guess I am still struggling between what is not setup that
you think we need to get setup or changes that you are looking to make
differently? What are we trying to discuss and decide up in regards to this?

~ Joe Stein

On Mon, Jul 13, 2015 at 8:51 AM, Ismael Juma ism...@juma.me.uk wrote:

 Hi Joe,

 Yes, I am aware of the emails and automatic JIRA updates.

 The question is whether a contributor who wants to make a simple change (eg
 fix a typo, improve a scaladoc, make a small code improvement) should have
 to create a JIRA for it and then submit the PR or if they can just skip the
 JIRA step. I will update the following wiki page accordingly once we decide
 one way or another:

 https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes

 Best,
 Ismael

 On Mon, Jul 13, 2015 at 1:46 PM, Joe Stein joe.st...@stealth.ly wrote:

  Sorry, meant to say 'an email to dev list' instead of 'a JIRA' below. The
  hooks in JIRA comments I have seen working recently.
 
  ~ Joe Stein
 
  On Mon, Jul 13, 2015 at 8:42 AM, Joe Stein joe.st...@stealth.ly wrote:
 
   Ismael,
  
   If you create a pull request on github today then a JIRA is created so
   folks can see and respond and such. The JIRA hooks also provide in
  comment
   updates too.
  
   What issue are you having or looking to-do?
  
   ~ Joe Stein
  
   On Mon, Jul 13, 2015 at 6:52 AM, Ismael Juma ism...@juma.me.uk
 wrote:
  
   Hi all,
  
   Guozhang raised this topic in the [DISCUSS] Using GitHub Pull
 Requests
   for
   contributions and code review thread and suggested starting a new
  thread
   for it.
  
   In the Spark project, they say:
  
   If the change is new, then it usually needs a new JIRA. However,
  trivial
   changes, where what should change is virtually the same as how it
   should
   change do not require a JIRA.
   Example: Fix typos in Foo scaladoc.
  
   In such cases, the commit message would be prefixed with [MINOR] or
   [HOTFIX] instead of [KAFKA-xxx].
  
   I can see the pros and cons for each approach.
  
   Always requiring a JIRA ticket makes it more consistent and makes it
   possible to use JIRA as the place to prioritise what needs attention
   (although this is imperfect as code review will take place in the pull
   request and it's likely that JIRA won't always be fully in sync for
   in-progress items).
  
   Skipping JIRA tickets for minor/hotfix pull requests (where the JIRA
   ticket
   just duplicates the information in the pull request) eliminates
  redundant
   work and reduces the barrier to contribution (it is likely that people
   will
   occasionally submit PRs without a JIRA even when the change is too big
  for
   that though).
  
   Guozhang suggested in the original thread:
  
   Personally I think it is better to not enforcing a JIRA ticket for
  minor
   /
   hotfix commits, for example, we can format the title with [MINOR]
  [HOTFIX]
   etc as in Spark
  
   What do others think?
  
   Best,
   Ismael
  
  
  
 



[jira] [Commented] (KAFKA-2198) kafka-topics.sh exits with 0 status on failures

2015-07-13 Thread Manikumar Reddy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624674#comment-14624674
 ] 

Manikumar Reddy commented on KAFKA-2198:


Updated reviewboard https://reviews.apache.org/r/34403/diff/
 against branch origin/trunk

 kafka-topics.sh exits with 0 status on failures
 ---

 Key: KAFKA-2198
 URL: https://issues.apache.org/jira/browse/KAFKA-2198
 Project: Kafka
  Issue Type: Bug
  Components: admin
Affects Versions: 0.8.2.1
Reporter: Bob Halley
Assignee: Manikumar Reddy
 Attachments: KAFKA-2198.patch, KAFKA-2198_2015-05-19_18:27:01.patch, 
 KAFKA-2198_2015-05-19_18:41:25.patch, KAFKA-2198_2015-07-10_22:02:02.patch, 
 KAFKA-2198_2015-07-10_23:11:23.patch, KAFKA-2198_2015-07-13_19:24:46.patch


 In the two failure cases below, kafka-topics.sh exits with status 0.  You 
 shouldn't need to parse output from the command to know if it failed or not.
 Case 1: Forgetting to add Kafka zookeeper chroot path to zookeeper spec
 $ kafka-topics.sh --alter --topic foo --config min.insync.replicas=2 
 --zookeeper 10.0.0.1  echo succeeded
 succeeded
 Case 2: Bad config option.  (Also, do we really need the java backtrace?  
 It's a lot of noise most of the time.)
 $ kafka-topics.sh --alter --topic foo --config min.insync.replicasTYPO=2 
 --zookeeper 10.0.0.1/kafka  echo succeeded
 Error while executing topic command requirement failed: Unknown configuration 
 min.insync.replicasTYPO.
 java.lang.IllegalArgumentException: requirement failed: Unknown configuration 
 min.insync.replicasTYPO.
 at scala.Predef$.require(Predef.scala:233)
 at kafka.log.LogConfig$$anonfun$validateNames$1.apply(LogConfig.scala:183)
 at kafka.log.LogConfig$$anonfun$validateNames$1.apply(LogConfig.scala:182)
 at scala.collection.Iterator$class.foreach(Iterator.scala:727)
 at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
 at kafka.log.LogConfig$.validateNames(LogConfig.scala:182)
 at kafka.log.LogConfig$.validate(LogConfig.scala:190)
 at 
 kafka.admin.TopicCommand$.parseTopicConfigsToBeAdded(TopicCommand.scala:205)
 at 
 kafka.admin.TopicCommand$$anonfun$alterTopic$1.apply(TopicCommand.scala:103)
 at 
 kafka.admin.TopicCommand$$anonfun$alterTopic$1.apply(TopicCommand.scala:100)
 at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
 at kafka.admin.TopicCommand$.alterTopic(TopicCommand.scala:100)
 at kafka.admin.TopicCommand$.main(TopicCommand.scala:57)
 at kafka.admin.TopicCommand.main(TopicCommand.scala)
 succeeded



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2198) kafka-topics.sh exits with 0 status on failures

2015-07-13 Thread Manikumar Reddy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar Reddy updated KAFKA-2198:
---
Attachment: KAFKA-2198_2015-07-13_19:24:46.patch

 kafka-topics.sh exits with 0 status on failures
 ---

 Key: KAFKA-2198
 URL: https://issues.apache.org/jira/browse/KAFKA-2198
 Project: Kafka
  Issue Type: Bug
  Components: admin
Affects Versions: 0.8.2.1
Reporter: Bob Halley
Assignee: Manikumar Reddy
 Attachments: KAFKA-2198.patch, KAFKA-2198_2015-05-19_18:27:01.patch, 
 KAFKA-2198_2015-05-19_18:41:25.patch, KAFKA-2198_2015-07-10_22:02:02.patch, 
 KAFKA-2198_2015-07-10_23:11:23.patch, KAFKA-2198_2015-07-13_19:24:46.patch


 In the two failure cases below, kafka-topics.sh exits with status 0.  You 
 shouldn't need to parse output from the command to know if it failed or not.
 Case 1: Forgetting to add Kafka zookeeper chroot path to zookeeper spec
 $ kafka-topics.sh --alter --topic foo --config min.insync.replicas=2 
 --zookeeper 10.0.0.1  echo succeeded
 succeeded
 Case 2: Bad config option.  (Also, do we really need the java backtrace?  
 It's a lot of noise most of the time.)
 $ kafka-topics.sh --alter --topic foo --config min.insync.replicasTYPO=2 
 --zookeeper 10.0.0.1/kafka  echo succeeded
 Error while executing topic command requirement failed: Unknown configuration 
 min.insync.replicasTYPO.
 java.lang.IllegalArgumentException: requirement failed: Unknown configuration 
 min.insync.replicasTYPO.
 at scala.Predef$.require(Predef.scala:233)
 at kafka.log.LogConfig$$anonfun$validateNames$1.apply(LogConfig.scala:183)
 at kafka.log.LogConfig$$anonfun$validateNames$1.apply(LogConfig.scala:182)
 at scala.collection.Iterator$class.foreach(Iterator.scala:727)
 at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
 at kafka.log.LogConfig$.validateNames(LogConfig.scala:182)
 at kafka.log.LogConfig$.validate(LogConfig.scala:190)
 at 
 kafka.admin.TopicCommand$.parseTopicConfigsToBeAdded(TopicCommand.scala:205)
 at 
 kafka.admin.TopicCommand$$anonfun$alterTopic$1.apply(TopicCommand.scala:103)
 at 
 kafka.admin.TopicCommand$$anonfun$alterTopic$1.apply(TopicCommand.scala:100)
 at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
 at kafka.admin.TopicCommand$.alterTopic(TopicCommand.scala:100)
 at kafka.admin.TopicCommand$.main(TopicCommand.scala:57)
 at kafka.admin.TopicCommand.main(TopicCommand.scala)
 succeeded



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] JIRA issue required even for minor/hotfix pull requests?

2015-07-13 Thread Ismael Juma
On Mon, Jul 13, 2015 at 2:41 PM, Joe Stein joe.st...@stealth.ly wrote:

 If the patch lives on a pull request and is a simple hotfix a committer
 could +1 and commit it. I don't see anything in the
 https://cwiki.apache.org/confluence/display/KAFKA/Bylaws preventing this
 already now.


Good.


 I guess I am still struggling between what is not setup that
 you think we need to get setup or changes that you are looking to make
 differently? What are we trying to discuss and decide up in regards to
 this?


Nothing needs to be set-up. It's just a matter of agreeing the process so
that the new (in-progress) page for contributing code changes can be
accurate (
https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes).
If you look at http://kafka.apache.org/contributing.html, it says that a
JIRA needs to be created, for example. Also, in the original pull request
thread, Guozhang said the same. If you think differently, it's even more
reason to clarify our position. :)

Best,
Ismael


Re: [Discussion] Limitations on topic names

2015-07-13 Thread Jun Rao
Magnus,

Converting dot to _ essentially is our way of escaping in the scope part of
the metric name. The issue is that your options of escaping is limited due
to the constraints in the reporters. For example, the Ganglia reporter
replaces anything other than alpha-numeric, -, _ and dot to _ in the metric
name. Not sure how well Graphite deals with \ either. For details, take a
look at the discussion in KAFKA-1902. Note that the replacement of dots
only affects the reporters. Dots are preserved in the mbean names.

Thanks,

Jun

On Sun, Jul 12, 2015 at 10:58 PM, Magnus Edenhill mag...@edenhill.se
wrote:

 Hi,

 since dots seem to be a problem on the metrics side, why not let the
 metrics side handle it
 by escaping troublesome characters? E.g. foo.my\.topic.feh
 Let's not push the problem upstream.

 Replacing . with another set of allowed characters __ seems like a bad
 idea since it
 is ambigious: __consumer_offsets == .consumer_offsets?

 I'm guessing the same problem arises if broker names are part of the
 metrics name,
 e.g., broker.192.168.0.2.rxbytes, do we want to push the exclusion of
 dots in IP addresses
 upstream as well? :)

 Magnus


 2015-07-13 2:06 GMT+02:00 Jun Rao j...@confluent.io:

  First, a couple of clarifications on this.
 
  1. Currently, we allow Kafka topic to have dots, except that we disallow
  topic names that are exactly . or .. (which can cause weird problems
  when mapping to file directories and ZK paths as Gwen pointed out).
 
  2. When creating the Coda Hale metrics, currently, we only replace dot
 with
  _ in the scope of the metric name. The actually jmx bean name still
  preserves dot. This is because the Graphite reporter uses scope when
  forming the metric names and assumes dots are component separators (see
  KAFKA-1902 for details). So, if one uses tools like jmxtrans to export
 the
  metrics from the mbeans directly, the original topic name is preserved.
  However, I am not sure how well this maps to Graphite. We thought about
  making the replacing character configurable. However, the difficulty is
  that the logic of doing the replacement is in a singleton
  class KafkaMetricsGroup and I am not sure if we can pass in an external
  config.
 
  Given the above, I'd suggest that customer try the jmxtrans to Graphite
  path and see if that helps. I agree that it's too disruptive to restrict
  the current topic naming convention.
 
  Also, since we plan to replace Coda Hale metrics with Kafka metrics in
 the
  future, we can try to address this issue better then.
 
  Thanks,
 
  Jun
 
 
 
 
  On Sun, Jul 12, 2015 at 10:26 AM, Gwen Shapira gshap...@cloudera.com
  wrote:
 
   I like the lets warn people of conflicts when creating the topic
   suggestion. IMO, automatic topic creation as currently done is buggy
   either way (Send data and hope the topic is ready before retries run
   out, potentially failing with the super helpful NO_LEADER error), so I
   don't mind leaving it broken a bit more. I think the right behavior is
   that conflicts will cause auto creating to fail, the same way we
   currently do when the default number of replicas is higher than number
   of brokers.
  
   One thing that is left confusing is that people in the . camp need
   to know about the conversion or they will fail to find their topics in
   their monitoring tools. Not very nice to them, but I can't think of
   alternatives.
  
   I'll start with the doc patch :)
  
   On Sat, Jul 11, 2015 at 12:54 AM, Ewen Cheslack-Postava
   e...@confluent.io wrote:
On Fri, Jul 10, 2015 at 4:41 PM, Gwen Shapira gshap...@cloudera.com
 
   wrote:
   
Yeah, I have an actual customer who ran into this. Unfortunately,
inconsistencies in the way things are named are pretty common - just
look at Kafka's many CLI options.
   
I don't think that supporting both and pointing at the docs with I
told you so when our metrics break is a good solution.
   
   
I agree, especially since we don't *already* have something in the
 docs
indicating this will be an issue. I was flippant about the situation
because I *wish* there was more careful consideration + naming policy
  in
place, but I realize that doesn't always happen in practice. I guess
 I
   need
to take Compatibility Czar more seriously :)
   
I see think the obvious practical options are as follows:
   
1. Kill support for _. Piss off the entire set of people who
  currently
use _ anywhere in topic names.
2. Kill support for .. Piss off the entire set of people who
  currently
use . anywhere in topic names.
3. Tell people they need to be careful about this issue. Piss off the
  set
of people who use both _ and . *and* happen to have conflicting
  topic
names. They will have some pain when they discover the issue and have
  to
figure out how to move one of those topics over to a non-conflicting
   name.
I'm going to claim that this group must be an *extremely* small
  

[jira] [Commented] (KAFKA-1835) Kafka new producer needs options to make blocking behavior explicit

2015-07-13 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625837#comment-14625837
 ] 

Ewen Cheslack-Postava commented on KAFKA-1835:
--

[~becket_qin] Agreed that guaranteeing an error on first send is awkward. 
That's why I said that behavior would be perversely good behavior, only 
because it forces them to handle that type of error :) Then again, if you do 
something like start a metadata fetch upon instantiation, the time between 
instantiation and first send could be arbitrary, and often times might be 
extremely small. So even starting a fetch then may still result in the same 
error very commonly and wouldn't significantly change the behavior.

Your response to the stale metadata question is interesting because the end 
result is enqueue, but notify of error. I think that is behavior that 
[~stevenz3wu] would probably also be happy with in the case of first send -- 
enqueue the data without partitioning, but notify of the error. Not saying 
that's the *right* solution, just that it's a solution that would be symmetric 
in both cases and satisfy the non-blocking constraint.

The point about unkeyed messages is really interesting -- it's a good point 
that there's really no good reason to indefinitely delay those messages just 
because we chose their partitions arbitrarily and that partition happens to be 
offline. But I'm not sure tracking that subset of messages and separately 
re-partitioning them so they can get sent out is worth the overhead and 
complexity of tracking all that extra info. Then again, if your application is 
only sending unkeyed messages, it could be pretty beneficial to enable 
resending to other topics (and support a random partitioner that ignores 
known-unavailable partitions). In any case, this is a giant tangent (my bad...).

Coming back to the original issue, I think with the proper explanation, the 
behavior of failing on the first send isn't that unintuitive. The short version 
is:
* KafkaProducer will only queue records when it knows the partition (and 
therefore, indirectly, the broker) the data is destined for. When it starts, 
the producer has no information about any topics and therefore cannot enqueue 
any data. Initial requests to send records will fail, but trigger requests for 
this metadata, and after it is received all subsequent send() calls will 
succeed assuming there is enough queue space.

The long version requires explaining that:
* Figuring out which partition a message should be sent to requires some 
information about the topic (such as number of partitions).
* By setting a 0 or very small max.block.ms, you have given us basically no 
time to look this information up.
* Queuing records before we know what partition they are destined for adds an 
extra layer of queuing and complexity.
* If you just created the producer, we've had little time to get the info we 
need. Therefore, to avoid an extra layer of queuing, you will see an error. If 
you are willing to accept a small *potential* delay, which might average XX ms 
for common configurations, you would not normally see this error. If you 
absolutely need to not block for XX ms, then you should handle this error.

I think that in practice, this is probably a good compromise. People who 
*really* understand what's going on can get the behavior they want, but have to 
jump through a couple of extra hoops, including setting the right configs and 
handling errors that most users would be unlikely to see. The vast majority of 
users who don't care about blocking a bit just leave the default settings and 
never notice that the producer blocks on the first send unless they have a 
really long outage where they can't fetch metadata. In other words, while the 
completely non-blocking case isn't ideal, I think since it would require a very 
specific configuration change, it won't affect most users and so the somewhat 
odd behavior is acceptable given clear documentation.


 Kafka new producer needs options to make blocking behavior explicit
 ---

 Key: KAFKA-1835
 URL: https://issues.apache.org/jira/browse/KAFKA-1835
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Affects Versions: 0.8.2.0, 0.8.3, 0.9.0
Reporter: Paul Pearcy
 Fix For: 0.8.3

 Attachments: KAFKA-1835-New-producer--blocking_v0.patch, 
 KAFKA-1835.patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The new (0.8.2 standalone) producer will block the first time it attempts to 
 retrieve metadata for a topic. This is not the desired behavior in some use 
 cases where async non-blocking guarantees are required and message loss is 
 acceptable in known cases. Also, most developers will assume an API that 
 returns a future is safe to call 

Re: Review Request 35867: Patch for KAFKA-1901

2015-07-13 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35867/#review91574
---



clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
(line 207)
https://reviews.apache.org/r/35867/#comment145018

`jmxPrefix, clientId`



clients/src/main/java/org/apache/kafka/common/metrics/JmxReporter.java (line 
152)
https://reviews.apache.org/r/35867/#comment145010

`log.warn(error..., e)` - also, whitespace before ``



clients/src/main/java/org/apache/kafka/common/metrics/JmxReporter.java (line 
164)
https://reviews.apache.org/r/35867/#comment145011

same



clients/src/main/java/org/apache/kafka/common/metrics/JmxReporter.java (line 
246)
https://reviews.apache.org/r/35867/#comment145023

In 
https://issues.apache.org/jira/browse/KAFKA-1901?focusedCommentId=14294803page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14294803
 I was wondering if we could have a commit fingerprint - i.e., the long value 
of the most-significant eight bytes of the commit modulo 10k or something like 
that. This would make is convenient to register as a measurable `KafkaMetric` 
that people can then use in their deployment monitoring. i.e., instantly look 
at a graph and say whether all brokers/clients are running the same version or 
not.



clients/src/main/java/org/apache/kafka/common/utils/AppInfoParser.java (line 27)
https://reviews.apache.org/r/35867/#comment145019

(For consistency) should we make this 40-char wide as is a standard commit 
id? Or we can just go with a eight-char or 16-char wide id for both this and 
the actual commit id.



clients/src/main/java/org/apache/kafka/common/utils/AppInfoParser.java (line 33)
https://reviews.apache.org/r/35867/#comment145020

should probably reference the above constants (ln 26, 27) here instead of 
hardcoding again.



core/src/main/scala/kafka/common/AppInfo.scala (line 24)
https://reviews.apache.org/r/35867/#comment145021

Per the comment in the previous diff, I think this can go now right? i.e., 
kafka server depends on clients so if you browse mbeans you will see two 
app-infos registered (under `kafka.server` and `kafka.common`) which is weird. 
The server will also expose app-info via the clients package since it already 
uses kafka metrics and the associated jmx reporter.


- Joel Koshy


On July 10, 2015, 11:15 a.m., Manikumar Reddy O wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/35867/
 ---
 
 (Updated July 10, 2015, 11:15 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1901
 https://issues.apache.org/jira/browse/KAFKA-1901
 
 
 Repository: kafka
 
 
 Description
 ---
 
 patch after rebase
 
 
 Diffs
 -
 
   build.gradle d86f1a8b25197d53f11e16c54a6854487e175649 
   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
 7aa076084c894bb8f47b9df2c086475b06f47060 
   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
 03b8dd23df63a8d8a117f02eabcce4a2d48c44f7 
   clients/src/main/java/org/apache/kafka/common/metrics/JmxReporter.java 
 6b9590c418aedd2727544c5dd23c017b4b72467a 
   clients/src/main/java/org/apache/kafka/common/utils/AppInfoParser.java 
 PRE-CREATION 
   clients/src/test/java/org/apache/kafka/common/metrics/JmxReporterTest.java 
 07b1b60d3a9cb1a399a2fe95b87229f64f539f3b 
   clients/src/test/java/org/apache/kafka/common/metrics/MetricsTest.java 
 544e120594de78c43581a980b1e4087b4fb98ccb 
   core/src/main/scala/kafka/common/AppInfo.scala 
 d642ca555f83c41451d4fcaa5c01a1f86eff0a1c 
   core/src/main/scala/kafka/server/KafkaServer.scala 
 18917bc4464b9403b16d85d20c3fd4c24893d1d3 
 
 Diff: https://reviews.apache.org/r/35867/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Manikumar Reddy O
 




Re: Review Request 35867: Patch for KAFKA-1901

2015-07-13 Thread Joel Koshy


 On June 25, 2015, 7:01 p.m., Joel Koshy wrote:
  build.gradle, line 386
  https://reviews.apache.org/r/35867/diff/1/?file=991942#file991942line386
 
  I was originally interested in this because it would be a quick way to 
  determine what version someone is running/testing with. However, it is 
  obviously not foolproof since you could have local changes that are 
  staged/unstaged but not committed.
  
  This is probably good enough, but can you think about whether it is 
  possible to easily add a tainted boolean field? i.e., if there are any 
  additional source files that are untracked or staged but not committed?
 
 Manikumar Reddy O wrote:
 It should be posible with some git commands. But do we really need this?  
 most of us will be running stable release or some trunk point.

It may be of marginal use, but the obvious advantage is if someone reports some 
Kafka issue and _happens_ to be doing some local testing on a stable release 
along with some untracked changes.


- Joel


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35867/#review89379
---


On July 10, 2015, 11:15 a.m., Manikumar Reddy O wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/35867/
 ---
 
 (Updated July 10, 2015, 11:15 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1901
 https://issues.apache.org/jira/browse/KAFKA-1901
 
 
 Repository: kafka
 
 
 Description
 ---
 
 patch after rebase
 
 
 Diffs
 -
 
   build.gradle d86f1a8b25197d53f11e16c54a6854487e175649 
   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
 7aa076084c894bb8f47b9df2c086475b06f47060 
   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
 03b8dd23df63a8d8a117f02eabcce4a2d48c44f7 
   clients/src/main/java/org/apache/kafka/common/metrics/JmxReporter.java 
 6b9590c418aedd2727544c5dd23c017b4b72467a 
   clients/src/main/java/org/apache/kafka/common/utils/AppInfoParser.java 
 PRE-CREATION 
   clients/src/test/java/org/apache/kafka/common/metrics/JmxReporterTest.java 
 07b1b60d3a9cb1a399a2fe95b87229f64f539f3b 
   clients/src/test/java/org/apache/kafka/common/metrics/MetricsTest.java 
 544e120594de78c43581a980b1e4087b4fb98ccb 
   core/src/main/scala/kafka/common/AppInfo.scala 
 d642ca555f83c41451d4fcaa5c01a1f86eff0a1c 
   core/src/main/scala/kafka/server/KafkaServer.scala 
 18917bc4464b9403b16d85d20c3fd4c24893d1d3 
 
 Diff: https://reviews.apache.org/r/35867/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Manikumar Reddy O
 




Re: Review Request 34403: Patch for KAFKA-2198

2015-07-13 Thread Gwen Shapira

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34403/#review91579
---

Ship it!


Ship It!

- Gwen Shapira


On July 13, 2015, 1:57 p.m., Manikumar Reddy O wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34403/
 ---
 
 (Updated July 13, 2015, 1:57 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-2198
 https://issues.apache.org/jira/browse/KAFKA-2198
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Addressing Gwen's comments
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/admin/TopicCommand.scala 
 a2ecb9620d647bf8f957a1f00f52896438e804a7 
 
 Diff: https://reviews.apache.org/r/34403/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Manikumar Reddy O
 




[jira] [Updated] (KAFKA-2260) Allow specifying expected offset on produce

2015-07-13 Thread Ben Kirwin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Kirwin updated KAFKA-2260:
--
Status: Patch Available  (was: Open)

Worked up a draft of this over the weekend, implementing the 
cas-on-partition-offset feature outlined in the original post. This is enough 
to support many cases of 'idempotent producer', along with a bunch of other fun 
stuff.

I'm attaching the diff here -- if folks are interested in moving this forward, 
I'll post it to reviewboard as well?

 Allow specifying expected offset on produce
 ---

 Key: KAFKA-2260
 URL: https://issues.apache.org/jira/browse/KAFKA-2260
 Project: Kafka
  Issue Type: Improvement
Reporter: Ben Kirwin
Priority: Minor

 I'd like to propose a change that adds a simple CAS-like mechanism to the 
 Kafka producer. This update has a small footprint, but enables a bunch of 
 interesting uses in stream processing or as a commit log for process state.
 h4. Proposed Change
 In short:
 - Allow the user to attach a specific offset to each message produced.
 - The server assigns offsets to messages in the usual way. However, if the 
 expected offset doesn't match the actual offset, the server should fail the 
 produce request instead of completing the write.
 This is a form of optimistic concurrency control, like the ubiquitous 
 check-and-set -- but instead of checking the current value of some state, it 
 checks the current offset of the log.
 h4. Motivation
 Much like check-and-set, this feature is only useful when there's very low 
 contention. Happily, when Kafka is used as a commit log or as a 
 stream-processing transport, it's common to have just one producer (or a 
 small number) for a given partition -- and in many of these cases, predicting 
 offsets turns out to be quite useful.
 - We get the same benefits as the 'idempotent producer' proposal: a producer 
 can retry a write indefinitely and be sure that at most one of those attempts 
 will succeed; and if two producers accidentally write to the end of the 
 partition at once, we can be certain that at least one of them will fail.
 - It's possible to 'bulk load' Kafka this way -- you can write a list of n 
 messages consecutively to a partition, even if the list is much larger than 
 the buffer size or the producer has to be restarted.
 - If a process is using Kafka as a commit log -- reading from a partition to 
 bootstrap, then writing any updates to that same partition -- it can be sure 
 that it's seen all of the messages in that partition at the moment it does 
 its first (successful) write.
 There's a bunch of other similar use-cases here, but they all have roughly 
 the same flavour.
 h4. Implementation
 The major advantage of this proposal over other suggested transaction / 
 idempotency mechanisms is its minimality: it gives the 'obvious' meaning to a 
 currently-unused field, adds no new APIs, and requires very little new code 
 or additional work from the server.
 - Produced messages already carry an offset field, which is currently ignored 
 by the server. This field could be used for the 'expected offset', with a 
 sigil value for the current behaviour. (-1 is a natural choice, since it's 
 already used to mean 'next available offset'.)
 - We'd need a new error and error code for a 'CAS failure'.
 - The server assigns offsets to produced messages in 
 {{ByteBufferMessageSet.validateMessagesAndAssignOffsets}}. After this 
 changed, this method would assign offsets in the same way -- but if they 
 don't match the offset in the message, we'd return an error instead of 
 completing the write.
 - To avoid breaking existing clients, this behaviour would need to live 
 behind some config flag. (Possibly global, but probably more useful 
 per-topic?)
 I understand all this is unsolicited and possibly strange: happy to answer 
 questions, and if this seems interesting, I'd be glad to flesh this out into 
 a full KIP or patch. (And apologies if this is the wrong venue for this sort 
 of thing!)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2260) Allow specifying expected offset on produce

2015-07-13 Thread Ben Kirwin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Kirwin updated KAFKA-2260:
--
Attachment: expected-offsets.patch

 Allow specifying expected offset on produce
 ---

 Key: KAFKA-2260
 URL: https://issues.apache.org/jira/browse/KAFKA-2260
 Project: Kafka
  Issue Type: Improvement
Reporter: Ben Kirwin
Priority: Minor
 Attachments: expected-offsets.patch


 I'd like to propose a change that adds a simple CAS-like mechanism to the 
 Kafka producer. This update has a small footprint, but enables a bunch of 
 interesting uses in stream processing or as a commit log for process state.
 h4. Proposed Change
 In short:
 - Allow the user to attach a specific offset to each message produced.
 - The server assigns offsets to messages in the usual way. However, if the 
 expected offset doesn't match the actual offset, the server should fail the 
 produce request instead of completing the write.
 This is a form of optimistic concurrency control, like the ubiquitous 
 check-and-set -- but instead of checking the current value of some state, it 
 checks the current offset of the log.
 h4. Motivation
 Much like check-and-set, this feature is only useful when there's very low 
 contention. Happily, when Kafka is used as a commit log or as a 
 stream-processing transport, it's common to have just one producer (or a 
 small number) for a given partition -- and in many of these cases, predicting 
 offsets turns out to be quite useful.
 - We get the same benefits as the 'idempotent producer' proposal: a producer 
 can retry a write indefinitely and be sure that at most one of those attempts 
 will succeed; and if two producers accidentally write to the end of the 
 partition at once, we can be certain that at least one of them will fail.
 - It's possible to 'bulk load' Kafka this way -- you can write a list of n 
 messages consecutively to a partition, even if the list is much larger than 
 the buffer size or the producer has to be restarted.
 - If a process is using Kafka as a commit log -- reading from a partition to 
 bootstrap, then writing any updates to that same partition -- it can be sure 
 that it's seen all of the messages in that partition at the moment it does 
 its first (successful) write.
 There's a bunch of other similar use-cases here, but they all have roughly 
 the same flavour.
 h4. Implementation
 The major advantage of this proposal over other suggested transaction / 
 idempotency mechanisms is its minimality: it gives the 'obvious' meaning to a 
 currently-unused field, adds no new APIs, and requires very little new code 
 or additional work from the server.
 - Produced messages already carry an offset field, which is currently ignored 
 by the server. This field could be used for the 'expected offset', with a 
 sigil value for the current behaviour. (-1 is a natural choice, since it's 
 already used to mean 'next available offset'.)
 - We'd need a new error and error code for a 'CAS failure'.
 - The server assigns offsets to produced messages in 
 {{ByteBufferMessageSet.validateMessagesAndAssignOffsets}}. After this 
 changed, this method would assign offsets in the same way -- but if they 
 don't match the offset in the message, we'd return an error instead of 
 completing the write.
 - To avoid breaking existing clients, this behaviour would need to live 
 behind some config flag. (Possibly global, but probably more useful 
 per-topic?)
 I understand all this is unsolicited and possibly strange: happy to answer 
 questions, and if this seems interesting, I'd be glad to flesh this out into 
 a full KIP or patch. (And apologies if this is the wrong venue for this sort 
 of thing!)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 36244: Patch for KAFKA-2312

2015-07-13 Thread Jason Gustafson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36244/#review91488
---

Ship it!


LGTM

- Jason Gustafson


On July 7, 2015, 5 a.m., Tim Brooks wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36244/
 ---
 
 (Updated July 7, 2015, 5 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-2312
 https://issues.apache.org/jira/browse/KAFKA-2312
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Use an atomic long for the 'light lock' opposed to an atomic reference.
 
 
 Diffs
 -
 
   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
 1f0e51557c4569f0980b72652846b250d00e05d6 
 
 Diff: https://reviews.apache.org/r/36244/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Tim Brooks
 




[jira] [Updated] (KAFKA-2275) Add a ListTopics() API to the new consumer

2015-07-13 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-2275:
-
Description: 
With regex subscription like

{code}
consumer.subscribe(topic*)
{code}

The partition assignment is automatically done at the Kafka side, while there 
are some use cases where consumers want regex subscriptions but not Kafka-side 
partition assignment, rather with their own specific partition assignment. With 
ListTopics() they can periodically check for topic list changes and 
specifically subscribe to the partitions of the new topics.

For implementation, it involves sending a TopicMetadataRequest to a random 
broker and parse the response.

  was:One usecase for this API is for consumers that want specific partition 
assignment with regex subscription. For implementation, it involves sending a 
TopicMetadataRequest to a random broker and parse the response.


 Add a ListTopics() API to the new consumer
 --

 Key: KAFKA-2275
 URL: https://issues.apache.org/jira/browse/KAFKA-2275
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Ashish K Singh
Priority: Critical
 Fix For: 0.8.3


 With regex subscription like
 {code}
 consumer.subscribe(topic*)
 {code}
 The partition assignment is automatically done at the Kafka side, while there 
 are some use cases where consumers want regex subscriptions but not 
 Kafka-side partition assignment, rather with their own specific partition 
 assignment. With ListTopics() they can periodically check for topic list 
 changes and specifically subscribe to the partitions of the new topics.
 For implementation, it involves sending a TopicMetadataRequest to a random 
 broker and parse the response.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [Discussion] Limitations on topic names

2015-07-13 Thread Joel Koshy
One way to get around this conflict could be to replace . with _ and _ with __

On Sat, Jul 11, 2015 at 10:33 AM, Todd Palino tpal...@gmail.com wrote:
 I tend to agree with this as a compromise at this point. The reality is that 
 this is technical debt that has built up in the project, and it does not go 
 away by documenting it, and it will only get worse.

 As pointed out, eliminating either character at this point is going to cause 
 problems for someone. And unfortunately, Guozhang, converting to __ doesn't 
 really solve the problem either because that is still a valid topic name that 
 could collide. It's less likely, but all it does is move the debt around a 
 little.

 -Todd

 On Jul 11, 2015, at 10:16 AM, Brock Noland br...@apache.org wrote:

 On Sat, Jul 11, 2015 at 12:54 AM, Ewen Cheslack-Postava
 e...@confluent.io wrote:
 On Fri, Jul 10, 2015 at 4:41 PM, Gwen Shapira gshap...@cloudera.com wrote:

 Yeah, I have an actual customer who ran into this. Unfortunately,
 inconsistencies in the way things are named are pretty common - just
 look at Kafka's many CLI options.

 I don't think that supporting both and pointing at the docs with I
 told you so when our metrics break is a good solution.

 I agree, especially since we don't *already* have something in the docs
 indicating this will be an issue. I was flippant about the situation
 because I *wish* there was more careful consideration + naming policy in
 place, but I realize that doesn't always happen in practice. I guess I need
 to take Compatibility Czar more seriously :)

 I see think the obvious practical options are as follows:

 1. Kill support for _. Piss off the entire set of people who currently
 use _ anywhere in topic names.
 2. Kill support for .. Piss off the entire set of people who currently
 use . anywhere in topic names.
 3. Tell people they need to be careful about this issue. Piss off the set
 of people who use both _ and . *and* happen to have conflicting topic
 names. They will have some pain when they discover the issue and have to
 figure out how to move one of those topics over to a non-conflicting name.
 I'm going to claim that this group must be an *extremely* small fraction of
 users, which doesn't make it better to allow things to break for them, but
 at least gives us an idea of the scale of impact.

 (One other alternative suggested earlier was encoding metric names to
 account for differences; given the metric renaming mess in the last
 release, I'm extremely hesitant to suggest anything of the sort...)

 None of the options are ideal, but to me, 3 seems like the least painful.
 Both for us, and for the vast majority of users. It seems to me that the
 number of users that would complain about (1) or (2) drastically outweigh
 (3).

 At this point, I don't think it's practical to keep switching the rules
 about which characters are allowed and which aren't because the previous
 attempts haven't been successful -- it seems the rules have changed
 multiple times, whether intentionally or accidentally, such that any more
 changes will cause problems. At this point, I think we just need to accept
 being liberal in accepting the range of topic names that have been
 permitted so far and make the best of the situation, even if it means only
 being able to warn people of conflicts.

 Here's another alternative: how about being liberal with topic name
 characters, but upon topic creation we convert the name to the metric name
 and fail if there's a conflict with another topic? This is relatively
 expensive (requires getting the metric name of all other topics), but it
 avoids the bad situation we're encountering here (conflicting metrics),
 avoids getting into a persistent conflict (we kill topic creation when we
 detect the issue rather than noticing it when the metrics conflict
 happens), and keeps the vast majority of existing users happy (both _ and .
 work in topic names as long as you don't create topics with conflicting
 metric names).

 There are definitely details to be worked out (auto topic creation?), but
 it seems like a more realistic solution than to start disallowing _ or . in
 topic names.

 I was thinking the same. Allow a.b or a_b but not a.b and a_b. This
 seems like it will impact a trivial amount of users and keep both the
 . and _ camps happy.


 -Ewen



 On Fri, Jul 10, 2015 at 4:33 PM, Ewen Cheslack-Postava
 e...@confluent.io wrote:
 I figure you'll probably see complaints no matter what change you make.
 Gwen, given that you raised this, another important question might be how
 many people you see using *both*. I'm guessing this question came up
 because you actually saw a conflict? But I'd imagine (or at least hope)
 that most organizations are mostly consistent about naming topics -- they
 standardize on one or the other.

 Since there's no right way to name them, I'd just leave it supporting
 both and document the potential conflict in metrics. And if people use
 both
 naming schemes, they 

[jira] [Commented] (KAFKA-2162) Kafka Auditing functionality

2015-07-13 Thread Parth Brahmbhatt (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625176#comment-14625176
 ] 

Parth Brahmbhatt commented on KAFKA-2162:
-

[~gwenshap] [~harsha_ch]
I don't completely agree with need to audit session/connection establishments 
and termination. In a secure system with authorization, connecting/establishing 
a session with a server does not buy a client anything unless they have 
authorization on operations so auditing those events don't seem to be useful to 
me. Ddosing based on authentication seems a different story and I don't think 
auditing can really help much in that situation , we should rather rely on 
quotas to prevent something like that from happening to begin with. 

Ticket renewals: Given the server is going to use keytabs (or should use 
keytabs) I think this is also not very useful but I know very little about 
kerberos and it never seizes to surprise me so may be we do need this.

If we want to audit anything more than the authorizer operations we will have 
to provide a pluggable auditor just like authorizer which means another config 
and another interface.


 Kafka Auditing functionality
 

 Key: KAFKA-2162
 URL: https://issues.apache.org/jira/browse/KAFKA-2162
 Project: Kafka
  Issue Type: Bug
Reporter: Sriharsha Chintalapani
Assignee: Parth Brahmbhatt

 During Kafka authorization  discussion thread . There was concerns raised 
 about not having Auditing. Auditing is important functionality but its not 
 part of authorizer. This jira will track adding audit functionality to kafka.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [Discussion] Limitations on topic names

2015-07-13 Thread Joel Koshy
This did come up in the discussion in KAFKA-1902. It is somewhat
concerning that something very specific - in this case (what I think
is a limitation [1]) in certain metric reporters should drive the
decision on what constitutes a legal topic name in Kafka - especially
when all the characters in question actually seem reasonable in a
topic name.

I'm guessing this is not a popular choice simply because these metric
systems are actually popular, but my preference would be to do nothing
here and these users should just avoid such characters in topics.

[1] 
https://issues.apache.org/jira/browse/KAFKA-1902?focusedCommentId=14294733page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14294733

On Mon, Jul 13, 2015 at 07:40:17AM -0700, Jun Rao wrote:
 Magnus,
 
 Converting dot to _ essentially is our way of escaping in the scope part of
 the metric name. The issue is that your options of escaping is limited due
 to the constraints in the reporters. For example, the Ganglia reporter
 replaces anything other than alpha-numeric, -, _ and dot to _ in the metric
 name. Not sure how well Graphite deals with \ either. For details, take a
 look at the discussion in KAFKA-1902. Note that the replacement of dots
 only affects the reporters. Dots are preserved in the mbean names.
 
 Thanks,
 
 Jun
 
 On Sun, Jul 12, 2015 at 10:58 PM, Magnus Edenhill mag...@edenhill.se
 wrote:
 
  Hi,
 
  since dots seem to be a problem on the metrics side, why not let the
  metrics side handle it
  by escaping troublesome characters? E.g. foo.my\.topic.feh
  Let's not push the problem upstream.
 
  Replacing . with another set of allowed characters __ seems like a bad
  idea since it
  is ambigious: __consumer_offsets == .consumer_offsets?
 
  I'm guessing the same problem arises if broker names are part of the
  metrics name,
  e.g., broker.192.168.0.2.rxbytes, do we want to push the exclusion of
  dots in IP addresses
  upstream as well? :)
 
  Magnus
 
 
  2015-07-13 2:06 GMT+02:00 Jun Rao j...@confluent.io:
 
   First, a couple of clarifications on this.
  
   1. Currently, we allow Kafka topic to have dots, except that we disallow
   topic names that are exactly . or .. (which can cause weird problems
   when mapping to file directories and ZK paths as Gwen pointed out).
  
   2. When creating the Coda Hale metrics, currently, we only replace dot
  with
   _ in the scope of the metric name. The actually jmx bean name still
   preserves dot. This is because the Graphite reporter uses scope when
   forming the metric names and assumes dots are component separators (see
   KAFKA-1902 for details). So, if one uses tools like jmxtrans to export
  the
   metrics from the mbeans directly, the original topic name is preserved.
   However, I am not sure how well this maps to Graphite. We thought about
   making the replacing character configurable. However, the difficulty is
   that the logic of doing the replacement is in a singleton
   class KafkaMetricsGroup and I am not sure if we can pass in an external
   config.
  
   Given the above, I'd suggest that customer try the jmxtrans to Graphite
   path and see if that helps. I agree that it's too disruptive to restrict
   the current topic naming convention.
  
   Also, since we plan to replace Coda Hale metrics with Kafka metrics in
  the
   future, we can try to address this issue better then.
  
   Thanks,
  
   Jun
  
  
  
  
   On Sun, Jul 12, 2015 at 10:26 AM, Gwen Shapira gshap...@cloudera.com
   wrote:
  
I like the lets warn people of conflicts when creating the topic
suggestion. IMO, automatic topic creation as currently done is buggy
either way (Send data and hope the topic is ready before retries run
out, potentially failing with the super helpful NO_LEADER error), so I
don't mind leaving it broken a bit more. I think the right behavior is
that conflicts will cause auto creating to fail, the same way we
currently do when the default number of replicas is higher than number
of brokers.
   
One thing that is left confusing is that people in the . camp need
to know about the conversion or they will fail to find their topics in
their monitoring tools. Not very nice to them, but I can't think of
alternatives.
   
I'll start with the doc patch :)
   
On Sat, Jul 11, 2015 at 12:54 AM, Ewen Cheslack-Postava
e...@confluent.io wrote:
 On Fri, Jul 10, 2015 at 4:41 PM, Gwen Shapira gshap...@cloudera.com
  
wrote:

 Yeah, I have an actual customer who ran into this. Unfortunately,
 inconsistencies in the way things are named are pretty common - just
 look at Kafka's many CLI options.

 I don't think that supporting both and pointing at the docs with I
 told you so when our metrics break is a good solution.


 I agree, especially since we don't *already* have something in the
  docs
 indicating this will be an issue. I was flippant about the situation
 

[jira] [Commented] (KAFKA-2275) Add a ListTopics() API to the new consumer

2015-07-13 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625095#comment-14625095
 ] 

Joel Koshy commented on KAFKA-2275:
---

[~onurkaraman] has also been doing some thinking on this - there is _some_ 
overlap (not entirely) with the producer and admin tools and were considering 
refactoring some of this sort of functionality into an admin client layer. It 
may make sense for you guys to sync-up either on this jira or offline (and 
follow-up on this jira afterward).

 Add a ListTopics() API to the new consumer
 --

 Key: KAFKA-2275
 URL: https://issues.apache.org/jira/browse/KAFKA-2275
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Ashish K Singh
Priority: Critical
 Fix For: 0.8.3


 With regex subscription like
 {code}
 consumer.subscribe(topic*)
 {code}
 The partition assignment is automatically done at the Kafka side, while there 
 are some use cases where consumers want regex subscriptions but not 
 Kafka-side partition assignment, rather with their own specific partition 
 assignment. With ListTopics() they can periodically check for topic list 
 changes and specifically subscribe to the partitions of the new topics.
 For implementation, it involves sending a TopicMetadataRequest to a random 
 broker and parse the response.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2275) Add a ListTopic() API to the new consumer

2015-07-13 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624955#comment-14624955
 ] 

Guozhang Wang commented on KAFKA-2275:
--

[~sslavic] Yes you are right, changed the name.

[~singhashish] Thanks, and please feel free to take it, one note though is that 
[~hachikuji] is working on KAFKA-2123 which is refactoring the consumer-side 
code lot, so your work may need to rebase on this ticket later on.

 Add a ListTopic() API to the new consumer
 -

 Key: KAFKA-2275
 URL: https://issues.apache.org/jira/browse/KAFKA-2275
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Ashish K Singh
Priority: Critical
 Fix For: 0.8.3


 One usecase for this API is for consumers that want specific partition 
 assignment with regex subscription. For implementation, it involves sending a 
 TopicMetadataRequest to a random broker and parse the response.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2145) An option to add topic owners.

2015-07-13 Thread Parth Brahmbhatt (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624975#comment-14624975
 ] 

Parth Brahmbhatt commented on KAFKA-2145:
-

@singhashish Given I am the original reporter I have some context on this. Do 
you mind if I take this one over? I have a few more jiras assigned to me most 
of which are blocked for one reason or another so I have some time that I could 
allocate to this one. 

 An option to add topic owners. 
 ---

 Key: KAFKA-2145
 URL: https://issues.apache.org/jira/browse/KAFKA-2145
 Project: Kafka
  Issue Type: Improvement
Reporter: Parth Brahmbhatt
Assignee: Ashish K Singh

 We need to expose a way so users can identify users/groups that share 
 ownership of topic. We discussed adding this as part of 
 https://issues.apache.org/jira/browse/KAFKA-2035 and agreed that it will be 
 simpler to add owner as a logconfig. 
 The owner field can be used for auditing and also by authorization layer to 
 grant access without having to explicitly configure acls. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1788) producer record can stay in RecordAccumulator forever if leader is no available

2015-07-13 Thread Parth Brahmbhatt (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624873#comment-14624873
 ] 

Parth Brahmbhatt commented on KAFKA-1788:
-

[~becket_qin] So is this jira irrelavant at this point? If yes can I resolve 
it? If no, can you describe what needs to be done? I know you had a KIP and 
multiple discussions but I am not sure if you are taking of it as part of 
KAFKA-2142. I will be happy to continue working on this jira if you can 
describe what needs to be done.

 producer record can stay in RecordAccumulator forever if leader is no 
 available
 ---

 Key: KAFKA-1788
 URL: https://issues.apache.org/jira/browse/KAFKA-1788
 Project: Kafka
  Issue Type: Bug
  Components: core, producer 
Affects Versions: 0.8.2.0
Reporter: Jun Rao
Assignee: Parth Brahmbhatt
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1788.patch, KAFKA-1788_2015-01-06_13:42:37.patch, 
 KAFKA-1788_2015-01-06_13:44:41.patch


 In the new producer, when a partition has no leader for a long time (e.g., 
 all replicas are down), the records for that partition will stay in the 
 RecordAccumulator until the leader is available. This may cause the 
 bufferpool to be full and the callback for the produced message to block for 
 a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1788) producer record can stay in RecordAccumulator forever if leader is no available

2015-07-13 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624933#comment-14624933
 ] 

Mayuresh Gharat commented on KAFKA-1788:


Hi [~parth.brahmbhatt], this is been handled as a part of KIP-19.
Jira : https://issues.apache.org/jira/browse/KAFKA-2120


Thanks,

Mayuresh

 producer record can stay in RecordAccumulator forever if leader is no 
 available
 ---

 Key: KAFKA-1788
 URL: https://issues.apache.org/jira/browse/KAFKA-1788
 Project: Kafka
  Issue Type: Bug
  Components: core, producer 
Affects Versions: 0.8.2.0
Reporter: Jun Rao
Assignee: Parth Brahmbhatt
  Labels: newbie++
 Fix For: 0.8.3

 Attachments: KAFKA-1788.patch, KAFKA-1788_2015-01-06_13:42:37.patch, 
 KAFKA-1788_2015-01-06_13:44:41.patch


 In the new producer, when a partition has no leader for a long time (e.g., 
 all replicas are down), the records for that partition will stay in the 
 RecordAccumulator until the leader is available. This may cause the 
 bufferpool to be full and the callback for the produced message to block for 
 a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-2145) An option to add topic owners.

2015-07-13 Thread Parth Brahmbhatt (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624975#comment-14624975
 ] 

Parth Brahmbhatt edited comment on KAFKA-2145 at 7/13/15 5:18 PM:
--

[~singhashish] Given I am the original reporter I have some context on this. Do 
you mind if I take this one over? I have a few more jiras assigned to me most 
of which are blocked for one reason or another so I have some time that I could 
allocate to this one. 


was (Author: parth.brahmbhatt):
@singhashish Given I am the original reporter I have some context on this. Do 
you mind if I take this one over? I have a few more jiras assigned to me most 
of which are blocked for one reason or another so I have some time that I could 
allocate to this one. 

 An option to add topic owners. 
 ---

 Key: KAFKA-2145
 URL: https://issues.apache.org/jira/browse/KAFKA-2145
 Project: Kafka
  Issue Type: Improvement
Reporter: Parth Brahmbhatt
Assignee: Ashish K Singh

 We need to expose a way so users can identify users/groups that share 
 ownership of topic. We discussed adding this as part of 
 https://issues.apache.org/jira/browse/KAFKA-2035 and agreed that it will be 
 simpler to add owner as a logconfig. 
 The owner field can be used for auditing and also by authorization layer to 
 grant access without having to explicitly configure acls. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] JIRA issue required even for minor/hotfix pull requests?

2015-07-13 Thread Guozhang Wang
Joe,

I think the issue that Ismael want to raise discussion for is that today,
we are unofficially sticking with JIRA tickets for all of our commits
(i.e. it is not enforced in bylaws but we are doing it anyways), for
example, following today's RB-based review process people are creating
JIRAs for typo fixes as well:

https://issues.apache.org/jira/browse/KAFKA-1957?jql=project%20%3D%20KAFKA%20AND%20text%20~%20%22typo%22

Now we are trying to migrate from RB to PR, in the proposed wiki (
https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes)
it is suggested people creating their PR with [KAFKA-XXX] as title prefix,
so effectively suggesting we will enforce it, while on the same page we are
also following Spark's statement that if it is minor you do not need to
create a JIRA, so I was discussing with Ismael that we should clear this
confusion and clarify which approach we should really pursue, whether
changing the statement in wiki that you could create a PR with
[KAFKA-] or [MINOR], [HOTFIX], etc or sticking that you should always
create a JIRA and make the PR title accordingly.

Guozhang


On Mon, Jul 13, 2015 at 6:54 AM, Ismael Juma ism...@juma.me.uk wrote:

 On Mon, Jul 13, 2015 at 2:41 PM, Joe Stein joe.st...@stealth.ly wrote:

  If the patch lives on a pull request and is a simple hotfix a committer
  could +1 and commit it. I don't see anything in the
  https://cwiki.apache.org/confluence/display/KAFKA/Bylaws preventing this
  already now.


 Good.


  I guess I am still struggling between what is not setup that
  you think we need to get setup or changes that you are looking to make
  differently? What are we trying to discuss and decide up in regards to
  this?
 

 Nothing needs to be set-up. It's just a matter of agreeing the process so
 that the new (in-progress) page for contributing code changes can be
 accurate (
 https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes
 ).
 If you look at http://kafka.apache.org/contributing.html, it says that a
 JIRA needs to be created, for example. Also, in the original pull request
 thread, Guozhang said the same. If you think differently, it's even more
 reason to clarify our position. :)

 Best,
 Ismael




-- 
-- Guozhang


[jira] [Updated] (KAFKA-2275) Add a ListTopics() API to the new consumer

2015-07-13 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-2275:
-
Summary: Add a ListTopics() API to the new consumer  (was: Add a 
ListTopic() API to the new consumer)

 Add a ListTopics() API to the new consumer
 --

 Key: KAFKA-2275
 URL: https://issues.apache.org/jira/browse/KAFKA-2275
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Ashish K Singh
Priority: Critical
 Fix For: 0.8.3


 One usecase for this API is for consumers that want specific partition 
 assignment with regex subscription. For implementation, it involves sending a 
 TopicMetadataRequest to a random broker and parse the response.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2275) Add a ListTopic() API to the new consumer

2015-07-13 Thread Ashish K Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624793#comment-14624793
 ] 

Ashish K Singh commented on KAFKA-2275:
---

If no one is already working on this, I can take it. Assigning it to myself for 
now. Feel free to re-assign if someone has started working on this.

 Add a ListTopic() API to the new consumer
 -

 Key: KAFKA-2275
 URL: https://issues.apache.org/jira/browse/KAFKA-2275
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Priority: Critical
 Fix For: 0.8.3


 One usecase for this API is for consumers that want specific partition 
 assignment with regex subscription. For implementation, it involves sending a 
 TopicMetadataRequest to a random broker and parse the response.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-2275) Add a ListTopic() API to the new consumer

2015-07-13 Thread Ashish K Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish K Singh reassigned KAFKA-2275:
-

Assignee: Ashish K Singh

 Add a ListTopic() API to the new consumer
 -

 Key: KAFKA-2275
 URL: https://issues.apache.org/jira/browse/KAFKA-2275
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Ashish K Singh
Priority: Critical
 Fix For: 0.8.3


 One usecase for this API is for consumers that want specific partition 
 assignment with regex subscription. For implementation, it involves sending a 
 TopicMetadataRequest to a random broker and parse the response.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2214) kafka-reassign-partitions.sh --verify should return non-zero exit codes when reassignment is not completed yet

2015-07-13 Thread Manikumar Reddy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624826#comment-14624826
 ] 

Manikumar Reddy commented on KAFKA-2214:


Sample output:

{code}
 sh kafka-reassign-partitions.sh --zookeeper localhost:2181 
--reassignment-json-file /tmp/expand.json --verify  echo succeeded
Status of partition reassignment:
ERROR: Assigned replicas (1) don't match the list of replicas for reassignment 
(11) for partition [EVENT,0]
Reassignment of partition [EVENT,0] failed
Partitions reassignment failed due to : Reassignment failed for some partitions
[2015-07-13 21:14:32,226] ERROR kafka.common.AdminCommandFailedException: 
Reassignment failed for some partitions
at 
kafka.admin.ReassignPartitionsCommand$.verifyAssignment(ReassignPartitionsCommand.scala:85)
at 
kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:45)
at 
kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala)
 (kafka.admin.ReassignPartitionsCommand$)
{code}

 kafka-reassign-partitions.sh --verify should return non-zero exit codes when 
 reassignment is not completed yet
 --

 Key: KAFKA-2214
 URL: https://issues.apache.org/jira/browse/KAFKA-2214
 Project: Kafka
  Issue Type: Improvement
  Components: admin
Affects Versions: 0.8.1.1, 0.8.2.0
Reporter: Michael Noll
Assignee: Manikumar Reddy
Priority: Minor
 Attachments: KAFKA-2214.patch, KAFKA-2214_2015-07-10_21:56:04.patch, 
 KAFKA-2214_2015-07-13_21:10:58.patch


 h4. Background
 The admin script {{kafka-reassign-partitions.sh}} should integrate better 
 with automation tools such as Ansible, which rely on scripts adhering to Unix 
 best practices such as appropriate exit codes on success/failure.
 h4. Current behavior (incorrect)
 When reassignments are still in progress {{kafka-reassign-partitions.sh}} 
 prints {{ERROR}} messages but returns an exit code of zero, which indicates 
 success.  This behavior makes it a bit cumbersome to integrate the script 
 into automation tools such as Ansible.
 {code}
 $ kafka-reassign-partitions.sh --zookeeper zookeeper1:2181 
 --reassignment-json-file partitions-to-move.json --verify
 Status of partition reassignment:
 ERROR: Assigned replicas (316,324,311) don't match the list of replicas for 
 reassignment (316,324) for partition [mytopic,2]
 Reassignment of partition [mytopic,0] completed successfully
 Reassignment of partition [myothertopic,1] completed successfully
 Reassignment of partition [myothertopic,3] completed successfully
 ...
 $ echo $?
 0
 # But preferably the exit code in the presence of ERRORs should be, say, 1.
 {code}
 h3. How to improve
 I'd suggest that, using the above as the running example, if there are any 
 {{ERROR}} entries in the output (i.e. if there are any assignments remaining 
 that don't match the desired assignments), then the 
 {{kafka-reassign-partitions.sh}}  should return a non-zero exit code.
 h3. Notes
 In Kafka 0.8.2 the output is a bit different: The ERROR messages are now 
 phrased differently.
 Before:
 {code}
 ERROR: Assigned replicas (316,324,311) don't match the list of replicas for 
 reassignment (316,324) for partition [mytopic,2]
 {code}
 Now:
 {code}
 Reassignment of partition [mytopic,2] is still in progress
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-2145) An option to add topic owners.

2015-07-13 Thread Ashish K Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish K Singh reassigned KAFKA-2145:
-

Assignee: Ashish K Singh  (was: Neelesh Srinivas Salian)

 An option to add topic owners. 
 ---

 Key: KAFKA-2145
 URL: https://issues.apache.org/jira/browse/KAFKA-2145
 Project: Kafka
  Issue Type: Improvement
Reporter: Parth Brahmbhatt
Assignee: Ashish K Singh

 We need to expose a way so users can identify users/groups that share 
 ownership of topic. We discussed adding this as part of 
 https://issues.apache.org/jira/browse/KAFKA-2035 and agreed that it will be 
 simpler to add owner as a logconfig. 
 The owner field can be used for auditing and also by authorization layer to 
 grant access without having to explicitly configure acls. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34641: Patch for KAFKA-2214

2015-07-13 Thread Manikumar Reddy O

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34641/
---

(Updated July 13, 2015, 3:43 p.m.)


Review request for kafka.


Bugs: KAFKA-2214
https://issues.apache.org/jira/browse/KAFKA-2214


Repository: kafka


Description (updated)
---

Addressing Gwen's comments


Diffs (updated)
-

  core/src/main/scala/kafka/admin/ReassignPartitionsCommand.scala 
ea345895a52977c25bff57e95e12b8662331d7fe 

Diff: https://reviews.apache.org/r/34641/diff/


Testing
---


Thanks,

Manikumar Reddy O



[jira] [Updated] (KAFKA-2214) kafka-reassign-partitions.sh --verify should return non-zero exit codes when reassignment is not completed yet

2015-07-13 Thread Manikumar Reddy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar Reddy updated KAFKA-2214:
---
Attachment: KAFKA-2214_2015-07-13_21:10:58.patch

 kafka-reassign-partitions.sh --verify should return non-zero exit codes when 
 reassignment is not completed yet
 --

 Key: KAFKA-2214
 URL: https://issues.apache.org/jira/browse/KAFKA-2214
 Project: Kafka
  Issue Type: Improvement
  Components: admin
Affects Versions: 0.8.1.1, 0.8.2.0
Reporter: Michael Noll
Assignee: Manikumar Reddy
Priority: Minor
 Attachments: KAFKA-2214.patch, KAFKA-2214_2015-07-10_21:56:04.patch, 
 KAFKA-2214_2015-07-13_21:10:58.patch


 h4. Background
 The admin script {{kafka-reassign-partitions.sh}} should integrate better 
 with automation tools such as Ansible, which rely on scripts adhering to Unix 
 best practices such as appropriate exit codes on success/failure.
 h4. Current behavior (incorrect)
 When reassignments are still in progress {{kafka-reassign-partitions.sh}} 
 prints {{ERROR}} messages but returns an exit code of zero, which indicates 
 success.  This behavior makes it a bit cumbersome to integrate the script 
 into automation tools such as Ansible.
 {code}
 $ kafka-reassign-partitions.sh --zookeeper zookeeper1:2181 
 --reassignment-json-file partitions-to-move.json --verify
 Status of partition reassignment:
 ERROR: Assigned replicas (316,324,311) don't match the list of replicas for 
 reassignment (316,324) for partition [mytopic,2]
 Reassignment of partition [mytopic,0] completed successfully
 Reassignment of partition [myothertopic,1] completed successfully
 Reassignment of partition [myothertopic,3] completed successfully
 ...
 $ echo $?
 0
 # But preferably the exit code in the presence of ERRORs should be, say, 1.
 {code}
 h3. How to improve
 I'd suggest that, using the above as the running example, if there are any 
 {{ERROR}} entries in the output (i.e. if there are any assignments remaining 
 that don't match the desired assignments), then the 
 {{kafka-reassign-partitions.sh}}  should return a non-zero exit code.
 h3. Notes
 In Kafka 0.8.2 the output is a bit different: The ERROR messages are now 
 phrased differently.
 Before:
 {code}
 ERROR: Assigned replicas (316,324,311) don't match the list of replicas for 
 reassignment (316,324) for partition [mytopic,2]
 {code}
 Now:
 {code}
 Reassignment of partition [mytopic,2] is still in progress
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34403: Patch for KAFKA-2198

2015-07-13 Thread Manikumar Reddy O


 On July 10, 2015, 4:46 p.m., Gwen Shapira wrote:
  core/src/main/scala/kafka/admin/TopicCommand.scala, lines 72-73
  https://reviews.apache.org/r/34403/diff/4/?file=1008271#file1008271line72
 
  This is a bit unclean. I think its more idiomatic when the catch block 
  includes the System.exit(1).
  
  Also, I'm afraid that printing the entire stack trace is intimidating 
  to non-developers who use the CLI. Perhaps the stack trace should go under 
  log.error(...)?
 
 Manikumar Reddy O wrote:
 Calling System.exit(1) in catch block results unexecuted finally block. 
 
 
 http://stackoverflow.com/questions/1410951/how-does-javas-system-exit-work-with-try-catch-finally-blocks
 
 http://stackoverflow.com/questions/15264076/regarding-excuting-finally-block-in-system-exit-case-also-by-adding-shutdownhook?lq=1
 
 log.error() used for printing stackTrace.
 
 Gwen Shapira wrote:
 Good point. The problem is that the code here is very explicit about what 
 we do when an exception occured, but doesn't show what we do when an 
 exception doesn't occure. Putting if ... else ...  at this point duplicates 
 the try ... catch...  logic right above it.
 
 How about modifying these lines to System.exit(exitCode) in the finally 
 clause and setting the value of exitCode in the try and catch clauses? This 
 will also allow us to support multiple exit codes cleanly in the future.

Thank for the suggestion. Uploaded a new patch with relevant chnages.


- Manikumar Reddy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34403/#review91310
---


On July 13, 2015, 1:57 p.m., Manikumar Reddy O wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/34403/
 ---
 
 (Updated July 13, 2015, 1:57 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-2198
 https://issues.apache.org/jira/browse/KAFKA-2198
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Addressing Gwen's comments
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/admin/TopicCommand.scala 
 a2ecb9620d647bf8f957a1f00f52896438e804a7 
 
 Diff: https://reviews.apache.org/r/34403/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Manikumar Reddy O
 




[jira] [Commented] (KAFKA-2214) kafka-reassign-partitions.sh --verify should return non-zero exit codes when reassignment is not completed yet

2015-07-13 Thread Manikumar Reddy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624819#comment-14624819
 ] 

Manikumar Reddy commented on KAFKA-2214:


Updated reviewboard https://reviews.apache.org/r/34641/diff/
 against branch origin/trunk

 kafka-reassign-partitions.sh --verify should return non-zero exit codes when 
 reassignment is not completed yet
 --

 Key: KAFKA-2214
 URL: https://issues.apache.org/jira/browse/KAFKA-2214
 Project: Kafka
  Issue Type: Improvement
  Components: admin
Affects Versions: 0.8.1.1, 0.8.2.0
Reporter: Michael Noll
Assignee: Manikumar Reddy
Priority: Minor
 Attachments: KAFKA-2214.patch, KAFKA-2214_2015-07-10_21:56:04.patch, 
 KAFKA-2214_2015-07-13_21:10:58.patch


 h4. Background
 The admin script {{kafka-reassign-partitions.sh}} should integrate better 
 with automation tools such as Ansible, which rely on scripts adhering to Unix 
 best practices such as appropriate exit codes on success/failure.
 h4. Current behavior (incorrect)
 When reassignments are still in progress {{kafka-reassign-partitions.sh}} 
 prints {{ERROR}} messages but returns an exit code of zero, which indicates 
 success.  This behavior makes it a bit cumbersome to integrate the script 
 into automation tools such as Ansible.
 {code}
 $ kafka-reassign-partitions.sh --zookeeper zookeeper1:2181 
 --reassignment-json-file partitions-to-move.json --verify
 Status of partition reassignment:
 ERROR: Assigned replicas (316,324,311) don't match the list of replicas for 
 reassignment (316,324) for partition [mytopic,2]
 Reassignment of partition [mytopic,0] completed successfully
 Reassignment of partition [myothertopic,1] completed successfully
 Reassignment of partition [myothertopic,3] completed successfully
 ...
 $ echo $?
 0
 # But preferably the exit code in the presence of ERRORs should be, say, 1.
 {code}
 h3. How to improve
 I'd suggest that, using the above as the running example, if there are any 
 {{ERROR}} entries in the output (i.e. if there are any assignments remaining 
 that don't match the desired assignments), then the 
 {{kafka-reassign-partitions.sh}}  should return a non-zero exit code.
 h3. Notes
 In Kafka 0.8.2 the output is a bit different: The ERROR messages are now 
 phrased differently.
 Before:
 {code}
 ERROR: Assigned replicas (316,324,311) don't match the list of replicas for 
 reassignment (316,324) for partition [mytopic,2]
 {code}
 Now:
 {code}
 Reassignment of partition [mytopic,2] is still in progress
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2077) Add ability to specify a TopicPicker class for KafkaLog4jApender

2015-07-13 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624895#comment-14624895
 ] 

Jason Gustafson commented on KAFKA-2077:


[~benoyantony], I think KafkaLog4jAppender was moved to a separate module in 
KAFKA-2132. Can you update your patch?

 Add ability to specify a TopicPicker class for KafkaLog4jApender
 

 Key: KAFKA-2077
 URL: https://issues.apache.org/jira/browse/KAFKA-2077
 Project: Kafka
  Issue Type: Improvement
  Components: producer 
Reporter: Benoy Antony
Assignee: Jun Rao
 Attachments: KAFKA-2077.patch, kafka-2077-001.patch


 KafkaLog4jAppender is the Log4j Appender to publish messages to Kafka. 

 Currently , a topic name has to be passed as a parameter. In some use cases, 
 it may be required to use a different topics for the same appender instance. 

 So it may be beneficial to enable KafkaLog4jAppender to accept TopicClass 
 which will provide a topic for a given message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2312) Use AtomicLong opposed to AtomicReference to store currentThread in consumer

2015-07-13 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625257#comment-14625257
 ] 

Guozhang Wang commented on KAFKA-2312:
--

Thanks for the patch, committed to trunk.

 Use AtomicLong opposed to AtomicReference to store currentThread in consumer
 

 Key: KAFKA-2312
 URL: https://issues.apache.org/jira/browse/KAFKA-2312
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Reporter: Tim Brooks
Assignee: Tim Brooks
Priority: Minor
 Fix For: 0.8.3

 Attachments: KAFKA-2312.patch


 When a thread id is returned by Thread.currentThread().getId() it is a 
 primitive. Storing it in an AtomicReference requires boxing and additional 
 indirection.
 An AtomicLong seems more natural to store a long. 
 The current implementation relies on knowing that null means no owner. Since 
 thread ids are always positive (specified in javadoc), it is possible to 
 create a constant NO_CURRENT_THREAD for -1. Which allows the usage of an 
 AtomicLong and makes the functionality explicit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2312) Use AtomicLong opposed to AtomicReference to store currentThread in consumer

2015-07-13 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-2312:
-
   Resolution: Fixed
Fix Version/s: 0.8.3
   Status: Resolved  (was: Patch Available)

 Use AtomicLong opposed to AtomicReference to store currentThread in consumer
 

 Key: KAFKA-2312
 URL: https://issues.apache.org/jira/browse/KAFKA-2312
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Reporter: Tim Brooks
Assignee: Tim Brooks
Priority: Minor
 Fix For: 0.8.3

 Attachments: KAFKA-2312.patch


 When a thread id is returned by Thread.currentThread().getId() it is a 
 primitive. Storing it in an AtomicReference requires boxing and additional 
 indirection.
 An AtomicLong seems more natural to store a long. 
 The current implementation relies on knowing that null means no owner. Since 
 thread ids are always positive (specified in javadoc), it is possible to 
 create a constant NO_CURRENT_THREAD for -1. Which allows the usage of an 
 AtomicLong and makes the functionality explicit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33378: Patch for KAFKA-2136

2015-07-13 Thread Aditya Auradkar


 On July 10, 2015, 5:49 p.m., Joel Koshy wrote:
  core/src/main/scala/kafka/server/DelayedFetch.scala, line 135
  https://reviews.apache.org/r/33378/diff/9/?file=996359#file996359line135
 
  For these, I'm wondering if we should put in the actual delay and in 
  KAFKA-2136 just add a config to enable/disable quotas altogether.

Hey Joel.. can you elaborate? The actual delay isn't being computed in this 
patch.


- Aditya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33378/#review91319
---


On July 1, 2015, 2:44 a.m., Aditya Auradkar wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33378/
 ---
 
 (Updated July 1, 2015, 2:44 a.m.)
 
 
 Review request for kafka, Joel Koshy and Jun Rao.
 
 
 Bugs: KAFKA-2136
 https://issues.apache.org/jira/browse/KAFKA-2136
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Changes are
 - protocol changes to the fetch request and response to return the 
 throttle_time_ms to clients
 - New producer/consumer metrics to expose the avg and max delay time for a 
 client
 - Test cases.
 - Addressed Joel's comments
 
 For now the patch will publish a zero delay and return a response
 
 
 Diffs
 -
 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
  56281ee15cc33dfc96ff64d5b1e596047c7132a4 
   
 clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
 07e65d4a54ba4eef5b787eba3e71cbe9f6a920bd 
   clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
 3dc8b015afd2347a41c9a9dbc02b8e367da5f75f 
   clients/src/main/java/org/apache/kafka/common/requests/FetchRequest.java 
 8686d83aa52e435c6adafbe9ff4bd1602281072a 
   clients/src/main/java/org/apache/kafka/common/requests/FetchResponse.java 
 eb8951fba48c335095cc43fc3672de1c733e07ff 
   clients/src/main/java/org/apache/kafka/common/requests/ProduceRequest.java 
 fabeae3083a8ea55cdacbb9568f3847ccd85bab4 
   clients/src/main/java/org/apache/kafka/common/requests/ProduceResponse.java 
 37ec0b79beafcf5735c386b066eb319fb697eff5 
   
 clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
  419541011d652becf0cda7a5e62ce813cddb1732 
   
 clients/src/test/java/org/apache/kafka/clients/producer/internals/SenderTest.java
  8b1805d3d2bcb9fe2bacb37d870c3236aa9532c4 
   
 clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java
  e3cc1967e407b64cc734548c19e30de700b64ba8 
   core/src/main/scala/kafka/api/FetchRequest.scala 
 5b38f8554898e54800abd65a7415dd0ac41fd958 
   core/src/main/scala/kafka/api/FetchResponse.scala 
 0b6b33ab6f7a732ff1322b6f48abd4c43e0d7215 
   core/src/main/scala/kafka/api/ProducerRequest.scala 
 c866180d3680da03e48d374415f10220f6ca68c4 
   core/src/main/scala/kafka/api/ProducerResponse.scala 
 5d1fac4cb8943f5bfaa487f8e9d9d2856efbd330 
   core/src/main/scala/kafka/consumer/SimpleConsumer.scala 
 c16f7edd322709060e54c77eb505c44cbd77a4ec 
   core/src/main/scala/kafka/server/AbstractFetcherThread.scala 
 83fc47417dd7fe4edf030217fa7fd69d99b170b0 
   core/src/main/scala/kafka/server/DelayedFetch.scala 
 de6cf5bdaa0e70394162febc63b50b55ca0a92db 
   core/src/main/scala/kafka/server/DelayedProduce.scala 
 05078b24ef28f2f4e099afa943e43f1d00359fda 
   core/src/main/scala/kafka/server/KafkaApis.scala 
 d63bc18d795a6f0e6994538ca55a9a46f7fb8ffd 
   core/src/main/scala/kafka/server/OffsetManager.scala 
 5cca85cf727975f6d3acb2223fd186753ad761dc 
   core/src/main/scala/kafka/server/ReplicaFetcherThread.scala 
 b31b432a226ba79546dd22ef1d2acbb439c2e9a3 
   core/src/main/scala/kafka/server/ReplicaManager.scala 
 59c9bc3ac3a8afc07a6f8c88c5871304db588d17 
   core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
 5717165f2344823fabe8f7cfafae4bb8af2d949a 
   core/src/test/scala/unit/kafka/server/ReplicaManagerTest.scala 
 00d59337a99ac135e8689bd1ecd928f7b1423d79 
 
 Diff: https://reviews.apache.org/r/33378/diff/
 
 
 Testing
 ---
 
 New tests added
 
 
 Thanks,
 
 Aditya Auradkar
 




Re: Review Request 34965: Patch for KAFKA-2241

2015-07-13 Thread Dong Lin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34965/
---

(Updated July 13, 2015, 8:30 p.m.)


Review request for kafka.


Bugs: KAFKA-2241
https://issues.apache.org/jira/browse/KAFKA-2241


Repository: kafka


Description (updated)
---

KAFKA-2241; AbstractFetcherThread.shutdown() should not block


Diffs (updated)
-

  core/src/main/scala/kafka/consumer/SimpleConsumer.scala 
c16f7edd322709060e54c77eb505c44cbd77a4ec 
  core/src/main/scala/kafka/server/AbstractFetcherThread.scala 
83fc47417dd7fe4edf030217fa7fd69d99b170b0 

Diff: https://reviews.apache.org/r/34965/diff/


Testing
---


Thanks,

Dong Lin



[jira] [Commented] (KAFKA-2241) AbstractFetcherThread.shutdown() should not block on ReadableByteChannel.read(buffer)

2015-07-13 Thread Dong Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625284#comment-14625284
 ] 

Dong Lin commented on KAFKA-2241:
-

Updated reviewboard https://reviews.apache.org/r/34965/diff/
 against branch origin/trunk

 AbstractFetcherThread.shutdown() should not block on 
 ReadableByteChannel.read(buffer)
 -

 Key: KAFKA-2241
 URL: https://issues.apache.org/jira/browse/KAFKA-2241
 Project: Kafka
  Issue Type: Bug
Reporter: Dong Lin
Assignee: Dong Lin
  Labels: quotas
 Attachments: KAFKA-2241.patch, KAFKA-2241_2015-06-03_15:30:35.patch, 
 KAFKA-2241_2015-07-09_15:35:49.patch, KAFKA-2241_2015-07-13_13:30:07.patch, 
 client.java, server.java


 This is likely a bug from Java. This affects Kafka and here is the patch to 
 fix it.
 Here is the description of the bug. By description of SocketChannel in Java 7 
 Documentation. If another thread interrupts the current thread while the read 
 operation is in progress, the it should closes the channel and throw 
 ClosedByInterruptException. However, we find that interrupting the thread 
 will not unblock the channel immediately. Instead, it waits for response or 
 socket timeout before throwing an exception.
 This will cause problem in the following scenario. Suppose one 
 console_consumer_1 is reading from a topic, and due to quota delay or 
 whatever reason, it block on channel.read(buffer). At this moment, another 
 console_consumer_2 joins and triggers rebalance at console_consumer_1. But 
 consumer_1 will block waiting on the channel.read before it can release 
 partition ownership, causing consumer_2 to fail after a number of failed 
 attempts to obtain partition ownership.
 In other words, AbstractFetcherThread.shutdown() is not guaranteed to 
 shutdown due to this bug.
 The problem is confirmed with Java 1.7 and java 1.6. To check it by yourself, 
 you can use the attached server.java and client.java -- start the server 
 before the client and see if client unblock after interruption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2136) Client side protocol changes to return quota delays

2015-07-13 Thread Aditya A Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625288#comment-14625288
 ] 

Aditya A Auradkar commented on KAFKA-2136:
--

Updated reviewboard https://reviews.apache.org/r/33378/diff/
 against branch trunk

 Client side protocol changes to return quota delays
 ---

 Key: KAFKA-2136
 URL: https://issues.apache.org/jira/browse/KAFKA-2136
 Project: Kafka
  Issue Type: Sub-task
Reporter: Aditya Auradkar
Assignee: Aditya Auradkar
  Labels: quotas
 Attachments: KAFKA-2136.patch, KAFKA-2136_2015-05-06_18:32:48.patch, 
 KAFKA-2136_2015-05-06_18:35:54.patch, KAFKA-2136_2015-05-11_14:50:56.patch, 
 KAFKA-2136_2015-05-12_14:40:44.patch, KAFKA-2136_2015-06-09_10:07:13.patch, 
 KAFKA-2136_2015-06-09_10:10:25.patch, KAFKA-2136_2015-06-30_19:43:55.patch, 
 KAFKA-2136_2015-07-13_13:34:03.patch


 As described in KIP-13, evolve the protocol to return a throttle_time_ms in 
 the Fetch and the ProduceResponse objects. Add client side metrics on the new 
 producer and consumer to expose the delay time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33378: Patch for KAFKA-2136

2015-07-13 Thread Aditya Auradkar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33378/
---

(Updated July 13, 2015, 8:34 p.m.)


Review request for kafka, Joel Koshy and Jun Rao.


Bugs: KAFKA-2136
https://issues.apache.org/jira/browse/KAFKA-2136


Repository: kafka


Description (updated)
---

Addressing Joel's comments


Merging


Chaning variable name


Addressing Joel's comments


Addressing Joel's comments


Diffs (updated)
-

  
clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java 
56281ee15cc33dfc96ff64d5b1e596047c7132a4 
  clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
07e65d4a54ba4eef5b787eba3e71cbe9f6a920bd 
  clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
3dc8b015afd2347a41c9a9dbc02b8e367da5f75f 
  clients/src/main/java/org/apache/kafka/common/requests/FetchRequest.java 
8686d83aa52e435c6adafbe9ff4bd1602281072a 
  clients/src/main/java/org/apache/kafka/common/requests/FetchResponse.java 
eb8951fba48c335095cc43fc3672de1c733e07ff 
  clients/src/main/java/org/apache/kafka/common/requests/ProduceRequest.java 
fabeae3083a8ea55cdacbb9568f3847ccd85bab4 
  clients/src/main/java/org/apache/kafka/common/requests/ProduceResponse.java 
37ec0b79beafcf5735c386b066eb319fb697eff5 
  
clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
 419541011d652becf0cda7a5e62ce813cddb1732 
  
clients/src/test/java/org/apache/kafka/clients/producer/internals/SenderTest.java
 8b1805d3d2bcb9fe2bacb37d870c3236aa9532c4 
  
clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java 
e3cc1967e407b64cc734548c19e30de700b64ba8 
  core/src/main/scala/kafka/api/FetchRequest.scala 
5b38f8554898e54800abd65a7415dd0ac41fd958 
  core/src/main/scala/kafka/api/FetchResponse.scala 
0b6b33ab6f7a732ff1322b6f48abd4c43e0d7215 
  core/src/main/scala/kafka/api/ProducerRequest.scala 
c866180d3680da03e48d374415f10220f6ca68c4 
  core/src/main/scala/kafka/api/ProducerResponse.scala 
5d1fac4cb8943f5bfaa487f8e9d9d2856efbd330 
  core/src/main/scala/kafka/consumer/SimpleConsumer.scala 
c16f7edd322709060e54c77eb505c44cbd77a4ec 
  core/src/main/scala/kafka/server/AbstractFetcherThread.scala 
83fc47417dd7fe4edf030217fa7fd69d99b170b0 
  core/src/main/scala/kafka/server/DelayedFetch.scala 
de6cf5bdaa0e70394162febc63b50b55ca0a92db 
  core/src/main/scala/kafka/server/DelayedProduce.scala 
05078b24ef28f2f4e099afa943e43f1d00359fda 
  core/src/main/scala/kafka/server/KafkaApis.scala 
d63bc18d795a6f0e6994538ca55a9a46f7fb8ffd 
  core/src/main/scala/kafka/server/OffsetManager.scala 
5cca85cf727975f6d3acb2223fd186753ad761dc 
  core/src/main/scala/kafka/server/ReplicaFetcherThread.scala 
b31b432a226ba79546dd22ef1d2acbb439c2e9a3 
  core/src/main/scala/kafka/server/ReplicaManager.scala 
59c9bc3ac3a8afc07a6f8c88c5871304db588d17 
  core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
5717165f2344823fabe8f7cfafae4bb8af2d949a 
  core/src/test/scala/unit/kafka/server/ReplicaManagerTest.scala 
00d59337a99ac135e8689bd1ecd928f7b1423d79 

Diff: https://reviews.apache.org/r/33378/diff/


Testing
---

New tests added


Thanks,

Aditya Auradkar



[jira] [Updated] (KAFKA-2136) Client side protocol changes to return quota delays

2015-07-13 Thread Aditya A Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya A Auradkar updated KAFKA-2136:
-
Attachment: KAFKA-2136_2015-07-13_13:34:03.patch

 Client side protocol changes to return quota delays
 ---

 Key: KAFKA-2136
 URL: https://issues.apache.org/jira/browse/KAFKA-2136
 Project: Kafka
  Issue Type: Sub-task
Reporter: Aditya Auradkar
Assignee: Aditya Auradkar
  Labels: quotas
 Attachments: KAFKA-2136.patch, KAFKA-2136_2015-05-06_18:32:48.patch, 
 KAFKA-2136_2015-05-06_18:35:54.patch, KAFKA-2136_2015-05-11_14:50:56.patch, 
 KAFKA-2136_2015-05-12_14:40:44.patch, KAFKA-2136_2015-06-09_10:07:13.patch, 
 KAFKA-2136_2015-06-09_10:10:25.patch, KAFKA-2136_2015-06-30_19:43:55.patch, 
 KAFKA-2136_2015-07-13_13:34:03.patch


 As described in KIP-13, evolve the protocol to return a throttle_time_ms in 
 the Fetch and the ProduceResponse objects. Add client side metrics on the new 
 producer and consumer to expose the delay time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1782) Junit3 Misusage

2015-07-13 Thread Alexander Pakulov (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625341#comment-14625341
 ] 

Alexander Pakulov commented on KAFKA-1782:
--

[~guozhang] [~junrao] is this ticket still relevant?

 Junit3 Misusage
 ---

 Key: KAFKA-1782
 URL: https://issues.apache.org/jira/browse/KAFKA-1782
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Alexander Pakulov
  Labels: newbie
 Fix For: 0.8.3

 Attachments: KAFKA-1782.patch, KAFKA-1782_2015-06-18_11:52:49.patch


 This is found while I was working on KAFKA-1580: in many of our cases where 
 we explicitly extend from junit3suite (e.g. ProducerFailureHandlingTest), we 
 are actually misusing a bunch of features that only exist in Junit4, such as 
 (expected=classOf). For example, the following code
 {code}
 import org.scalatest.junit.JUnit3Suite
 import org.junit.Test
 import java.io.IOException
 class MiscTest extends JUnit3Suite {
   @Test (expected = classOf[IOException])
   def testSendOffset() {
   }
 }
 {code}
 will actually pass even though IOException was not thrown since this 
 annotation is not supported in Junit3. Whereas
 {code}
 import org.junit._
 import java.io.IOException
 class MiscTest extends JUnit3Suite {
   @Test (expected = classOf[IOException])
   def testSendOffset() {
   }
 }
 {code}
 or
 {code}
 import org.scalatest.junit.JUnitSuite
 import org.junit._
 import java.io.IOException
 class MiscTest extends JUnit3Suite {
   @Test (expected = classOf[IOException])
   def testSendOffset() {
   }
 }
 {code}
 or
 {code}
 import org.junit._
 import java.io.IOException
 class MiscTest {
   @Test (expected = classOf[IOException])
   def testSendOffset() {
   }
 }
 {code}
 will fail.
 I would propose to not rely on Junit annotations other than @Test itself but 
 use scala unit test annotations instead, for example:
 {code}
 import org.junit._
 import java.io.IOException
 class MiscTest {
   @Test
   def testSendOffset() {
 intercept[IOException] {
   //nothing
 }
   }
 }
 {code}
 will fail with a clearer stacktrace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 36244: Patch for KAFKA-2312

2015-07-13 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36244/#review91519
---

Ship it!


Ship It!

- Guozhang Wang


On July 7, 2015, 5 a.m., Tim Brooks wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36244/
 ---
 
 (Updated July 7, 2015, 5 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-2312
 https://issues.apache.org/jira/browse/KAFKA-2312
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Use an atomic long for the 'light lock' opposed to an atomic reference.
 
 
 Diffs
 -
 
   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
 1f0e51557c4569f0980b72652846b250d00e05d6 
 
 Diff: https://reviews.apache.org/r/36244/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Tim Brooks
 




[jira] [Commented] (KAFKA-1835) Kafka new producer needs options to make blocking behavior explicit

2015-07-13 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625234#comment-14625234
 ] 

Jiangjie Qin commented on KAFKA-1835:
-

[~guozhang][~jkreps] What do you think on this? The scenario we want to solve 
is user don't want to send to be blocked in anyway. Assuming KIP-19 is done, 
user will set max.block.ms to be 0. In that case, the problem becomes how can 
the first send() get through. 
I am thinking maybe we can do a metadata refresh when the clients get 
instantiated. There might be some overhead because this will get the metadata 
of all the topic back, but I don't think this will be a big issue. Any thoughts?

 Kafka new producer needs options to make blocking behavior explicit
 ---

 Key: KAFKA-1835
 URL: https://issues.apache.org/jira/browse/KAFKA-1835
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Affects Versions: 0.8.2.0, 0.8.3, 0.9.0
Reporter: Paul Pearcy
 Fix For: 0.8.3

 Attachments: KAFKA-1835-New-producer--blocking_v0.patch, 
 KAFKA-1835.patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The new (0.8.2 standalone) producer will block the first time it attempts to 
 retrieve metadata for a topic. This is not the desired behavior in some use 
 cases where async non-blocking guarantees are required and message loss is 
 acceptable in known cases. Also, most developers will assume an API that 
 returns a future is safe to call in a critical request path. 
 Discussing on the mailing list, the most viable option is to have the 
 following settings:
  pre.initialize.topics=x,y,z
  pre.initialize.timeout=x
  
 This moves potential blocking to the init of the producer and outside of some 
 random request. The potential will still exist for blocking in a corner case 
 where connectivity with Kafka is lost and a topic not included in pre-init 
 has a message sent for the first time. 
 There is the question of what to do when initialization fails. There are a 
 couple of options that I'd like available:
 - Fail creation of the client 
 - Fail all sends until the meta is available 
 Open to input on how the above option should be expressed. 
 It is also worth noting more nuanced solutions exist that could work without 
 the extra settings, they just end up having extra complications and at the 
 end of the day not adding much value. For instance, the producer could accept 
 and queue messages(note: more complicated than I am making it sound due to 
 storing all accepted messages in pre-partitioned compact binary form), but 
 you're still going to be forced to choose to either start blocking or 
 dropping messages at some point. 
 I have some test cases I am going to port over to the Kafka producer 
 integration ones and start from there. My current impl is in scala, but 
 porting to Java shouldn't be a big deal (was using a promise to track init 
 status, but will likely need to make that an atomic bool). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 36244: Patch for KAFKA-2312

2015-07-13 Thread Ismael Juma

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36244/#review91514
---

Ship it!


Ship It!

- Ismael Juma


On July 7, 2015, 5 a.m., Tim Brooks wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36244/
 ---
 
 (Updated July 7, 2015, 5 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-2312
 https://issues.apache.org/jira/browse/KAFKA-2312
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Use an atomic long for the 'light lock' opposed to an atomic reference.
 
 
 Diffs
 -
 
   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
 1f0e51557c4569f0980b72652846b250d00e05d6 
 
 Diff: https://reviews.apache.org/r/36244/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Tim Brooks
 




[jira] [Updated] (KAFKA-2241) AbstractFetcherThread.shutdown() should not block on ReadableByteChannel.read(buffer)

2015-07-13 Thread Dong Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin updated KAFKA-2241:

Attachment: KAFKA-2241_2015-07-13_13:30:07.patch

 AbstractFetcherThread.shutdown() should not block on 
 ReadableByteChannel.read(buffer)
 -

 Key: KAFKA-2241
 URL: https://issues.apache.org/jira/browse/KAFKA-2241
 Project: Kafka
  Issue Type: Bug
Reporter: Dong Lin
Assignee: Dong Lin
  Labels: quotas
 Attachments: KAFKA-2241.patch, KAFKA-2241_2015-06-03_15:30:35.patch, 
 KAFKA-2241_2015-07-09_15:35:49.patch, KAFKA-2241_2015-07-13_13:30:07.patch, 
 client.java, server.java


 This is likely a bug from Java. This affects Kafka and here is the patch to 
 fix it.
 Here is the description of the bug. By description of SocketChannel in Java 7 
 Documentation. If another thread interrupts the current thread while the read 
 operation is in progress, the it should closes the channel and throw 
 ClosedByInterruptException. However, we find that interrupting the thread 
 will not unblock the channel immediately. Instead, it waits for response or 
 socket timeout before throwing an exception.
 This will cause problem in the following scenario. Suppose one 
 console_consumer_1 is reading from a topic, and due to quota delay or 
 whatever reason, it block on channel.read(buffer). At this moment, another 
 console_consumer_2 joins and triggers rebalance at console_consumer_1. But 
 consumer_1 will block waiting on the channel.read before it can release 
 partition ownership, causing consumer_2 to fail after a number of failed 
 attempts to obtain partition ownership.
 In other words, AbstractFetcherThread.shutdown() is not guaranteed to 
 shutdown due to this bug.
 The problem is confirmed with Java 1.7 and java 1.6. To check it by yourself, 
 you can use the attached server.java and client.java -- start the server 
 before the client and see if client unblock after interruption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33378: Patch for KAFKA-2136

2015-07-13 Thread Aditya Auradkar


 On June 25, 2015, 10:55 p.m., Joel Koshy wrote:
  core/src/main/scala/kafka/server/AbstractFetcherThread.scala, line 40
  https://reviews.apache.org/r/33378/diff/8/?file=981582#file981582line40
 
  I think we should add throttle time metrics to the old producer and 
  consumer as well. What do you think?
 
 Aditya Auradkar wrote:
 I think that sounds reasonable.. I initially decided against it in my 
 patch because I thought of this as an incentive to upgrade. Any concerns if I 
 submit a subsequent RB for this immediately after this is committed?
 
 Joel Koshy wrote:
 I think it is definitely something that we will need (for users that are 
 still on old clients). So can you either create a separate jira labeled as 
 quotas or do that as part of this patch?

Here you go. I'll work on it ASAP
https://issues.apache.org/jira/browse/KAFKA-2332


- Aditya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33378/#review89429
---


On July 1, 2015, 2:44 a.m., Aditya Auradkar wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33378/
 ---
 
 (Updated July 1, 2015, 2:44 a.m.)
 
 
 Review request for kafka, Joel Koshy and Jun Rao.
 
 
 Bugs: KAFKA-2136
 https://issues.apache.org/jira/browse/KAFKA-2136
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Changes are
 - protocol changes to the fetch request and response to return the 
 throttle_time_ms to clients
 - New producer/consumer metrics to expose the avg and max delay time for a 
 client
 - Test cases.
 - Addressed Joel's comments
 
 For now the patch will publish a zero delay and return a response
 
 
 Diffs
 -
 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
  56281ee15cc33dfc96ff64d5b1e596047c7132a4 
   
 clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
 07e65d4a54ba4eef5b787eba3e71cbe9f6a920bd 
   clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
 3dc8b015afd2347a41c9a9dbc02b8e367da5f75f 
   clients/src/main/java/org/apache/kafka/common/requests/FetchRequest.java 
 8686d83aa52e435c6adafbe9ff4bd1602281072a 
   clients/src/main/java/org/apache/kafka/common/requests/FetchResponse.java 
 eb8951fba48c335095cc43fc3672de1c733e07ff 
   clients/src/main/java/org/apache/kafka/common/requests/ProduceRequest.java 
 fabeae3083a8ea55cdacbb9568f3847ccd85bab4 
   clients/src/main/java/org/apache/kafka/common/requests/ProduceResponse.java 
 37ec0b79beafcf5735c386b066eb319fb697eff5 
   
 clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
  419541011d652becf0cda7a5e62ce813cddb1732 
   
 clients/src/test/java/org/apache/kafka/clients/producer/internals/SenderTest.java
  8b1805d3d2bcb9fe2bacb37d870c3236aa9532c4 
   
 clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java
  e3cc1967e407b64cc734548c19e30de700b64ba8 
   core/src/main/scala/kafka/api/FetchRequest.scala 
 5b38f8554898e54800abd65a7415dd0ac41fd958 
   core/src/main/scala/kafka/api/FetchResponse.scala 
 0b6b33ab6f7a732ff1322b6f48abd4c43e0d7215 
   core/src/main/scala/kafka/api/ProducerRequest.scala 
 c866180d3680da03e48d374415f10220f6ca68c4 
   core/src/main/scala/kafka/api/ProducerResponse.scala 
 5d1fac4cb8943f5bfaa487f8e9d9d2856efbd330 
   core/src/main/scala/kafka/consumer/SimpleConsumer.scala 
 c16f7edd322709060e54c77eb505c44cbd77a4ec 
   core/src/main/scala/kafka/server/AbstractFetcherThread.scala 
 83fc47417dd7fe4edf030217fa7fd69d99b170b0 
   core/src/main/scala/kafka/server/DelayedFetch.scala 
 de6cf5bdaa0e70394162febc63b50b55ca0a92db 
   core/src/main/scala/kafka/server/DelayedProduce.scala 
 05078b24ef28f2f4e099afa943e43f1d00359fda 
   core/src/main/scala/kafka/server/KafkaApis.scala 
 d63bc18d795a6f0e6994538ca55a9a46f7fb8ffd 
   core/src/main/scala/kafka/server/OffsetManager.scala 
 5cca85cf727975f6d3acb2223fd186753ad761dc 
   core/src/main/scala/kafka/server/ReplicaFetcherThread.scala 
 b31b432a226ba79546dd22ef1d2acbb439c2e9a3 
   core/src/main/scala/kafka/server/ReplicaManager.scala 
 59c9bc3ac3a8afc07a6f8c88c5871304db588d17 
   core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
 5717165f2344823fabe8f7cfafae4bb8af2d949a 
   core/src/test/scala/unit/kafka/server/ReplicaManagerTest.scala 
 00d59337a99ac135e8689bd1ecd928f7b1423d79 
 
 Diff: https://reviews.apache.org/r/33378/diff/
 
 
 Testing
 ---
 
 New tests added
 
 
 Thanks,
 
 Aditya Auradkar
 




[jira] [Created] (KAFKA-2332) Add quota metrics to old producer and consumer

2015-07-13 Thread Aditya Auradkar (JIRA)
Aditya Auradkar created KAFKA-2332:
--

 Summary: Add quota metrics to old producer and consumer
 Key: KAFKA-2332
 URL: https://issues.apache.org/jira/browse/KAFKA-2332
 Project: Kafka
  Issue Type: Improvement
Reporter: Aditya Auradkar
Assignee: Dong Lin


Quota metrics have only been added to the new producer and consumer. It may be 
beneficial to add these to the existing consumer and old producer also for 
clients using the older versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33378: Patch for KAFKA-2136

2015-07-13 Thread Aditya Auradkar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33378/
---

(Updated July 13, 2015, 8:36 p.m.)


Review request for kafka, Joel Koshy and Jun Rao.


Bugs: KAFKA-2136
https://issues.apache.org/jira/browse/KAFKA-2136


Repository: kafka


Description (updated)
---

Changes are
- Addressing Joel's comments
- protocol changes to the fetch request and response to return the 
throttle_time_ms to clients
- New producer/consumer metrics to expose the avg and max delay time for a 
client
- Test cases.
- Addressed Joel's comments

For now the patch will publish a zero delay and return a response


Diffs
-

  
clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java 
56281ee15cc33dfc96ff64d5b1e596047c7132a4 
  clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
07e65d4a54ba4eef5b787eba3e71cbe9f6a920bd 
  clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
3dc8b015afd2347a41c9a9dbc02b8e367da5f75f 
  clients/src/main/java/org/apache/kafka/common/requests/FetchRequest.java 
8686d83aa52e435c6adafbe9ff4bd1602281072a 
  clients/src/main/java/org/apache/kafka/common/requests/FetchResponse.java 
eb8951fba48c335095cc43fc3672de1c733e07ff 
  clients/src/main/java/org/apache/kafka/common/requests/ProduceRequest.java 
fabeae3083a8ea55cdacbb9568f3847ccd85bab4 
  clients/src/main/java/org/apache/kafka/common/requests/ProduceResponse.java 
37ec0b79beafcf5735c386b066eb319fb697eff5 
  
clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
 419541011d652becf0cda7a5e62ce813cddb1732 
  
clients/src/test/java/org/apache/kafka/clients/producer/internals/SenderTest.java
 8b1805d3d2bcb9fe2bacb37d870c3236aa9532c4 
  
clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java 
e3cc1967e407b64cc734548c19e30de700b64ba8 
  core/src/main/scala/kafka/api/FetchRequest.scala 
5b38f8554898e54800abd65a7415dd0ac41fd958 
  core/src/main/scala/kafka/api/FetchResponse.scala 
0b6b33ab6f7a732ff1322b6f48abd4c43e0d7215 
  core/src/main/scala/kafka/api/ProducerRequest.scala 
c866180d3680da03e48d374415f10220f6ca68c4 
  core/src/main/scala/kafka/api/ProducerResponse.scala 
5d1fac4cb8943f5bfaa487f8e9d9d2856efbd330 
  core/src/main/scala/kafka/consumer/SimpleConsumer.scala 
c16f7edd322709060e54c77eb505c44cbd77a4ec 
  core/src/main/scala/kafka/server/AbstractFetcherThread.scala 
83fc47417dd7fe4edf030217fa7fd69d99b170b0 
  core/src/main/scala/kafka/server/DelayedFetch.scala 
de6cf5bdaa0e70394162febc63b50b55ca0a92db 
  core/src/main/scala/kafka/server/DelayedProduce.scala 
05078b24ef28f2f4e099afa943e43f1d00359fda 
  core/src/main/scala/kafka/server/KafkaApis.scala 
d63bc18d795a6f0e6994538ca55a9a46f7fb8ffd 
  core/src/main/scala/kafka/server/OffsetManager.scala 
5cca85cf727975f6d3acb2223fd186753ad761dc 
  core/src/main/scala/kafka/server/ReplicaFetcherThread.scala 
b31b432a226ba79546dd22ef1d2acbb439c2e9a3 
  core/src/main/scala/kafka/server/ReplicaManager.scala 
59c9bc3ac3a8afc07a6f8c88c5871304db588d17 
  core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
5717165f2344823fabe8f7cfafae4bb8af2d949a 
  core/src/test/scala/unit/kafka/server/ReplicaManagerTest.scala 
00d59337a99ac135e8689bd1ecd928f7b1423d79 

Diff: https://reviews.apache.org/r/33378/diff/


Testing
---

New tests added


Thanks,

Aditya Auradkar



[jira] [Commented] (KAFKA-1595) Remove deprecated and slower scala JSON parser from kafka.consumer.TopicCount

2015-07-13 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625346#comment-14625346
 ] 

Ismael Juma commented on KAFKA-1595:


[~gwenshap], I started a thread in the mailing list as you requested. I also 
implemented the change using Jackson for comparison:

https://github.com/apache/kafka/compare/trunk...ijuma:kafka-1595-remove-deprecated-json-parser-jackson?expand=1

 Remove deprecated and slower scala JSON parser from kafka.consumer.TopicCount
 -

 Key: KAFKA-1595
 URL: https://issues.apache.org/jira/browse/KAFKA-1595
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 0.8.1.1
Reporter: Jagbir
Assignee: Ismael Juma
  Labels: newbie
 Fix For: 0.8.3

 Attachments: KAFKA-1595.patch


 The following issue is created as a follow up suggested by Jun Rao
 in a kafka news group message with the Subject
 Blocking Recursive parsing from 
 kafka.consumer.TopicCount$.constructTopicCount
 SUMMARY:
 An issue was detected in a typical cluster of 3 kafka instances backed
 by 3 zookeeper instances (kafka version 0.8.1.1, scala version 2.10.3,
 java version 1.7.0_65). On consumer end, when consumers get recycled,
 there is a troubling JSON parsing recursion which takes a busy lock and
 blocks consumers thread pool.
 In 0.8.1.1 scala client library ZookeeperConsumerConnector.scala:355 takes
 a global lock (0xd3a7e1d0) during the rebalance, and fires an
 expensive JSON parsing, while keeping the other consumers from shutting
 down, see, e.g,
 at 
 kafka.consumer.ZookeeperConsumerConnector.shutdown(ZookeeperConsumerConnector.scala:161)
 The deep recursive JSON parsing should be deprecated in favor
 of a better JSON parser, see, e.g,
 http://engineering.ooyala.com/blog/comparing-scala-json-libraries?
 DETAILS:
 The first dump is for a recursive blocking thread holding the lock for 
 0xd3a7e1d0
 and the subsequent dump is for a waiting thread.
 (Please grep for 0xd3a7e1d0 to see the locked object.)
 Â 
 -8-
 Sa863f22b1e5hjh6788991800900b34545c_profile-a-prod1-s-140789080845312-c397945e8_watcher_executor
 prio=10 tid=0x7f24dc285800 nid=0xda9 runnable [0x7f249e40b000]
 java.lang.Thread.State: RUNNABLE
 at 
 scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.p$7(Parsers.scala:722)
 at 
 scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.continue$1(Parsers.scala:726)
 at 
 scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.apply(Parsers.scala:737)
 at 
 scala.util.parsing.combinator.Parsers$$anonfun$rep1$1.apply(Parsers.scala:721)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
 at 
 scala.util.parsing.combinator.Parsers$Success.flatMapWithNext(Parsers.scala:142)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$flatMap$1.apply(Parsers.scala:239)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
 at 
 scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
 at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)
 at 
 

[jira] [Created] (KAFKA-2333) Add rename topic support

2015-07-13 Thread Grant Henke (JIRA)
Grant Henke created KAFKA-2333:
--

 Summary: Add rename topic support
 Key: KAFKA-2333
 URL: https://issues.apache.org/jira/browse/KAFKA-2333
 Project: Kafka
  Issue Type: New Feature
Reporter: Grant Henke


Add the ability to change the name of existing topics. 

This likely needs an associated KIP. This Jira will be updated when one is 
created.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-1893) Allow regex subscriptions in the new consumer

2015-07-13 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson reassigned KAFKA-1893:
--

Assignee: Jason Gustafson

 Allow regex subscriptions in the new consumer
 -

 Key: KAFKA-1893
 URL: https://issues.apache.org/jira/browse/KAFKA-1893
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Jay Kreps
Assignee: Jason Gustafson
Priority: Critical
 Fix For: 0.8.3


 The consumer needs to handle subscribing to regular expressions. Presumably 
 this would be done as a new api,
 {code}
   void subscribe(java.util.regex.Pattern pattern);
 {code}
 Some questions/thoughts to work out:
  - It should not be possible to mix pattern subscription with partition 
 subscription.
  - Is it allowable to mix this with normal topic subscriptions? Logically 
 this is okay but a bit complex to implement.
  - We need to ensure we regularly update the metadata and recheck our regexes 
 against the metadata to update subscriptions for new topics that are created 
 or old topics that are deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2332) Add quota metrics to old producer and consumer

2015-07-13 Thread Aditya Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Auradkar updated KAFKA-2332:
---
Issue Type: Sub-task  (was: Improvement)
Parent: KAFKA-2083

 Add quota metrics to old producer and consumer
 --

 Key: KAFKA-2332
 URL: https://issues.apache.org/jira/browse/KAFKA-2332
 Project: Kafka
  Issue Type: Sub-task
Reporter: Aditya Auradkar
Assignee: Dong Lin
  Labels: quotas

 Quota metrics have only been added to the new producer and consumer. It may 
 be beneficial to add these to the existing consumer and old producer also for 
 clients using the older versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[DISCUSS] Json libraries for Kafka

2015-07-13 Thread Ismael Juma
Hi all,

Kafka currently use scala.util.parsing.json.JSON as its json parser and it
has a number of issues:

* It encourages unsafe casts (returns `Option[Any]`)
* It's slow (it relies on parser combinators under the hood)
* It's not thread-safe (so external locks are needed to use it in a
concurrent environment)
* It's deprecated (it should have never been included in the standard
library in the first place)

KAFKA-1595[1] has been filed to track this issue.

I initially proposed a change using spray-json's AST with the jawn
parser[2]. Gwen expressed some reservations about the choice (a previous
discussion had concluded that Jackson should be used instead) and asked me
to raise the issue in the mailing list[3].

In order to have a fair comparison, I implemented the change using Jackson
as well[4]. I paste part of the commit message:

A thin wrapper over Jackson's Tree Model API is used as the replacement.
This wrapper
increases safety while providing a simple, but powerful API through the
usage of the
`DecodeJson` type class. Even though this has a maintenance cost, it makes
the API
much more convenient from Scala. A number of tests were added to verify the
behaviour of this wrapper. The Scala module for Jackson doesn't provide any
help for our current usage, so we don't
depend on it.

A comparison between the two approaches as I see it:

Similarities:

   1. The code for users of the JSON library is similar
   2. No third-party dependencies
   3. Good performance

In favour of using Jackson:

   1. Same library for client and broker
   2. Widely used

In favour of using spray-json and jawn:

   1. Simple type class based API is included and it has a number of nice
   features:
  1. Support for parsing into case classes (we don't use this yet, but
  we could use it to make the code safer and more readable in some
cases)[5].
  2. Very little reflection used (only for retrieving case classes
  field names).
  3. Write support (could replace our `Json.encode` method).
   2. Less code to maintain (ie we don't need a wrapper to make it nice to
   use from Scala)
   3. No memory overhead from wrapping the Jackson classes (probably not a
   big deal)

I am happy to go either way as both approaches have been implemented and I
am torn between the options.

What do you think?

Best,
Ismael

[1] https://issues.apache.org/jira/browse/KAFKA-1595
[2]
https://github.com/ijuma/kafka/commit/80974afefc00eb6313a7357e7942d5d86ffce84d
[3]
https://issues.apache.org/jira/browse/KAFKA-1595?focusedCommentId=14512881page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14512881
[4]
https://github.com/ijuma/kafka/commit/4ca0feb37e8be2d388b60efacc19bc6788b6
[5] The Scala module for Jackson (which is not being used in the commit
above) also supports this, but it uses a reflection-based approach instead
of type classes.


[jira] [Commented] (KAFKA-2205) Generalize TopicConfigManager to handle multiple entity configs

2015-07-13 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625550#comment-14625550
 ] 

Jun Rao commented on KAFKA-2205:


Reviewed and just have a couple of more minor comments.

 Generalize TopicConfigManager to handle multiple entity configs
 ---

 Key: KAFKA-2205
 URL: https://issues.apache.org/jira/browse/KAFKA-2205
 Project: Kafka
  Issue Type: Sub-task
Reporter: Aditya Auradkar
Assignee: Aditya Auradkar
  Labels: quotas
 Attachments: KAFKA-2205.patch, KAFKA-2205_2015-07-01_18:38:18.patch, 
 KAFKA-2205_2015-07-07_19:12:15.patch


 Acceptance Criteria:
 - TopicConfigManager should be generalized to handle Topic and Client configs 
 (and any type of config in the future). As described in KIP-21
 - Add a ConfigCommand tool to change topic and client configuration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-2182) zkClient dies if there is any exception while reconnecting

2015-07-13 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-2182.

   Resolution: Implemented
Fix Version/s: 0.8.3

As [~parth.brahmbhatt] pointed out, this is already fixed in  KAFKA-2169. 
Resolving this jira.

 zkClient dies if there is any exception while reconnecting
 --

 Key: KAFKA-2182
 URL: https://issues.apache.org/jira/browse/KAFKA-2182
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1
Reporter: Igor Maravić
Assignee: Parth Brahmbhatt
Priority: Critical
 Fix For: 0.8.3


 We, Spotify, have just been hit by a BUG that's related to ZkClient. It made 
 a whole Kafka cluster go down.
 Long story short, we've restarted TOR switch and all of our brokers from the 
 cluster lost all the connectivity with the zookeeper cluster, which was 
 living in another rack. Because of that, all the connections to Zookeeper got 
 expired.
 Everything would be fine if we haven't lost hostname to IP Address mapping 
 for some reason. Since we did lost that mapping, we got a 
 UnknownHostException when the broker tried to reconnect. This exception got 
 swallowed up, and we never got reconnected to Zookeeper, which in turn made 
 our cluster of brokers useless.
 If we had handleSessionEstablishmentError function, the whole exception 
 could be caught, we could just completely kill KafkaServer process and start 
 it cleanly, since this kind of exception is fatal for the KafkaCluster.
 {code}
 2015-05-05T12:49:01.709+00:00 127.0.0.1 apache-kafka[main-EventThread] INFO  
 zookeeper.ZooKeeper  - Initiating client connection, 
 connectString=zookeeper1.spotify.net:2181,zookeeper2.spotify.net:2181,zookeeper3.spotify.net:2181/gabobroker-local
  sessionTimeout=6000 watcher=org.I0Itec.zkclient.ZkClient@7303d690
 2015-05-05T12:49:01.711+00:00 127.0.0.1 apache-kafka[main-EventThread] ERROR 
 zookeeper.ClientCnxn  - Error while calling watcher
 2015-05-05T12:49:01.711+00:00 127.0.0.1 java.lang.RuntimeException: Exception 
 while restarting zk client
 2015-05-05T12:49:01.711+00:00 127.0.0.1 at 
 org.I0Itec.zkclient.ZkClient.processStateChanged(ZkClient.java:462)
 2015-05-05T12:49:01.711+00:00 127.0.0.1 at 
 org.I0Itec.zkclient.ZkClient.process(ZkClient.java:368)
 2015-05-05T12:49:01.711+00:00 127.0.0.1 at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
 2015-05-05T12:49:01.711+00:00 127.0.0.1 at 
 org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
 2015-05-05T12:49:01.711+00:00 127.0.0.1 Caused by: 
 org.I0Itec.zkclient.exception.ZkException: Unable to connect to 
 zookeeper1.spotify.net:2181,zookeeper2.spotify.net:2181,zookeeper3.spotify.net:2181/gabobroker-local
 2015-05-05T12:49:01.711+00:00 127.0.0.1 at 
 org.I0Itec.zkclient.ZkConnection.connect(ZkConnection.java:66)
 2015-05-05T12:49:01.711+00:00 127.0.0.1 at 
 org.I0Itec.zkclient.ZkClient.reconnect(ZkClient.java:939)
 2015-05-05T12:49:01.711+00:00 127.0.0.1 at 
 org.I0Itec.zkclient.ZkClient.processStateChanged(ZkClient.java:458)
 2015-05-05T12:49:01.711+00:00 127.0.0.1 ... 3 more
 2015-05-05T12:49:01.712+00:00 127.0.0.1 Caused by: 
 java.net.UnknownHostException: zookeeper1.spotify.net: Name or service not 
 known
 2015-05-05T12:49:01.712+00:00 127.0.0.1 at 
 java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
 2015-05-05T12:49:01.712+00:00 127.0.0.1 at 
 java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
 2015-05-05T12:49:01.712+00:00 127.0.0.1 at 
 java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
 2015-05-05T12:49:01.712+00:00 127.0.0.1 at 
 java.net.InetAddress.getAllByName0(InetAddress.java:1246)
 2015-05-05T12:49:01.712+00:00 127.0.0.1 at 
 java.net.InetAddress.getAllByName(InetAddress.java:1162)
 2015-05-05T12:49:01.712+00:00 127.0.0.1 at 
 java.net.InetAddress.getAllByName(InetAddress.java:1098)
 2015-05-05T12:49:01.712+00:00 127.0.0.1 at 
 org.apache.zookeeper.client.StaticHostProvider.init(StaticHostProvider.java:61)
 2015-05-05T12:49:01.712+00:00 127.0.0.1 at 
 org.apache.zookeeper.ZooKeeper.init(ZooKeeper.java:445)
 2015-05-05T12:49:01.712+00:00 127.0.0.1 at 
 org.apache.zookeeper.ZooKeeper.init(ZooKeeper.java:380)
 2015-05-05T12:49:01.713+00:00 127.0.0.1 at 
 org.I0Itec.zkclient.ZkConnection.connect(ZkConnection.java:64)
 2015-05-05T12:49:01.713+00:00 127.0.0.1 ... 5 more
 2015-05-05T12:49:01.713+00:00 127.0.0.1 
 apache-kafka[ZkClient-EventThread-18-zookeeper1.spotify.net:2181,zookeeper2.spotify.net:2181,zookeeper3.spotify.net:2181/gabobroker-local]
  ERROR zkclient.ZkEventThread  - Error handling event ZkEvent[Children of 
 /config/changes changed sent to 
 kafka.server.TopicConfigManager$ConfigChangeListener$@17638f6]
 2015-05-05T12:49:01.713+00:00 

Re: Review Request 36341: Patch for KAFKA-2311

2015-07-13 Thread Jason Gustafson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36341/#review91556
---

Ship it!


Ship It!

- Jason Gustafson


On July 9, 2015, 1:04 a.m., Tim Brooks wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36341/
 ---
 
 (Updated July 9, 2015, 1:04 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-2311
 https://issues.apache.org/jira/browse/KAFKA-2311
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Remove unnecessary close check
 
 
 Diffs
 -
 
   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
 1f0e51557c4569f0980b72652846b250d00e05d6 
 
 Diff: https://reviews.apache.org/r/36341/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Tim Brooks
 




[jira] [Commented] (KAFKA-1835) Kafka new producer needs options to make blocking behavior explicit

2015-07-13 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625568#comment-14625568
 ] 

Guozhang Wang commented on KAFKA-1835:
--

[~smiklosovic] You will only be blocked for the full timeout period if the 
topic your producer is trying to send to is not available, or the destination 
broker cluster is not available (which is a much bigger problem though), hence 
the producer client blocks on refreshing the metadata for that topic; if that 
topic already exists the producer should not block for that long. What is the 
scenario you have encountered?

 Kafka new producer needs options to make blocking behavior explicit
 ---

 Key: KAFKA-1835
 URL: https://issues.apache.org/jira/browse/KAFKA-1835
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Affects Versions: 0.8.2.0, 0.8.3, 0.9.0
Reporter: Paul Pearcy
 Fix For: 0.8.3

 Attachments: KAFKA-1835-New-producer--blocking_v0.patch, 
 KAFKA-1835.patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The new (0.8.2 standalone) producer will block the first time it attempts to 
 retrieve metadata for a topic. This is not the desired behavior in some use 
 cases where async non-blocking guarantees are required and message loss is 
 acceptable in known cases. Also, most developers will assume an API that 
 returns a future is safe to call in a critical request path. 
 Discussing on the mailing list, the most viable option is to have the 
 following settings:
  pre.initialize.topics=x,y,z
  pre.initialize.timeout=x
  
 This moves potential blocking to the init of the producer and outside of some 
 random request. The potential will still exist for blocking in a corner case 
 where connectivity with Kafka is lost and a topic not included in pre-init 
 has a message sent for the first time. 
 There is the question of what to do when initialization fails. There are a 
 couple of options that I'd like available:
 - Fail creation of the client 
 - Fail all sends until the meta is available 
 Open to input on how the above option should be expressed. 
 It is also worth noting more nuanced solutions exist that could work without 
 the extra settings, they just end up having extra complications and at the 
 end of the day not adding much value. For instance, the producer could accept 
 and queue messages(note: more complicated than I am making it sound due to 
 storing all accepted messages in pre-partitioned compact binary form), but 
 you're still going to be forced to choose to either start blocking or 
 dropping messages at some point. 
 I have some test cases I am going to port over to the Kafka producer 
 integration ones and start from there. My current impl is in scala, but 
 porting to Java shouldn't be a big deal (was using a promise to track init 
 status, but will likely need to make that an atomic bool). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: Kafka-trunk #539

2015-07-13 Thread Apache Jenkins Server
See https://builds.apache.org/job/Kafka-trunk/539/changes

Changes:

[wangguoz] KAFKA-2312: use atomic long for thread id reference; reviewed by 
Ewen Cheslack-Postava, Jason Gustafson, Ismael Juma and Guozhang Wang

--
[...truncated 1435 lines...]
kafka.log.BrokerCompressionTest  testBrokerSideCompression[4] PASSED

kafka.log.BrokerCompressionTest  testBrokerSideCompression[5] PASSED

kafka.log.BrokerCompressionTest  testBrokerSideCompression[6] PASSED

kafka.log.BrokerCompressionTest  testBrokerSideCompression[7] PASSED

kafka.log.BrokerCompressionTest  testBrokerSideCompression[8] PASSED

kafka.log.BrokerCompressionTest  testBrokerSideCompression[9] PASSED

kafka.log.BrokerCompressionTest  testBrokerSideCompression[10] PASSED

kafka.log.BrokerCompressionTest  testBrokerSideCompression[11] PASSED

kafka.log.BrokerCompressionTest  testBrokerSideCompression[12] PASSED

kafka.log.BrokerCompressionTest  testBrokerSideCompression[13] PASSED

kafka.log.BrokerCompressionTest  testBrokerSideCompression[14] PASSED

kafka.log.BrokerCompressionTest  testBrokerSideCompression[15] PASSED

kafka.log.BrokerCompressionTest  testBrokerSideCompression[16] PASSED

kafka.log.BrokerCompressionTest  testBrokerSideCompression[17] PASSED

kafka.log.BrokerCompressionTest  testBrokerSideCompression[18] PASSED

kafka.log.BrokerCompressionTest  testBrokerSideCompression[19] PASSED

kafka.log.LogSegmentTest  testTruncate PASSED

kafka.log.LogSegmentTest  testReadOnEmptySegment PASSED

kafka.log.LogSegmentTest  testReadBeforeFirstOffset PASSED

kafka.log.LogSegmentTest  testMaxOffset PASSED

kafka.log.LogSegmentTest  testReadAfterLast PASSED

kafka.log.LogSegmentTest  testReadFromGap PASSED

kafka.log.LogSegmentTest  testTruncateFull PASSED

kafka.log.LogSegmentTest  testNextOffsetCalculation PASSED

kafka.log.LogSegmentTest  testChangeFileSuffixes PASSED

kafka.log.LogSegmentTest  testRecoveryFixesCorruptIndex PASSED

kafka.log.LogSegmentTest  testRecoveryWithCorruptMessage PASSED

kafka.log.LogSegmentTest  testCreateWithInitFileSizeAppendMessage PASSED

kafka.log.LogSegmentTest  testCreateWithInitFileSizeClearShutdown PASSED

kafka.log.LogConfigTest  testFromPropsEmpty PASSED

kafka.log.LogConfigTest  testFromPropsInvalid PASSED

kafka.log.LogConfigTest  testFromPropsToProps PASSED

kafka.log.LogCleanerIntegrationTest  cleanerTest[0] PASSED

kafka.log.LogCleanerIntegrationTest  cleanerTest[1] PASSED

kafka.log.LogCleanerIntegrationTest  cleanerTest[2] PASSED

kafka.log.LogCleanerIntegrationTest  cleanerTest[3] PASSED

kafka.log.LogManagerTest  testCleanupSegmentsToMaintainSize PASSED

kafka.log.LogManagerTest  testTimeBasedFlush PASSED

kafka.log.LogManagerTest  testLeastLoadedAssignment PASSED

kafka.log.LogManagerTest  testGetNonExistentLog PASSED

kafka.log.LogManagerTest  testCleanupExpiredSegments PASSED

kafka.log.LogManagerTest  testCreateLog PASSED

kafka.log.LogManagerTest  testCheckpointRecoveryPoints PASSED

kafka.log.LogManagerTest  testRecoveryDirectoryMappingWithTrailingSlash PASSED

kafka.log.LogManagerTest  testRecoveryDirectoryMappingWithRelativeDirectory 
PASSED

kafka.log.LogManagerTest  testTwoLogManagersUsingSameDirFails PASSED

kafka.coordinator.PartitionAssignorTest  
testRangeAssignorTwoConsumersOneTopicOnePartition PASSED

kafka.coordinator.PartitionAssignorTest  
testRangeAssignorTwoConsumersOneTopicTwoPartitions PASSED

kafka.coordinator.PartitionAssignorTest  
testRangeAssignorMultipleConsumersMixedTopics PASSED

kafka.coordinator.PartitionAssignorTest  
testRangeAssignorTwoConsumersTwoTopicsSixPartitions PASSED

kafka.coordinator.PartitionAssignorTest  
testRoundRobinAssignorOneConsumerNoTopic PASSED

kafka.coordinator.PartitionAssignorTest  
testRoundRobinAssignorOneConsumerNonexistentTopic PASSED

kafka.coordinator.PartitionAssignorTest  
testRoundRobinAssignorOneConsumerOneTopic PASSED

kafka.coordinator.PartitionAssignorTest  
testRoundRobinAssignorOnlyAssignsPartitionsFromSubscribedTopics PASSED

kafka.coordinator.PartitionAssignorTest  
testRoundRobinAssignorOneConsumerMultipleTopics PASSED

kafka.coordinator.PartitionAssignorTest  
testRoundRobinAssignorTwoConsumersOneTopicOnePartition PASSED

kafka.coordinator.PartitionAssignorTest  
testRoundRobinAssignorTwoConsumersOneTopicTwoPartitions PASSED

kafka.coordinator.PartitionAssignorTest  
testRoundRobinAssignorMultipleConsumersMixedTopics PASSED

kafka.coordinator.PartitionAssignorTest  
testRoundRobinAssignorTwoConsumersTwoTopicsSixPartitions PASSED

kafka.coordinator.PartitionAssignorTest  testRangeAssignorOneConsumerNoTopic 
PASSED

kafka.coordinator.PartitionAssignorTest  
testRangeAssignorOneConsumerNonexistentTopic PASSED

kafka.coordinator.PartitionAssignorTest  testRangeAssignorOneConsumerOneTopic 
PASSED

kafka.coordinator.PartitionAssignorTest  
testRangeAssignorOnlyAssignsPartitionsFromSubscribedTopics PASSED

kafka.coordinator.PartitionAssignorTest  
testRangeAssignorOneConsumerMultipleTopics 

[jira] [Assigned] (KAFKA-2145) An option to add topic owners.

2015-07-13 Thread Parth Brahmbhatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Brahmbhatt reassigned KAFKA-2145:
---

Assignee: Parth Brahmbhatt

 An option to add topic owners. 
 ---

 Key: KAFKA-2145
 URL: https://issues.apache.org/jira/browse/KAFKA-2145
 Project: Kafka
  Issue Type: Improvement
Reporter: Parth Brahmbhatt
Assignee: Parth Brahmbhatt

 We need to expose a way so users can identify users/groups that share 
 ownership of topic. We discussed adding this as part of 
 https://issues.apache.org/jira/browse/KAFKA-2035 and agreed that it will be 
 simpler to add owner as a logconfig. 
 The owner field can be used for auditing and also by authorization layer to 
 grant access without having to explicitly configure acls. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2241) AbstractFetcherThread.shutdown() should not block on ReadableByteChannel.read(buffer)

2015-07-13 Thread Dong Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin updated KAFKA-2241:

Attachment: KAFKA-2241_2015-07-13_14:51:42.patch

 AbstractFetcherThread.shutdown() should not block on 
 ReadableByteChannel.read(buffer)
 -

 Key: KAFKA-2241
 URL: https://issues.apache.org/jira/browse/KAFKA-2241
 Project: Kafka
  Issue Type: Bug
Reporter: Dong Lin
Assignee: Dong Lin
  Labels: quotas
 Attachments: KAFKA-2241.patch, KAFKA-2241_2015-06-03_15:30:35.patch, 
 KAFKA-2241_2015-07-09_15:35:49.patch, KAFKA-2241_2015-07-13_13:30:07.patch, 
 KAFKA-2241_2015-07-13_14:51:42.patch, client.java, server.java


 This is likely a bug from Java. This affects Kafka and here is the patch to 
 fix it.
 Here is the description of the bug. By description of SocketChannel in Java 7 
 Documentation. If another thread interrupts the current thread while the read 
 operation is in progress, the it should closes the channel and throw 
 ClosedByInterruptException. However, we find that interrupting the thread 
 will not unblock the channel immediately. Instead, it waits for response or 
 socket timeout before throwing an exception.
 This will cause problem in the following scenario. Suppose one 
 console_consumer_1 is reading from a topic, and due to quota delay or 
 whatever reason, it block on channel.read(buffer). At this moment, another 
 console_consumer_2 joins and triggers rebalance at console_consumer_1. But 
 consumer_1 will block waiting on the channel.read before it can release 
 partition ownership, causing consumer_2 to fail after a number of failed 
 attempts to obtain partition ownership.
 In other words, AbstractFetcherThread.shutdown() is not guaranteed to 
 shutdown due to this bug.
 The problem is confirmed with Java 1.7 and java 1.6. To check it by yourself, 
 you can use the attached server.java and client.java -- start the server 
 before the client and see if client unblock after interruption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2241) AbstractFetcherThread.shutdown() should not block on ReadableByteChannel.read(buffer)

2015-07-13 Thread Dong Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625428#comment-14625428
 ] 

Dong Lin commented on KAFKA-2241:
-

Updated reviewboard https://reviews.apache.org/r/34965/diff/
 against branch origin/trunk

 AbstractFetcherThread.shutdown() should not block on 
 ReadableByteChannel.read(buffer)
 -

 Key: KAFKA-2241
 URL: https://issues.apache.org/jira/browse/KAFKA-2241
 Project: Kafka
  Issue Type: Bug
Reporter: Dong Lin
Assignee: Dong Lin
  Labels: quotas
 Attachments: KAFKA-2241.patch, KAFKA-2241_2015-06-03_15:30:35.patch, 
 KAFKA-2241_2015-07-09_15:35:49.patch, KAFKA-2241_2015-07-13_13:30:07.patch, 
 KAFKA-2241_2015-07-13_14:51:42.patch, client.java, server.java


 This is likely a bug from Java. This affects Kafka and here is the patch to 
 fix it.
 Here is the description of the bug. By description of SocketChannel in Java 7 
 Documentation. If another thread interrupts the current thread while the read 
 operation is in progress, the it should closes the channel and throw 
 ClosedByInterruptException. However, we find that interrupting the thread 
 will not unblock the channel immediately. Instead, it waits for response or 
 socket timeout before throwing an exception.
 This will cause problem in the following scenario. Suppose one 
 console_consumer_1 is reading from a topic, and due to quota delay or 
 whatever reason, it block on channel.read(buffer). At this moment, another 
 console_consumer_2 joins and triggers rebalance at console_consumer_1. But 
 consumer_1 will block waiting on the channel.read before it can release 
 partition ownership, causing consumer_2 to fail after a number of failed 
 attempts to obtain partition ownership.
 In other words, AbstractFetcherThread.shutdown() is not guaranteed to 
 shutdown due to this bug.
 The problem is confirmed with Java 1.7 and java 1.6. To check it by yourself, 
 you can use the attached server.java and client.java -- start the server 
 before the client and see if client unblock after interruption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2145) An option to add topic owners.

2015-07-13 Thread Ashish K Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625380#comment-14625380
 ] 

Ashish K Singh commented on KAFKA-2145:
---

[~parth.brahmbhatt] sure, go ahead.

 An option to add topic owners. 
 ---

 Key: KAFKA-2145
 URL: https://issues.apache.org/jira/browse/KAFKA-2145
 Project: Kafka
  Issue Type: Improvement
Reporter: Parth Brahmbhatt
Assignee: Ashish K Singh

 We need to expose a way so users can identify users/groups that share 
 ownership of topic. We discussed adding this as part of 
 https://issues.apache.org/jira/browse/KAFKA-2035 and agreed that it will be 
 simpler to add owner as a logconfig. 
 The owner field can be used for auditing and also by authorization layer to 
 grant access without having to explicitly configure acls. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2145) An option to add topic owners.

2015-07-13 Thread Ashish K Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish K Singh updated KAFKA-2145:
--
Assignee: (was: Ashish K Singh)

 An option to add topic owners. 
 ---

 Key: KAFKA-2145
 URL: https://issues.apache.org/jira/browse/KAFKA-2145
 Project: Kafka
  Issue Type: Improvement
Reporter: Parth Brahmbhatt

 We need to expose a way so users can identify users/groups that share 
 ownership of topic. We discussed adding this as part of 
 https://issues.apache.org/jira/browse/KAFKA-2035 and agreed that it will be 
 simpler to add owner as a logconfig. 
 The owner field can be used for auditing and also by authorization layer to 
 grant access without having to explicitly configure acls. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34965: Patch for KAFKA-2241

2015-07-13 Thread Dong Lin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34965/
---

(Updated July 13, 2015, 9:52 p.m.)


Review request for kafka.


Bugs: KAFKA-2241
https://issues.apache.org/jira/browse/KAFKA-2241


Repository: kafka


Description
---

KAFKA-2241; AbstractFetcherThread.shutdown() should not block


Diffs (updated)
-

  core/src/main/scala/kafka/consumer/SimpleConsumer.scala 
c16f7edd322709060e54c77eb505c44cbd77a4ec 
  core/src/main/scala/kafka/server/AbstractFetcherThread.scala 
83fc47417dd7fe4edf030217fa7fd69d99b170b0 

Diff: https://reviews.apache.org/r/34965/diff/


Testing
---


Thanks,

Dong Lin



Re: Review Request 33378: Patch for KAFKA-2136

2015-07-13 Thread Aditya Auradkar


 On July 10, 2015, 5:49 p.m., Joel Koshy wrote:
  LGTM - just a few minor comments.

Also, I filed this ticket to add metrics to the old producer and consumers:
https://issues.apache.org/jira/browse/KAFKA-2332


- Aditya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33378/#review91319
---


On July 13, 2015, 8:36 p.m., Aditya Auradkar wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33378/
 ---
 
 (Updated July 13, 2015, 8:36 p.m.)
 
 
 Review request for kafka, Joel Koshy and Jun Rao.
 
 
 Bugs: KAFKA-2136
 https://issues.apache.org/jira/browse/KAFKA-2136
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Changes are
 - Addressing Joel's comments
 - protocol changes to the fetch request and response to return the 
 throttle_time_ms to clients
 - New producer/consumer metrics to expose the avg and max delay time for a 
 client
 - Test cases.
 - Addressed Joel's comments
   
 For now the patch will publish a zero delay and return a response
 
 
 Diffs
 -
 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
  56281ee15cc33dfc96ff64d5b1e596047c7132a4 
   
 clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
 07e65d4a54ba4eef5b787eba3e71cbe9f6a920bd 
   clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
 3dc8b015afd2347a41c9a9dbc02b8e367da5f75f 
   clients/src/main/java/org/apache/kafka/common/requests/FetchRequest.java 
 8686d83aa52e435c6adafbe9ff4bd1602281072a 
   clients/src/main/java/org/apache/kafka/common/requests/FetchResponse.java 
 eb8951fba48c335095cc43fc3672de1c733e07ff 
   clients/src/main/java/org/apache/kafka/common/requests/ProduceRequest.java 
 fabeae3083a8ea55cdacbb9568f3847ccd85bab4 
   clients/src/main/java/org/apache/kafka/common/requests/ProduceResponse.java 
 37ec0b79beafcf5735c386b066eb319fb697eff5 
   
 clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
  419541011d652becf0cda7a5e62ce813cddb1732 
   
 clients/src/test/java/org/apache/kafka/clients/producer/internals/SenderTest.java
  8b1805d3d2bcb9fe2bacb37d870c3236aa9532c4 
   
 clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java
  e3cc1967e407b64cc734548c19e30de700b64ba8 
   core/src/main/scala/kafka/api/FetchRequest.scala 
 5b38f8554898e54800abd65a7415dd0ac41fd958 
   core/src/main/scala/kafka/api/FetchResponse.scala 
 0b6b33ab6f7a732ff1322b6f48abd4c43e0d7215 
   core/src/main/scala/kafka/api/ProducerRequest.scala 
 c866180d3680da03e48d374415f10220f6ca68c4 
   core/src/main/scala/kafka/api/ProducerResponse.scala 
 5d1fac4cb8943f5bfaa487f8e9d9d2856efbd330 
   core/src/main/scala/kafka/consumer/SimpleConsumer.scala 
 c16f7edd322709060e54c77eb505c44cbd77a4ec 
   core/src/main/scala/kafka/server/AbstractFetcherThread.scala 
 83fc47417dd7fe4edf030217fa7fd69d99b170b0 
   core/src/main/scala/kafka/server/DelayedFetch.scala 
 de6cf5bdaa0e70394162febc63b50b55ca0a92db 
   core/src/main/scala/kafka/server/DelayedProduce.scala 
 05078b24ef28f2f4e099afa943e43f1d00359fda 
   core/src/main/scala/kafka/server/KafkaApis.scala 
 d63bc18d795a6f0e6994538ca55a9a46f7fb8ffd 
   core/src/main/scala/kafka/server/OffsetManager.scala 
 5cca85cf727975f6d3acb2223fd186753ad761dc 
   core/src/main/scala/kafka/server/ReplicaFetcherThread.scala 
 b31b432a226ba79546dd22ef1d2acbb439c2e9a3 
   core/src/main/scala/kafka/server/ReplicaManager.scala 
 59c9bc3ac3a8afc07a6f8c88c5871304db588d17 
   core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
 5717165f2344823fabe8f7cfafae4bb8af2d949a 
   core/src/test/scala/unit/kafka/server/ReplicaManagerTest.scala 
 00d59337a99ac135e8689bd1ecd928f7b1423d79 
 
 Diff: https://reviews.apache.org/r/33378/diff/
 
 
 Testing
 ---
 
 New tests added
 
 
 Thanks,
 
 Aditya Auradkar
 




[jira] [Commented] (KAFKA-2275) Add a ListTopics() API to the new consumer

2015-07-13 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625460#comment-14625460
 ] 

Jason Gustafson commented on KAFKA-2275:


[~singhashish], [~onurkaraman], one of the nice things that KAFKA-2123 
introduces is a methodology for executing periodic tasks. It is currently used 
for heartbeats and autocommits, but I think it could be used to periodically 
send topic metadata requests to refresh regex subscriptions as well. You may 
want to have a look at the active review board to see if this ticket can be 
done in a way to keep that option available (or to see if something else is 
needed). The only trickiness I see is that NetworkClient currently hijacks all 
metadata responses.

 Add a ListTopics() API to the new consumer
 --

 Key: KAFKA-2275
 URL: https://issues.apache.org/jira/browse/KAFKA-2275
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer
Reporter: Guozhang Wang
Assignee: Ashish K Singh
Priority: Critical
 Fix For: 0.8.3


 With regex subscription like
 {code}
 consumer.subscribe(topic*)
 {code}
 The partition assignment is automatically done at the Kafka side, while there 
 are some use cases where consumers want regex subscriptions but not 
 Kafka-side partition assignment, rather with their own specific partition 
 assignment. With ListTopics() they can periodically check for topic list 
 changes and specifically subscribe to the partitions of the new topics.
 For implementation, it involves sending a TopicMetadataRequest to a random 
 broker and parse the response.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1782) Junit3 Misusage

2015-07-13 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1782:
-
Reviewer: Guozhang Wang

 Junit3 Misusage
 ---

 Key: KAFKA-1782
 URL: https://issues.apache.org/jira/browse/KAFKA-1782
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Alexander Pakulov
  Labels: newbie
 Fix For: 0.8.3

 Attachments: KAFKA-1782.patch, KAFKA-1782_2015-06-18_11:52:49.patch


 This is found while I was working on KAFKA-1580: in many of our cases where 
 we explicitly extend from junit3suite (e.g. ProducerFailureHandlingTest), we 
 are actually misusing a bunch of features that only exist in Junit4, such as 
 (expected=classOf). For example, the following code
 {code}
 import org.scalatest.junit.JUnit3Suite
 import org.junit.Test
 import java.io.IOException
 class MiscTest extends JUnit3Suite {
   @Test (expected = classOf[IOException])
   def testSendOffset() {
   }
 }
 {code}
 will actually pass even though IOException was not thrown since this 
 annotation is not supported in Junit3. Whereas
 {code}
 import org.junit._
 import java.io.IOException
 class MiscTest extends JUnit3Suite {
   @Test (expected = classOf[IOException])
   def testSendOffset() {
   }
 }
 {code}
 or
 {code}
 import org.scalatest.junit.JUnitSuite
 import org.junit._
 import java.io.IOException
 class MiscTest extends JUnit3Suite {
   @Test (expected = classOf[IOException])
   def testSendOffset() {
   }
 }
 {code}
 or
 {code}
 import org.junit._
 import java.io.IOException
 class MiscTest {
   @Test (expected = classOf[IOException])
   def testSendOffset() {
   }
 }
 {code}
 will fail.
 I would propose to not rely on Junit annotations other than @Test itself but 
 use scala unit test annotations instead, for example:
 {code}
 import org.junit._
 import java.io.IOException
 class MiscTest {
   @Test
   def testSendOffset() {
 intercept[IOException] {
   //nothing
 }
   }
 }
 {code}
 will fail with a clearer stacktrace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 36333: Patch for KAFKA-2123

2015-07-13 Thread Ewen Cheslack-Postava

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36333/#review91564
---



clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
(line 961)
https://reviews.apache.org/r/36333/#comment144989

Hmm, this seems like very different behavior from before. Won't this 
trigger an offset fetch request *every* time this method is called? Seems like 
that could be very bad behavior if I wanted to do something like list the 
committed offsets for the partitions this consumer owns (i.e. by iterating over 
the very confusingly named SetTopicPartitions subscriptions(), which returns 
assigned partitions).

Wouldn't the previous logic where it checks if we have the committed offset 
first make sense?



clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
 (line 197)
https://reviews.apache.org/r/36333/#comment144993

Won't this always sleep even if we succeeded? Unlike similar code earlier 
in this class, this one doesn't check if future.succeeded().



clients/src/test/java/org/apache/kafka/clients/consumer/internals/CoordinatorTest.java
 (line 516)
https://reviews.apache.org/r/36333/#comment144998

It looks like a bunch of reorganization + addition of createCoordinator() 
calls were added, but it looks like they all use a MockRebalanceCallback? Even 
the ones that explicitly create their own callback and pass it into 
createCoordinator()? Maybe I'm just missing the reason for these changes?


- Ewen Cheslack-Postava


On July 12, 2015, 12:34 a.m., Jason Gustafson wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36333/
 ---
 
 (Updated July 12, 2015, 12:34 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-2123
 https://issues.apache.org/jira/browse/KAFKA-2123
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-2123; resolve problems from rebase
 
 
 KAFKA-2123; address review comments
 
 
 Diffs
 -
 
   clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 
 fd98740bff175cc9d5bc02e365d88e011ef65d22 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerCommitCallback.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerRebalanceCallback.java
  74dfdba0ecbca04947adba5eabb1cb5190a0cd5f 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerRecords.java 
 eb75d2e797e3aa3992e4cf74b12f51c8f1545e02 
   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
 7aa076084c894bb8f47b9df2c086475b06f47060 
   clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java 
 46e26a665a22625d50888efa7b53472279f36e79 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
  c1c8172cd45f6715262f9a6f497a7b1797a834a3 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/DelayedTask.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/DelayedTaskQueue.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
  695eaf63db9a5fa20dc2ca68957901462a96cd96 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/Heartbeat.java
  51eae1944d5c17cf838be57adf560bafe36fbfbd 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/NoAvailableBrokersException.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/RequestFuture.java
  13fc9af7392b4ade958daf3b0c9a165ddda351a6 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/RequestFutureAdapter.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/RequestFutureListener.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/SendFailedException.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/StaleMetadataException.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/SubscriptionState.java
  683745304c671952ff566f23b5dd4cf3ab75377a 
   
 clients/src/main/java/org/apache/kafka/common/errors/ConsumerCoordinatorNotAvailableException.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/common/errors/DisconnectException.java 
 PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/common/errors/IllegalGenerationException.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/common/errors/NotCoordinatorForConsumerException.java
  PRE-CREATION 
   
 

Re: Review Request 36333: Patch for KAFKA-2123

2015-07-13 Thread Jason Gustafson


 On July 14, 2015, 12:04 a.m., Guozhang Wang wrote:
  clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java,
   lines 117-131
  https://reviews.apache.org/r/36333/diff/1-2/?file=1002924#file1002924line117
 
  With this change, we are now always sending an OffsetFetchRequest even 
  when subscriptions.refreshCommitsNeeded returns false?

You are right. The weird thing about this API is that it accepts the partitions 
to refresh, but SubscriptionState only has a single flag indicating that a 
refresh is needed. This means that refreshing committed offsets for a subset of 
the partitions (which is what KafkaConsumer does) could cause us to fail to 
refresh the other partitions. I think maybe we should just remove the 
partitions parameter and always refresh all assigned partitions when a refresh 
is needed. Either that or we need to invalidate committed offsets on a 
per-partition basis.


 On July 14, 2015, 12:04 a.m., Guozhang Wang wrote:
  clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java,
   lines 558-559
  https://reviews.apache.org/r/36333/diff/2/?file=1009093#file1009093line558
 
  When the coordinator is dead / not known yet, the consumerCoordinator 
  field could be null, but since we do not stop scheduling the heartbeat 
  tasks, will this cause triggering client.send(null, ...)?

Actually I changed the code so that heartbeats are only rescheduled when the 
coordinator is known (and we are not awaiting a group join). Take a look at 
HeartbeatTask and see if it addresses this concern.


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36333/#review91565
---


On July 12, 2015, 12:34 a.m., Jason Gustafson wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36333/
 ---
 
 (Updated July 12, 2015, 12:34 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-2123
 https://issues.apache.org/jira/browse/KAFKA-2123
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-2123; resolve problems from rebase
 
 
 KAFKA-2123; address review comments
 
 
 Diffs
 -
 
   clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 
 fd98740bff175cc9d5bc02e365d88e011ef65d22 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerCommitCallback.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerRebalanceCallback.java
  74dfdba0ecbca04947adba5eabb1cb5190a0cd5f 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerRecords.java 
 eb75d2e797e3aa3992e4cf74b12f51c8f1545e02 
   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
 7aa076084c894bb8f47b9df2c086475b06f47060 
   clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java 
 46e26a665a22625d50888efa7b53472279f36e79 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
  c1c8172cd45f6715262f9a6f497a7b1797a834a3 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/DelayedTask.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/DelayedTaskQueue.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
  695eaf63db9a5fa20dc2ca68957901462a96cd96 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/Heartbeat.java
  51eae1944d5c17cf838be57adf560bafe36fbfbd 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/NoAvailableBrokersException.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/RequestFuture.java
  13fc9af7392b4ade958daf3b0c9a165ddda351a6 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/RequestFutureAdapter.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/RequestFutureListener.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/SendFailedException.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/StaleMetadataException.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/SubscriptionState.java
  683745304c671952ff566f23b5dd4cf3ab75377a 
   
 clients/src/main/java/org/apache/kafka/common/errors/ConsumerCoordinatorNotAvailableException.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/common/errors/DisconnectException.java 
 PRE-CREATION 
   
 

Re: Review Request 36333: Patch for KAFKA-2123

2015-07-13 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/36333/#review91565
---



clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
 (lines 117 - 131)
https://reviews.apache.org/r/36333/#comment144990

With this change, we are now always sending an OffsetFetchRequest even when 
subscriptions.refreshCommitsNeeded returns false?



clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
 (lines 506 - 507)
https://reviews.apache.org/r/36333/#comment144992

When the coordinator is dead / not known yet, the consumerCoordinator field 
could be null, but since we do not stop scheduling the heartbeat tasks, will 
this cause triggering client.send(null, ...)?


- Guozhang Wang


On July 12, 2015, 12:34 a.m., Jason Gustafson wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/36333/
 ---
 
 (Updated July 12, 2015, 12:34 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-2123
 https://issues.apache.org/jira/browse/KAFKA-2123
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-2123; resolve problems from rebase
 
 
 KAFKA-2123; address review comments
 
 
 Diffs
 -
 
   clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 
 fd98740bff175cc9d5bc02e365d88e011ef65d22 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerCommitCallback.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerRebalanceCallback.java
  74dfdba0ecbca04947adba5eabb1cb5190a0cd5f 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerRecords.java 
 eb75d2e797e3aa3992e4cf74b12f51c8f1545e02 
   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
 7aa076084c894bb8f47b9df2c086475b06f47060 
   clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java 
 46e26a665a22625d50888efa7b53472279f36e79 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
  c1c8172cd45f6715262f9a6f497a7b1797a834a3 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/DelayedTask.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/DelayedTaskQueue.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
  695eaf63db9a5fa20dc2ca68957901462a96cd96 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/Heartbeat.java
  51eae1944d5c17cf838be57adf560bafe36fbfbd 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/NoAvailableBrokersException.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/RequestFuture.java
  13fc9af7392b4ade958daf3b0c9a165ddda351a6 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/RequestFutureAdapter.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/RequestFutureListener.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/SendFailedException.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/StaleMetadataException.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/SubscriptionState.java
  683745304c671952ff566f23b5dd4cf3ab75377a 
   
 clients/src/main/java/org/apache/kafka/common/errors/ConsumerCoordinatorNotAvailableException.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/common/errors/DisconnectException.java 
 PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/common/errors/IllegalGenerationException.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/common/errors/NotCoordinatorForConsumerException.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/common/errors/OffsetLoadInProgressException.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/common/errors/UnknownConsumerIdException.java
  PRE-CREATION 
   clients/src/main/java/org/apache/kafka/common/protocol/Errors.java 
 4c0ecc3badd99727b5bd9d430364e61c184e0923 
   
 clients/src/test/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClientTest.java
  PRE-CREATION 
   
 clients/src/test/java/org/apache/kafka/clients/consumer/internals/CoordinatorTest.java
  d085fe5c9e2a0567893508a1c71f014fae6d7510 
   
 clients/src/test/java/org/apache/kafka/clients/consumer/internals/DelayedTaskQueueTest.java
  PRE-CREATION 
   
 clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
  

[jira] [Updated] (KAFKA-972) MetadataRequest returns stale list of brokers

2015-07-13 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-972:
--
   Resolution: Fixed
Fix Version/s: 0.8.3
   Status: Resolved  (was: Patch Available)

Thanks for the latest patch. +1 and committed to trunk.

 MetadataRequest returns stale list of brokers
 -

 Key: KAFKA-972
 URL: https://issues.apache.org/jira/browse/KAFKA-972
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.0
Reporter: Vinicius Carvalho
Assignee: Ashish K Singh
 Fix For: 0.8.3

 Attachments: BrokerMetadataTest.scala, KAFKA-972.patch, 
 KAFKA-972_2015-06-30_18:42:13.patch, KAFKA-972_2015-07-01_01:36:56.patch, 
 KAFKA-972_2015-07-01_01:42:42.patch, KAFKA-972_2015-07-01_08:06:03.patch, 
 KAFKA-972_2015-07-06_23:07:34.patch, KAFKA-972_2015-07-07_10:42:41.patch, 
 KAFKA-972_2015-07-07_23:24:13.patch


 When we issue an metadatarequest towards the cluster, the list of brokers is 
 stale. I mean, even when a broker is down, it's returned back to the client. 
 The following are examples of two invocations one with both brokers online 
 and the second with a broker down:
 {
 brokers: [
 {
 nodeId: 0,
 host: 10.139.245.106,
 port: 9092,
 byteLength: 24
 },
 {
 nodeId: 1,
 host: localhost,
 port: 9093,
 byteLength: 19
 }
 ],
 topicMetadata: [
 {
 topicErrorCode: 0,
 topicName: foozbar,
 partitions: [
 {
 replicas: [
 0
 ],
 isr: [
 0
 ],
 partitionErrorCode: 0,
 partitionId: 0,
 leader: 0,
 byteLength: 26
 },
 {
 replicas: [
 1
 ],
 isr: [
 1
 ],
 partitionErrorCode: 0,
 partitionId: 1,
 leader: 1,
 byteLength: 26
 },
 {
 replicas: [
 0
 ],
 isr: [
 0
 ],
 partitionErrorCode: 0,
 partitionId: 2,
 leader: 0,
 byteLength: 26
 },
 {
 replicas: [
 1
 ],
 isr: [
 1
 ],
 partitionErrorCode: 0,
 partitionId: 3,
 leader: 1,
 byteLength: 26
 },
 {
 replicas: [
 0
 ],
 isr: [
 0
 ],
 partitionErrorCode: 0,
 partitionId: 4,
 leader: 0,
 byteLength: 26
 }
 ],
 byteLength: 145
 }
 ],
 responseSize: 200,
 correlationId: -1000
 }
 {
 brokers: [
 {
 nodeId: 0,
 host: 10.139.245.106,
 port: 9092,
 byteLength: 24
 },
 {
 nodeId: 1,
 host: localhost,
 port: 9093,
 byteLength: 19
 }
 ],
 topicMetadata: [
 {
 topicErrorCode: 0,
 topicName: foozbar,
 partitions: [
 {
 replicas: [
 0
 ],
 isr: [],
 partitionErrorCode: 5,
 partitionId: 0,
 leader: -1,
 byteLength: 22
 },
 {
 replicas: [
 1
 ],
 isr: [
 1
 ],
 partitionErrorCode: 0,
 partitionId: 1,
 leader: 1,
 byteLength: 26
 },
 {
 replicas: [
 0
 ],
 isr: [],
 partitionErrorCode: 5,
 partitionId: 2,
 leader: -1,
 byteLength: 22
  

[jira] [Commented] (KAFKA-1835) Kafka new producer needs options to make blocking behavior explicit

2015-07-13 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625634#comment-14625634
 ] 

Jiangjie Qin commented on KAFKA-1835:
-

Hey [~guozhang], I am worrying about the case where user wants strict 
non-blocking during send. In that case they will set max.block.ms to be 0. 
Therefore they will always get an exception for the first send() because there 
is no metadata and user decided not to wait at all. 
Currently a workaround is to call partitionsFor on the topic first to enforce a 
metadata refresh then starting send. But in KIP-19 partitionsFor() will also 
use max.block.ms. So user will not be able to enforce a metadata refresh before 
sending data.

 Kafka new producer needs options to make blocking behavior explicit
 ---

 Key: KAFKA-1835
 URL: https://issues.apache.org/jira/browse/KAFKA-1835
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Affects Versions: 0.8.2.0, 0.8.3, 0.9.0
Reporter: Paul Pearcy
 Fix For: 0.8.3

 Attachments: KAFKA-1835-New-producer--blocking_v0.patch, 
 KAFKA-1835.patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The new (0.8.2 standalone) producer will block the first time it attempts to 
 retrieve metadata for a topic. This is not the desired behavior in some use 
 cases where async non-blocking guarantees are required and message loss is 
 acceptable in known cases. Also, most developers will assume an API that 
 returns a future is safe to call in a critical request path. 
 Discussing on the mailing list, the most viable option is to have the 
 following settings:
  pre.initialize.topics=x,y,z
  pre.initialize.timeout=x
  
 This moves potential blocking to the init of the producer and outside of some 
 random request. The potential will still exist for blocking in a corner case 
 where connectivity with Kafka is lost and a topic not included in pre-init 
 has a message sent for the first time. 
 There is the question of what to do when initialization fails. There are a 
 couple of options that I'd like available:
 - Fail creation of the client 
 - Fail all sends until the meta is available 
 Open to input on how the above option should be expressed. 
 It is also worth noting more nuanced solutions exist that could work without 
 the extra settings, they just end up having extra complications and at the 
 end of the day not adding much value. For instance, the producer could accept 
 and queue messages(note: more complicated than I am making it sound due to 
 storing all accepted messages in pre-partitioned compact binary form), but 
 you're still going to be forced to choose to either start blocking or 
 dropping messages at some point. 
 I have some test cases I am going to port over to the Kafka producer 
 integration ones and start from there. My current impl is in scala, but 
 porting to Java shouldn't be a big deal (was using a promise to track init 
 status, but will likely need to make that an atomic bool). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is back to normal : Kafka-trunk #540

2015-07-13 Thread Apache Jenkins Server
See https://builds.apache.org/job/Kafka-trunk/540/changes



[jira] [Commented] (KAFKA-1835) Kafka new producer needs options to make blocking behavior explicit

2015-07-13 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625674#comment-14625674
 ] 

Ewen Cheslack-Postava commented on KAFKA-1835:
--

[~becket_qin] I agree that for a user this doesn't look great when they first 
start using the API and switch it to be fully non-blocking. Although in a 
perverse way it may be pretty good behavior for those users -- it forces them 
to actually handle that exception properly when it occurs because they need to 
handle it to get any data sent out. This means they should be robust, to some 
degree, to both the metadata fetch and a buffer full condition. And I'm not 
convinced this behavior is unreasonable. Right now we continue to use the 
metadata we have for partitioning even if it's hit the max age. I'd argue we 
could very reasonably say that as soon as it hits metadata.max.age.ms and we 
can't get an update, we could reasonably start throwing the same error on send 
because we can't be certain the partitioning is still valid since the number of 
partitions could have changed.

In fact, perhaps that's another bug? Given connectivity issue with the cluster, 
the producer could incorrectly partition for arbitrarily long. It's also 
limited by the buffer size so in most cases probably wouldn't be an issue, but 
seems like bad behavior nonetheless.

 Kafka new producer needs options to make blocking behavior explicit
 ---

 Key: KAFKA-1835
 URL: https://issues.apache.org/jira/browse/KAFKA-1835
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Affects Versions: 0.8.2.0, 0.8.3, 0.9.0
Reporter: Paul Pearcy
 Fix For: 0.8.3

 Attachments: KAFKA-1835-New-producer--blocking_v0.patch, 
 KAFKA-1835.patch

   Original Estimate: 504h
  Remaining Estimate: 504h

 The new (0.8.2 standalone) producer will block the first time it attempts to 
 retrieve metadata for a topic. This is not the desired behavior in some use 
 cases where async non-blocking guarantees are required and message loss is 
 acceptable in known cases. Also, most developers will assume an API that 
 returns a future is safe to call in a critical request path. 
 Discussing on the mailing list, the most viable option is to have the 
 following settings:
  pre.initialize.topics=x,y,z
  pre.initialize.timeout=x
  
 This moves potential blocking to the init of the producer and outside of some 
 random request. The potential will still exist for blocking in a corner case 
 where connectivity with Kafka is lost and a topic not included in pre-init 
 has a message sent for the first time. 
 There is the question of what to do when initialization fails. There are a 
 couple of options that I'd like available:
 - Fail creation of the client 
 - Fail all sends until the meta is available 
 Open to input on how the above option should be expressed. 
 It is also worth noting more nuanced solutions exist that could work without 
 the extra settings, they just end up having extra complications and at the 
 end of the day not adding much value. For instance, the producer could accept 
 and queue messages(note: more complicated than I am making it sound due to 
 storing all accepted messages in pre-partitioned compact binary form), but 
 you're still going to be forced to choose to either start blocking or 
 dropping messages at some point. 
 I have some test cases I am going to port over to the Kafka producer 
 integration ones and start from there. My current impl is in scala, but 
 porting to Java shouldn't be a big deal (was using a promise to track init 
 status, but will likely need to make that an atomic bool). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2123) Make new consumer offset commit API use callback + future

2015-07-13 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625677#comment-14625677
 ] 

Jason Gustafson commented on KAFKA-2123:


Updated reviewboard https://reviews.apache.org/r/36333/diff/
 against branch upstream/trunk

 Make new consumer offset commit API use callback + future
 -

 Key: KAFKA-2123
 URL: https://issues.apache.org/jira/browse/KAFKA-2123
 Project: Kafka
  Issue Type: Sub-task
  Components: clients, consumer
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
Priority: Critical
 Fix For: 0.8.3

 Attachments: KAFKA-2123.patch, KAFKA-2123.patch, 
 KAFKA-2123_2015-04-30_11:23:05.patch, KAFKA-2123_2015-05-01_19:33:19.patch, 
 KAFKA-2123_2015-05-04_09:39:50.patch, KAFKA-2123_2015-05-04_22:51:48.patch, 
 KAFKA-2123_2015-05-29_11:11:05.patch, KAFKA-2123_2015-07-11_17:33:59.patch, 
 KAFKA-2123_2015-07-13_18:45:08.patch


 The current version of the offset commit API in the new consumer is
 void commit(offsets, commit type)
 where the commit type is either sync or async. This means you need to use 
 sync if you ever want confirmation that the commit succeeded. Some 
 applications will want to use asynchronous offset commit, but be able to tell 
 when the commit completes.
 This is basically the same problem that had to be fixed going from old 
 consumer - new consumer and I'd suggest the same fix using a callback + 
 future combination. The new API would be
 FutureVoid commit(MapTopicPartition, Long offsets, ConsumerCommitCallback 
 callback);
 where ConsumerCommitCallback contains a single method:
 public void onCompletion(Exception exception);
 We can provide shorthand variants of commit() for eliding the different 
 arguments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] Using GitHub Pull Requests for contributions and code review

2015-07-13 Thread Jun Rao
Ismael,

I followed the instructions in KAFKA-2320 and created a new Jenkins job (
https://builds.apache.org/job/kafka-trunk-git-pr/).  Could you check if it
works?

As for wiki, I have a couple of minor comments.

a. Could we add the following to the wiki?
To avoid conflicts, assign a jira to yourself if you plan to work on it. If
you are creating a jira and don't plan to work on it, leave the assignee as
Unassigned.

b. Previously, we mark a jira as Patch Available if there is a patch.
Could we reuse that instead of In Progress to be consistent? Also, if a
patch needs more work after review, the reviewer will mark the jira back to
In Progress.

Thanks,

Jun




On Wed, Jul 8, 2015 at 1:39 AM, Ismael Juma ism...@juma.me.uk wrote:

 An update on this.

 On Thu, Apr 30, 2015 at 2:12 PM, Ismael Juma ism...@juma.me.uk wrote:

 
 1. CI builds triggered by GitHub PRs (this is supported by Apache
 Infra, we need to request it for Kafka and provide whatever
 configuration
 is needed)
 
  Filed https://issues.apache.org/jira/browse/KAFKA-2320, someone with a
 Jenkins account needs to follow the instructions there.

 
 1. Adapting Spark's merge_park_pr script and integrating it into the
 kafka Git repository
 
  https://issues.apache.org/jira/browse/KAFKA-2187 includes a patch that
 has received an initial review by Neha.

 
 1. Updating the Kafka contribution wiki and adding a CONTRIBUTING.md
 to the Git repository (this is shown when someone is creating a pull
 request)
 
  Initial versions (feedback and/or improvements are welcome):

-

 https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes
-

 https://cwiki.apache.org/confluence/display/KAFKA/Patch+submission+and+review#Patchsubmissionandreview-MergingGitHubPullRequests
- https://issues.apache.org/jira/browse/KAFKA-2321 (patch available)


 1. Go through existing GitHub pull requests and close the ones that
 are no longer relevant (there are quite a few as people have been
 opening
 them over the years, but nothing was done about most of them)
 
  Not done yet. I think this should wait until we have merged a few PRs as
 I
 would like to invite people to open new PRs if it's still relevant while
 pointing them to the documentation on how to go about it.

 
 1. Other things I may be missing
 
  We also need to update the Contributing page on the website. I think
 this should also wait until we are happy that the new approach works well
 for us.

 Any help moving this forward is appreciated. Aside from reviews, feedback
 and merging the changes; testing the new process is particularly useful (
 https://issues.apache.org/jira/browse/KAFKA-2276 is an example).

 Best,
 Ismael



[jira] [Commented] (KAFKA-1782) Junit3 Misusage

2015-07-13 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625654#comment-14625654
 ] 

Guozhang Wang commented on KAFKA-1782:
--

[~apakulov] Yes this is still relevant, sorry for being late on the reviews, 
will take a look at it soon.

 Junit3 Misusage
 ---

 Key: KAFKA-1782
 URL: https://issues.apache.org/jira/browse/KAFKA-1782
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Alexander Pakulov
  Labels: newbie
 Fix For: 0.8.3

 Attachments: KAFKA-1782.patch, KAFKA-1782_2015-06-18_11:52:49.patch


 This is found while I was working on KAFKA-1580: in many of our cases where 
 we explicitly extend from junit3suite (e.g. ProducerFailureHandlingTest), we 
 are actually misusing a bunch of features that only exist in Junit4, such as 
 (expected=classOf). For example, the following code
 {code}
 import org.scalatest.junit.JUnit3Suite
 import org.junit.Test
 import java.io.IOException
 class MiscTest extends JUnit3Suite {
   @Test (expected = classOf[IOException])
   def testSendOffset() {
   }
 }
 {code}
 will actually pass even though IOException was not thrown since this 
 annotation is not supported in Junit3. Whereas
 {code}
 import org.junit._
 import java.io.IOException
 class MiscTest extends JUnit3Suite {
   @Test (expected = classOf[IOException])
   def testSendOffset() {
   }
 }
 {code}
 or
 {code}
 import org.scalatest.junit.JUnitSuite
 import org.junit._
 import java.io.IOException
 class MiscTest extends JUnit3Suite {
   @Test (expected = classOf[IOException])
   def testSendOffset() {
   }
 }
 {code}
 or
 {code}
 import org.junit._
 import java.io.IOException
 class MiscTest {
   @Test (expected = classOf[IOException])
   def testSendOffset() {
   }
 }
 {code}
 will fail.
 I would propose to not rely on Junit annotations other than @Test itself but 
 use scala unit test annotations instead, for example:
 {code}
 import org.junit._
 import java.io.IOException
 class MiscTest {
   @Test
   def testSendOffset() {
 intercept[IOException] {
   //nothing
 }
   }
 }
 {code}
 will fail with a clearer stacktrace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >