[jira] [Commented] (KAFKA-1737) Document required ZkSerializer for ZkClient used with AdminUtils

2014-11-12 Thread Vivek Madani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207798#comment-14207798
 ] 

Vivek Madani commented on KAFKA-1737:
-

Hi - Did you mean to enforce ZkStringSerializer on the ZkClient instance passed 
to AdminUtils.createTopic? Or you meant changing ZkClient from 
org.I0Itec.zkclient.ZkClient? 

If I understand this correctly, since AdminUtils are user-facing, user can 
create ZkClient instance outside and pass it on to AdminUtils. Do you suggest 
providing an overload in AdminUtils that takes parameters required to construct 
ZkClient internally and set ZkStringSerializer for that? In this case, doc 
update may still be required in case someone intends to use the overload which 
takes ZkClient. Or we just set ZkStringSerializer for the instance of ZkClient 
passed to AdminUtils.

There are many places where new ZkClient is called within kafka code-base and 
your suggestion to have a createZkClient will help but we may need a different 
mechanism for AdminUtils. I am saying this based on my limited understanding of 
the Kafka code-base - correct me if I am missing out anything.

 Document required ZkSerializer for ZkClient used with AdminUtils
 

 Key: KAFKA-1737
 URL: https://issues.apache.org/jira/browse/KAFKA-1737
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Affects Versions: 0.8.1.1
Reporter: Stevo Slavic
Priority: Minor

 {{ZkClient}} instances passed to {{AdminUtils}} calls must have 
 {{kafka.utils.ZKStringSerializer}} set as {{ZkSerializer}}. Otherwise 
 commands executed via {{AdminUtils}} may not be seen/recognizable to broker, 
 producer or consumer. E.g. producer (with auto topic creation turned off) 
 will not be able to send messages to a topic created via {{AdminUtils}}, it 
 will throw {{UnknownTopicOrPartitionException}}.
 Please consider at least documenting this requirement in {{AdminUtils}} 
 scaladoc.
 For more info see [related discussion on Kafka user mailing 
 list|http://mail-archives.apache.org/mod_mbox/kafka-users/201410.mbox/%3CCAAUywg-oihNiXuQRYeS%3D8Z3ymsmEHo6ghLs%3Dru4nbm%2BdHVz6TA%40mail.gmail.com%3E].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1752) add --replace-broker option

2014-11-12 Thread Dmitry Pekar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207900#comment-14207900
 ] 

Dmitry Pekar commented on KAFKA-1752:
-

[~gwenshap] That, probably, could be implemented. But wouldn't it create 
unpredictable and unmanageable (from user's point of view) replica 
redistribution? Also should we consider using a strategy with optimal (minimal 
number) moving of replicas between brokers?


 add --replace-broker option
 ---

 Key: KAFKA-1752
 URL: https://issues.apache.org/jira/browse/KAFKA-1752
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1752) add --replace-broker option

2014-11-12 Thread Dmitry Pekar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207900#comment-14207900
 ] 

Dmitry Pekar edited comment on KAFKA-1752 at 11/12/14 10:31 AM:


[~gwenshap] That, probably, could be implemented. 
1.But wouldn't it create unpredictable and unmanageable (from user's point of 
view) replica redistribution? 
2.If 1. is false should we consider using a strategy with optimal (minimal 
number) moving of replicas between brokers?

If 1. is false than we should discuss the strategy of fair redistribution. Need 
to think about it.

Also this seems to extend the scope of initial ticket, because this part 
(--add-broker and fair redistribution or replicas) is the most complicated.


was (Author: dmitry pekar):
[~gwenshap] That, probably, could be implemented. But wouldn't it create 
unpredictable and unmanageable (from user's point of view) replica 
redistribution? Also should we consider using a strategy with optimal (minimal 
number) moving of replicas between brokers?


 add --replace-broker option
 ---

 Key: KAFKA-1752
 URL: https://issues.apache.org/jira/browse/KAFKA-1752
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1667) topic-level configuration not validated

2014-11-12 Thread Dmytro Kostiuchenko (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207957#comment-14207957
 ] 

Dmytro Kostiuchenko commented on KAFKA-1667:


Updated reviewboard https://reviews.apache.org/r/27634/diff/
 against branch origin/trunk

  topic-level configuration not validated
 

 Key: KAFKA-1667
 URL: https://issues.apache.org/jira/browse/KAFKA-1667
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ryan Berdeen
  Labels: newbie
 Attachments: KAFKA-1667.patch, KAFKA-1667_2014-11-05_19:43:53.patch, 
 KAFKA-1667_2014-11-06_17:10:14.patch, KAFKA-1667_2014-11-07_14:28:14.patch, 
 KAFKA-1667_2014-11-12_12:49:11.patch


 I was able to set the configuration for a topic to these invalid values:
 {code}
 Topic:topic-config-test  PartitionCount:1ReplicationFactor:2 
 Configs:min.cleanable.dirty.ratio=-30.2,segment.bytes=-1,retention.ms=-12,cleanup.policy=lol
 {code}
 It seems that the values are saved as long as they are the correct type, but 
 are not validated like the corresponding broker-level properties.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1667) topic-level configuration not validated

2014-11-12 Thread Dmytro Kostiuchenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kostiuchenko updated KAFKA-1667:
---
Attachment: KAFKA-1667_2014-11-12_12:49:11.patch

  topic-level configuration not validated
 

 Key: KAFKA-1667
 URL: https://issues.apache.org/jira/browse/KAFKA-1667
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ryan Berdeen
  Labels: newbie
 Attachments: KAFKA-1667.patch, KAFKA-1667_2014-11-05_19:43:53.patch, 
 KAFKA-1667_2014-11-06_17:10:14.patch, KAFKA-1667_2014-11-07_14:28:14.patch, 
 KAFKA-1667_2014-11-12_12:49:11.patch


 I was able to set the configuration for a topic to these invalid values:
 {code}
 Topic:topic-config-test  PartitionCount:1ReplicationFactor:2 
 Configs:min.cleanable.dirty.ratio=-30.2,segment.bytes=-1,retention.ms=-12,cleanup.policy=lol
 {code}
 It seems that the values are saved as long as they are the correct type, but 
 are not validated like the corresponding broker-level properties.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 27634: Patch for KAFKA-1667

2014-11-12 Thread Dmytro Kostiuchenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27634/
---

(Updated Nov. 12, 2014, 11:49 a.m.)


Review request for kafka.


Bugs: KAFKA-1667
https://issues.apache.org/jira/browse/KAFKA-1667


Repository: kafka


Description (updated)
---

KAFKA-1667 Fixed bugs in LogConfig. Added test and documentation


KAFKA-1667 Updated tests to reflect new boolean property parsing logic


KAFKA-1667 renamed methods to match naming convention


KAFKA-1667 Added unit test to cover invalid configuration case


KAFKA-1667 Strict UncleanLeaderElection property parsing


Diffs (updated)
-

  clients/src/main/java/org/apache/kafka/common/config/ConfigDef.java 
c4cea2cc072f4db4ce014b63d226431d3766bef1 
  core/src/main/scala/kafka/admin/TopicCommand.scala 
0b2735e7fc42ef9894bef1997b1f06a8ebee5439 
  core/src/main/scala/kafka/log/LogConfig.scala 
e48922a97727dd0b98f3ae630ebb0af3bef2373d 
  core/src/main/scala/kafka/utils/Utils.scala 
23aefb4715b177feae1d2f83e8b910653ea10c5f 
  core/src/test/scala/kafka/log/LogConfigTest.scala PRE-CREATION 
  core/src/test/scala/unit/kafka/integration/UncleanLeaderElectionTest.scala 
f44568cb25edf25db857415119018fd4c9922f61 

Diff: https://reviews.apache.org/r/27634/diff/


Testing
---


Thanks,

Dmytro Kostiuchenko



[jira] [Commented] (KAFKA-1667) topic-level configuration not validated

2014-11-12 Thread Dmytro Kostiuchenko (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207968#comment-14207968
 ] 

Dmytro Kostiuchenko commented on KAFKA-1667:


Can't assign issue to myself. Get an exception when running 
kafka-patch-review.py.

{code}jira.exceptions.JIRAError: HTTP 400: Field 'assignee' cannot be set. It 
is not on the appropriate screen, or unknown{code}

 Also don't have Assign to me button in JIRA.

  topic-level configuration not validated
 

 Key: KAFKA-1667
 URL: https://issues.apache.org/jira/browse/KAFKA-1667
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ryan Berdeen
  Labels: newbie
 Attachments: KAFKA-1667.patch, KAFKA-1667_2014-11-05_19:43:53.patch, 
 KAFKA-1667_2014-11-06_17:10:14.patch, KAFKA-1667_2014-11-07_14:28:14.patch, 
 KAFKA-1667_2014-11-12_12:49:11.patch


 I was able to set the configuration for a topic to these invalid values:
 {code}
 Topic:topic-config-test  PartitionCount:1ReplicationFactor:2 
 Configs:min.cleanable.dirty.ratio=-30.2,segment.bytes=-1,retention.ms=-12,cleanup.policy=lol
 {code}
 It seems that the values are saved as long as they are the correct type, but 
 are not validated like the corresponding broker-level properties.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1667) topic-level configuration not validated

2014-11-12 Thread Dmytro Kostiuchenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kostiuchenko updated KAFKA-1667:
---
Status: Patch Available  (was: Open)

  topic-level configuration not validated
 

 Key: KAFKA-1667
 URL: https://issues.apache.org/jira/browse/KAFKA-1667
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ryan Berdeen
  Labels: newbie
 Attachments: KAFKA-1667_2014-11-05_19:43:53.patch, 
 KAFKA-1667_2014-11-06_17:10:14.patch, KAFKA-1667_2014-11-07_14:28:14.patch, 
 KAFKA-1667_2014-11-12_12:49:11.patch


 I was able to set the configuration for a topic to these invalid values:
 {code}
 Topic:topic-config-test  PartitionCount:1ReplicationFactor:2 
 Configs:min.cleanable.dirty.ratio=-30.2,segment.bytes=-1,retention.ms=-12,cleanup.policy=lol
 {code}
 It seems that the values are saved as long as they are the correct type, but 
 are not validated like the corresponding broker-level properties.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1667) topic-level configuration not validated

2014-11-12 Thread Dmytro Kostiuchenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kostiuchenko updated KAFKA-1667:
---
Attachment: (was: KAFKA-1667.patch)

  topic-level configuration not validated
 

 Key: KAFKA-1667
 URL: https://issues.apache.org/jira/browse/KAFKA-1667
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ryan Berdeen
  Labels: newbie
 Attachments: KAFKA-1667_2014-11-05_19:43:53.patch, 
 KAFKA-1667_2014-11-06_17:10:14.patch, KAFKA-1667_2014-11-07_14:28:14.patch, 
 KAFKA-1667_2014-11-12_12:49:11.patch


 I was able to set the configuration for a topic to these invalid values:
 {code}
 Topic:topic-config-test  PartitionCount:1ReplicationFactor:2 
 Configs:min.cleanable.dirty.ratio=-30.2,segment.bytes=-1,retention.ms=-12,cleanup.policy=lol
 {code}
 It seems that the values are saved as long as they are the correct type, but 
 are not validated like the corresponding broker-level properties.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 27684: Patch for KAFKA-1743

2014-11-12 Thread Manikumar Reddy O


 On Nov. 10, 2014, 7:50 p.m., Jun Rao wrote:
  core/src/main/scala/kafka/consumer/ConsumerConnector.scala, lines 76-80
  https://reviews.apache.org/r/27684/diff/2/?file=755292#file755292line76
 
  We will also need to change the interface in ConsumerConnector from 
  
def commitOffsets(retryOnFailure: Boolean = true)

  back to 
  
def commitOffsets

  In ZookeeperConsumerconnector, we can make the following method private
  
  def commitOffsets(retryOnFailure: Boolean = true)
  
  Another question, will scala compiler be confused with 2 methods, one 
  w/o parenthsis and one with 1 parameter having a default? Could you try 
  compiling the code on all scala versions?

Currently below classes uses the new method  commitOffsets(true). 

kafka/javaapi/consumer/ZookeeperConsumerConnector.scala
kafka/tools/TestEndToEndLatency.scala

If we are changing the interface,  then we need to change the above classes 
also. 

If we are not fixing this on trunk, then same problem will come in 0.8.3. 
How to handle this? 

2 methods, one w/o parenthsis and one with 1 parameter is getting compiled on 
all scala versions.


- Manikumar Reddy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27684/#review60652
---


On Nov. 8, 2014, 6:20 a.m., Manikumar Reddy O wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27684/
 ---
 
 (Updated Nov. 8, 2014, 6:20 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1743
 https://issues.apache.org/jira/browse/KAFKA-1743
 
 
 Repository: kafka
 
 
 Description
 ---
 
 def commitOffsets method added to make ConsumerConnector backward  compatible
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/consumer/ConsumerConnector.scala 
 07677c1c26768ef9c9032626180d0015f12cb0e0 
 
 Diff: https://reviews.apache.org/r/27684/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Manikumar Reddy O
 




Re: Review Request 27890: Patch for KAFKA-1764

2014-11-12 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27890/#review61001
---


Could you also remove the line in consumer that sends back the shutdown command?

- Guozhang Wang


On Nov. 11, 2014, 10:59 p.m., Jiangjie Qin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27890/
 ---
 
 (Updated Nov. 11, 2014, 10:59 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1764
 https://issues.apache.org/jira/browse/KAFKA-1764
 
 
 Repository: kafka
 
 
 Description
 ---
 
 fix for KAFKA-1764
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
 fbc680fde21b02f11285a4f4b442987356abd17b 
 
 Diff: https://reviews.apache.org/r/27890/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jiangjie Qin
 




Re: Kafka Command Line Shell

2014-11-12 Thread Guozhang Wang
Thanks Joe. I will read the wiki page.

On Tue, Nov 11, 2014 at 11:47 PM, Joe Stein joe.st...@stealth.ly wrote:

 I started writing this up on the wiki

 https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Command+Line+and+Related+Improvements

 Instead of starting a new thread I figure just continue this one I started.
 I also added another (important) component for centralized management of
 configuration as global level much like we have topic level.  These
 global configuration would be overridden (perhaps not all) from the
 server.properties on start (so like in case one broker needs a different
 port, sure).

 One concern I have is that using RQ/RP wire protocol to the
 controller instead of the current way (via ZK admin path) may expose
 concurrency on the admin requests, which may not be supported yet.

 Guozhang, take a look at the diagram how I am thinking of this it would be
 a new handle request that will execute the tools pretty much how they are
 today. My thinking is maybe to-do one at a time (so TopicCommand first I
 think) and have what the TopicCommand is doing happen on server and send
 the RQ/RP to the client but execute on the server. If there is something
 not supported we will of course have to deal with that and implement it for
 sure.  Once we get one working end to end I think adding the rest will be
 (more or less) concise iterations to get it done. I added your concern to
 the wiki under the gotchas section.

 /***
  Joe Stein
  Founder, Principal Consultant
  Big Data Open Source Security LLC
  http://www.stealth.ly
  Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
 /

 On Mon, Oct 20, 2014 at 2:15 AM, Guozhang Wang wangg...@gmail.com wrote:

  One concern I have is that using RQ/RP wire protocol to the controller
  instead of the current way (via ZK admin path) may expose concurrency on
  the admin requests, which may not be supported yet.
 
  Some initial discussion about this is on KAFKA-1305.
 
  Guozhang
 
  On Sun, Oct 19, 2014 at 1:55 PM, Joe Stein joe.st...@stealth.ly wrote:
 
   Maybe we should add some AdminMessage RQ/RP wire protocol structure(s)
  and
   let the controller handle it? We could then build the CLI and Shell in
  the
   project both as useful tools and samples for others.
  
   Making a http interface should be simple after KAFKA-1494 is done which
  all
   client libraries could offer.
  
   I will update the design tonight/tomorrow and should be able to have
   someone starting to work on it this week.
  
   /***
   Joe Stein
   Founder, Principal Consultant
   Big Data Open Source Security LLC
   http://www.stealth.ly
   Twitter: @allthingshadoop
   /
   On Oct 19, 2014 1:21 PM, Harsha ka...@harsha.io wrote:
  
+1 for Web Api
   
On Sat, Oct 18, 2014, at 11:48 PM, Glen Mazza wrote:
 Apache Karaf has been doing this for quite a few years, albeit in
  Java
 not Scala.  Still, their coding approach to creating a CLI probably
 captures many lessons learned over that time.

 Glen

 On 10/17/2014 08:03 PM, Joe Stein wrote:
  Hi, I have been thinking about the ease of use for operations
 with
Kafka.
  We have lots of tools doing a lot of different things and they
 are
   all
kind
  of in different places.
 
  So, what I was thinking is to have a single interface for our
  tooling
  https://issues.apache.org/jira/browse/KAFKA-1694
 
  This would manifest itself in two ways 1) a command line
 interface
  2)
a repl
 
  We would have one entry point centrally for all Kafka commands.
  kafka CMD ARGS
  kafka createTopic --brokerList etc,
  kafka reassignPartition --brokerList etc,
 
  or execute and run the shell
 
  kafka --brokerList localhost
  kafkause topicName;
  kafkaset acl='label';
 
  I was thinking that all calls would be initialized through
--brokerList and
  the broker can tell the KafkaCommandTool what server to connect
 to
   for
  MetaData.
 
  Thoughts? Tomatoes?
 
  /***
Joe Stein
Founder, Principal Consultant
Big Data Open Source Security LLC
http://www.stealth.ly
Twitter: @allthingshadoop 
  http://www.twitter.com/allthingshadoop
  /
 

   
  
 
 
 
  --
  -- Guozhang
 




-- 
-- Guozhang


[jira] [Commented] (KAFKA-1737) Document required ZkSerializer for ZkClient used with AdminUtils

2014-11-12 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208275#comment-14208275
 ] 

Guozhang Wang commented on KAFKA-1737:
--

Hi Vivek,

Here are my thoughts: since currently we only use a single data format 
(ZkStringSerializer) in Kafka's ZK, we could just enforce it in ZkClient 
construction time; but as you mentioned, people can pass any ZkClient instances 
to the AdminUtils API functions using new ZkClient it is a bit hard to 
enforce it programmatically, and instead I was proposing to add a new 
createZkClient function and let people to use it instead of calling new 
ZkClient to create new instances. Of course we still need to change the docs 
telling people to do so, not using new.

 Document required ZkSerializer for ZkClient used with AdminUtils
 

 Key: KAFKA-1737
 URL: https://issues.apache.org/jira/browse/KAFKA-1737
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Affects Versions: 0.8.1.1
Reporter: Stevo Slavic
Priority: Minor

 {{ZkClient}} instances passed to {{AdminUtils}} calls must have 
 {{kafka.utils.ZKStringSerializer}} set as {{ZkSerializer}}. Otherwise 
 commands executed via {{AdminUtils}} may not be seen/recognizable to broker, 
 producer or consumer. E.g. producer (with auto topic creation turned off) 
 will not be able to send messages to a topic created via {{AdminUtils}}, it 
 will throw {{UnknownTopicOrPartitionException}}.
 Please consider at least documenting this requirement in {{AdminUtils}} 
 scaladoc.
 For more info see [related discussion on Kafka user mailing 
 list|http://mail-archives.apache.org/mod_mbox/kafka-users/201410.mbox/%3CCAAUywg-oihNiXuQRYeS%3D8Z3ymsmEHo6ghLs%3Dru4nbm%2BdHVz6TA%40mail.gmail.com%3E].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1173) Using Vagrant to get up and running with Apache Kafka

2014-11-12 Thread Joe Stein (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208281#comment-14208281
 ] 

Joe Stein commented on KAFKA-1173:
--

[~ewencp] I am +1 on the virtual box parts to the patch and the updates you 
last made (I really like how you added the vagrant part to the main README, 
nice touch. I am having issue with the EC2 pieces but am pretty convinced it is 
my account how it is setup with VPC so I am going to setup a new account and 
try it again. I may not have a chance to-do that until the weekend FYI.

 Using Vagrant to get up and running with Apache Kafka
 -

 Key: KAFKA-1173
 URL: https://issues.apache.org/jira/browse/KAFKA-1173
 Project: Kafka
  Issue Type: Improvement
Reporter: Joe Stein
Assignee: Ewen Cheslack-Postava
 Attachments: KAFKA-1173.patch, KAFKA-1173_2013-12-07_12:07:55.patch, 
 KAFKA-1173_2014-11-11_13:50:55.patch


 Vagrant has been getting a lot of pickup in the tech communities.  I have 
 found it very useful for development and testing and working with a few 
 clients now using it to help virtualize their environments in repeatable ways.
 Using Vagrant to get up and running.
 For 0.8.0 I have a patch on github https://github.com/stealthly/kafka
 1) Install Vagrant [http://www.vagrantup.com/](http://www.vagrantup.com/)
 2) Install Virtual Box 
 [https://www.virtualbox.org/](https://www.virtualbox.org/)
 In the main kafka folder
 1) ./sbt update
 2) ./sbt package
 3) ./sbt assembly-package-dependency
 4) vagrant up
 once this is done 
 * Zookeeper will be running 192.168.50.5
 * Broker 1 on 192.168.50.10
 * Broker 2 on 192.168.50.20
 * Broker 3 on 192.168.50.30
 When you are all up and running you will be back at a command brompt.  
 If you want you can login to the machines using vagrant shh machineName but 
 you don't need to.
 You can access the brokers and zookeeper by their IP
 e.g.
 bin/kafka-console-producer.sh --broker-list 
 192.168.50.10:9092,192.168.50.20:9092,192.168.50.30:9092 --topic sandbox
 bin/kafka-console-consumer.sh --zookeeper 192.168.50.5:2181 --topic sandbox 
 --from-beginning



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names

2014-11-12 Thread Vladimir Tretyakov (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208284#comment-14208284
 ] 

Vladimir Tretyakov commented on KAFKA-1481:
---

Maybe somebody can answer my last questions? Have to finish with this patch and 
moving forward! Thx.

 Stop using dashes AND underscores as separators in MBean names
 --

 Key: KAFKA-1481
 URL: https://issues.apache.org/jira/browse/KAFKA-1481
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1.1
Reporter: Otis Gospodnetic
Priority: Critical
  Labels: patch
 Fix For: 0.8.3

 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, 
 KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, 
 KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, 
 KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, 
 KAFKA-1481_2014-10-31_14-35-43.patch, 
 KAFKA-1481_2014-11-03_16-39-41_doc.patch, 
 KAFKA-1481_2014-11-03_17-02-23.patch, 
 KAFKA-1481_2014-11-10_20-39-41_doc.patch, 
 KAFKA-1481_2014-11-10_21-02-23.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch, alternateLayout1.png, 
 alternateLayout2.png, diff-for-alternate-layout1.patch, 
 diff-for-alternate-layout2.patch, originalLayout.png


 MBeans should not use dashes or underscores as separators because these 
 characters are allowed in hostnames, topics, group and consumer IDs, etc., 
 and these are embedded in MBeans names making it impossible to parse out 
 individual bits from MBeans.
 Perhaps a pipe character should be used to avoid the conflict. 
 This looks like a major blocker because it means nobody can write Kafka 0.8.x 
 monitoring tools unless they are doing it for themselves AND do not use 
 dashes AND do not use underscores.
 See: http://search-hadoop.com/m/4TaT4lonIW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1684) Implement TLS/SSL authentication

2014-11-12 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208321#comment-14208321
 ] 

Jun Rao commented on KAFKA-1684:


Gwen,

Thanks for the comment. Having 3 different ports is probably fine. My point is 
that since adding a port requires inter-broker request format changes (and 
rolling upgrades with this kind of changes is a bit tricky), it would be good 
if we do the request change just once. Perhaps we can work out the needed 
request format changes for both SSL and SASL first.

Regarding finding a good model to mimic, it seems that HDFS supports both 
Kerberos and SSL. Is that a better model to look into?

 Implement TLS/SSL authentication
 

 Key: KAFKA-1684
 URL: https://issues.apache.org/jira/browse/KAFKA-1684
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.9.0
Reporter: Jay Kreps
Assignee: Ivan Lyutov
 Attachments: KAFKA-1684.patch


 Add an SSL port to the configuration and advertise this as part of the 
 metadata request.
 If the SSL port is configured the socket server will need to add a second 
 Acceptor thread to listen on it. Connections accepted on this port will need 
 to go through the SSL handshake prior to being registered with a Processor 
 for request processing.
 SSL requests and responses may need to be wrapped or unwrapped using the 
 SSLEngine that was initialized by the acceptor. This wrapping and unwrapping 
 is very similar to what will need to be done for SASL-based authentication 
 schemes. We should have a uniform interface that covers both of these and we 
 will need to store the instance in the session with the request. The socket 
 server will have to use this object when reading and writing requests. We 
 will need to take care with the FetchRequests as the current 
 FileChannel.transferTo mechanism will be incompatible with wrap/unwrap so we 
 can only use this optimization for unencrypted sockets that don't require 
 userspace translation (wrapping).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25995: Patch for KAFKA-1650

2014-11-12 Thread Jiangjie Qin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25995/
---

(Updated Nov. 12, 2014, 5:51 p.m.)


Review request for kafka.


Summary (updated)
-

Patch for KAFKA-1650


Bugs: KAFKA-1650 and KAKFA-1650
https://issues.apache.org/jira/browse/KAFKA-1650
https://issues.apache.org/jira/browse/KAKFA-1650


Repository: kafka


Description
---

Addressed Guozhang's comments.


Addressed Guozhang's comments


commit before switch to trunk


commit before rebase


Rebased on trunk, Addressed Guozhang's comments.


Addressed Guozhang's comments on MaxInFlightRequests


Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
mirrormaker-redesign


Diffs (updated)
-

  core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
fbc680fde21b02f11285a4f4b442987356abd17b 
  core/src/main/scala/kafka/tools/MirrorMaker.scala 
f399105087588946987bbc84e3759935d9498b6a 

Diff: https://reviews.apache.org/r/25995/diff/


Testing
---


Thanks,

Jiangjie Qin



Re: Review Request 25995: Patch for KAKFA-1650

2014-11-12 Thread Jiangjie Qin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25995/
---

(Updated Nov. 12, 2014, 5:51 p.m.)


Review request for kafka.


Summary (updated)
-

Patch for KAKFA-1650


Bugs: KAFKA-1650 and KAKFA-1650
https://issues.apache.org/jira/browse/KAFKA-1650
https://issues.apache.org/jira/browse/KAKFA-1650


Repository: kafka


Description (updated)
---

Addressed Guozhang's comments.


Addressed Guozhang's comments


commit before switch to trunk


commit before rebase


Rebased on trunk, Addressed Guozhang's comments.


Addressed Guozhang's comments on MaxInFlightRequests


Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
mirrormaker-redesign


Diffs (updated)
-

  core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
fbc680fde21b02f11285a4f4b442987356abd17b 
  core/src/main/scala/kafka/tools/MirrorMaker.scala 
f399105087588946987bbc84e3759935d9498b6a 

Diff: https://reviews.apache.org/r/25995/diff/


Testing
---


Thanks,

Jiangjie Qin



[jira] [Updated] (KAFKA-1650) Mirror Maker could lose data on unclean shutdown.

2014-11-12 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin updated KAFKA-1650:

Attachment: KAFKA-1650_2014-11-12_09:51:30.patch

 Mirror Maker could lose data on unclean shutdown.
 -

 Key: KAFKA-1650
 URL: https://issues.apache.org/jira/browse/KAFKA-1650
 Project: Kafka
  Issue Type: Improvement
Reporter: Jiangjie Qin
Assignee: Jiangjie Qin
 Attachments: KAFKA-1650.patch, KAFKA-1650_2014-10-06_10:17:46.patch, 
 KAFKA-1650_2014-11-12_09:51:30.patch


 Currently if mirror maker got shutdown uncleanly, the data in the data 
 channel and buffer could potentially be lost. With the new producer's 
 callback, this issue could be solved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1650) Mirror Maker could lose data on unclean shutdown.

2014-11-12 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208327#comment-14208327
 ] 

Jiangjie Qin commented on KAFKA-1650:
-

Updated reviewboard https://reviews.apache.org/r/25995/diff/
 against branch origin/trunk

 Mirror Maker could lose data on unclean shutdown.
 -

 Key: KAFKA-1650
 URL: https://issues.apache.org/jira/browse/KAFKA-1650
 Project: Kafka
  Issue Type: Improvement
Reporter: Jiangjie Qin
Assignee: Jiangjie Qin
 Attachments: KAFKA-1650.patch, KAFKA-1650_2014-10-06_10:17:46.patch, 
 KAFKA-1650_2014-11-12_09:51:30.patch


 Currently if mirror maker got shutdown uncleanly, the data in the data 
 channel and buffer could potentially be lost. With the new producer's 
 callback, this issue could be solved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names

2014-11-12 Thread Vladimir Tretyakov (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208284#comment-14208284
 ] 

Vladimir Tretyakov edited comment on KAFKA-1481 at 11/12/14 5:54 PM:
-

Maybe somebody can answer my last questions? Have to finish with this patch and 
moving forward! Thx.

Will extract Kafka version like Gwen Shapira
 has suggested in 
http://search-hadoop.com/m/4TaT4xtk36/Programmatic+Kafka+version+detection%252Fextractionsubj=Programmatic+Kafka+version+detection+extraction+
 

{quote}
So it looks like we can use Gradle to add properties to manifest file and
then use getResourceAsStream to read the file and parse it.

The Gradle part would be something like:
jar.manifest {
attributes('Implementation-Title': project.name,
'Implementation-Version': project.version,
'Built-By': System.getProperty('user.name'),
'Built-JDK': System.getProperty('java.version'),
'Built-Host': getHostname(),
'Source-Compatibility': project.sourceCompatibility,
'Target-Compatibility': project.targetCompatibility
)
}

The code part would be:
this.getClass().getClassLoader().getResourceAsStream(/META-INF/MANIFEST.MF)

Does that look like the right approach?
{quote}

What do you think?

What about 65?
{quote}
{quote}


was (Author: vladimir.tretyakov):
Maybe somebody can answer my last questions? Have to finish with this patch and 
moving forward! Thx.

 Stop using dashes AND underscores as separators in MBean names
 --

 Key: KAFKA-1481
 URL: https://issues.apache.org/jira/browse/KAFKA-1481
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1.1
Reporter: Otis Gospodnetic
Priority: Critical
  Labels: patch
 Fix For: 0.8.3

 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, 
 KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, 
 KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, 
 KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, 
 KAFKA-1481_2014-10-31_14-35-43.patch, 
 KAFKA-1481_2014-11-03_16-39-41_doc.patch, 
 KAFKA-1481_2014-11-03_17-02-23.patch, 
 KAFKA-1481_2014-11-10_20-39-41_doc.patch, 
 KAFKA-1481_2014-11-10_21-02-23.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch, alternateLayout1.png, 
 alternateLayout2.png, diff-for-alternate-layout1.patch, 
 diff-for-alternate-layout2.patch, originalLayout.png


 MBeans should not use dashes or underscores as separators because these 
 characters are allowed in hostnames, topics, group and consumer IDs, etc., 
 and these are embedded in MBeans names making it impossible to parse out 
 individual bits from MBeans.
 Perhaps a pipe character should be used to avoid the conflict. 
 This looks like a major blocker because it means nobody can write Kafka 0.8.x 
 monitoring tools unless they are doing it for themselves AND do not use 
 dashes AND do not use underscores.
 See: http://search-hadoop.com/m/4TaT4lonIW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


- Re: Kafka consumer transactional support

2014-11-12 Thread Falabella, Anthony
Hi Jun,

Thanks for taking a look at my issue and also for updating the future release 
plan Wiki page.

My use case is to use Kafka as if it were a JMS provider (messaging use case).
I'm currently using Kafka  0.8.1.1 with Java and specifically the Spring 
Integration Kafka Inbound Channel Adapter.  Internally that adapter uses the 
HighLevelConsumer which shields the caller from the internals of offsets. Let's 
take the case where a consumer-group reads a number of messages and then is 
abruptly terminated before properly processing those messages.  In that 
scenario upon restart ideally we'd begin reading at the offset we were at prior 
to abruptly terminating.  If we have auto.commit.enable=true upon restart 
those messages will be considered already read and will be skipped.  Setting 
auto.commit.enable=false would help in this case but now we'd have to 
manually call on the offset manager, requiring the use of the SimpleConsumer.

In my use-case, it's acceptable for some manual intervention to say reprocess 
messaging X thru Y, but to do so would require us to know the exact offset we 
had started at prior to the chunk that was read in when the JVM abnormally 
terminated.  Perhaps I could look at the underlying ExportZkOffsets and 
ImportZkOffsets Java classes mentioned in this link, but at the very least I'd 
need to log the timestamp just prior to my read to be used in that query per:
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Idon'twantmyconsumer'soffsetstobecommittedautomatically.CanImanuallymanagemyconsumer'soffsets?

It sounds like the ConsumerAPI rewrite mentioned in these links might be 
helpful in my situation (potentially targeted for Apr 2015):
https://cwiki.apache.org/confluence/display/KAFKA/Client+Rewrite#ClientRewrite-ConsumerAPI
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Consumer+Rewrite+Design

In the meantime if you have any suggestions for things I might be able to use 
to work-around my concern I'd be appreciative.  Again I'm on 0.8.1.1 but would 
be willing to look at 0.8.2 if it offered anything to help with my use-case.

Thanks
Tony


[jira] [Created] (KAFKA-1767) /admin/reassign_partitions deleted before reassignment completes

2014-11-12 Thread Ryan Berdeen (JIRA)
Ryan Berdeen created KAFKA-1767:
---

 Summary: /admin/reassign_partitions deleted before reassignment 
completes
 Key: KAFKA-1767
 URL: https://issues.apache.org/jira/browse/KAFKA-1767
 Project: Kafka
  Issue Type: Bug
  Components: controller
Affects Versions: 0.8.1.1
Reporter: Ryan Berdeen
Assignee: Neha Narkhede


https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/controller/KafkaController.scala#L477-L517
 describes the process of reassigning partitions. Specifically,by the time 
{{/admin/reassign_partitions}} is updated, the newly assigned replicas (RAR) 
should be in sync, and the assigned replicas (AR) in ZooKeeper should be 
updated:

{code}
4. Wait until all replicas in RAR are in sync with the leader.
...
10. Update AR in ZK with RAR.
11. Update the /admin/reassign_partitions path in ZK to remove this partition.
{code}

This worked in 0.8.1, but in 0.8.1.1 we observe {{/admin/reassign_partitions}} 
being removed before step 4 has completed.

For example, if we have AR [1,2] and then put [3,4] in 
{{/admin/reassign_partitions}}, the cluster will end up with AR [1,2,3,4] and 
ISR [1,2] when the key is removed. Eventually, the AR will be updated to [3,4].

This means that the {{kafka-reassign-partitions.sh}} tool will accept a new 
batch of reassignments before the current reassignments have finished, and our 
own tool that feeds in reassignments in small batches (see KAFKA-1677) can't 
rely on this key to detect active reassignments.

Although we haven't observed this, it seems likely that if a controller 
resignation happens, the new controller won't know that a reassignment is in 
progress, and the AR will never be updated to the RAR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names

2014-11-12 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208497#comment-14208497
 ] 

Jun Rao commented on KAFKA-1481:



65. The following are some of the choices that we have.
(1) kafka.server:type=BrokerTopicMetrics,name=AggregateBytesOutPerSec
(2) kafka.server:type=AggregateBrokerTopicMetrics,name=BytesOutPerSec
(3) kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec
(4) kafka.server:type=BrokerTopicMetrics,name=AllTopicsBytesOutPerSec
(5) kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec,allTopics=true
(6) kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec,topics=aggregate
The following is my take. The issue with (1), (2) and (3) is that it's not 
obvious which dimension is being aggregated upon. I also don't quite like (2) 
since it breaks the convention that type is the class name. If we do go with 
this route, I'd prefer that we explicitly create an AggregateBrokerTopicMetrics 
class instead of sneaking in the prefix in KafkaMetricsGroup. (4), (5) and (6) 
will all make it clear which dimension is being aggregated upon. (4) is a bit 
weird now that we support tags since the main purpose of tags is that we don't 
have to squeeze everything into a single name. So, either (5) and (6) looks 
reasonable to me. Also, I am not sure how jconsole displays mbeans, but the 
key/value pairs in the mbean name are supposed to be unordered.

[~jjkoshy], what's your take?

As for the mbean for the Kafka version, could we do that in a separate jira? 
The approach seems reasonable.

 Stop using dashes AND underscores as separators in MBean names
 --

 Key: KAFKA-1481
 URL: https://issues.apache.org/jira/browse/KAFKA-1481
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1.1
Reporter: Otis Gospodnetic
Priority: Critical
  Labels: patch
 Fix For: 0.8.3

 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, 
 KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, 
 KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, 
 KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, 
 KAFKA-1481_2014-10-31_14-35-43.patch, 
 KAFKA-1481_2014-11-03_16-39-41_doc.patch, 
 KAFKA-1481_2014-11-03_17-02-23.patch, 
 KAFKA-1481_2014-11-10_20-39-41_doc.patch, 
 KAFKA-1481_2014-11-10_21-02-23.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch, alternateLayout1.png, 
 alternateLayout2.png, diff-for-alternate-layout1.patch, 
 diff-for-alternate-layout2.patch, originalLayout.png


 MBeans should not use dashes or underscores as separators because these 
 characters are allowed in hostnames, topics, group and consumer IDs, etc., 
 and these are embedded in MBeans names making it impossible to parse out 
 individual bits from MBeans.
 Perhaps a pipe character should be used to avoid the conflict. 
 This looks like a major blocker because it means nobody can write Kafka 0.8.x 
 monitoring tools unless they are doing it for themselves AND do not use 
 dashes AND do not use underscores.
 See: http://search-hadoop.com/m/4TaT4lonIW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 27735: Patch for KAFKA-1173

2014-11-12 Thread Ewen Cheslack-Postava

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27735/
---

(Updated Nov. 12, 2014, 7:32 p.m.)


Review request for kafka.


Bugs: KAFKA-1173
https://issues.apache.org/jira/browse/KAFKA-1173


Repository: kafka


Description (updated)
---

Add basic EC2 support, cleaner Vagrantfile, README cleanup, etc.


Better naming, hostmanager for routable VM names, vagrant-cachier to reduce 
startup cost, cleanup provisioning scripts, initial support for multiple 
zookeepers, general cleanup.


Don't sync a few directories that aren't actually required on the server.


Add generic worker node support.


Default # of workers should be 0


Add support for Zookeeper clusters.

This requires us to split up allocating VMs and provisioning because Vagrant
will run the provisioner for the first node before all nodes are allocated. This
leaves the first node running Zookeeper with unroutable peer hostnames which it,
for some reason, caches as unroutable. The cluster never properly finishes
forming since the nodes are unable to open connections to nodes booted later
than they were. The simple solution is to make sure all nodes are booted before
starting configuration so we have all the addresses and hostnames available and
routable.

Fix AWS provider commands in Vagrant README.


Addressing Joe's comments.


Add support for EC2 VPC settings.


Update Vagrant README to use --no-parallel when using EC2.

There's an issue that causes Vagrant to hang when running in
parallel. The last message is from vagrant-hostmanager, but it's not
clear if it is the actual cause.


Diffs (updated)
-

  .gitignore 99b32a6770e3da59bc0167d77d45ca339ac3dbbd 
  README.md 9aca90664b2a80a37125775ddbdea06ba6c53644 
  Vagrantfile PRE-CREATION 
  vagrant/README.md PRE-CREATION 
  vagrant/base.sh PRE-CREATION 
  vagrant/broker.sh PRE-CREATION 
  vagrant/zk.sh PRE-CREATION 

Diff: https://reviews.apache.org/r/27735/diff/


Testing
---


Thanks,

Ewen Cheslack-Postava



[jira] [Commented] (KAFKA-1173) Using Vagrant to get up and running with Apache Kafka

2014-11-12 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208512#comment-14208512
 ] 

Ewen Cheslack-Postava commented on KAFKA-1173:
--

Updated reviewboard https://reviews.apache.org/r/27735/diff/
 against branch origin/trunk

 Using Vagrant to get up and running with Apache Kafka
 -

 Key: KAFKA-1173
 URL: https://issues.apache.org/jira/browse/KAFKA-1173
 Project: Kafka
  Issue Type: Improvement
Reporter: Joe Stein
Assignee: Ewen Cheslack-Postava
 Attachments: KAFKA-1173.patch, KAFKA-1173_2013-12-07_12:07:55.patch, 
 KAFKA-1173_2014-11-11_13:50:55.patch, KAFKA-1173_2014-11-12_11:32:09.patch


 Vagrant has been getting a lot of pickup in the tech communities.  I have 
 found it very useful for development and testing and working with a few 
 clients now using it to help virtualize their environments in repeatable ways.
 Using Vagrant to get up and running.
 For 0.8.0 I have a patch on github https://github.com/stealthly/kafka
 1) Install Vagrant [http://www.vagrantup.com/](http://www.vagrantup.com/)
 2) Install Virtual Box 
 [https://www.virtualbox.org/](https://www.virtualbox.org/)
 In the main kafka folder
 1) ./sbt update
 2) ./sbt package
 3) ./sbt assembly-package-dependency
 4) vagrant up
 once this is done 
 * Zookeeper will be running 192.168.50.5
 * Broker 1 on 192.168.50.10
 * Broker 2 on 192.168.50.20
 * Broker 3 on 192.168.50.30
 When you are all up and running you will be back at a command brompt.  
 If you want you can login to the machines using vagrant shh machineName but 
 you don't need to.
 You can access the brokers and zookeeper by their IP
 e.g.
 bin/kafka-console-producer.sh --broker-list 
 192.168.50.10:9092,192.168.50.20:9092,192.168.50.30:9092 --topic sandbox
 bin/kafka-console-consumer.sh --zookeeper 192.168.50.5:2181 --topic sandbox 
 --from-beginning



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1173) Using Vagrant to get up and running with Apache Kafka

2014-11-12 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1173:
-
Attachment: KAFKA-1173_2014-11-12_11:32:09.patch

 Using Vagrant to get up and running with Apache Kafka
 -

 Key: KAFKA-1173
 URL: https://issues.apache.org/jira/browse/KAFKA-1173
 Project: Kafka
  Issue Type: Improvement
Reporter: Joe Stein
Assignee: Ewen Cheslack-Postava
 Attachments: KAFKA-1173.patch, KAFKA-1173_2013-12-07_12:07:55.patch, 
 KAFKA-1173_2014-11-11_13:50:55.patch, KAFKA-1173_2014-11-12_11:32:09.patch


 Vagrant has been getting a lot of pickup in the tech communities.  I have 
 found it very useful for development and testing and working with a few 
 clients now using it to help virtualize their environments in repeatable ways.
 Using Vagrant to get up and running.
 For 0.8.0 I have a patch on github https://github.com/stealthly/kafka
 1) Install Vagrant [http://www.vagrantup.com/](http://www.vagrantup.com/)
 2) Install Virtual Box 
 [https://www.virtualbox.org/](https://www.virtualbox.org/)
 In the main kafka folder
 1) ./sbt update
 2) ./sbt package
 3) ./sbt assembly-package-dependency
 4) vagrant up
 once this is done 
 * Zookeeper will be running 192.168.50.5
 * Broker 1 on 192.168.50.10
 * Broker 2 on 192.168.50.20
 * Broker 3 on 192.168.50.30
 When you are all up and running you will be back at a command brompt.  
 If you want you can login to the machines using vagrant shh machineName but 
 you don't need to.
 You can access the brokers and zookeeper by their IP
 e.g.
 bin/kafka-console-producer.sh --broker-list 
 192.168.50.10:9092,192.168.50.20:9092,192.168.50.30:9092 --topic sandbox
 bin/kafka-console-consumer.sh --zookeeper 192.168.50.5:2181 --topic sandbox 
 --from-beginning



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1173) Using Vagrant to get up and running with Apache Kafka

2014-11-12 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208515#comment-14208515
 ] 

Ewen Cheslack-Postava commented on KAFKA-1173:
--

[~joestein] To be honest, I'm not too surprised something is coming up with the 
EC2 support. In theory it should be simple, but VPCs introduce a bunch of 
variables, and testing is tricky since some defaults depend on the age of your 
account since that affects whether you have EC2 classic support.

I ran through a test with a VPC and found some issues. I updated the patch, 
including some additional info in the README since setting up under a VPC 
requires slight differences. My testing so far has been in EC2-Classic since 
that's the default for my account. I also put this VPC in a different region to 
make sure that worked. Finally, I've noticed that the default parallel 
provisioning seems to work fine until the very end, when it sometimes seems to 
hang. I couldn't easily track down the cause, so I updated the readme to use 
--no-parallel when using EC2. Not ideal, but it works reliably until we can 
find a better fix.

Hopefully those fixes will clear up the issue you're seeing.

 Using Vagrant to get up and running with Apache Kafka
 -

 Key: KAFKA-1173
 URL: https://issues.apache.org/jira/browse/KAFKA-1173
 Project: Kafka
  Issue Type: Improvement
Reporter: Joe Stein
Assignee: Ewen Cheslack-Postava
 Attachments: KAFKA-1173.patch, KAFKA-1173_2013-12-07_12:07:55.patch, 
 KAFKA-1173_2014-11-11_13:50:55.patch, KAFKA-1173_2014-11-12_11:32:09.patch


 Vagrant has been getting a lot of pickup in the tech communities.  I have 
 found it very useful for development and testing and working with a few 
 clients now using it to help virtualize their environments in repeatable ways.
 Using Vagrant to get up and running.
 For 0.8.0 I have a patch on github https://github.com/stealthly/kafka
 1) Install Vagrant [http://www.vagrantup.com/](http://www.vagrantup.com/)
 2) Install Virtual Box 
 [https://www.virtualbox.org/](https://www.virtualbox.org/)
 In the main kafka folder
 1) ./sbt update
 2) ./sbt package
 3) ./sbt assembly-package-dependency
 4) vagrant up
 once this is done 
 * Zookeeper will be running 192.168.50.5
 * Broker 1 on 192.168.50.10
 * Broker 2 on 192.168.50.20
 * Broker 3 on 192.168.50.30
 When you are all up and running you will be back at a command brompt.  
 If you want you can login to the machines using vagrant shh machineName but 
 you don't need to.
 You can access the brokers and zookeeper by their IP
 e.g.
 bin/kafka-console-producer.sh --broker-list 
 192.168.50.10:9092,192.168.50.20:9092,192.168.50.30:9092 --topic sandbox
 bin/kafka-console-consumer.sh --zookeeper 192.168.50.5:2181 --topic sandbox 
 --from-beginning



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Kafka consumer transactional support

2014-11-12 Thread Falabella, Anthony
Hi Jun,

Thanks for taking a look at my issue and also for updating the future release 
plan Wiki page.

My use case is to use Kafka as if it were a JMS provider (messaging use case).
I'm currently using Kafka  0.8.1.1 with Java and specifically the Spring 
Integration Kafka Inbound Channel Adapter.  Internally that adapter uses the 
HighLevelConsumer which shields the caller from the internals of offsets. Let's 
take the case where a consumer-group reads a number of messages and then is 
abruptly terminated before properly processing those messages.  In that 
scenario upon restart ideally we'd begin reading at the offset we were at prior 
to abruptly terminating.  If we have auto.commit.enable=true upon restart 
those messages will be considered already read and will be skipped.  Setting 
auto.commit.enable=false would help in this case but now we'd have to 
manually call on the offset manager, requiring the use of the SimpleConsumer.

In my use-case, it's acceptable for some manual intervention to say reprocess 
messaging X thru Y, but to do so would require us to know the exact offset we 
had started at prior to the chunk that was read in when the JVM abnormally 
terminated.  Perhaps I could look at the underlying ExportZkOffsets and 
ImportZkOffsets Java classes mentioned in this link, but at the very least I'd 
need to log the timestamp just prior to my read to be used in that query per:
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Idon'twantmyconsumer'soffsetstobecommittedautomatically.CanImanuallymanagemyconsumer'soffsets?

It sounds like the ConsumerAPI rewrite mentioned in these links might be 
helpful in my situation (potentially targeted for Apr 2015):
https://cwiki.apache.org/confluence/display/KAFKA/Client+Rewrite#ClientRewrite-ConsumerAPI
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Consumer+Rewrite+Design

In the meantime if you have any suggestions for things I might be able to use 
to work-around my concern I'd be appreciative.  Again I'm on 0.8.1.1 but would 
be willing to look at 0.8.2 if it offered anything to help with my use-case.

Thanks
Tony


Re: Kafka consumer transactional support

2014-11-12 Thread Jun Rao
Yes, the new consumer api will solve your probably better. Before that's
ready, another option is to use the commitOffset() api in the high level
consumer. It doesn't take any offset though. So, to prevent message loss
during consumer failure, you will need to make sure all iterated messages
are fully processed before calling commitOffset().

Thanks,

Jun

On Wed, Nov 12, 2014 at 11:35 AM, Falabella, Anthony 
anthony.falabe...@citi.com wrote:

 Hi Jun,

 Thanks for taking a look at my issue and also for updating the future
 release plan Wiki page.

 My use case is to use Kafka as if it were a JMS provider (messaging use
 case).
 I'm currently using Kafka  0.8.1.1 with Java and specifically the Spring
 Integration Kafka Inbound Channel Adapter.  Internally that adapter uses
 the HighLevelConsumer which shields the caller from the internals of
 offsets. Let's take the case where a consumer-group reads a number of
 messages and then is abruptly terminated before properly processing those
 messages.  In that scenario upon restart ideally we'd begin reading at the
 offset we were at prior to abruptly terminating.  If we have
 auto.commit.enable=true upon restart those messages will be considered
 already read and will be skipped.  Setting auto.commit.enable=false would
 help in this case but now we'd have to manually call on the offset manager,
 requiring the use of the SimpleConsumer.

 In my use-case, it's acceptable for some manual intervention to say
 reprocess messaging X thru Y, but to do so would require us to know the
 exact offset we had started at prior to the chunk that was read in when the
 JVM abnormally terminated.  Perhaps I could look at the underlying
 ExportZkOffsets and ImportZkOffsets Java classes mentioned in this link,
 but at the very least I'd need to log the timestamp just prior to my read
 to be used in that query per:

 https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Idon'twantmyconsumer'soffsetstobecommittedautomatically.CanImanuallymanagemyconsumer'soffsets
 ?

 It sounds like the ConsumerAPI rewrite mentioned in these links might be
 helpful in my situation (potentially targeted for Apr 2015):

 https://cwiki.apache.org/confluence/display/KAFKA/Client+Rewrite#ClientRewrite-ConsumerAPI

 https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Consumer+Rewrite+Design

 In the meantime if you have any suggestions for things I might be able to
 use to work-around my concern I'd be appreciative.  Again I'm on 0.8.1.1
 but would be willing to look at 0.8.2 if it offered anything to help with
 my use-case.

 Thanks
 Tony



Re: Review Request 27890: Patch for KAFKA-1764

2014-11-12 Thread Jiangjie Qin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27890/
---

(Updated Nov. 12, 2014, 10:05 p.m.)


Review request for kafka.


Bugs: KAFKA-1764
https://issues.apache.org/jira/browse/KAFKA-1764


Repository: kafka


Description (updated)
---

Changed Consumer iterator to stop putting the shutdown message back into 
channel.


Diffs (updated)
-

  core/src/main/scala/kafka/consumer/ConsumerIterator.scala 
ac491b4da2583ef7227c67f5b8bc0fd731d705c3 
  core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
fbc680fde21b02f11285a4f4b442987356abd17b 

Diff: https://reviews.apache.org/r/27890/diff/


Testing
---


Thanks,

Jiangjie Qin



[jira] [Updated] (KAFKA-1764) ZookeeperConsumerConnector could put multiple shutdownCommand to the same data chunk queue.

2014-11-12 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin updated KAFKA-1764:

Attachment: KAFKA-1764_2014-11-12_14:05:35.patch

 ZookeeperConsumerConnector could put multiple shutdownCommand to the same 
 data chunk queue.
 ---

 Key: KAFKA-1764
 URL: https://issues.apache.org/jira/browse/KAFKA-1764
 Project: Kafka
  Issue Type: Bug
Reporter: Jiangjie Qin
Assignee: Jiangjie Qin
 Attachments: KAFKA-1764.patch, KAFKA-1764_2014-11-12_14:05:35.patch


 In ZookeeperConsumerConnector shutdown(), we could potentially put multiple 
 shutdownCommand into the same data chunk queue, provided the topics are 
 sharing the same data chunk queue in topicThreadIdAndQueues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1764) ZookeeperConsumerConnector could put multiple shutdownCommand to the same data chunk queue.

2014-11-12 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208788#comment-14208788
 ] 

Jiangjie Qin commented on KAFKA-1764:
-

Updated reviewboard https://reviews.apache.org/r/27890/diff/
 against branch origin/trunk

 ZookeeperConsumerConnector could put multiple shutdownCommand to the same 
 data chunk queue.
 ---

 Key: KAFKA-1764
 URL: https://issues.apache.org/jira/browse/KAFKA-1764
 Project: Kafka
  Issue Type: Bug
Reporter: Jiangjie Qin
Assignee: Jiangjie Qin
 Attachments: KAFKA-1764.patch, KAFKA-1764_2014-11-12_14:05:35.patch


 In ZookeeperConsumerConnector shutdown(), we could potentially put multiple 
 shutdownCommand into the same data chunk queue, provided the topics are 
 sharing the same data chunk queue in topicThreadIdAndQueues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 27684: Patch for KAFKA-1743

2014-11-12 Thread Jun Rao


 On Nov. 10, 2014, 7:50 p.m., Jun Rao wrote:
  core/src/main/scala/kafka/consumer/ConsumerConnector.scala, lines 76-80
  https://reviews.apache.org/r/27684/diff/2/?file=755292#file755292line76
 
  We will also need to change the interface in ConsumerConnector from 
  
def commitOffsets(retryOnFailure: Boolean = true)

  back to 
  
def commitOffsets

  In ZookeeperConsumerconnector, we can make the following method private
  
  def commitOffsets(retryOnFailure: Boolean = true)
  
  Another question, will scala compiler be confused with 2 methods, one 
  w/o parenthsis and one with 1 parameter having a default? Could you try 
  compiling the code on all scala versions?
 
 Manikumar Reddy O wrote:
 Currently below classes uses the new method  commitOffsets(true). 
 
 kafka/javaapi/consumer/ZookeeperConsumerConnector.scala
 kafka/tools/TestEndToEndLatency.scala
 
 If we are changing the interface,  then we need to change the above 
 classes 
 also. 
 
 If we are not fixing this on trunk, then same problem will come in 0.8.3. 
 How to handle this? 
 
 2 methods, one w/o parenthsis and one with 1 parameter is getting 
 compiled on 
 all scala versions.

Thanks for the explanation. There is actually a bit of inconsistency introduced 
in this patch. 

In kafka.javaapi.consumer.ZookeeperConsumerConnector, commitOffsets() is 
implemented as the following.
  def commitOffsets() {
underlying.commitOffsets()
  }
This actually calls underlying.commitOffsets(isAutoCommit: Boolean = true) with 
a default value of true. However, ConsumerConnector.commitOffset is implemented 
as the following which sets isAutoCommit to false.
  def commitOffsets { commitOffsets(false) }
  
So, we should use true in the above.

Another thing that I was thinking is that it's going to be a bit confusing if 
we have the following scala apis.
  def commitOffsets(retryOnFailure: Boolean = true)
  def commitOffsets
  
So, if you do commitOffset it calls the second one and if you do 
commitOffset(), you actually call the first one. However, the expectation is 
probably the same method will be called in both cases. Would it be better if we 
get rid of the default like the following? Then, it's clear which method will 
be called.
  def commitOffsets(retryOnFailure: Boolean)
  def commitOffsets


- Jun


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27684/#review60652
---


On Nov. 8, 2014, 6:20 a.m., Manikumar Reddy O wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27684/
 ---
 
 (Updated Nov. 8, 2014, 6:20 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1743
 https://issues.apache.org/jira/browse/KAFKA-1743
 
 
 Repository: kafka
 
 
 Description
 ---
 
 def commitOffsets method added to make ConsumerConnector backward  compatible
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/consumer/ConsumerConnector.scala 
 07677c1c26768ef9c9032626180d0015f12cb0e0 
 
 Diff: https://reviews.apache.org/r/27684/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Manikumar Reddy O
 




[jira] [Commented] (KAFKA-1282) Disconnect idle socket connection in Selector

2014-11-12 Thread nicu marasoiu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208810#comment-14208810
 ] 

nicu marasoiu commented on KAFKA-1282:
--

I want, yes, I will add a few tests this week.

 Disconnect idle socket connection in Selector
 -

 Key: KAFKA-1282
 URL: https://issues.apache.org/jira/browse/KAFKA-1282
 Project: Kafka
  Issue Type: Bug
  Components: producer 
Affects Versions: 0.8.2
Reporter: Jun Rao
Assignee: nicu marasoiu
  Labels: newbie++
 Fix For: 0.9.0

 Attachments: 1282_access-order.patch, 1282_brush.patch, 
 1282_brushed_up.patch, 
 KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch


 To reduce # socket connections, it would be useful for the new producer to 
 close socket connections that are idle. We can introduce a new producer 
 config for the idle time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 27890: Patch for KAFKA-1764

2014-11-12 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27890/#review61099
---

Ship it!


Ship It!

- Guozhang Wang


On Nov. 12, 2014, 10:05 p.m., Jiangjie Qin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27890/
 ---
 
 (Updated Nov. 12, 2014, 10:05 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1764
 https://issues.apache.org/jira/browse/KAFKA-1764
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Changed Consumer iterator to stop putting the shutdown message back into 
 channel.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/consumer/ConsumerIterator.scala 
 ac491b4da2583ef7227c67f5b8bc0fd731d705c3 
   core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
 fbc680fde21b02f11285a4f4b442987356abd17b 
 
 Diff: https://reviews.apache.org/r/27890/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jiangjie Qin
 




Re: Review Request 27834: Fix KAFKA-1762: add comment on risks using a larger value of max.inflight.requests than 1, in KAFKA-1650 we will add another comment about its risk of data loss

2014-11-12 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27834/#review61123
---

Ship it!



clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java
https://reviews.apache.org/r/27834/#comment102598

I would actually prefer not mentioning the default value comment - (say, 
if we change the default ever).


- Joel Koshy


On Nov. 10, 2014, 10:53 p.m., Guozhang Wang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27834/
 ---
 
 (Updated Nov. 10, 2014, 10:53 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1762
 https://issues.apache.org/jira/browse/KAFKA-1762
 
 
 Repository: kafka
 
 
 Description
 ---
 
 dummy
 
 
 Diffs
 -
 
   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
 9095caf0db1e41a4acb4216fb197626fbd85b806 
 
 Diff: https://reviews.apache.org/r/27834/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Guozhang Wang
 




[jira] [Commented] (KAFKA-1762) Enforce MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION to 1 in MirrorMaker

2014-11-12 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208940#comment-14208940
 ] 

Joel Koshy commented on KAFKA-1762:
---

Committed the doc change to trunk

 Enforce MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION to 1 in MirrorMaker
 -

 Key: KAFKA-1762
 URL: https://issues.apache.org/jira/browse/KAFKA-1762
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Attachments: KAFKA-1762.patch


 The new Producer client introduces a config for the max # of inFlight 
 messages. When it is set  1 on MirrorMaker, however, there is a risk for 
 data loss even with KAFKA-1650 because the offsets recorded in the MM's 
 offset map is no longer continuous.
 Another issue is that when this value is set  1, there is a risk of message 
 re-ordering in the producer
 Changes:
 1. Set max # of inFlight messages = 1 in MM
 2. Leave comments explaining what the risks are of changing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is back to normal : Kafka-trunk #328

2014-11-12 Thread Apache Jenkins Server
See https://builds.apache.org/job/Kafka-trunk/328/changes



Re: Kafka consumer transactional support

2014-11-12 Thread Falabella, Anthony
I didn't realize there was a commitOffset() method on the high level consumer 
(the code is abstracted by the Spring Integration classes).
Yes, this actually suits my needs and I was able to get it to work for my use 
case.
Thank you very much - that was extremely helpful.

In case it's of any use to someone else, here's the solution I came up with.

Spring Configuration file
int-kafka:inbound-channel-adapter id=kafkaInboundChannelAdapter
kafka-consumer-context-ref=kafkaConsumerContext
auto-startup=true
channel=exchangeKafkaFusionInboundSpringExecutorChannel 
int:poller fixed-delay=10 time-unit=MILLISECONDS max-messages-per-poll=5 

   !-- Kafka 0.8.1.1 does not have the concept of transactions.  Therefore 
we must handle this on our own. --
   !-- To do so on the consumer we've set the autocommit.enable=false so 
commits won't automatically be performed as we read msgs. --
   !-- The advice-chain below will call 
consumerConfig.getConsumerConnector().commitOffsets() only if no exceptions 
occurred during the processing of this chunk of msgs.  --
   int:advice-chain
  bean id=kafkaConsumerAfterAdvice 
class=com.citigroup.tmg.exchgateway.common.util.springintegration.KafkaConsumerAfterAdvice
 property name=consumerContext 
ref=kafkaConsumerContext/
 property name=consumerGroupId value=consumerGroupG/
  /bean
   /int:advice-chain
/int:poller


int-kafka:producer-context id=kafkaProducerContext
int-kafka:producer-configurations
   int-kafka:producer-configuration broker-list=175.65.76.12:9092
  key-class-type=java.lang.String
  value-class-type=java.lang.String
  topic=test
  compression-codec=default/
/int-kafka:producer-configurations
/int-kafka:producer-context


int-kafka:zookeeper-connect id=zookeeperConnect
   zk-connect=175.65.76.12:2181
   zk-connection-timeout=6000
   zk-session-timeout=6000
   zk-sync-time=2000 /

!-- See http://kafka.apache.org/documentation.html#consumerconfigs 
 --
!-- or high-level consumer on http://kafka.apache.org/07/configuration.html 
or https://kafka.apache.org/08/configuration.html --
!-- 
http://grokbase.com/t/kafka/users/12b9bmsy7k/question-about-resetting-offsets-and-the-high-level-consumer
 --
  bean id=kafkaConsumerProperties 
class=org.springframework.beans.factory.config.PropertiesFactoryBean
   property name=properties
  props
 prop key=autocommit.enablefalse/prop
 prop key=auto.offset.resetlargest/prop
  /props
   /property
/bean

int-kafka:consumer-context id=kafkaConsumerContext consumer-timeout=4000 
zookeeper-connect=zookeeperConnect 
consumer-properties=kafkaConsumerProperties
   int-kafka:consumer-configurations
  int-kafka:consumer-configuration group-id=consumerGroupG 
max-messages=2
 int-kafka:topic id=test-multi streams=3/
  /int-kafka:consumer-configuration
   /int-kafka:consumer-configurations
/int-kafka:consumer-context




Java Advice Class

public class KafkaConsumerAfterAdvice implements AfterReturningAdvice, 
InitializingBean {
   private KafkaConsumerContext consumerContext;
   private String consumerGroupId;

   public void setConsumerContext(KafkaConsumerContext consumerContext) {
  this.consumerContext = consumerContext;
   }

   public void setConsumerGroupId(String consumerGroupId) {
  this.consumerGroupId = consumerGroupId;
   }

   /**
   * Spring calls this after the bean has be initialized within the 
ApplicationContext.
   */
   @Override
   public void afterPropertiesSet() throws Exception {
  Assert.notNull(consumerContext, [consumerContext] cannot be 
null);
  Assert.notNull(consumerGroupId, [consumerGroupId] cannot be 
null);
   }

   @Override
   public void afterReturning(Object returnValue, Method method, Object[] 
args, Object target) throws Throwable {
  // If there were messages then returnValue=true otherwise 
returnValue=false.
  // Only if true do we need to take the hit to commit the offsets.
  if (returnValue.equals(true)) {
 IteratorConsumerConfigurationString, Object 
consumerConfigIterator = consumerContext.getConsumerConfigurations().iterator();
 while (consumerConfigIterator.hasNext()) {
   ConsumerConfigurationString, Object consumerConfig 
= consumerConfigIterator.next();
   if 
(consumerGroupId.equals(consumerConfig.getConsumerMetadata().getGroupId())) {
  
consumerConfig.getConsumerConnector().commitOffsets();
   }
 }
  }
   }
}


[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names

2014-11-12 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14209055#comment-14209055
 ] 

Joel Koshy commented on KAFKA-1481:
---

(1) and (4) seem equivalent (i.e., AllTopics vs Aggregate) - or are you saying 
that (4) will be AllTopics or AllBrokers as appropriate?

I'm +0 on (5) for the reason I stated above. i.e., it is odd to see true when 
browsing mbeans

I'm +0 on (6) as well as topics=aggregate is a bit odd. The field name 
suggests it is a list of topics but it is more like a boolean. Between this and 
(5) I prefer (5).

(3) seems reasonable to me although it is not as clear as having an explicit 
aggregate term in the type. However, I think (1), (2) and (3) do make it clear 
enough what is being aggregated: i.e., bytes-out-per-sec aggregated on topic. I 
actually think Broker should not be there since this is a broker-side mbean 
already. i.e., if we had kafka.server:type=TopicMetrics,name=BytesOutPerSec 
(wouldn't it be clear that the dimension of aggregation is across topics?) 
i.e., I think we can just make the dimension clear from the typename.

Likewise, it should be clear (for consumers) that 
FetchRequestAndResponseMetrics is really a broker-level aggregation.


 Stop using dashes AND underscores as separators in MBean names
 --

 Key: KAFKA-1481
 URL: https://issues.apache.org/jira/browse/KAFKA-1481
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1.1
Reporter: Otis Gospodnetic
Priority: Critical
  Labels: patch
 Fix For: 0.8.3

 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, 
 KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, 
 KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, 
 KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, 
 KAFKA-1481_2014-10-31_14-35-43.patch, 
 KAFKA-1481_2014-11-03_16-39-41_doc.patch, 
 KAFKA-1481_2014-11-03_17-02-23.patch, 
 KAFKA-1481_2014-11-10_20-39-41_doc.patch, 
 KAFKA-1481_2014-11-10_21-02-23.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch, alternateLayout1.png, 
 alternateLayout2.png, diff-for-alternate-layout1.patch, 
 diff-for-alternate-layout2.patch, originalLayout.png


 MBeans should not use dashes or underscores as separators because these 
 characters are allowed in hostnames, topics, group and consumer IDs, etc., 
 and these are embedded in MBeans names making it impossible to parse out 
 individual bits from MBeans.
 Perhaps a pipe character should be used to avoid the conflict. 
 This looks like a major blocker because it means nobody can write Kafka 0.8.x 
 monitoring tools unless they are doing it for themselves AND do not use 
 dashes AND do not use underscores.
 See: http://search-hadoop.com/m/4TaT4lonIW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1729) add doc for Kafka-based offset management in 0.8.2

2014-11-12 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14209071#comment-14209071
 ] 

Jun Rao commented on KAFKA-1729:


Thanks for the patch. A few comments.

1. We need to make sure that before people start using the Kafka-based offset 
management in production, they set offsets.topic.num.partitions and 
offsets.topic.replication.factor properly for the offset topic since the 
defaults are not suitable for production usage. Could you add that in 
implementation.html? 

2. It seems that issuing manual OffsetCommitRequest is only needed when using 
SimpleConsumer. We can probably make that clear in the wiki.

 add doc for Kafka-based offset management in 0.8.2
 --

 Key: KAFKA-1729
 URL: https://issues.apache.org/jira/browse/KAFKA-1729
 Project: Kafka
  Issue Type: Sub-task
Reporter: Jun Rao
Assignee: Joel Koshy
 Attachments: KAFKA-1782-doc-v1.patch, KAFKA-1782-doc-v2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1752) add --replace-broker option

2014-11-12 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14209072#comment-14209072
 ] 

Neha Narkhede commented on KAFKA-1752:
--

bq. So it looks like we actually want --add-broker (and transfer some load to 
it) and --decommission-broker (and transfer its load somewhere)?

Right. 

[~Dmitry Pekar] We would like to avoid adding more and more nuanced options to 
the partition reassignment tool that is already too complex. I would suggest 
taking a step back and arriving at a few simple options that would cover all 
use cases. I think that all we need is a way for users to add and decommission 
brokers and the user's expectation would be that the tool comes up with a 
correct assignment that leads to roughly even distribution of partitions as per 
our replica placement strategy.

 add --replace-broker option
 ---

 Key: KAFKA-1752
 URL: https://issues.apache.org/jira/browse/KAFKA-1752
 Project: Kafka
  Issue Type: Sub-task
  Components: tools
Reporter: Dmitry Pekar
Assignee: Dmitry Pekar
 Fix For: 0.8.3






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1762) Update max-inflight-request doc string

2014-11-12 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-1762:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

 Update max-inflight-request doc string
 --

 Key: KAFKA-1762
 URL: https://issues.apache.org/jira/browse/KAFKA-1762
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Attachments: KAFKA-1762.patch


 The new Producer client introduces a config for the max # of inFlight 
 messages. When it is set  1 on MirrorMaker, however, there is a risk for 
 data loss even with KAFKA-1650 because the offsets recorded in the MM's 
 offset map is no longer continuous.
 Another issue is that when this value is set  1, there is a risk of message 
 re-ordering in the producer
 Changes:
 1. Set max # of inFlight messages = 1 in MM
 2. Leave comments explaining what the risks are of changing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1762) Update max-inflight-request doc string

2014-11-12 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-1762:
--
Summary: Update max-inflight-request doc string  (was: Enforce 
MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION to 1 in MirrorMaker)

 Update max-inflight-request doc string
 --

 Key: KAFKA-1762
 URL: https://issues.apache.org/jira/browse/KAFKA-1762
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Attachments: KAFKA-1762.patch


 The new Producer client introduces a config for the max # of inFlight 
 messages. When it is set  1 on MirrorMaker, however, there is a risk for 
 data loss even with KAFKA-1650 because the offsets recorded in the MM's 
 offset map is no longer continuous.
 Another issue is that when this value is set  1, there is a risk of message 
 re-ordering in the producer
 Changes:
 1. Set max # of inFlight messages = 1 in MM
 2. Leave comments explaining what the risks are of changing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability

2014-11-12 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14209151#comment-14209151
 ] 

Jun Rao commented on KAFKA-1555:


Thanks for the doc patch. Committed to svn after fixing a few typos. Let me 
know if you see any further issue.

 provide strong consistency with reasonable availability
 ---

 Key: KAFKA-1555
 URL: https://issues.apache.org/jira/browse/KAFKA-1555
 Project: Kafka
  Issue Type: Improvement
  Components: controller
Affects Versions: 0.8.1.1
Reporter: Jiang Wu
Assignee: Gwen Shapira
 Fix For: 0.8.2

 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, 
 KAFKA-1555-DOCS.2.patch, KAFKA-1555-DOCS.3.patch, KAFKA-1555-DOCS.4.patch, 
 KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, 
 KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, 
 KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch


 In a mission critical application, we expect a kafka cluster with 3 brokers 
 can satisfy two requirements:
 1. When 1 broker is down, no message loss or service blocking happens.
 2. In worse cases such as two brokers are down, service can be blocked, but 
 no message loss happens.
 We found that current kafka versoin (0.8.1.1) cannot achieve the requirements 
 due to its three behaviors:
 1. when choosing a new leader from 2 followers in ISR, the one with less 
 messages may be chosen as the leader.
 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it 
 has less messages than the leader.
 3. ISR can contains only 1 broker, therefore acknowledged messages may be 
 stored in only 1 broker.
 The following is an analytical proof. 
 We consider a cluster with 3 brokers and a topic with 3 replicas, and assume 
 that at the beginning, all 3 replicas, leader A, followers B and C, are in 
 sync, i.e., they have the same messages and are all in ISR.
 According to the value of request.required.acks (acks for short), there are 
 the following cases.
 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement.
 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this 
 time, although C hasn't received m, C is still in ISR. If A is killed, C can 
 be elected as the new leader, and consumers will miss m.
 3. acks=-1. B and C restart and are removed from ISR. Producer sends a 
 message m to A, and receives an acknowledgement. Disk failure happens in A 
 before B and C replicate m. Message m is lost.
 In summary, any existing configuration cannot satisfy the requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-1555) provide strong consistency with reasonable availability

2014-11-12 Thread Joe Stein (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Stein resolved KAFKA-1555.
--
Resolution: Fixed

 provide strong consistency with reasonable availability
 ---

 Key: KAFKA-1555
 URL: https://issues.apache.org/jira/browse/KAFKA-1555
 Project: Kafka
  Issue Type: Improvement
  Components: controller
Affects Versions: 0.8.1.1
Reporter: Jiang Wu
Assignee: Gwen Shapira
 Fix For: 0.8.2

 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, 
 KAFKA-1555-DOCS.2.patch, KAFKA-1555-DOCS.3.patch, KAFKA-1555-DOCS.4.patch, 
 KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, 
 KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, 
 KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch


 In a mission critical application, we expect a kafka cluster with 3 brokers 
 can satisfy two requirements:
 1. When 1 broker is down, no message loss or service blocking happens.
 2. In worse cases such as two brokers are down, service can be blocked, but 
 no message loss happens.
 We found that current kafka versoin (0.8.1.1) cannot achieve the requirements 
 due to its three behaviors:
 1. when choosing a new leader from 2 followers in ISR, the one with less 
 messages may be chosen as the leader.
 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it 
 has less messages than the leader.
 3. ISR can contains only 1 broker, therefore acknowledged messages may be 
 stored in only 1 broker.
 The following is an analytical proof. 
 We consider a cluster with 3 brokers and a topic with 3 replicas, and assume 
 that at the beginning, all 3 replicas, leader A, followers B and C, are in 
 sync, i.e., they have the same messages and are all in ISR.
 According to the value of request.required.acks (acks for short), there are 
 the following cases.
 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement.
 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this 
 time, although C hasn't received m, C is still in ISR. If A is killed, C can 
 be elected as the new leader, and consumers will miss m.
 3. acks=-1. B and C restart and are removed from ISR. Producer sends a 
 message m to A, and receives an acknowledgement. Disk failure happens in A 
 before B and C replicate m. Message m is lost.
 In summary, any existing configuration cannot satisfy the requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Request for subscription

2014-11-12 Thread Sampath Tulava
Hi,

Can I subscribe for this mailing list

Thanks,
Sampath


[jira] [Created] (KAFKA-1768) Expose version via JMX

2014-11-12 Thread Otis Gospodnetic (JIRA)
Otis Gospodnetic created KAFKA-1768:
---

 Summary: Expose version via JMX
 Key: KAFKA-1768
 URL: https://issues.apache.org/jira/browse/KAFKA-1768
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 0.8.1.1
Reporter: Otis Gospodnetic
 Fix For: 0.8.2


See Gwen's code snippet in 
http://search-hadoop.com/m/4TaT4xtk36/Programmatic+Kafka+version+detection%252Fextractionsubj=Programmatic+Kafka+version+detection+extraction+



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)