[jira] [Commented] (KAFKA-1737) Document required ZkSerializer for ZkClient used with AdminUtils
[ https://issues.apache.org/jira/browse/KAFKA-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207798#comment-14207798 ] Vivek Madani commented on KAFKA-1737: - Hi - Did you mean to enforce ZkStringSerializer on the ZkClient instance passed to AdminUtils.createTopic? Or you meant changing ZkClient from org.I0Itec.zkclient.ZkClient? If I understand this correctly, since AdminUtils are user-facing, user can create ZkClient instance outside and pass it on to AdminUtils. Do you suggest providing an overload in AdminUtils that takes parameters required to construct ZkClient internally and set ZkStringSerializer for that? In this case, doc update may still be required in case someone intends to use the overload which takes ZkClient. Or we just set ZkStringSerializer for the instance of ZkClient passed to AdminUtils. There are many places where new ZkClient is called within kafka code-base and your suggestion to have a createZkClient will help but we may need a different mechanism for AdminUtils. I am saying this based on my limited understanding of the Kafka code-base - correct me if I am missing out anything. Document required ZkSerializer for ZkClient used with AdminUtils Key: KAFKA-1737 URL: https://issues.apache.org/jira/browse/KAFKA-1737 Project: Kafka Issue Type: Improvement Components: tools Affects Versions: 0.8.1.1 Reporter: Stevo Slavic Priority: Minor {{ZkClient}} instances passed to {{AdminUtils}} calls must have {{kafka.utils.ZKStringSerializer}} set as {{ZkSerializer}}. Otherwise commands executed via {{AdminUtils}} may not be seen/recognizable to broker, producer or consumer. E.g. producer (with auto topic creation turned off) will not be able to send messages to a topic created via {{AdminUtils}}, it will throw {{UnknownTopicOrPartitionException}}. Please consider at least documenting this requirement in {{AdminUtils}} scaladoc. For more info see [related discussion on Kafka user mailing list|http://mail-archives.apache.org/mod_mbox/kafka-users/201410.mbox/%3CCAAUywg-oihNiXuQRYeS%3D8Z3ymsmEHo6ghLs%3Dru4nbm%2BdHVz6TA%40mail.gmail.com%3E]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1752) add --replace-broker option
[ https://issues.apache.org/jira/browse/KAFKA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207900#comment-14207900 ] Dmitry Pekar commented on KAFKA-1752: - [~gwenshap] That, probably, could be implemented. But wouldn't it create unpredictable and unmanageable (from user's point of view) replica redistribution? Also should we consider using a strategy with optimal (minimal number) moving of replicas between brokers? add --replace-broker option --- Key: KAFKA-1752 URL: https://issues.apache.org/jira/browse/KAFKA-1752 Project: Kafka Issue Type: Sub-task Components: tools Reporter: Dmitry Pekar Assignee: Dmitry Pekar Fix For: 0.8.3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1752) add --replace-broker option
[ https://issues.apache.org/jira/browse/KAFKA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207900#comment-14207900 ] Dmitry Pekar edited comment on KAFKA-1752 at 11/12/14 10:31 AM: [~gwenshap] That, probably, could be implemented. 1.But wouldn't it create unpredictable and unmanageable (from user's point of view) replica redistribution? 2.If 1. is false should we consider using a strategy with optimal (minimal number) moving of replicas between brokers? If 1. is false than we should discuss the strategy of fair redistribution. Need to think about it. Also this seems to extend the scope of initial ticket, because this part (--add-broker and fair redistribution or replicas) is the most complicated. was (Author: dmitry pekar): [~gwenshap] That, probably, could be implemented. But wouldn't it create unpredictable and unmanageable (from user's point of view) replica redistribution? Also should we consider using a strategy with optimal (minimal number) moving of replicas between brokers? add --replace-broker option --- Key: KAFKA-1752 URL: https://issues.apache.org/jira/browse/KAFKA-1752 Project: Kafka Issue Type: Sub-task Components: tools Reporter: Dmitry Pekar Assignee: Dmitry Pekar Fix For: 0.8.3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1667) topic-level configuration not validated
[ https://issues.apache.org/jira/browse/KAFKA-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207957#comment-14207957 ] Dmytro Kostiuchenko commented on KAFKA-1667: Updated reviewboard https://reviews.apache.org/r/27634/diff/ against branch origin/trunk topic-level configuration not validated Key: KAFKA-1667 URL: https://issues.apache.org/jira/browse/KAFKA-1667 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ryan Berdeen Labels: newbie Attachments: KAFKA-1667.patch, KAFKA-1667_2014-11-05_19:43:53.patch, KAFKA-1667_2014-11-06_17:10:14.patch, KAFKA-1667_2014-11-07_14:28:14.patch, KAFKA-1667_2014-11-12_12:49:11.patch I was able to set the configuration for a topic to these invalid values: {code} Topic:topic-config-test PartitionCount:1ReplicationFactor:2 Configs:min.cleanable.dirty.ratio=-30.2,segment.bytes=-1,retention.ms=-12,cleanup.policy=lol {code} It seems that the values are saved as long as they are the correct type, but are not validated like the corresponding broker-level properties. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1667) topic-level configuration not validated
[ https://issues.apache.org/jira/browse/KAFKA-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmytro Kostiuchenko updated KAFKA-1667: --- Attachment: KAFKA-1667_2014-11-12_12:49:11.patch topic-level configuration not validated Key: KAFKA-1667 URL: https://issues.apache.org/jira/browse/KAFKA-1667 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ryan Berdeen Labels: newbie Attachments: KAFKA-1667.patch, KAFKA-1667_2014-11-05_19:43:53.patch, KAFKA-1667_2014-11-06_17:10:14.patch, KAFKA-1667_2014-11-07_14:28:14.patch, KAFKA-1667_2014-11-12_12:49:11.patch I was able to set the configuration for a topic to these invalid values: {code} Topic:topic-config-test PartitionCount:1ReplicationFactor:2 Configs:min.cleanable.dirty.ratio=-30.2,segment.bytes=-1,retention.ms=-12,cleanup.policy=lol {code} It seems that the values are saved as long as they are the correct type, but are not validated like the corresponding broker-level properties. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 27634: Patch for KAFKA-1667
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27634/ --- (Updated Nov. 12, 2014, 11:49 a.m.) Review request for kafka. Bugs: KAFKA-1667 https://issues.apache.org/jira/browse/KAFKA-1667 Repository: kafka Description (updated) --- KAFKA-1667 Fixed bugs in LogConfig. Added test and documentation KAFKA-1667 Updated tests to reflect new boolean property parsing logic KAFKA-1667 renamed methods to match naming convention KAFKA-1667 Added unit test to cover invalid configuration case KAFKA-1667 Strict UncleanLeaderElection property parsing Diffs (updated) - clients/src/main/java/org/apache/kafka/common/config/ConfigDef.java c4cea2cc072f4db4ce014b63d226431d3766bef1 core/src/main/scala/kafka/admin/TopicCommand.scala 0b2735e7fc42ef9894bef1997b1f06a8ebee5439 core/src/main/scala/kafka/log/LogConfig.scala e48922a97727dd0b98f3ae630ebb0af3bef2373d core/src/main/scala/kafka/utils/Utils.scala 23aefb4715b177feae1d2f83e8b910653ea10c5f core/src/test/scala/kafka/log/LogConfigTest.scala PRE-CREATION core/src/test/scala/unit/kafka/integration/UncleanLeaderElectionTest.scala f44568cb25edf25db857415119018fd4c9922f61 Diff: https://reviews.apache.org/r/27634/diff/ Testing --- Thanks, Dmytro Kostiuchenko
[jira] [Commented] (KAFKA-1667) topic-level configuration not validated
[ https://issues.apache.org/jira/browse/KAFKA-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207968#comment-14207968 ] Dmytro Kostiuchenko commented on KAFKA-1667: Can't assign issue to myself. Get an exception when running kafka-patch-review.py. {code}jira.exceptions.JIRAError: HTTP 400: Field 'assignee' cannot be set. It is not on the appropriate screen, or unknown{code} Also don't have Assign to me button in JIRA. topic-level configuration not validated Key: KAFKA-1667 URL: https://issues.apache.org/jira/browse/KAFKA-1667 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ryan Berdeen Labels: newbie Attachments: KAFKA-1667.patch, KAFKA-1667_2014-11-05_19:43:53.patch, KAFKA-1667_2014-11-06_17:10:14.patch, KAFKA-1667_2014-11-07_14:28:14.patch, KAFKA-1667_2014-11-12_12:49:11.patch I was able to set the configuration for a topic to these invalid values: {code} Topic:topic-config-test PartitionCount:1ReplicationFactor:2 Configs:min.cleanable.dirty.ratio=-30.2,segment.bytes=-1,retention.ms=-12,cleanup.policy=lol {code} It seems that the values are saved as long as they are the correct type, but are not validated like the corresponding broker-level properties. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1667) topic-level configuration not validated
[ https://issues.apache.org/jira/browse/KAFKA-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmytro Kostiuchenko updated KAFKA-1667: --- Status: Patch Available (was: Open) topic-level configuration not validated Key: KAFKA-1667 URL: https://issues.apache.org/jira/browse/KAFKA-1667 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ryan Berdeen Labels: newbie Attachments: KAFKA-1667_2014-11-05_19:43:53.patch, KAFKA-1667_2014-11-06_17:10:14.patch, KAFKA-1667_2014-11-07_14:28:14.patch, KAFKA-1667_2014-11-12_12:49:11.patch I was able to set the configuration for a topic to these invalid values: {code} Topic:topic-config-test PartitionCount:1ReplicationFactor:2 Configs:min.cleanable.dirty.ratio=-30.2,segment.bytes=-1,retention.ms=-12,cleanup.policy=lol {code} It seems that the values are saved as long as they are the correct type, but are not validated like the corresponding broker-level properties. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1667) topic-level configuration not validated
[ https://issues.apache.org/jira/browse/KAFKA-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmytro Kostiuchenko updated KAFKA-1667: --- Attachment: (was: KAFKA-1667.patch) topic-level configuration not validated Key: KAFKA-1667 URL: https://issues.apache.org/jira/browse/KAFKA-1667 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ryan Berdeen Labels: newbie Attachments: KAFKA-1667_2014-11-05_19:43:53.patch, KAFKA-1667_2014-11-06_17:10:14.patch, KAFKA-1667_2014-11-07_14:28:14.patch, KAFKA-1667_2014-11-12_12:49:11.patch I was able to set the configuration for a topic to these invalid values: {code} Topic:topic-config-test PartitionCount:1ReplicationFactor:2 Configs:min.cleanable.dirty.ratio=-30.2,segment.bytes=-1,retention.ms=-12,cleanup.policy=lol {code} It seems that the values are saved as long as they are the correct type, but are not validated like the corresponding broker-level properties. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 27684: Patch for KAFKA-1743
On Nov. 10, 2014, 7:50 p.m., Jun Rao wrote: core/src/main/scala/kafka/consumer/ConsumerConnector.scala, lines 76-80 https://reviews.apache.org/r/27684/diff/2/?file=755292#file755292line76 We will also need to change the interface in ConsumerConnector from def commitOffsets(retryOnFailure: Boolean = true) back to def commitOffsets In ZookeeperConsumerconnector, we can make the following method private def commitOffsets(retryOnFailure: Boolean = true) Another question, will scala compiler be confused with 2 methods, one w/o parenthsis and one with 1 parameter having a default? Could you try compiling the code on all scala versions? Currently below classes uses the new method commitOffsets(true). kafka/javaapi/consumer/ZookeeperConsumerConnector.scala kafka/tools/TestEndToEndLatency.scala If we are changing the interface, then we need to change the above classes also. If we are not fixing this on trunk, then same problem will come in 0.8.3. How to handle this? 2 methods, one w/o parenthsis and one with 1 parameter is getting compiled on all scala versions. - Manikumar Reddy --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27684/#review60652 --- On Nov. 8, 2014, 6:20 a.m., Manikumar Reddy O wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27684/ --- (Updated Nov. 8, 2014, 6:20 a.m.) Review request for kafka. Bugs: KAFKA-1743 https://issues.apache.org/jira/browse/KAFKA-1743 Repository: kafka Description --- def commitOffsets method added to make ConsumerConnector backward compatible Diffs - core/src/main/scala/kafka/consumer/ConsumerConnector.scala 07677c1c26768ef9c9032626180d0015f12cb0e0 Diff: https://reviews.apache.org/r/27684/diff/ Testing --- Thanks, Manikumar Reddy O
Re: Review Request 27890: Patch for KAFKA-1764
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27890/#review61001 --- Could you also remove the line in consumer that sends back the shutdown command? - Guozhang Wang On Nov. 11, 2014, 10:59 p.m., Jiangjie Qin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27890/ --- (Updated Nov. 11, 2014, 10:59 p.m.) Review request for kafka. Bugs: KAFKA-1764 https://issues.apache.org/jira/browse/KAFKA-1764 Repository: kafka Description --- fix for KAFKA-1764 Diffs - core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala fbc680fde21b02f11285a4f4b442987356abd17b Diff: https://reviews.apache.org/r/27890/diff/ Testing --- Thanks, Jiangjie Qin
Re: Kafka Command Line Shell
Thanks Joe. I will read the wiki page. On Tue, Nov 11, 2014 at 11:47 PM, Joe Stein joe.st...@stealth.ly wrote: I started writing this up on the wiki https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Command+Line+and+Related+Improvements Instead of starting a new thread I figure just continue this one I started. I also added another (important) component for centralized management of configuration as global level much like we have topic level. These global configuration would be overridden (perhaps not all) from the server.properties on start (so like in case one broker needs a different port, sure). One concern I have is that using RQ/RP wire protocol to the controller instead of the current way (via ZK admin path) may expose concurrency on the admin requests, which may not be supported yet. Guozhang, take a look at the diagram how I am thinking of this it would be a new handle request that will execute the tools pretty much how they are today. My thinking is maybe to-do one at a time (so TopicCommand first I think) and have what the TopicCommand is doing happen on server and send the RQ/RP to the client but execute on the server. If there is something not supported we will of course have to deal with that and implement it for sure. Once we get one working end to end I think adding the rest will be (more or less) concise iterations to get it done. I added your concern to the wiki under the gotchas section. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / On Mon, Oct 20, 2014 at 2:15 AM, Guozhang Wang wangg...@gmail.com wrote: One concern I have is that using RQ/RP wire protocol to the controller instead of the current way (via ZK admin path) may expose concurrency on the admin requests, which may not be supported yet. Some initial discussion about this is on KAFKA-1305. Guozhang On Sun, Oct 19, 2014 at 1:55 PM, Joe Stein joe.st...@stealth.ly wrote: Maybe we should add some AdminMessage RQ/RP wire protocol structure(s) and let the controller handle it? We could then build the CLI and Shell in the project both as useful tools and samples for others. Making a http interface should be simple after KAFKA-1494 is done which all client libraries could offer. I will update the design tonight/tomorrow and should be able to have someone starting to work on it this week. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop / On Oct 19, 2014 1:21 PM, Harsha ka...@harsha.io wrote: +1 for Web Api On Sat, Oct 18, 2014, at 11:48 PM, Glen Mazza wrote: Apache Karaf has been doing this for quite a few years, albeit in Java not Scala. Still, their coding approach to creating a CLI probably captures many lessons learned over that time. Glen On 10/17/2014 08:03 PM, Joe Stein wrote: Hi, I have been thinking about the ease of use for operations with Kafka. We have lots of tools doing a lot of different things and they are all kind of in different places. So, what I was thinking is to have a single interface for our tooling https://issues.apache.org/jira/browse/KAFKA-1694 This would manifest itself in two ways 1) a command line interface 2) a repl We would have one entry point centrally for all Kafka commands. kafka CMD ARGS kafka createTopic --brokerList etc, kafka reassignPartition --brokerList etc, or execute and run the shell kafka --brokerList localhost kafkause topicName; kafkaset acl='label'; I was thinking that all calls would be initialized through --brokerList and the broker can tell the KafkaCommandTool what server to connect to for MetaData. Thoughts? Tomatoes? /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / -- -- Guozhang -- -- Guozhang
[jira] [Commented] (KAFKA-1737) Document required ZkSerializer for ZkClient used with AdminUtils
[ https://issues.apache.org/jira/browse/KAFKA-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208275#comment-14208275 ] Guozhang Wang commented on KAFKA-1737: -- Hi Vivek, Here are my thoughts: since currently we only use a single data format (ZkStringSerializer) in Kafka's ZK, we could just enforce it in ZkClient construction time; but as you mentioned, people can pass any ZkClient instances to the AdminUtils API functions using new ZkClient it is a bit hard to enforce it programmatically, and instead I was proposing to add a new createZkClient function and let people to use it instead of calling new ZkClient to create new instances. Of course we still need to change the docs telling people to do so, not using new. Document required ZkSerializer for ZkClient used with AdminUtils Key: KAFKA-1737 URL: https://issues.apache.org/jira/browse/KAFKA-1737 Project: Kafka Issue Type: Improvement Components: tools Affects Versions: 0.8.1.1 Reporter: Stevo Slavic Priority: Minor {{ZkClient}} instances passed to {{AdminUtils}} calls must have {{kafka.utils.ZKStringSerializer}} set as {{ZkSerializer}}. Otherwise commands executed via {{AdminUtils}} may not be seen/recognizable to broker, producer or consumer. E.g. producer (with auto topic creation turned off) will not be able to send messages to a topic created via {{AdminUtils}}, it will throw {{UnknownTopicOrPartitionException}}. Please consider at least documenting this requirement in {{AdminUtils}} scaladoc. For more info see [related discussion on Kafka user mailing list|http://mail-archives.apache.org/mod_mbox/kafka-users/201410.mbox/%3CCAAUywg-oihNiXuQRYeS%3D8Z3ymsmEHo6ghLs%3Dru4nbm%2BdHVz6TA%40mail.gmail.com%3E]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1173) Using Vagrant to get up and running with Apache Kafka
[ https://issues.apache.org/jira/browse/KAFKA-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208281#comment-14208281 ] Joe Stein commented on KAFKA-1173: -- [~ewencp] I am +1 on the virtual box parts to the patch and the updates you last made (I really like how you added the vagrant part to the main README, nice touch. I am having issue with the EC2 pieces but am pretty convinced it is my account how it is setup with VPC so I am going to setup a new account and try it again. I may not have a chance to-do that until the weekend FYI. Using Vagrant to get up and running with Apache Kafka - Key: KAFKA-1173 URL: https://issues.apache.org/jira/browse/KAFKA-1173 Project: Kafka Issue Type: Improvement Reporter: Joe Stein Assignee: Ewen Cheslack-Postava Attachments: KAFKA-1173.patch, KAFKA-1173_2013-12-07_12:07:55.patch, KAFKA-1173_2014-11-11_13:50:55.patch Vagrant has been getting a lot of pickup in the tech communities. I have found it very useful for development and testing and working with a few clients now using it to help virtualize their environments in repeatable ways. Using Vagrant to get up and running. For 0.8.0 I have a patch on github https://github.com/stealthly/kafka 1) Install Vagrant [http://www.vagrantup.com/](http://www.vagrantup.com/) 2) Install Virtual Box [https://www.virtualbox.org/](https://www.virtualbox.org/) In the main kafka folder 1) ./sbt update 2) ./sbt package 3) ./sbt assembly-package-dependency 4) vagrant up once this is done * Zookeeper will be running 192.168.50.5 * Broker 1 on 192.168.50.10 * Broker 2 on 192.168.50.20 * Broker 3 on 192.168.50.30 When you are all up and running you will be back at a command brompt. If you want you can login to the machines using vagrant shh machineName but you don't need to. You can access the brokers and zookeeper by their IP e.g. bin/kafka-console-producer.sh --broker-list 192.168.50.10:9092,192.168.50.20:9092,192.168.50.30:9092 --topic sandbox bin/kafka-console-consumer.sh --zookeeper 192.168.50.5:2181 --topic sandbox --from-beginning -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names
[ https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208284#comment-14208284 ] Vladimir Tretyakov commented on KAFKA-1481: --- Maybe somebody can answer my last questions? Have to finish with this patch and moving forward! Thx. Stop using dashes AND underscores as separators in MBean names -- Key: KAFKA-1481 URL: https://issues.apache.org/jira/browse/KAFKA-1481 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1.1 Reporter: Otis Gospodnetic Priority: Critical Labels: patch Fix For: 0.8.3 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, KAFKA-1481_2014-10-31_14-35-43.patch, KAFKA-1481_2014-11-03_16-39-41_doc.patch, KAFKA-1481_2014-11-03_17-02-23.patch, KAFKA-1481_2014-11-10_20-39-41_doc.patch, KAFKA-1481_2014-11-10_21-02-23.patch, KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch, alternateLayout1.png, alternateLayout2.png, diff-for-alternate-layout1.patch, diff-for-alternate-layout2.patch, originalLayout.png MBeans should not use dashes or underscores as separators because these characters are allowed in hostnames, topics, group and consumer IDs, etc., and these are embedded in MBeans names making it impossible to parse out individual bits from MBeans. Perhaps a pipe character should be used to avoid the conflict. This looks like a major blocker because it means nobody can write Kafka 0.8.x monitoring tools unless they are doing it for themselves AND do not use dashes AND do not use underscores. See: http://search-hadoop.com/m/4TaT4lonIW -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1684) Implement TLS/SSL authentication
[ https://issues.apache.org/jira/browse/KAFKA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208321#comment-14208321 ] Jun Rao commented on KAFKA-1684: Gwen, Thanks for the comment. Having 3 different ports is probably fine. My point is that since adding a port requires inter-broker request format changes (and rolling upgrades with this kind of changes is a bit tricky), it would be good if we do the request change just once. Perhaps we can work out the needed request format changes for both SSL and SASL first. Regarding finding a good model to mimic, it seems that HDFS supports both Kerberos and SSL. Is that a better model to look into? Implement TLS/SSL authentication Key: KAFKA-1684 URL: https://issues.apache.org/jira/browse/KAFKA-1684 Project: Kafka Issue Type: Sub-task Components: security Affects Versions: 0.9.0 Reporter: Jay Kreps Assignee: Ivan Lyutov Attachments: KAFKA-1684.patch Add an SSL port to the configuration and advertise this as part of the metadata request. If the SSL port is configured the socket server will need to add a second Acceptor thread to listen on it. Connections accepted on this port will need to go through the SSL handshake prior to being registered with a Processor for request processing. SSL requests and responses may need to be wrapped or unwrapped using the SSLEngine that was initialized by the acceptor. This wrapping and unwrapping is very similar to what will need to be done for SASL-based authentication schemes. We should have a uniform interface that covers both of these and we will need to store the instance in the session with the request. The socket server will have to use this object when reading and writing requests. We will need to take care with the FetchRequests as the current FileChannel.transferTo mechanism will be incompatible with wrap/unwrap so we can only use this optimization for unencrypted sockets that don't require userspace translation (wrapping). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 25995: Patch for KAFKA-1650
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25995/ --- (Updated Nov. 12, 2014, 5:51 p.m.) Review request for kafka. Summary (updated) - Patch for KAFKA-1650 Bugs: KAFKA-1650 and KAKFA-1650 https://issues.apache.org/jira/browse/KAFKA-1650 https://issues.apache.org/jira/browse/KAKFA-1650 Repository: kafka Description --- Addressed Guozhang's comments. Addressed Guozhang's comments commit before switch to trunk commit before rebase Rebased on trunk, Addressed Guozhang's comments. Addressed Guozhang's comments on MaxInFlightRequests Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into mirrormaker-redesign Diffs (updated) - core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala fbc680fde21b02f11285a4f4b442987356abd17b core/src/main/scala/kafka/tools/MirrorMaker.scala f399105087588946987bbc84e3759935d9498b6a Diff: https://reviews.apache.org/r/25995/diff/ Testing --- Thanks, Jiangjie Qin
Re: Review Request 25995: Patch for KAKFA-1650
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25995/ --- (Updated Nov. 12, 2014, 5:51 p.m.) Review request for kafka. Summary (updated) - Patch for KAKFA-1650 Bugs: KAFKA-1650 and KAKFA-1650 https://issues.apache.org/jira/browse/KAFKA-1650 https://issues.apache.org/jira/browse/KAKFA-1650 Repository: kafka Description (updated) --- Addressed Guozhang's comments. Addressed Guozhang's comments commit before switch to trunk commit before rebase Rebased on trunk, Addressed Guozhang's comments. Addressed Guozhang's comments on MaxInFlightRequests Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into mirrormaker-redesign Diffs (updated) - core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala fbc680fde21b02f11285a4f4b442987356abd17b core/src/main/scala/kafka/tools/MirrorMaker.scala f399105087588946987bbc84e3759935d9498b6a Diff: https://reviews.apache.org/r/25995/diff/ Testing --- Thanks, Jiangjie Qin
[jira] [Updated] (KAFKA-1650) Mirror Maker could lose data on unclean shutdown.
[ https://issues.apache.org/jira/browse/KAFKA-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiangjie Qin updated KAFKA-1650: Attachment: KAFKA-1650_2014-11-12_09:51:30.patch Mirror Maker could lose data on unclean shutdown. - Key: KAFKA-1650 URL: https://issues.apache.org/jira/browse/KAFKA-1650 Project: Kafka Issue Type: Improvement Reporter: Jiangjie Qin Assignee: Jiangjie Qin Attachments: KAFKA-1650.patch, KAFKA-1650_2014-10-06_10:17:46.patch, KAFKA-1650_2014-11-12_09:51:30.patch Currently if mirror maker got shutdown uncleanly, the data in the data channel and buffer could potentially be lost. With the new producer's callback, this issue could be solved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1650) Mirror Maker could lose data on unclean shutdown.
[ https://issues.apache.org/jira/browse/KAFKA-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208327#comment-14208327 ] Jiangjie Qin commented on KAFKA-1650: - Updated reviewboard https://reviews.apache.org/r/25995/diff/ against branch origin/trunk Mirror Maker could lose data on unclean shutdown. - Key: KAFKA-1650 URL: https://issues.apache.org/jira/browse/KAFKA-1650 Project: Kafka Issue Type: Improvement Reporter: Jiangjie Qin Assignee: Jiangjie Qin Attachments: KAFKA-1650.patch, KAFKA-1650_2014-10-06_10:17:46.patch, KAFKA-1650_2014-11-12_09:51:30.patch Currently if mirror maker got shutdown uncleanly, the data in the data channel and buffer could potentially be lost. With the new producer's callback, this issue could be solved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names
[ https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208284#comment-14208284 ] Vladimir Tretyakov edited comment on KAFKA-1481 at 11/12/14 5:54 PM: - Maybe somebody can answer my last questions? Have to finish with this patch and moving forward! Thx. Will extract Kafka version like Gwen Shapira has suggested in http://search-hadoop.com/m/4TaT4xtk36/Programmatic+Kafka+version+detection%252Fextractionsubj=Programmatic+Kafka+version+detection+extraction+ {quote} So it looks like we can use Gradle to add properties to manifest file and then use getResourceAsStream to read the file and parse it. The Gradle part would be something like: jar.manifest { attributes('Implementation-Title': project.name, 'Implementation-Version': project.version, 'Built-By': System.getProperty('user.name'), 'Built-JDK': System.getProperty('java.version'), 'Built-Host': getHostname(), 'Source-Compatibility': project.sourceCompatibility, 'Target-Compatibility': project.targetCompatibility ) } The code part would be: this.getClass().getClassLoader().getResourceAsStream(/META-INF/MANIFEST.MF) Does that look like the right approach? {quote} What do you think? What about 65? {quote} {quote} was (Author: vladimir.tretyakov): Maybe somebody can answer my last questions? Have to finish with this patch and moving forward! Thx. Stop using dashes AND underscores as separators in MBean names -- Key: KAFKA-1481 URL: https://issues.apache.org/jira/browse/KAFKA-1481 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1.1 Reporter: Otis Gospodnetic Priority: Critical Labels: patch Fix For: 0.8.3 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, KAFKA-1481_2014-10-31_14-35-43.patch, KAFKA-1481_2014-11-03_16-39-41_doc.patch, KAFKA-1481_2014-11-03_17-02-23.patch, KAFKA-1481_2014-11-10_20-39-41_doc.patch, KAFKA-1481_2014-11-10_21-02-23.patch, KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch, alternateLayout1.png, alternateLayout2.png, diff-for-alternate-layout1.patch, diff-for-alternate-layout2.patch, originalLayout.png MBeans should not use dashes or underscores as separators because these characters are allowed in hostnames, topics, group and consumer IDs, etc., and these are embedded in MBeans names making it impossible to parse out individual bits from MBeans. Perhaps a pipe character should be used to avoid the conflict. This looks like a major blocker because it means nobody can write Kafka 0.8.x monitoring tools unless they are doing it for themselves AND do not use dashes AND do not use underscores. See: http://search-hadoop.com/m/4TaT4lonIW -- This message was sent by Atlassian JIRA (v6.3.4#6332)
- Re: Kafka consumer transactional support
Hi Jun, Thanks for taking a look at my issue and also for updating the future release plan Wiki page. My use case is to use Kafka as if it were a JMS provider (messaging use case). I'm currently using Kafka 0.8.1.1 with Java and specifically the Spring Integration Kafka Inbound Channel Adapter. Internally that adapter uses the HighLevelConsumer which shields the caller from the internals of offsets. Let's take the case where a consumer-group reads a number of messages and then is abruptly terminated before properly processing those messages. In that scenario upon restart ideally we'd begin reading at the offset we were at prior to abruptly terminating. If we have auto.commit.enable=true upon restart those messages will be considered already read and will be skipped. Setting auto.commit.enable=false would help in this case but now we'd have to manually call on the offset manager, requiring the use of the SimpleConsumer. In my use-case, it's acceptable for some manual intervention to say reprocess messaging X thru Y, but to do so would require us to know the exact offset we had started at prior to the chunk that was read in when the JVM abnormally terminated. Perhaps I could look at the underlying ExportZkOffsets and ImportZkOffsets Java classes mentioned in this link, but at the very least I'd need to log the timestamp just prior to my read to be used in that query per: https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Idon'twantmyconsumer'soffsetstobecommittedautomatically.CanImanuallymanagemyconsumer'soffsets? It sounds like the ConsumerAPI rewrite mentioned in these links might be helpful in my situation (potentially targeted for Apr 2015): https://cwiki.apache.org/confluence/display/KAFKA/Client+Rewrite#ClientRewrite-ConsumerAPI https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Consumer+Rewrite+Design In the meantime if you have any suggestions for things I might be able to use to work-around my concern I'd be appreciative. Again I'm on 0.8.1.1 but would be willing to look at 0.8.2 if it offered anything to help with my use-case. Thanks Tony
[jira] [Created] (KAFKA-1767) /admin/reassign_partitions deleted before reassignment completes
Ryan Berdeen created KAFKA-1767: --- Summary: /admin/reassign_partitions deleted before reassignment completes Key: KAFKA-1767 URL: https://issues.apache.org/jira/browse/KAFKA-1767 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 0.8.1.1 Reporter: Ryan Berdeen Assignee: Neha Narkhede https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/controller/KafkaController.scala#L477-L517 describes the process of reassigning partitions. Specifically,by the time {{/admin/reassign_partitions}} is updated, the newly assigned replicas (RAR) should be in sync, and the assigned replicas (AR) in ZooKeeper should be updated: {code} 4. Wait until all replicas in RAR are in sync with the leader. ... 10. Update AR in ZK with RAR. 11. Update the /admin/reassign_partitions path in ZK to remove this partition. {code} This worked in 0.8.1, but in 0.8.1.1 we observe {{/admin/reassign_partitions}} being removed before step 4 has completed. For example, if we have AR [1,2] and then put [3,4] in {{/admin/reassign_partitions}}, the cluster will end up with AR [1,2,3,4] and ISR [1,2] when the key is removed. Eventually, the AR will be updated to [3,4]. This means that the {{kafka-reassign-partitions.sh}} tool will accept a new batch of reassignments before the current reassignments have finished, and our own tool that feeds in reassignments in small batches (see KAFKA-1677) can't rely on this key to detect active reassignments. Although we haven't observed this, it seems likely that if a controller resignation happens, the new controller won't know that a reassignment is in progress, and the AR will never be updated to the RAR. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names
[ https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208497#comment-14208497 ] Jun Rao commented on KAFKA-1481: 65. The following are some of the choices that we have. (1) kafka.server:type=BrokerTopicMetrics,name=AggregateBytesOutPerSec (2) kafka.server:type=AggregateBrokerTopicMetrics,name=BytesOutPerSec (3) kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec (4) kafka.server:type=BrokerTopicMetrics,name=AllTopicsBytesOutPerSec (5) kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec,allTopics=true (6) kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec,topics=aggregate The following is my take. The issue with (1), (2) and (3) is that it's not obvious which dimension is being aggregated upon. I also don't quite like (2) since it breaks the convention that type is the class name. If we do go with this route, I'd prefer that we explicitly create an AggregateBrokerTopicMetrics class instead of sneaking in the prefix in KafkaMetricsGroup. (4), (5) and (6) will all make it clear which dimension is being aggregated upon. (4) is a bit weird now that we support tags since the main purpose of tags is that we don't have to squeeze everything into a single name. So, either (5) and (6) looks reasonable to me. Also, I am not sure how jconsole displays mbeans, but the key/value pairs in the mbean name are supposed to be unordered. [~jjkoshy], what's your take? As for the mbean for the Kafka version, could we do that in a separate jira? The approach seems reasonable. Stop using dashes AND underscores as separators in MBean names -- Key: KAFKA-1481 URL: https://issues.apache.org/jira/browse/KAFKA-1481 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1.1 Reporter: Otis Gospodnetic Priority: Critical Labels: patch Fix For: 0.8.3 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, KAFKA-1481_2014-10-31_14-35-43.patch, KAFKA-1481_2014-11-03_16-39-41_doc.patch, KAFKA-1481_2014-11-03_17-02-23.patch, KAFKA-1481_2014-11-10_20-39-41_doc.patch, KAFKA-1481_2014-11-10_21-02-23.patch, KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch, alternateLayout1.png, alternateLayout2.png, diff-for-alternate-layout1.patch, diff-for-alternate-layout2.patch, originalLayout.png MBeans should not use dashes or underscores as separators because these characters are allowed in hostnames, topics, group and consumer IDs, etc., and these are embedded in MBeans names making it impossible to parse out individual bits from MBeans. Perhaps a pipe character should be used to avoid the conflict. This looks like a major blocker because it means nobody can write Kafka 0.8.x monitoring tools unless they are doing it for themselves AND do not use dashes AND do not use underscores. See: http://search-hadoop.com/m/4TaT4lonIW -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 27735: Patch for KAFKA-1173
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27735/ --- (Updated Nov. 12, 2014, 7:32 p.m.) Review request for kafka. Bugs: KAFKA-1173 https://issues.apache.org/jira/browse/KAFKA-1173 Repository: kafka Description (updated) --- Add basic EC2 support, cleaner Vagrantfile, README cleanup, etc. Better naming, hostmanager for routable VM names, vagrant-cachier to reduce startup cost, cleanup provisioning scripts, initial support for multiple zookeepers, general cleanup. Don't sync a few directories that aren't actually required on the server. Add generic worker node support. Default # of workers should be 0 Add support for Zookeeper clusters. This requires us to split up allocating VMs and provisioning because Vagrant will run the provisioner for the first node before all nodes are allocated. This leaves the first node running Zookeeper with unroutable peer hostnames which it, for some reason, caches as unroutable. The cluster never properly finishes forming since the nodes are unable to open connections to nodes booted later than they were. The simple solution is to make sure all nodes are booted before starting configuration so we have all the addresses and hostnames available and routable. Fix AWS provider commands in Vagrant README. Addressing Joe's comments. Add support for EC2 VPC settings. Update Vagrant README to use --no-parallel when using EC2. There's an issue that causes Vagrant to hang when running in parallel. The last message is from vagrant-hostmanager, but it's not clear if it is the actual cause. Diffs (updated) - .gitignore 99b32a6770e3da59bc0167d77d45ca339ac3dbbd README.md 9aca90664b2a80a37125775ddbdea06ba6c53644 Vagrantfile PRE-CREATION vagrant/README.md PRE-CREATION vagrant/base.sh PRE-CREATION vagrant/broker.sh PRE-CREATION vagrant/zk.sh PRE-CREATION Diff: https://reviews.apache.org/r/27735/diff/ Testing --- Thanks, Ewen Cheslack-Postava
[jira] [Commented] (KAFKA-1173) Using Vagrant to get up and running with Apache Kafka
[ https://issues.apache.org/jira/browse/KAFKA-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208512#comment-14208512 ] Ewen Cheslack-Postava commented on KAFKA-1173: -- Updated reviewboard https://reviews.apache.org/r/27735/diff/ against branch origin/trunk Using Vagrant to get up and running with Apache Kafka - Key: KAFKA-1173 URL: https://issues.apache.org/jira/browse/KAFKA-1173 Project: Kafka Issue Type: Improvement Reporter: Joe Stein Assignee: Ewen Cheslack-Postava Attachments: KAFKA-1173.patch, KAFKA-1173_2013-12-07_12:07:55.patch, KAFKA-1173_2014-11-11_13:50:55.patch, KAFKA-1173_2014-11-12_11:32:09.patch Vagrant has been getting a lot of pickup in the tech communities. I have found it very useful for development and testing and working with a few clients now using it to help virtualize their environments in repeatable ways. Using Vagrant to get up and running. For 0.8.0 I have a patch on github https://github.com/stealthly/kafka 1) Install Vagrant [http://www.vagrantup.com/](http://www.vagrantup.com/) 2) Install Virtual Box [https://www.virtualbox.org/](https://www.virtualbox.org/) In the main kafka folder 1) ./sbt update 2) ./sbt package 3) ./sbt assembly-package-dependency 4) vagrant up once this is done * Zookeeper will be running 192.168.50.5 * Broker 1 on 192.168.50.10 * Broker 2 on 192.168.50.20 * Broker 3 on 192.168.50.30 When you are all up and running you will be back at a command brompt. If you want you can login to the machines using vagrant shh machineName but you don't need to. You can access the brokers and zookeeper by their IP e.g. bin/kafka-console-producer.sh --broker-list 192.168.50.10:9092,192.168.50.20:9092,192.168.50.30:9092 --topic sandbox bin/kafka-console-consumer.sh --zookeeper 192.168.50.5:2181 --topic sandbox --from-beginning -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1173) Using Vagrant to get up and running with Apache Kafka
[ https://issues.apache.org/jira/browse/KAFKA-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-1173: - Attachment: KAFKA-1173_2014-11-12_11:32:09.patch Using Vagrant to get up and running with Apache Kafka - Key: KAFKA-1173 URL: https://issues.apache.org/jira/browse/KAFKA-1173 Project: Kafka Issue Type: Improvement Reporter: Joe Stein Assignee: Ewen Cheslack-Postava Attachments: KAFKA-1173.patch, KAFKA-1173_2013-12-07_12:07:55.patch, KAFKA-1173_2014-11-11_13:50:55.patch, KAFKA-1173_2014-11-12_11:32:09.patch Vagrant has been getting a lot of pickup in the tech communities. I have found it very useful for development and testing and working with a few clients now using it to help virtualize their environments in repeatable ways. Using Vagrant to get up and running. For 0.8.0 I have a patch on github https://github.com/stealthly/kafka 1) Install Vagrant [http://www.vagrantup.com/](http://www.vagrantup.com/) 2) Install Virtual Box [https://www.virtualbox.org/](https://www.virtualbox.org/) In the main kafka folder 1) ./sbt update 2) ./sbt package 3) ./sbt assembly-package-dependency 4) vagrant up once this is done * Zookeeper will be running 192.168.50.5 * Broker 1 on 192.168.50.10 * Broker 2 on 192.168.50.20 * Broker 3 on 192.168.50.30 When you are all up and running you will be back at a command brompt. If you want you can login to the machines using vagrant shh machineName but you don't need to. You can access the brokers and zookeeper by their IP e.g. bin/kafka-console-producer.sh --broker-list 192.168.50.10:9092,192.168.50.20:9092,192.168.50.30:9092 --topic sandbox bin/kafka-console-consumer.sh --zookeeper 192.168.50.5:2181 --topic sandbox --from-beginning -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1173) Using Vagrant to get up and running with Apache Kafka
[ https://issues.apache.org/jira/browse/KAFKA-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208515#comment-14208515 ] Ewen Cheslack-Postava commented on KAFKA-1173: -- [~joestein] To be honest, I'm not too surprised something is coming up with the EC2 support. In theory it should be simple, but VPCs introduce a bunch of variables, and testing is tricky since some defaults depend on the age of your account since that affects whether you have EC2 classic support. I ran through a test with a VPC and found some issues. I updated the patch, including some additional info in the README since setting up under a VPC requires slight differences. My testing so far has been in EC2-Classic since that's the default for my account. I also put this VPC in a different region to make sure that worked. Finally, I've noticed that the default parallel provisioning seems to work fine until the very end, when it sometimes seems to hang. I couldn't easily track down the cause, so I updated the readme to use --no-parallel when using EC2. Not ideal, but it works reliably until we can find a better fix. Hopefully those fixes will clear up the issue you're seeing. Using Vagrant to get up and running with Apache Kafka - Key: KAFKA-1173 URL: https://issues.apache.org/jira/browse/KAFKA-1173 Project: Kafka Issue Type: Improvement Reporter: Joe Stein Assignee: Ewen Cheslack-Postava Attachments: KAFKA-1173.patch, KAFKA-1173_2013-12-07_12:07:55.patch, KAFKA-1173_2014-11-11_13:50:55.patch, KAFKA-1173_2014-11-12_11:32:09.patch Vagrant has been getting a lot of pickup in the tech communities. I have found it very useful for development and testing and working with a few clients now using it to help virtualize their environments in repeatable ways. Using Vagrant to get up and running. For 0.8.0 I have a patch on github https://github.com/stealthly/kafka 1) Install Vagrant [http://www.vagrantup.com/](http://www.vagrantup.com/) 2) Install Virtual Box [https://www.virtualbox.org/](https://www.virtualbox.org/) In the main kafka folder 1) ./sbt update 2) ./sbt package 3) ./sbt assembly-package-dependency 4) vagrant up once this is done * Zookeeper will be running 192.168.50.5 * Broker 1 on 192.168.50.10 * Broker 2 on 192.168.50.20 * Broker 3 on 192.168.50.30 When you are all up and running you will be back at a command brompt. If you want you can login to the machines using vagrant shh machineName but you don't need to. You can access the brokers and zookeeper by their IP e.g. bin/kafka-console-producer.sh --broker-list 192.168.50.10:9092,192.168.50.20:9092,192.168.50.30:9092 --topic sandbox bin/kafka-console-consumer.sh --zookeeper 192.168.50.5:2181 --topic sandbox --from-beginning -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Kafka consumer transactional support
Hi Jun, Thanks for taking a look at my issue and also for updating the future release plan Wiki page. My use case is to use Kafka as if it were a JMS provider (messaging use case). I'm currently using Kafka 0.8.1.1 with Java and specifically the Spring Integration Kafka Inbound Channel Adapter. Internally that adapter uses the HighLevelConsumer which shields the caller from the internals of offsets. Let's take the case where a consumer-group reads a number of messages and then is abruptly terminated before properly processing those messages. In that scenario upon restart ideally we'd begin reading at the offset we were at prior to abruptly terminating. If we have auto.commit.enable=true upon restart those messages will be considered already read and will be skipped. Setting auto.commit.enable=false would help in this case but now we'd have to manually call on the offset manager, requiring the use of the SimpleConsumer. In my use-case, it's acceptable for some manual intervention to say reprocess messaging X thru Y, but to do so would require us to know the exact offset we had started at prior to the chunk that was read in when the JVM abnormally terminated. Perhaps I could look at the underlying ExportZkOffsets and ImportZkOffsets Java classes mentioned in this link, but at the very least I'd need to log the timestamp just prior to my read to be used in that query per: https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Idon'twantmyconsumer'soffsetstobecommittedautomatically.CanImanuallymanagemyconsumer'soffsets? It sounds like the ConsumerAPI rewrite mentioned in these links might be helpful in my situation (potentially targeted for Apr 2015): https://cwiki.apache.org/confluence/display/KAFKA/Client+Rewrite#ClientRewrite-ConsumerAPI https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Consumer+Rewrite+Design In the meantime if you have any suggestions for things I might be able to use to work-around my concern I'd be appreciative. Again I'm on 0.8.1.1 but would be willing to look at 0.8.2 if it offered anything to help with my use-case. Thanks Tony
Re: Kafka consumer transactional support
Yes, the new consumer api will solve your probably better. Before that's ready, another option is to use the commitOffset() api in the high level consumer. It doesn't take any offset though. So, to prevent message loss during consumer failure, you will need to make sure all iterated messages are fully processed before calling commitOffset(). Thanks, Jun On Wed, Nov 12, 2014 at 11:35 AM, Falabella, Anthony anthony.falabe...@citi.com wrote: Hi Jun, Thanks for taking a look at my issue and also for updating the future release plan Wiki page. My use case is to use Kafka as if it were a JMS provider (messaging use case). I'm currently using Kafka 0.8.1.1 with Java and specifically the Spring Integration Kafka Inbound Channel Adapter. Internally that adapter uses the HighLevelConsumer which shields the caller from the internals of offsets. Let's take the case where a consumer-group reads a number of messages and then is abruptly terminated before properly processing those messages. In that scenario upon restart ideally we'd begin reading at the offset we were at prior to abruptly terminating. If we have auto.commit.enable=true upon restart those messages will be considered already read and will be skipped. Setting auto.commit.enable=false would help in this case but now we'd have to manually call on the offset manager, requiring the use of the SimpleConsumer. In my use-case, it's acceptable for some manual intervention to say reprocess messaging X thru Y, but to do so would require us to know the exact offset we had started at prior to the chunk that was read in when the JVM abnormally terminated. Perhaps I could look at the underlying ExportZkOffsets and ImportZkOffsets Java classes mentioned in this link, but at the very least I'd need to log the timestamp just prior to my read to be used in that query per: https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Idon'twantmyconsumer'soffsetstobecommittedautomatically.CanImanuallymanagemyconsumer'soffsets ? It sounds like the ConsumerAPI rewrite mentioned in these links might be helpful in my situation (potentially targeted for Apr 2015): https://cwiki.apache.org/confluence/display/KAFKA/Client+Rewrite#ClientRewrite-ConsumerAPI https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Consumer+Rewrite+Design In the meantime if you have any suggestions for things I might be able to use to work-around my concern I'd be appreciative. Again I'm on 0.8.1.1 but would be willing to look at 0.8.2 if it offered anything to help with my use-case. Thanks Tony
Re: Review Request 27890: Patch for KAFKA-1764
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27890/ --- (Updated Nov. 12, 2014, 10:05 p.m.) Review request for kafka. Bugs: KAFKA-1764 https://issues.apache.org/jira/browse/KAFKA-1764 Repository: kafka Description (updated) --- Changed Consumer iterator to stop putting the shutdown message back into channel. Diffs (updated) - core/src/main/scala/kafka/consumer/ConsumerIterator.scala ac491b4da2583ef7227c67f5b8bc0fd731d705c3 core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala fbc680fde21b02f11285a4f4b442987356abd17b Diff: https://reviews.apache.org/r/27890/diff/ Testing --- Thanks, Jiangjie Qin
[jira] [Updated] (KAFKA-1764) ZookeeperConsumerConnector could put multiple shutdownCommand to the same data chunk queue.
[ https://issues.apache.org/jira/browse/KAFKA-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiangjie Qin updated KAFKA-1764: Attachment: KAFKA-1764_2014-11-12_14:05:35.patch ZookeeperConsumerConnector could put multiple shutdownCommand to the same data chunk queue. --- Key: KAFKA-1764 URL: https://issues.apache.org/jira/browse/KAFKA-1764 Project: Kafka Issue Type: Bug Reporter: Jiangjie Qin Assignee: Jiangjie Qin Attachments: KAFKA-1764.patch, KAFKA-1764_2014-11-12_14:05:35.patch In ZookeeperConsumerConnector shutdown(), we could potentially put multiple shutdownCommand into the same data chunk queue, provided the topics are sharing the same data chunk queue in topicThreadIdAndQueues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1764) ZookeeperConsumerConnector could put multiple shutdownCommand to the same data chunk queue.
[ https://issues.apache.org/jira/browse/KAFKA-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208788#comment-14208788 ] Jiangjie Qin commented on KAFKA-1764: - Updated reviewboard https://reviews.apache.org/r/27890/diff/ against branch origin/trunk ZookeeperConsumerConnector could put multiple shutdownCommand to the same data chunk queue. --- Key: KAFKA-1764 URL: https://issues.apache.org/jira/browse/KAFKA-1764 Project: Kafka Issue Type: Bug Reporter: Jiangjie Qin Assignee: Jiangjie Qin Attachments: KAFKA-1764.patch, KAFKA-1764_2014-11-12_14:05:35.patch In ZookeeperConsumerConnector shutdown(), we could potentially put multiple shutdownCommand into the same data chunk queue, provided the topics are sharing the same data chunk queue in topicThreadIdAndQueues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 27684: Patch for KAFKA-1743
On Nov. 10, 2014, 7:50 p.m., Jun Rao wrote: core/src/main/scala/kafka/consumer/ConsumerConnector.scala, lines 76-80 https://reviews.apache.org/r/27684/diff/2/?file=755292#file755292line76 We will also need to change the interface in ConsumerConnector from def commitOffsets(retryOnFailure: Boolean = true) back to def commitOffsets In ZookeeperConsumerconnector, we can make the following method private def commitOffsets(retryOnFailure: Boolean = true) Another question, will scala compiler be confused with 2 methods, one w/o parenthsis and one with 1 parameter having a default? Could you try compiling the code on all scala versions? Manikumar Reddy O wrote: Currently below classes uses the new method commitOffsets(true). kafka/javaapi/consumer/ZookeeperConsumerConnector.scala kafka/tools/TestEndToEndLatency.scala If we are changing the interface, then we need to change the above classes also. If we are not fixing this on trunk, then same problem will come in 0.8.3. How to handle this? 2 methods, one w/o parenthsis and one with 1 parameter is getting compiled on all scala versions. Thanks for the explanation. There is actually a bit of inconsistency introduced in this patch. In kafka.javaapi.consumer.ZookeeperConsumerConnector, commitOffsets() is implemented as the following. def commitOffsets() { underlying.commitOffsets() } This actually calls underlying.commitOffsets(isAutoCommit: Boolean = true) with a default value of true. However, ConsumerConnector.commitOffset is implemented as the following which sets isAutoCommit to false. def commitOffsets { commitOffsets(false) } So, we should use true in the above. Another thing that I was thinking is that it's going to be a bit confusing if we have the following scala apis. def commitOffsets(retryOnFailure: Boolean = true) def commitOffsets So, if you do commitOffset it calls the second one and if you do commitOffset(), you actually call the first one. However, the expectation is probably the same method will be called in both cases. Would it be better if we get rid of the default like the following? Then, it's clear which method will be called. def commitOffsets(retryOnFailure: Boolean) def commitOffsets - Jun --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27684/#review60652 --- On Nov. 8, 2014, 6:20 a.m., Manikumar Reddy O wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27684/ --- (Updated Nov. 8, 2014, 6:20 a.m.) Review request for kafka. Bugs: KAFKA-1743 https://issues.apache.org/jira/browse/KAFKA-1743 Repository: kafka Description --- def commitOffsets method added to make ConsumerConnector backward compatible Diffs - core/src/main/scala/kafka/consumer/ConsumerConnector.scala 07677c1c26768ef9c9032626180d0015f12cb0e0 Diff: https://reviews.apache.org/r/27684/diff/ Testing --- Thanks, Manikumar Reddy O
[jira] [Commented] (KAFKA-1282) Disconnect idle socket connection in Selector
[ https://issues.apache.org/jira/browse/KAFKA-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208810#comment-14208810 ] nicu marasoiu commented on KAFKA-1282: -- I want, yes, I will add a few tests this week. Disconnect idle socket connection in Selector - Key: KAFKA-1282 URL: https://issues.apache.org/jira/browse/KAFKA-1282 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 0.8.2 Reporter: Jun Rao Assignee: nicu marasoiu Labels: newbie++ Fix For: 0.9.0 Attachments: 1282_access-order.patch, 1282_brush.patch, 1282_brushed_up.patch, KAFKA-1282_Disconnect_idle_socket_connection_in_Selector.patch To reduce # socket connections, it would be useful for the new producer to close socket connections that are idle. We can introduce a new producer config for the idle time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 27890: Patch for KAFKA-1764
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27890/#review61099 --- Ship it! Ship It! - Guozhang Wang On Nov. 12, 2014, 10:05 p.m., Jiangjie Qin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27890/ --- (Updated Nov. 12, 2014, 10:05 p.m.) Review request for kafka. Bugs: KAFKA-1764 https://issues.apache.org/jira/browse/KAFKA-1764 Repository: kafka Description --- Changed Consumer iterator to stop putting the shutdown message back into channel. Diffs - core/src/main/scala/kafka/consumer/ConsumerIterator.scala ac491b4da2583ef7227c67f5b8bc0fd731d705c3 core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala fbc680fde21b02f11285a4f4b442987356abd17b Diff: https://reviews.apache.org/r/27890/diff/ Testing --- Thanks, Jiangjie Qin
Re: Review Request 27834: Fix KAFKA-1762: add comment on risks using a larger value of max.inflight.requests than 1, in KAFKA-1650 we will add another comment about its risk of data loss
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27834/#review61123 --- Ship it! clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java https://reviews.apache.org/r/27834/#comment102598 I would actually prefer not mentioning the default value comment - (say, if we change the default ever). - Joel Koshy On Nov. 10, 2014, 10:53 p.m., Guozhang Wang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27834/ --- (Updated Nov. 10, 2014, 10:53 p.m.) Review request for kafka. Bugs: KAFKA-1762 https://issues.apache.org/jira/browse/KAFKA-1762 Repository: kafka Description --- dummy Diffs - clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 9095caf0db1e41a4acb4216fb197626fbd85b806 Diff: https://reviews.apache.org/r/27834/diff/ Testing --- Thanks, Guozhang Wang
[jira] [Commented] (KAFKA-1762) Enforce MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION to 1 in MirrorMaker
[ https://issues.apache.org/jira/browse/KAFKA-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208940#comment-14208940 ] Joel Koshy commented on KAFKA-1762: --- Committed the doc change to trunk Enforce MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION to 1 in MirrorMaker - Key: KAFKA-1762 URL: https://issues.apache.org/jira/browse/KAFKA-1762 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Attachments: KAFKA-1762.patch The new Producer client introduces a config for the max # of inFlight messages. When it is set 1 on MirrorMaker, however, there is a risk for data loss even with KAFKA-1650 because the offsets recorded in the MM's offset map is no longer continuous. Another issue is that when this value is set 1, there is a risk of message re-ordering in the producer Changes: 1. Set max # of inFlight messages = 1 in MM 2. Leave comments explaining what the risks are of changing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Jenkins build is back to normal : Kafka-trunk #328
See https://builds.apache.org/job/Kafka-trunk/328/changes
Re: Kafka consumer transactional support
I didn't realize there was a commitOffset() method on the high level consumer (the code is abstracted by the Spring Integration classes). Yes, this actually suits my needs and I was able to get it to work for my use case. Thank you very much - that was extremely helpful. In case it's of any use to someone else, here's the solution I came up with. Spring Configuration file int-kafka:inbound-channel-adapter id=kafkaInboundChannelAdapter kafka-consumer-context-ref=kafkaConsumerContext auto-startup=true channel=exchangeKafkaFusionInboundSpringExecutorChannel int:poller fixed-delay=10 time-unit=MILLISECONDS max-messages-per-poll=5 !-- Kafka 0.8.1.1 does not have the concept of transactions. Therefore we must handle this on our own. -- !-- To do so on the consumer we've set the autocommit.enable=false so commits won't automatically be performed as we read msgs. -- !-- The advice-chain below will call consumerConfig.getConsumerConnector().commitOffsets() only if no exceptions occurred during the processing of this chunk of msgs. -- int:advice-chain bean id=kafkaConsumerAfterAdvice class=com.citigroup.tmg.exchgateway.common.util.springintegration.KafkaConsumerAfterAdvice property name=consumerContext ref=kafkaConsumerContext/ property name=consumerGroupId value=consumerGroupG/ /bean /int:advice-chain /int:poller int-kafka:producer-context id=kafkaProducerContext int-kafka:producer-configurations int-kafka:producer-configuration broker-list=175.65.76.12:9092 key-class-type=java.lang.String value-class-type=java.lang.String topic=test compression-codec=default/ /int-kafka:producer-configurations /int-kafka:producer-context int-kafka:zookeeper-connect id=zookeeperConnect zk-connect=175.65.76.12:2181 zk-connection-timeout=6000 zk-session-timeout=6000 zk-sync-time=2000 / !-- See http://kafka.apache.org/documentation.html#consumerconfigs -- !-- or high-level consumer on http://kafka.apache.org/07/configuration.html or https://kafka.apache.org/08/configuration.html -- !-- http://grokbase.com/t/kafka/users/12b9bmsy7k/question-about-resetting-offsets-and-the-high-level-consumer -- bean id=kafkaConsumerProperties class=org.springframework.beans.factory.config.PropertiesFactoryBean property name=properties props prop key=autocommit.enablefalse/prop prop key=auto.offset.resetlargest/prop /props /property /bean int-kafka:consumer-context id=kafkaConsumerContext consumer-timeout=4000 zookeeper-connect=zookeeperConnect consumer-properties=kafkaConsumerProperties int-kafka:consumer-configurations int-kafka:consumer-configuration group-id=consumerGroupG max-messages=2 int-kafka:topic id=test-multi streams=3/ /int-kafka:consumer-configuration /int-kafka:consumer-configurations /int-kafka:consumer-context Java Advice Class public class KafkaConsumerAfterAdvice implements AfterReturningAdvice, InitializingBean { private KafkaConsumerContext consumerContext; private String consumerGroupId; public void setConsumerContext(KafkaConsumerContext consumerContext) { this.consumerContext = consumerContext; } public void setConsumerGroupId(String consumerGroupId) { this.consumerGroupId = consumerGroupId; } /** * Spring calls this after the bean has be initialized within the ApplicationContext. */ @Override public void afterPropertiesSet() throws Exception { Assert.notNull(consumerContext, [consumerContext] cannot be null); Assert.notNull(consumerGroupId, [consumerGroupId] cannot be null); } @Override public void afterReturning(Object returnValue, Method method, Object[] args, Object target) throws Throwable { // If there were messages then returnValue=true otherwise returnValue=false. // Only if true do we need to take the hit to commit the offsets. if (returnValue.equals(true)) { IteratorConsumerConfigurationString, Object consumerConfigIterator = consumerContext.getConsumerConfigurations().iterator(); while (consumerConfigIterator.hasNext()) { ConsumerConfigurationString, Object consumerConfig = consumerConfigIterator.next(); if (consumerGroupId.equals(consumerConfig.getConsumerMetadata().getGroupId())) { consumerConfig.getConsumerConnector().commitOffsets(); } } } } }
[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names
[ https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14209055#comment-14209055 ] Joel Koshy commented on KAFKA-1481: --- (1) and (4) seem equivalent (i.e., AllTopics vs Aggregate) - or are you saying that (4) will be AllTopics or AllBrokers as appropriate? I'm +0 on (5) for the reason I stated above. i.e., it is odd to see true when browsing mbeans I'm +0 on (6) as well as topics=aggregate is a bit odd. The field name suggests it is a list of topics but it is more like a boolean. Between this and (5) I prefer (5). (3) seems reasonable to me although it is not as clear as having an explicit aggregate term in the type. However, I think (1), (2) and (3) do make it clear enough what is being aggregated: i.e., bytes-out-per-sec aggregated on topic. I actually think Broker should not be there since this is a broker-side mbean already. i.e., if we had kafka.server:type=TopicMetrics,name=BytesOutPerSec (wouldn't it be clear that the dimension of aggregation is across topics?) i.e., I think we can just make the dimension clear from the typename. Likewise, it should be clear (for consumers) that FetchRequestAndResponseMetrics is really a broker-level aggregation. Stop using dashes AND underscores as separators in MBean names -- Key: KAFKA-1481 URL: https://issues.apache.org/jira/browse/KAFKA-1481 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1.1 Reporter: Otis Gospodnetic Priority: Critical Labels: patch Fix For: 0.8.3 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-30_21-35-43.patch, KAFKA-1481_2014-10-31_14-35-43.patch, KAFKA-1481_2014-11-03_16-39-41_doc.patch, KAFKA-1481_2014-11-03_17-02-23.patch, KAFKA-1481_2014-11-10_20-39-41_doc.patch, KAFKA-1481_2014-11-10_21-02-23.patch, KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch, alternateLayout1.png, alternateLayout2.png, diff-for-alternate-layout1.patch, diff-for-alternate-layout2.patch, originalLayout.png MBeans should not use dashes or underscores as separators because these characters are allowed in hostnames, topics, group and consumer IDs, etc., and these are embedded in MBeans names making it impossible to parse out individual bits from MBeans. Perhaps a pipe character should be used to avoid the conflict. This looks like a major blocker because it means nobody can write Kafka 0.8.x monitoring tools unless they are doing it for themselves AND do not use dashes AND do not use underscores. See: http://search-hadoop.com/m/4TaT4lonIW -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1729) add doc for Kafka-based offset management in 0.8.2
[ https://issues.apache.org/jira/browse/KAFKA-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14209071#comment-14209071 ] Jun Rao commented on KAFKA-1729: Thanks for the patch. A few comments. 1. We need to make sure that before people start using the Kafka-based offset management in production, they set offsets.topic.num.partitions and offsets.topic.replication.factor properly for the offset topic since the defaults are not suitable for production usage. Could you add that in implementation.html? 2. It seems that issuing manual OffsetCommitRequest is only needed when using SimpleConsumer. We can probably make that clear in the wiki. add doc for Kafka-based offset management in 0.8.2 -- Key: KAFKA-1729 URL: https://issues.apache.org/jira/browse/KAFKA-1729 Project: Kafka Issue Type: Sub-task Reporter: Jun Rao Assignee: Joel Koshy Attachments: KAFKA-1782-doc-v1.patch, KAFKA-1782-doc-v2.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1752) add --replace-broker option
[ https://issues.apache.org/jira/browse/KAFKA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14209072#comment-14209072 ] Neha Narkhede commented on KAFKA-1752: -- bq. So it looks like we actually want --add-broker (and transfer some load to it) and --decommission-broker (and transfer its load somewhere)? Right. [~Dmitry Pekar] We would like to avoid adding more and more nuanced options to the partition reassignment tool that is already too complex. I would suggest taking a step back and arriving at a few simple options that would cover all use cases. I think that all we need is a way for users to add and decommission brokers and the user's expectation would be that the tool comes up with a correct assignment that leads to roughly even distribution of partitions as per our replica placement strategy. add --replace-broker option --- Key: KAFKA-1752 URL: https://issues.apache.org/jira/browse/KAFKA-1752 Project: Kafka Issue Type: Sub-task Components: tools Reporter: Dmitry Pekar Assignee: Dmitry Pekar Fix For: 0.8.3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1762) Update max-inflight-request doc string
[ https://issues.apache.org/jira/browse/KAFKA-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Koshy updated KAFKA-1762: -- Resolution: Fixed Status: Resolved (was: Patch Available) Update max-inflight-request doc string -- Key: KAFKA-1762 URL: https://issues.apache.org/jira/browse/KAFKA-1762 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Attachments: KAFKA-1762.patch The new Producer client introduces a config for the max # of inFlight messages. When it is set 1 on MirrorMaker, however, there is a risk for data loss even with KAFKA-1650 because the offsets recorded in the MM's offset map is no longer continuous. Another issue is that when this value is set 1, there is a risk of message re-ordering in the producer Changes: 1. Set max # of inFlight messages = 1 in MM 2. Leave comments explaining what the risks are of changing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1762) Update max-inflight-request doc string
[ https://issues.apache.org/jira/browse/KAFKA-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Koshy updated KAFKA-1762: -- Summary: Update max-inflight-request doc string (was: Enforce MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION to 1 in MirrorMaker) Update max-inflight-request doc string -- Key: KAFKA-1762 URL: https://issues.apache.org/jira/browse/KAFKA-1762 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Attachments: KAFKA-1762.patch The new Producer client introduces a config for the max # of inFlight messages. When it is set 1 on MirrorMaker, however, there is a risk for data loss even with KAFKA-1650 because the offsets recorded in the MM's offset map is no longer continuous. Another issue is that when this value is set 1, there is a risk of message re-ordering in the producer Changes: 1. Set max # of inFlight messages = 1 in MM 2. Leave comments explaining what the risks are of changing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14209151#comment-14209151 ] Jun Rao commented on KAFKA-1555: Thanks for the doc patch. Committed to svn after fixing a few typos. Let me know if you see any further issue. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, KAFKA-1555-DOCS.2.patch, KAFKA-1555-DOCS.3.patch, KAFKA-1555-DOCS.4.patch, KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein resolved KAFKA-1555. -- Resolution: Fixed provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, KAFKA-1555-DOCS.2.patch, KAFKA-1555-DOCS.3.patch, KAFKA-1555-DOCS.4.patch, KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Request for subscription
Hi, Can I subscribe for this mailing list Thanks, Sampath
[jira] [Created] (KAFKA-1768) Expose version via JMX
Otis Gospodnetic created KAFKA-1768: --- Summary: Expose version via JMX Key: KAFKA-1768 URL: https://issues.apache.org/jira/browse/KAFKA-1768 Project: Kafka Issue Type: Improvement Affects Versions: 0.8.1.1 Reporter: Otis Gospodnetic Fix For: 0.8.2 See Gwen's code snippet in http://search-hadoop.com/m/4TaT4xtk36/Programmatic+Kafka+version+detection%252Fextractionsubj=Programmatic+Kafka+version+detection+extraction+ -- This message was sent by Atlassian JIRA (v6.3.4#6332)