Re: Review Request 26994: Patch for KAFKA-1719
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26994/ --- (Updated Oct. 24, 2014, 7:56 a.m.) Review request for kafka. Bugs: KAFKA-1719 https://issues.apache.org/jira/browse/KAFKA-1719 Repository: kafka Description (updated) --- Addressed Guozhang's comments. Addressed Neha and Guzhang's comments. Incorporated Joel and Neha's comments. Incorporated Joel and Neha's comments. Also fixed a potential race where cleanShutdown could execute multiple times if several threads exit abnormally at same time. Diffs (updated) - core/src/main/scala/kafka/tools/MirrorMaker.scala b8698ee1469c8fbc92ccc176d916eb3e28b87867 Diff: https://reviews.apache.org/r/26994/diff/ Testing --- Thanks, Jiangjie Qin
[jira] [Updated] (KAFKA-1719) Make mirror maker exit when one consumer/producer thread exits.
[ https://issues.apache.org/jira/browse/KAFKA-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiangjie Qin updated KAFKA-1719: Attachment: KAFKA-1719_2014-10-24_00:56:06.patch Make mirror maker exit when one consumer/producer thread exits. --- Key: KAFKA-1719 URL: https://issues.apache.org/jira/browse/KAFKA-1719 Project: Kafka Issue Type: Improvement Reporter: Jiangjie Qin Assignee: Jiangjie Qin Attachments: KAFKA-1719.patch, KAFKA-1719_2014-10-22_15:04:32.patch, KAFKA-1719_2014-10-23_16:20:22.patch, KAFKA-1719_2014-10-24_00:56:06.patch When one of the consumer/producer thread exits, the entire mirror maker will be blocked. In this case, it is better to make it exit. It seems a single ZookeeperConsumerConnector is sufficient for mirror maker, probably we don't need a list for the connectors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1719) Make mirror maker exit when one consumer/producer thread exits.
[ https://issues.apache.org/jira/browse/KAFKA-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182573#comment-14182573 ] Jiangjie Qin commented on KAFKA-1719: - Updated reviewboard https://reviews.apache.org/r/26994/diff/ against branch origin/trunk Make mirror maker exit when one consumer/producer thread exits. --- Key: KAFKA-1719 URL: https://issues.apache.org/jira/browse/KAFKA-1719 Project: Kafka Issue Type: Improvement Reporter: Jiangjie Qin Assignee: Jiangjie Qin Attachments: KAFKA-1719.patch, KAFKA-1719_2014-10-22_15:04:32.patch, KAFKA-1719_2014-10-23_16:20:22.patch, KAFKA-1719_2014-10-24_00:56:06.patch When one of the consumer/producer thread exits, the entire mirror maker will be blocked. In this case, it is better to make it exit. It seems a single ZookeeperConsumerConnector is sufficient for mirror maker, probably we don't need a list for the connectors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names
[ https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Tretyakov updated KAFKA-1481: -- Attachment: KAFKA-1481_2014-10-24_14-14-35.patch.patch Stop using dashes AND underscores as separators in MBean names -- Key: KAFKA-1481 URL: https://issues.apache.org/jira/browse/KAFKA-1481 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1.1 Reporter: Otis Gospodnetic Priority: Critical Labels: patch Fix For: 0.8.3 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-24_14-14-35.patch.patch, KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch MBeans should not use dashes or underscores as separators because these characters are allowed in hostnames, topics, group and consumer IDs, etc., and these are embedded in MBeans names making it impossible to parse out individual bits from MBeans. Perhaps a pipe character should be used to avoid the conflict. This looks like a major blocker because it means nobody can write Kafka 0.8.x monitoring tools unless they are doing it for themselves AND do not use dashes AND do not use underscores. See: http://search-hadoop.com/m/4TaT4lonIW -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names
[ https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182670#comment-14182670 ] Vladimir Tretyakov commented on KAFKA-1481: --- Hi Jun, thx for your feedback again, attached new patch. 20.1 done. 20.2 clientId is Taggable here, it will convert to something like clientId=XXX automatically 21 tags should be Map which iterators iterate in the same order elements were inserted (final string must be stable): from LinkedHashMap docs The iterator and all traversal methods of this class visit elements in the order they were inserted.. I am new in scala, if you know better candidate with predictable iterators please let me know 22.1 done 22.2 clientId is not always just clientId=XXX, look at: {code} class ConsumerFetcherThread(consumerFetcherThreadId: ConsumerFetcherThreadId, val config: ConsumerConfig, sourceBroker: Broker, partitionMap: Map[TopicAndPartition, PartitionTopicInfo], val consumerFetcherManager: ConsumerFetcherManager) extends AbstractFetcherThread(name = consumerFetcherThreadId, clientId = new TaggableInfo(new TaggableInfo(clientId - config.clientId), consumerFetcherThreadId), sourceBroker = sourceBroker, socketTimeout = config.socketTimeoutMs, socketBufferSize = config.socketReceiveBufferBytes, fetchSize = config.fetchMessageMaxBytes, fetcherBrokerId = Request.OrdinaryConsumerId, maxWait = config.fetchWaitMaxMs, minBytes = config.fetchMinBytes, isInterruptible = true) { {code} this clientId = new TaggableInfo(new TaggableInfo(clientId - config.clientId), consumerFetcherThreadId) part was in code before my changes (was manipulation with strings, not Taggable of course). Yeah, somewhere in code we can use string instead of Taggable, but maybe it is better to has Taggable everywhere what is related to metrics. At least it was not easy to me decide where I have to use Taggable, but where I can leave string. I'd prefer Taggable everywhere in this case, sorry:). 23. done everything as described by link, checked how it works/applies on 'origin/0.8.2' from scratch. 24. see 22.2 25. agree, added second constructor with 'String' as clientId, can't remove constructor with Taggable as clientId because as you can see in 22.2 sometimes clientId is not just clientId but something more complex. 26. added 'MetricsLeakTest', basic idea is start/stop many producers/consumes and observe count of metrics in Metrics.defaultRegistry(), count should not grow. Stop using dashes AND underscores as separators in MBean names -- Key: KAFKA-1481 URL: https://issues.apache.org/jira/browse/KAFKA-1481 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1.1 Reporter: Otis Gospodnetic Priority: Critical Labels: patch Fix For: 0.8.3 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, KAFKA-1481_2014-10-21_09-14-35.patch, KAFKA-1481_2014-10-24_14-14-35.patch.patch, KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch MBeans should not use dashes or underscores as separators because these characters are allowed in hostnames, topics, group and consumer IDs, etc., and these are embedded in MBeans names making it impossible to parse out individual bits from MBeans. Perhaps a pipe character should be used to avoid the conflict. This looks like a major blocker because it means nobody can write Kafka 0.8.x monitoring tools unless they are doing it for themselves AND do not use dashes AND do not use underscores. See: http://search-hadoop.com/m/4TaT4lonIW -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] 0.8.2-beta Release Candidate 1
Jun, I updated the https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/ with the contents of kafka-clients-0.8.2-beta-javadoc.jar There weren't any artifact changes so I don't think we need a new release candidate ... we can extend the vote to Monday if we don't get a pass/fail by 2pm PT today. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / On Fri, Oct 24, 2014 at 12:04 AM, Jun Rao jun...@gmail.com wrote: Joe, Verified quickstart on both the src and binary release. They all look good. The javadoc doesn't seem to include those in clients. Could you add them? Thanks, Jun On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly wrote: This is the first candidate for release of Apache Kafka 0.8.2-beta Release Notes for the 0.8.2-beta release https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEASE_NOTES.html *** Please download, test and vote by Friday, October 24th, 2pm PT Kafka's KEYS file containing PGP keys we use to sign the release: https://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5, sha1 and sha2 (SHA256) checksum. * Release artifacts to be voted upon (source and binary): https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/ * Maven artifacts to be voted upon prior to release: https://repository.apache.org/content/groups/staging/ * scala-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala-doc/ * java-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/ * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2-beta tag https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c52bc62a89d60f85125d3723c8410fa0 /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop /
[jira] [Created] (KAFKA-1729) add doc for Kafka-based offset management in 0.8.2
Jun Rao created KAFKA-1729: -- Summary: add doc for Kafka-based offset management in 0.8.2 Key: KAFKA-1729 URL: https://issues.apache.org/jira/browse/KAFKA-1729 Project: Kafka Issue Type: Sub-task Reporter: Jun Rao -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-1730) add the doc for the new java producer in 0.8.2
Jun Rao created KAFKA-1730: -- Summary: add the doc for the new java producer in 0.8.2 Key: KAFKA-1730 URL: https://issues.apache.org/jira/browse/KAFKA-1730 Project: Kafka Issue Type: Sub-task Affects Versions: 0.8.2 Reporter: Jun Rao -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1730) add the doc for the new java producer in 0.8.2
[ https://issues.apache.org/jira/browse/KAFKA-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182974#comment-14182974 ] Jun Rao commented on KAFKA-1730: We can probably just add a link to the java doc of the new producer. add the doc for the new java producer in 0.8.2 -- Key: KAFKA-1730 URL: https://issues.apache.org/jira/browse/KAFKA-1730 Project: Kafka Issue Type: Sub-task Affects Versions: 0.8.2 Reporter: Jun Rao -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-1731) add config/jmx changes in 0.8.2 doc
Jun Rao created KAFKA-1731: -- Summary: add config/jmx changes in 0.8.2 doc Key: KAFKA-1731 URL: https://issues.apache.org/jira/browse/KAFKA-1731 Project: Kafka Issue Type: Sub-task Reporter: Jun Rao -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1728) update 082 docs
[ https://issues.apache.org/jira/browse/KAFKA-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182986#comment-14182986 ] Jun Rao commented on KAFKA-1728: Need to include the doc change in KAFKA-1555. update 082 docs --- Key: KAFKA-1728 URL: https://issues.apache.org/jira/browse/KAFKA-1728 Project: Kafka Issue Type: Task Affects Versions: 0.8.2 Reporter: Jun Rao We need to update the docs for 082 release. https://svn.apache.org/repos/asf/kafka/site/082 http://kafka.apache.org/082/documentation.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] 0.8.2-beta Release Candidate 1
Joe, Thanks. +1 on RC 1 from me. Jun On Fri, Oct 24, 2014 at 5:22 AM, Joe Stein joe.st...@stealth.ly wrote: Jun, I updated the https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/ with the contents of kafka-clients-0.8.2-beta-javadoc.jar There weren't any artifact changes so I don't think we need a new release candidate ... we can extend the vote to Monday if we don't get a pass/fail by 2pm PT today. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / On Fri, Oct 24, 2014 at 12:04 AM, Jun Rao jun...@gmail.com wrote: Joe, Verified quickstart on both the src and binary release. They all look good. The javadoc doesn't seem to include those in clients. Could you add them? Thanks, Jun On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly wrote: This is the first candidate for release of Apache Kafka 0.8.2-beta Release Notes for the 0.8.2-beta release https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEASE_NOTES.html *** Please download, test and vote by Friday, October 24th, 2pm PT Kafka's KEYS file containing PGP keys we use to sign the release: https://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5, sha1 and sha2 (SHA256) checksum. * Release artifacts to be voted upon (source and binary): https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/ * Maven artifacts to be voted upon prior to release: https://repository.apache.org/content/groups/staging/ * scala-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala-doc/ * java-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/ * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2-beta tag https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c52bc62a89d60f85125d3723c8410fa0 /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop /
082 docs
Hi, Everyone, I created KAFKA-1728 to track the doc changes in the 082 release. If anyone wants to help, you can sign up on one of the sub tasks. Joel Koshy, Do you want to update the doc for offset management? Thanks, Jun
Build failed in Jenkins: Kafka-trunk #317
See https://builds.apache.org/job/Kafka-trunk/317/changes Changes: [neha.narkhede] KAFKA-1725 Configuration file bugs in system tests add noise to output and break a few tests; reviewed by Neha Narkhede -- [...truncated 1080 lines...] kafka.admin.AdminTest testPartitionReassignmentNonOverlappingReplicas FAILED java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95) at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33) at kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33) at kafka.admin.AdminTest.setUp(AdminTest.scala:33) kafka.admin.AdminTest testReassigningNonExistingPartition FAILED java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95) at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33) at kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33) at kafka.admin.AdminTest.setUp(AdminTest.scala:33) kafka.admin.AdminTest testResumePartitionReassignmentThatWasCompleted FAILED java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95) at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33) at kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33) at kafka.admin.AdminTest.setUp(AdminTest.scala:33) kafka.admin.AdminTest testPreferredReplicaJsonData FAILED java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95) at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33) at kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33) at kafka.admin.AdminTest.setUp(AdminTest.scala:33) kafka.admin.AdminTest testBasicPreferredReplicaElection FAILED java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95) at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33) at kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33) at kafka.admin.AdminTest.setUp(AdminTest.scala:33) kafka.admin.AdminTest testShutdownBroker FAILED java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95) at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33) at kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33) at kafka.admin.AdminTest.setUp(AdminTest.scala:33) kafka.admin.AdminTest testTopicConfigChange FAILED java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at
[jira] [Updated] (KAFKA-1725) Configuration file bugs in system tests add noise to output and break a few tests
[ https://issues.apache.org/jira/browse/KAFKA-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neha Narkhede updated KAFKA-1725: - Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for fixing the system tests! Pushed to trunk and 0.8.2 Configuration file bugs in system tests add noise to output and break a few tests - Key: KAFKA-1725 URL: https://issues.apache.org/jira/browse/KAFKA-1725 Project: Kafka Issue Type: Bug Components: tools Reporter: Ewen Cheslack-Postava Assignee: Ewen Cheslack-Postava Priority: Minor Attachments: KAFKA-1725.patch There are some broken and misnamed system test configuration files (testcase_*_properties.json) that are causing a bunch of exceptions when running system tests and make it a lot harder to parse the output. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (KAFKA-1725) Configuration file bugs in system tests add noise to output and break a few tests
[ https://issues.apache.org/jira/browse/KAFKA-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neha Narkhede closed KAFKA-1725. Configuration file bugs in system tests add noise to output and break a few tests - Key: KAFKA-1725 URL: https://issues.apache.org/jira/browse/KAFKA-1725 Project: Kafka Issue Type: Bug Components: tools Reporter: Ewen Cheslack-Postava Assignee: Ewen Cheslack-Postava Priority: Minor Attachments: KAFKA-1725.patch There are some broken and misnamed system test configuration files (testcase_*_properties.json) that are causing a bunch of exceptions when running system tests and make it a lot harder to parse the output. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] 0.8.2-beta Release Candidate 1
+1 (non-official community vote). Kicked the tires of the binary release. Works out of the box as expected, new producer included. On Fri, Oct 24, 2014 at 5:22 AM, Joe Stein joe.st...@stealth.ly wrote: Jun, I updated the https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/ with the contents of kafka-clients-0.8.2-beta-javadoc.jar There weren't any artifact changes so I don't think we need a new release candidate ... we can extend the vote to Monday if we don't get a pass/fail by 2pm PT today. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / On Fri, Oct 24, 2014 at 12:04 AM, Jun Rao jun...@gmail.com wrote: Joe, Verified quickstart on both the src and binary release. They all look good. The javadoc doesn't seem to include those in clients. Could you add them? Thanks, Jun On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly wrote: This is the first candidate for release of Apache Kafka 0.8.2-beta Release Notes for the 0.8.2-beta release https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEASE_NOTES.html *** Please download, test and vote by Friday, October 24th, 2pm PT Kafka's KEYS file containing PGP keys we use to sign the release: https://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5, sha1 and sha2 (SHA256) checksum. * Release artifacts to be voted upon (source and binary): https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/ * Maven artifacts to be voted upon prior to release: https://repository.apache.org/content/groups/staging/ * scala-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala-doc/ * java-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/ * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2-beta tag https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c52bc62a89d60f85125d3723c8410fa0 /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop /
[jira] [Updated] (KAFKA-1719) Make mirror maker exit when one consumer/producer thread exits.
[ https://issues.apache.org/jira/browse/KAFKA-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neha Narkhede updated KAFKA-1719: - Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the patches. Pushed to trunk and 0.8.2 Make mirror maker exit when one consumer/producer thread exits. --- Key: KAFKA-1719 URL: https://issues.apache.org/jira/browse/KAFKA-1719 Project: Kafka Issue Type: Improvement Reporter: Jiangjie Qin Assignee: Jiangjie Qin Attachments: KAFKA-1719.patch, KAFKA-1719_2014-10-22_15:04:32.patch, KAFKA-1719_2014-10-23_16:20:22.patch, KAFKA-1719_2014-10-24_00:56:06.patch When one of the consumer/producer thread exits, the entire mirror maker will be blocked. In this case, it is better to make it exit. It seems a single ZookeeperConsumerConnector is sufficient for mirror maker, probably we don't need a list for the connectors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1688) Add authorization interface and naive implementation
[ https://issues.apache.org/jira/browse/KAFKA-1688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183061#comment-14183061 ] Michael Herstine commented on KAFKA-1688: - Apologies-- yes, I should have said Principal. Add authorization interface and naive implementation Key: KAFKA-1688 URL: https://issues.apache.org/jira/browse/KAFKA-1688 Project: Kafka Issue Type: Sub-task Components: security Reporter: Jay Kreps Assignee: Sriharsha Chintalapani Add a PermissionManager interface as described here: https://cwiki.apache.org/confluence/display/KAFKA/Security (possibly there is a better name?) Implement calls to the PermissionsManager in KafkaApis for the main requests (FetchRequest, ProduceRequest, etc). We will need to add a new error code and exception to the protocol to indicate permission denied. Add a server configuration to give the class you want to instantiate that implements that interface. That class can define its own configuration properties from the main config file. Provide a simple implementation of this interface which just takes a user and ip whitelist and permits those in either of the whitelists to do anything, and denies all others. Rather than writing an integration test for this class we can probably just use this class for the TLS and SASL authentication testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] 0.8.2-beta Release Candidate 1
+1 (binding) Verified the quickstart, docs, unit tests on the source and binary release. Thanks, Neha On Fri, Oct 24, 2014 at 9:26 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-official community vote). Kicked the tires of the binary release. Works out of the box as expected, new producer included. On Fri, Oct 24, 2014 at 5:22 AM, Joe Stein joe.st...@stealth.ly wrote: Jun, I updated the https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/ with the contents of kafka-clients-0.8.2-beta-javadoc.jar There weren't any artifact changes so I don't think we need a new release candidate ... we can extend the vote to Monday if we don't get a pass/fail by 2pm PT today. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / On Fri, Oct 24, 2014 at 12:04 AM, Jun Rao jun...@gmail.com wrote: Joe, Verified quickstart on both the src and binary release. They all look good. The javadoc doesn't seem to include those in clients. Could you add them? Thanks, Jun On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly wrote: This is the first candidate for release of Apache Kafka 0.8.2-beta Release Notes for the 0.8.2-beta release https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEASE_NOTES.html *** Please download, test and vote by Friday, October 24th, 2pm PT Kafka's KEYS file containing PGP keys we use to sign the release: https://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5, sha1 and sha2 (SHA256) checksum. * Release artifacts to be voted upon (source and binary): https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/ * Maven artifacts to be voted upon prior to release: https://repository.apache.org/content/groups/staging/ * scala-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala-doc/ * java-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/ * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2-beta tag https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c52bc62a89d60f85125d3723c8410fa0 /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop /
Build failed in Jenkins: Kafka-trunk #318
See https://builds.apache.org/job/Kafka-trunk/318/changes Changes: [neha.narkhede] KAFKA-1719 Make mirror maker exit when one consumer/producer thread exits; reviewed by Neha Narkhede, Joel Koshy and Guozhang Wang -- [...truncated 1284 lines...] kafka.admin.AdminTest testPartitionReassignmentNonOverlappingReplicas FAILED java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95) at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33) at kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33) at kafka.admin.AdminTest.setUp(AdminTest.scala:33) kafka.admin.AdminTest testReassigningNonExistingPartition FAILED java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95) at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33) at kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33) at kafka.admin.AdminTest.setUp(AdminTest.scala:33) kafka.admin.AdminTest testResumePartitionReassignmentThatWasCompleted FAILED java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95) at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33) at kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33) at kafka.admin.AdminTest.setUp(AdminTest.scala:33) kafka.admin.AdminTest testPreferredReplicaJsonData FAILED java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95) at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33) at kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33) at kafka.admin.AdminTest.setUp(AdminTest.scala:33) kafka.admin.AdminTest testBasicPreferredReplicaElection FAILED java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95) at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33) at kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33) at kafka.admin.AdminTest.setUp(AdminTest.scala:33) kafka.admin.AdminTest testShutdownBroker FAILED java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95) at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33) at kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33) at kafka.admin.AdminTest.setUp(AdminTest.scala:33) kafka.admin.AdminTest testTopicConfigChange FAILED java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at
Re: [VOTE] 0.8.2-beta Release Candidate 1
+1 New Sender is Added ~ 2014-10-25 1:18 GMT+08:00 Neha Narkhede neha.narkh...@gmail.com: +1 (binding) Verified the quickstart, docs, unit tests on the source and binary release. Thanks, Neha On Fri, Oct 24, 2014 at 9:26 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 (non-official community vote). Kicked the tires of the binary release. Works out of the box as expected, new producer included. On Fri, Oct 24, 2014 at 5:22 AM, Joe Stein joe.st...@stealth.ly wrote: Jun, I updated the https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/ with the contents of kafka-clients-0.8.2-beta-javadoc.jar There weren't any artifact changes so I don't think we need a new release candidate ... we can extend the vote to Monday if we don't get a pass/fail by 2pm PT today. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / On Fri, Oct 24, 2014 at 12:04 AM, Jun Rao jun...@gmail.com wrote: Joe, Verified quickstart on both the src and binary release. They all look good. The javadoc doesn't seem to include those in clients. Could you add them? Thanks, Jun On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly wrote: This is the first candidate for release of Apache Kafka 0.8.2-beta Release Notes for the 0.8.2-beta release https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEASE_NOTES.html *** Please download, test and vote by Friday, October 24th, 2pm PT Kafka's KEYS file containing PGP keys we use to sign the release: https://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5, sha1 and sha2 (SHA256) checksum. * Release artifacts to be voted upon (source and binary): https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/ * Maven artifacts to be voted upon prior to release: https://repository.apache.org/content/groups/staging/ * scala-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala-doc/ * java-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/ * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2-beta tag https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c52bc62a89d60f85125d3723c8410fa0 /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / -- long is the way and hard that out of Hell leads up to light
Re: 082 docs
Yes I can update the doc for offset management, and review Gwen's doc on min.isr/durability guarantees On Fri, Oct 24, 2014 at 09:10:19AM -0700, Jun Rao wrote: Hi, Everyone, I created KAFKA-1728 to track the doc changes in the 082 release. If anyone wants to help, you can sign up on one of the sub tasks. Joel Koshy, Do you want to update the doc for offset management? Thanks, Jun
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183086#comment-14183086 ] Joel Koshy commented on KAFKA-1555: --- Jun, you can just diff configuration.html, design.html between the files in the two directories. Or you can use a tool like meld I have a few comments, that I will post later after some suggested edits. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1683) Implement a session concept in the socket server
[ https://issues.apache.org/jira/browse/KAFKA-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183089#comment-14183089 ] Gwen Shapira commented on KAFKA-1683: - Reviewed Zookeeper SASL code to get better handle on where we are heading. Since our high-level design is inspired by ZK, I encourage everyone involved to take a look too. One thing it clarified for me - SASL does not (necessarily) maintain its own information on the socket. We need to attach something extra that will be able to retrieve authenticated identity and provide some of the security-protocol specific implementation. In ZK, this something is a ServerCnxn instance that gets attached to SelectionKey (nice trick that I was unaware of) and optionally references a ZooKeeperSaslServer. In Ivan's patch for KAFKA-1684, we instead extend SocketChannel with a protocol-specific wrapper and use this to maintain authentication state. As far as I can see, both solutions are valid, and both allow us to attach authentication information to a socket/channel and maintain it there - with the benefit that it can easily match the socket lifecycle. I need to have one of these exist for the no authentication case for this patch. I'm going to go with Cnxn instance attached to keys and not the SocketChannel extension, simply because its less code to merge into KAFKA-1684 later :) For same reason, the Cnxn will be a clear mock - i.e. will not contain any of the functionality we actually need from a security-protocol-specific object except authId getters and setters. If we decide to go with Cnxn objects, the functionality we actually need (handshakes, wrapping) will get implemented in followup JIRAs and if we decide to keep the SocketChannels, this will go away anyway. Implement a session concept in the socket server -- Key: KAFKA-1683 URL: https://issues.apache.org/jira/browse/KAFKA-1683 Project: Kafka Issue Type: Sub-task Components: security Affects Versions: 0.9.0 Reporter: Jay Kreps Assignee: Gwen Shapira To implement authentication we need a way to keep track of some things between requests. The initial use for this would be remembering the authenticated user/principle info, but likely more uses would come up (for example we will also need to remember whether and which encryption or integrity measures are in place on the socket so we can wrap and unwrap writes and reads). I was thinking we could just add a Session object that might have a user field. The session object would need to get added to RequestChannel.Request so it is passed down to the API layer with each request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] 0.8.2-beta Release Candidate 1
What's the plan for moving from beta to stable? Time based? Or are there any patches that need to go in? On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly wrote: This is the first candidate for release of Apache Kafka 0.8.2-beta Release Notes for the 0.8.2-beta release https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEASE_NOTES.html *** Please download, test and vote by Friday, October 24th, 2pm PT Kafka's KEYS file containing PGP keys we use to sign the release: https://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5, sha1 and sha2 (SHA256) checksum. * Release artifacts to be voted upon (source and binary): https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/ * Maven artifacts to be voted upon prior to release: https://repository.apache.org/content/groups/staging/ * scala-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala-doc/ * java-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/ * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2-beta tag https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c52bc62a89d60f85125d3723c8410fa0 /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop /
Re: 082 docs
I'll upload a clean patch today to make the review easier. On Fri, Oct 24, 2014 at 10:32 AM, Joel Koshy jjkosh...@gmail.com wrote: Yes I can update the doc for offset management, and review Gwen's doc on min.isr/durability guarantees On Fri, Oct 24, 2014 at 09:10:19AM -0700, Jun Rao wrote: Hi, Everyone, I created KAFKA-1728 to track the doc changes in the 082 release. If anyone wants to help, you can sign up on one of the sub tasks. Joel Koshy, Do you want to update the doc for offset management? Thanks, Jun
Re: [VOTE] 0.8.2-beta Release Candidate 1
Gwen, Production grade testing is the plan for moving from beta to stable. It will be great if companies that pick up beta, deploy it, use it and report bugs. LinkedIn will be helping do this within a 4-5 weeks timeframe, if possible. Thanks, Neha On Fri, Oct 24, 2014 at 10:36 AM, Gwen Shapira gshap...@cloudera.com wrote: What's the plan for moving from beta to stable? Time based? Or are there any patches that need to go in? On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly wrote: This is the first candidate for release of Apache Kafka 0.8.2-beta Release Notes for the 0.8.2-beta release https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEASE_NOTES.html *** Please download, test and vote by Friday, October 24th, 2pm PT Kafka's KEYS file containing PGP keys we use to sign the release: https://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5, sha1 and sha2 (SHA256) checksum. * Release artifacts to be voted upon (source and binary): https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/ * Maven artifacts to be voted upon prior to release: https://repository.apache.org/content/groups/staging/ * scala-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala-doc/ * java-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/ * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2-beta tag https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c52bc62a89d60f85125d3723c8410fa0 /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop /
Re: [VOTE] 0.8.2-beta Release Candidate 1
So the criteria is: At least one company doing a serious testing cycle on the beta and possibly running it in production for few weeks? On Fri, Oct 24, 2014 at 10:43 AM, Neha Narkhede neha.narkh...@gmail.com wrote: Gwen, Production grade testing is the plan for moving from beta to stable. It will be great if companies that pick up beta, deploy it, use it and report bugs. LinkedIn will be helping do this within a 4-5 weeks timeframe, if possible. Thanks, Neha On Fri, Oct 24, 2014 at 10:36 AM, Gwen Shapira gshap...@cloudera.com wrote: What's the plan for moving from beta to stable? Time based? Or are there any patches that need to go in? On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly wrote: This is the first candidate for release of Apache Kafka 0.8.2-beta Release Notes for the 0.8.2-beta release https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEASE_NOTES.html *** Please download, test and vote by Friday, October 24th, 2pm PT Kafka's KEYS file containing PGP keys we use to sign the release: https://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5, sha1 and sha2 (SHA256) checksum. * Release artifacts to be voted upon (source and binary): https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/ * Maven artifacts to be voted upon prior to release: https://repository.apache.org/content/groups/staging/ * scala-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala-doc/ * java-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/ * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2-beta tag https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c52bc62a89d60f85125d3723c8410fa0 /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop /
[jira] [Updated] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gwen Shapira updated KAFKA-1555: Attachment: KAFKA-1555-DOCS.1.patch Attaching a clean version of the docs - just the diffs from 082. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1710) [New Java Producer Potential Deadlock] Producer Deadlock when all messages is being sent to single partition
[ https://issues.apache.org/jira/browse/KAFKA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14179014#comment-14179014 ] Bhavesh Mistry edited comment on KAFKA-1710 at 10/24/14 6:21 PM: - [~jkreps], I am sorry I did not get back to you so soon. The cost of enqueue a message into single partition is ~54% as compare to round-robin. (test with 32 partitions to single topic and 3 broker cluster) The throughput is measuring the cost of put data into buffer only not cost of sending data to brokers. Here is test I have done: To *single* partition: Throughput per Thread=2666.5 byte(s)/millisecond All done...! To *all* partition: Throughput per Thread=5818.181818181818 byte(s)/millisecond All done...! Here is test program for this: {code} package org.kafka.test; import java.io.IOException; import java.io.InputStream; import java.util.Properties; import java.util.concurrent.Callable; import java.util.concurrent.CountDownLatch; import java.util.concurrent.ExecutorService; import java.util.concurrent.LinkedBlockingQueue; import java.util.concurrent.ThreadPoolExecutor; import java.util.concurrent.TimeUnit; import org.apache.kafka.clients.producer.Callback; import org.apache.kafka.clients.producer.KafkaProducer; import org.apache.kafka.clients.producer.Producer; import org.apache.kafka.clients.producer.ProducerRecord; import org.apache.kafka.clients.producer.RecordMetadata; public class TestNetworkDownProducer { static int numberTh = 75; static CountDownLatch latch = new CountDownLatch(numberTh); public static void main(String[] args) throws IOException, InterruptedException { //Thread.sleep(6); Properties prop = new Properties(); InputStream propFile = Thread.currentThread().getContextClassLoader() .getResourceAsStream(kafkaproducer.properties); String topic = logmon.test; prop.load(propFile); System.out.println(Property: + prop.toString()); StringBuilder builder = new StringBuilder(1024); int msgLenth = 256; int numberOfLoop = 5000; for (int i = 0; i msgLenth; i++) builder.append(a); int numberOfProducer = 1; Producer[] producer = new Producer[numberOfProducer]; for (int i = 0; i producer.length; i++) { producer[i] = new KafkaProducer(prop); } ExecutorService service = new ThreadPoolExecutor(numberTh, numberTh, 0L, TimeUnit.MILLISECONDS, new LinkedBlockingQueueRunnable(numberTh *2)); MyProducer [] producerThResult = new MyProducer [numberTh]; for(int i = 0 ; i numberTh;i++){ producerThResult[i] = new MyProducer(producer,numberOfLoop,builder.toString(), topic); service.execute(producerThResult[i]); } latch.await(); for (int i = 0; i producer.length; i++) { producer[i].close(); } service.shutdownNow(); System.out.println(All Producers done...!); // now interpret the result... of this... long lowestTime = 0 ; for(int i =0 ; i producerThResult.length;i++){ if(i == 1){ lowestTime = producerThResult[i].totalTimeinNano; }else if ( producerThResult[i].totalTimeinNano lowestTime){ lowestTime = producerThResult[i].totalTimeinNano; } } long bytesSend = msgLenth * numberOfLoop; long durationInMs = TimeUnit.MILLISECONDS.convert(lowestTime, TimeUnit.NANOSECONDS); double throughput = (bytesSend * 1.0) / (durationInMs); System.out.println(Throughput per Thread= + throughput + byte(s)/microsecond); System.out.println(All done...!); } static class MyProducer implements CallableLong , Runnable { Producer[] producer; long maxloops; String msg ; String topic; long totalTimeinNano = 0; MyProducer(Producer[] list, long maxloops,String msg,String topic){ this.producer = list; this.maxloops = maxloops; this.msg = msg; this.topic = topic; } public void run() {
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183304#comment-14183304 ] Joel Koshy commented on KAFKA-1555: --- Looks good - I just have a few minor edits to what you wrote. configuration.html _Minor edits: also, it would be good if we can also mention NotEnoughReplicasAfterAppend and document it_ min.insync.replicas: The minimum number of replicas that are required to declare a message as committed. If the number of in-sync replicas drops below this threshold, then writing messages with request.required.acks set to -1 will return a NotEnoughReplicas or NotEnoughReplicasAfterAppend error code. This is used to provide enhanced durability guarantees - i.e., all in-sync replicas need to acknowledge the message AND there needs to be at least this many replicas in the set of in-sync replicas. ops.html: log.cleanup.interval.mins=30 - log.retention.check.interval.ms=30 design.html Couple of comments: * all (or -1) brokers - maybe make it clear up front that this is all current in-sync replicas, and later clarify that consistency can be preferred over availability via the min.isr property * bq. a message that was acked will not be lost as long as at least one in sync replica remains ** The above should probably be clarified a bit. i.e., availability of a replica affects whether a message will be lost or not only during the time it is yet to be replicated to all assigned replicas. * It would be useful to describe how min.isr helps facilitate trading off consistency vs availability * There are a couple of typos in various places provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
RE: [VOTE] 0.8.2-beta Release Candidate 1
We are testing 0.8.2 branch now in dev-int. Balaji -Original Message- From: Neha Narkhede [mailto:neha.narkh...@gmail.com] Sent: Friday, October 24, 2014 12:06 PM To: dev@kafka.apache.org Subject: Re: [VOTE] 0.8.2-beta Release Candidate 1 So the criteria is: At least one company doing a serious testing cycle on the beta and possibly running it in production for few weeks? That's right. On Fri, Oct 24, 2014 at 10:54 AM, Harsha ka...@harsha.io wrote: +1 (non-binding) On Fri, Oct 24, 2014, at 10:49 AM, Gwen Shapira wrote: So the criteria is: At least one company doing a serious testing cycle on the beta and possibly running it in production for few weeks? On Fri, Oct 24, 2014 at 10:43 AM, Neha Narkhede neha.narkh...@gmail.com wrote: Gwen, Production grade testing is the plan for moving from beta to stable. It will be great if companies that pick up beta, deploy it, use it and report bugs. LinkedIn will be helping do this within a 4-5 weeks timeframe, if possible. Thanks, Neha On Fri, Oct 24, 2014 at 10:36 AM, Gwen Shapira gshap...@cloudera.com wrote: What's the plan for moving from beta to stable? Time based? Or are there any patches that need to go in? On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly wrote: This is the first candidate for release of Apache Kafka 0.8.2-beta Release Notes for the 0.8.2-beta release https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEAS E_NOTES.html *** Please download, test and vote by Friday, October 24th, 2pm PT Kafka's KEYS file containing PGP keys we use to sign the release: https://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5, sha1 and sha2 (SHA256) checksum. * Release artifacts to be voted upon (source and binary): https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1 / * Maven artifacts to be voted upon prior to release: https://repository.apache.org/content/groups/staging/ * scala-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala- doc/ * java-doc https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-d oc/ * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2-beta tag https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c5 2bc62a89d60f85125d3723c8410fa0 /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop /
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183382#comment-14183382 ] Kyle Banker commented on KAFKA-1555: The new durability docs looks great so far. Here are a couple more proposed edits to the patch: EDIT #1: min.insync.replicas: When a producer sets request.required.acks to -1, min.insync.replicas specifies the minimum number of replicas that must acknowledge a write for the write to be considered successful. If this minimum cannot be met, then the producer will raise an exception (either NotEnoughReplicas or NotEnoughReplicasAfterAppend). When used together, min.insync.replicas and request.required.acks allow you to enforce greater durability guarantees. A typical scenario would be to create a topic with a replication factor of 3, set min.insync.replicas to 2, and produce with request.required.acks of -1. This will ensure that the producer raises an exception if a majority of replicas do not receive a write. EDIT #2 (for request.required.acks): -1. The producer gets an acknowledgement after all in-sync replicas have received the data. This option provides the greatest level of durability. However, it does not completely eliminate the risk of message loss because the number of in sync replicas may, in rare cases shrink, to 1. If you want to ensure that some minimum number of replicas (typically a majority) receive a write, the you must set the topic-level min.insync.replicas setting. Please read the Replication section of the design documentation for a more in-depth discussion. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183382#comment-14183382 ] Kyle Banker edited comment on KAFKA-1555 at 10/24/14 8:03 PM: -- The new durability docs looks great so far. Here are a couple more proposed edits to the patch: EDIT #1: min.insync.replicas: When a producer sets request.required.acks to -1, min.insync.replicas specifies the minimum number of replicas that must acknowledge a write for the write to be considered successful. If this minimum cannot be met, then the producer will raise an exception (either NotEnoughReplicas or NotEnoughReplicasAfterAppend). When used together, min.insync.replicas and request.required.acks allow you to enforce greater durability guarantees. A typical scenario would be to create a topic with a replication factor of 3, set min.insync.replicas to 2, and produce with request.required.acks of -1. This will ensure that the producer raises an exception if a majority of replicas do not receive a write. EDIT #2 (for request.required.acks): -1. The producer gets an acknowledgement after all in-sync replicas have received the data. This option provides the greatest level of durability. However, it does not completely eliminate the risk of message loss because the number of in sync replicas may, in rare cases, shrink to 1. If you want to ensure that some minimum number of replicas (typically a majority) receive a write, then you must set the topic-level min.insync.replicas setting. Please read the Replication section of the design documentation for a more in-depth discussion. was (Author: kbanker): The new durability docs looks great so far. Here are a couple more proposed edits to the patch: EDIT #1: min.insync.replicas: When a producer sets request.required.acks to -1, min.insync.replicas specifies the minimum number of replicas that must acknowledge a write for the write to be considered successful. If this minimum cannot be met, then the producer will raise an exception (either NotEnoughReplicas or NotEnoughReplicasAfterAppend). When used together, min.insync.replicas and request.required.acks allow you to enforce greater durability guarantees. A typical scenario would be to create a topic with a replication factor of 3, set min.insync.replicas to 2, and produce with request.required.acks of -1. This will ensure that the producer raises an exception if a majority of replicas do not receive a write. EDIT #2 (for request.required.acks): -1. The producer gets an acknowledgement after all in-sync replicas have received the data. This option provides the greatest level of durability. However, it does not completely eliminate the risk of message loss because the number of in sync replicas may, in rare cases, shrink to 1. If you want to ensure that some minimum number of replicas (typically a majority) receive a write, the you must set the topic-level min.insync.replicas setting. Please read the Replication section of the design documentation for a more in-depth discussion. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously
[jira] [Comment Edited] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183382#comment-14183382 ] Kyle Banker edited comment on KAFKA-1555 at 10/24/14 8:03 PM: -- The new durability docs looks great so far. Here are a couple more proposed edits to the patch: EDIT #1: min.insync.replicas: When a producer sets request.required.acks to -1, min.insync.replicas specifies the minimum number of replicas that must acknowledge a write for the write to be considered successful. If this minimum cannot be met, then the producer will raise an exception (either NotEnoughReplicas or NotEnoughReplicasAfterAppend). When used together, min.insync.replicas and request.required.acks allow you to enforce greater durability guarantees. A typical scenario would be to create a topic with a replication factor of 3, set min.insync.replicas to 2, and produce with request.required.acks of -1. This will ensure that the producer raises an exception if a majority of replicas do not receive a write. EDIT #2 (for request.required.acks): -1. The producer gets an acknowledgement after all in-sync replicas have received the data. This option provides the greatest level of durability. However, it does not completely eliminate the risk of message loss because the number of in sync replicas may, in rare cases, shrink to 1. If you want to ensure that some minimum number of replicas (typically a majority) receive a write, the you must set the topic-level min.insync.replicas setting. Please read the Replication section of the design documentation for a more in-depth discussion. was (Author: kbanker): The new durability docs looks great so far. Here are a couple more proposed edits to the patch: EDIT #1: min.insync.replicas: When a producer sets request.required.acks to -1, min.insync.replicas specifies the minimum number of replicas that must acknowledge a write for the write to be considered successful. If this minimum cannot be met, then the producer will raise an exception (either NotEnoughReplicas or NotEnoughReplicasAfterAppend). When used together, min.insync.replicas and request.required.acks allow you to enforce greater durability guarantees. A typical scenario would be to create a topic with a replication factor of 3, set min.insync.replicas to 2, and produce with request.required.acks of -1. This will ensure that the producer raises an exception if a majority of replicas do not receive a write. EDIT #2 (for request.required.acks): -1. The producer gets an acknowledgement after all in-sync replicas have received the data. This option provides the greatest level of durability. However, it does not completely eliminate the risk of message loss because the number of in sync replicas may, in rare cases shrink, to 1. If you want to ensure that some minimum number of replicas (typically a majority) receive a write, the you must set the topic-level min.insync.replicas setting. Please read the Replication section of the design documentation for a more in-depth discussion. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183393#comment-14183393 ] Kyle Banker commented on KAFKA-1555: I like the section on Availability and Durability Guarantees, but I believe that, in addition, it would be useful to suggest 3 or 4 typical durability configurations and the trade-offs provided by each one. As of now, users still have to infer from the docs the ideal settings for all of the following: topic replication factor, min.insync.replicas, request.required.acks, and whether or not to disable unclean leader election. I'd be happy to write up a draft of these scenarios as I understand them if folks think this would be a good idea. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gwen Shapira updated KAFKA-1555: Attachment: KAFKA-1555-DOCS.2.patch Uploading new docs, containing edits by [~kbanker] and [~jjkoshy]. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, KAFKA-1555-DOCS.2.patch, KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183411#comment-14183411 ] Gwen Shapira commented on KAFKA-1555: - [~jjkoshy]: * I did not make the suggested changes to ops.html since its not related to this JIRA and this line was not modified in this patch. (Actually, the line you refer to doesn't seem to exist in my copy of the docs...) * The above should probably be clarified a bit. i.e., availability of a replica affects whether a message will be lost or not only during the time it is yet to be replicated to all assigned replicas. I've re-phrased the sentence per Kyle's suggestion. Let me know if it works now. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, KAFKA-1555-DOCS.2.patch, KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1710) [New Java Producer Potential Deadlock] Producer Deadlock when all messages is being sent to single partition
[ https://issues.apache.org/jira/browse/KAFKA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183514#comment-14183514 ] Jay Kreps commented on KAFKA-1710: -- Ah, gotcha, so that was per ms not per us. Question was this using compression? If so, which compression type (gzip, snappy, etc)? [New Java Producer Potential Deadlock] Producer Deadlock when all messages is being sent to single partition Key: KAFKA-1710 URL: https://issues.apache.org/jira/browse/KAFKA-1710 Project: Kafka Issue Type: Bug Components: producer Environment: Development Reporter: Bhavesh Mistry Assignee: Ewen Cheslack-Postava Priority: Critical Labels: performance Attachments: Screen Shot 2014-10-13 at 10.19.04 AM.png, Screen Shot 2014-10-15 at 9.09.06 PM.png, Screen Shot 2014-10-15 at 9.14.15 PM.png, TestNetworkDownProducer.java, th1.dump, th10.dump, th11.dump, th12.dump, th13.dump, th14.dump, th15.dump, th2.dump, th3.dump, th4.dump, th5.dump, th6.dump, th7.dump, th8.dump, th9.dump Hi Kafka Dev Team, When I run the test to send message to single partition for 3 minutes or so on, I have encounter deadlock (please see the screen attached) and thread contention from YourKit profiling. Use Case: 1) Aggregating messages into same partition for metric counting. 2) Replicate Old Producer behavior for sticking to partition for 3 minutes. Here is output: Frozen threads found (potential deadlock) It seems that the following threads have not changed their stack for more than 10 seconds. These threads are possibly (but not necessarily!) in a deadlock or hung. pool-1-thread-128 --- Frozen for at least 2m org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition, byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, Callback) KafkaProducer.java:237 org.kafka.test.TestNetworkDownProducer$MyProducer.run() TestNetworkDownProducer.java:84 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1145 java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:615 java.lang.Thread.run() Thread.java:744 pool-1-thread-159 --- Frozen for at least 2m 1 sec org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition, byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, Callback) KafkaProducer.java:237 org.kafka.test.TestNetworkDownProducer$MyProducer.run() TestNetworkDownProducer.java:84 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1145 java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:615 java.lang.Thread.run() Thread.java:744 pool-1-thread-55 --- Frozen for at least 2m org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition, byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, Callback) KafkaProducer.java:237 org.kafka.test.TestNetworkDownProducer$MyProducer.run() TestNetworkDownProducer.java:84 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1145 java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:615 java.lang.Thread.run() Thread.java:744 Thanks, Bhavesh -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1710) [New Java Producer Potential Deadlock] Producer Deadlock when all messages is being sent to single partition
[ https://issues.apache.org/jira/browse/KAFKA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183545#comment-14183545 ] Jay Kreps commented on KAFKA-1710: -- The reason I ask is because the per-partition lock is held for the duration of the write to the buffer. In the case where compression is enabled that will be longer because the compression occurs as part of the write. So in the case where you have two partitions you are effectively getting two cpu cores for compression and if you have only one you get just one. [New Java Producer Potential Deadlock] Producer Deadlock when all messages is being sent to single partition Key: KAFKA-1710 URL: https://issues.apache.org/jira/browse/KAFKA-1710 Project: Kafka Issue Type: Bug Components: producer Environment: Development Reporter: Bhavesh Mistry Assignee: Ewen Cheslack-Postava Priority: Critical Labels: performance Attachments: Screen Shot 2014-10-13 at 10.19.04 AM.png, Screen Shot 2014-10-15 at 9.09.06 PM.png, Screen Shot 2014-10-15 at 9.14.15 PM.png, TestNetworkDownProducer.java, th1.dump, th10.dump, th11.dump, th12.dump, th13.dump, th14.dump, th15.dump, th2.dump, th3.dump, th4.dump, th5.dump, th6.dump, th7.dump, th8.dump, th9.dump Hi Kafka Dev Team, When I run the test to send message to single partition for 3 minutes or so on, I have encounter deadlock (please see the screen attached) and thread contention from YourKit profiling. Use Case: 1) Aggregating messages into same partition for metric counting. 2) Replicate Old Producer behavior for sticking to partition for 3 minutes. Here is output: Frozen threads found (potential deadlock) It seems that the following threads have not changed their stack for more than 10 seconds. These threads are possibly (but not necessarily!) in a deadlock or hung. pool-1-thread-128 --- Frozen for at least 2m org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition, byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, Callback) KafkaProducer.java:237 org.kafka.test.TestNetworkDownProducer$MyProducer.run() TestNetworkDownProducer.java:84 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1145 java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:615 java.lang.Thread.run() Thread.java:744 pool-1-thread-159 --- Frozen for at least 2m 1 sec org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition, byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, Callback) KafkaProducer.java:237 org.kafka.test.TestNetworkDownProducer$MyProducer.run() TestNetworkDownProducer.java:84 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1145 java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:615 java.lang.Thread.run() Thread.java:744 pool-1-thread-55 --- Frozen for at least 2m org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition, byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, Callback) KafkaProducer.java:237 org.kafka.test.TestNetworkDownProducer$MyProducer.run() TestNetworkDownProducer.java:84 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1145 java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:615 java.lang.Thread.run() Thread.java:744 Thanks, Bhavesh -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1710) [New Java Producer Potential Deadlock] Producer Deadlock when all messages is being sent to single partition
[ https://issues.apache.org/jira/browse/KAFKA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183648#comment-14183648 ] Bhavesh Mistry commented on KAFKA-1710: --- [~jkreps], Yes , I did this test with 75 threads and on My mac pro with 8 core with Snappy compression ON. Do you have any idea how we can improve this enqueue for single partition ? May be have x # of CPU active buffer ? Here is info about the box: {code} machdep.cpu.max_basic: 13 machdep.cpu.max_ext: 2147483656 machdep.cpu.vendor: GenuineIntel machdep.cpu.brand_string: Intel(R) Core(TM) i7-3840QM CPU @ 2.80GHz machdep.cpu.family: 6 machdep.cpu.model: 58 machdep.cpu.extmodel: 3 machdep.cpu.extfamily: 0 machdep.cpu.stepping: 9 machdep.cpu.feature_bits: 3219913727 2142954495 machdep.cpu.leaf7_feature_bits: 641 machdep.cpu.extfeature_bits: 672139520 1 machdep.cpu.signature: 198313 machdep.cpu.brand: 0 machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC POPCNT AES PCID XSAVE OSXSAVE TSCTMR AVX1.0 RDRAND F16C machdep.cpu.leaf7_features: SMEP ENFSTRG RDWRFSGS machdep.cpu.extfeatures: SYSCALL XD EM64T LAHF RDTSCP TSCI machdep.cpu.logical_per_package: 16 machdep.cpu.cores_per_package: 8 {code} [New Java Producer Potential Deadlock] Producer Deadlock when all messages is being sent to single partition Key: KAFKA-1710 URL: https://issues.apache.org/jira/browse/KAFKA-1710 Project: Kafka Issue Type: Bug Components: producer Environment: Development Reporter: Bhavesh Mistry Assignee: Ewen Cheslack-Postava Priority: Critical Labels: performance Attachments: Screen Shot 2014-10-13 at 10.19.04 AM.png, Screen Shot 2014-10-15 at 9.09.06 PM.png, Screen Shot 2014-10-15 at 9.14.15 PM.png, TestNetworkDownProducer.java, th1.dump, th10.dump, th11.dump, th12.dump, th13.dump, th14.dump, th15.dump, th2.dump, th3.dump, th4.dump, th5.dump, th6.dump, th7.dump, th8.dump, th9.dump Hi Kafka Dev Team, When I run the test to send message to single partition for 3 minutes or so on, I have encounter deadlock (please see the screen attached) and thread contention from YourKit profiling. Use Case: 1) Aggregating messages into same partition for metric counting. 2) Replicate Old Producer behavior for sticking to partition for 3 minutes. Here is output: Frozen threads found (potential deadlock) It seems that the following threads have not changed their stack for more than 10 seconds. These threads are possibly (but not necessarily!) in a deadlock or hung. pool-1-thread-128 --- Frozen for at least 2m org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition, byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, Callback) KafkaProducer.java:237 org.kafka.test.TestNetworkDownProducer$MyProducer.run() TestNetworkDownProducer.java:84 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1145 java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:615 java.lang.Thread.run() Thread.java:744 pool-1-thread-159 --- Frozen for at least 2m 1 sec org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition, byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, Callback) KafkaProducer.java:237 org.kafka.test.TestNetworkDownProducer$MyProducer.run() TestNetworkDownProducer.java:84 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1145 java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:615 java.lang.Thread.run() Thread.java:744 pool-1-thread-55 --- Frozen for at least 2m org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition, byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, Callback) KafkaProducer.java:237 org.kafka.test.TestNetworkDownProducer$MyProducer.run() TestNetworkDownProducer.java:84 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1145 java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:615 java.lang.Thread.run() Thread.java:744 Thanks, Bhavesh -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1501) transient unit tests failures due to port already in use
[ https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183659#comment-14183659 ] Guozhang Wang commented on KAFKA-1501: -- Thanks Chris, I also got the same conclusion on my tests. Need some more time tackling the problem. transient unit tests failures due to port already in use Key: KAFKA-1501 URL: https://issues.apache.org/jira/browse/KAFKA-1501 Project: Kafka Issue Type: Improvement Components: core Reporter: Jun Rao Assignee: Guozhang Wang Labels: newbie Attachments: KAFKA-1501.patch Saw the following transient failures. kafka.api.ProducerFailureHandlingTest testTooLargeRecordWithAckOne FAILED kafka.common.KafkaException: Socket server failed to bind to localhost:59909: Address already in use. at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195) at kafka.network.Acceptor.init(SocketServer.scala:141) at kafka.network.SocketServer.startup(SocketServer.scala:68) at kafka.server.KafkaServer.startup(KafkaServer.scala:95) at kafka.utils.TestUtils$.createServer(TestUtils.scala:123) at kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
High Level Consumer Iterator IllegalStateException Issue
HI Kafka Community , I am using kafka trunk source code and I get following exception. What could cause the iterator to have FAILED state. Please let me know how I can fix this issue. *java.lang.IllegalStateException: Iterator is in failed stateat kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:54)* Here is Properties: Properties props = new Properties(); props.put(zookeeper.connect, zkConnect); props.put(group.id, groupId); *props.put(consumer.timeout.ms http://consumer.timeout.ms, -1);* props.put(zookeeper.session.timeout.ms, 1); props.put(zookeeper.sync.time.ms, 6000); props.put(auto.commit.interval.ms, 2000); props.put(rebalance.max.retries, 8); props.put(auto.offset.reset, largest); props.put(fetch.message.max.bytes,2097152); props.put(socket.receive.buffer.bytes,2097152); props.put(auto.commit.enable,true); Thanks, Bhavesh
[jira] [Commented] (KAFKA-1710) [New Java Producer Potential Deadlock] Producer Deadlock when all messages is being sent to single partition
[ https://issues.apache.org/jira/browse/KAFKA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183775#comment-14183775 ] Jay Kreps commented on KAFKA-1710: -- You could try some profiling and see if you see any implementation bottlenecks. I don't think we can fundamentally reengineer this piece or move the compression outside the lock. The reason being that you have multiple threads that want to write to a shared byte array. We need to synchronized access to ensure safety (otherwise they would overwrite each others data). Furthermore since this is batch compression we are compressing into the destination array using a compressor used for the prior messages. This batch compression is very important to get a good compression ratio as it allows redundancy between messages to be exploited. [New Java Producer Potential Deadlock] Producer Deadlock when all messages is being sent to single partition Key: KAFKA-1710 URL: https://issues.apache.org/jira/browse/KAFKA-1710 Project: Kafka Issue Type: Bug Components: producer Environment: Development Reporter: Bhavesh Mistry Assignee: Ewen Cheslack-Postava Priority: Critical Labels: performance Attachments: Screen Shot 2014-10-13 at 10.19.04 AM.png, Screen Shot 2014-10-15 at 9.09.06 PM.png, Screen Shot 2014-10-15 at 9.14.15 PM.png, TestNetworkDownProducer.java, th1.dump, th10.dump, th11.dump, th12.dump, th13.dump, th14.dump, th15.dump, th2.dump, th3.dump, th4.dump, th5.dump, th6.dump, th7.dump, th8.dump, th9.dump Hi Kafka Dev Team, When I run the test to send message to single partition for 3 minutes or so on, I have encounter deadlock (please see the screen attached) and thread contention from YourKit profiling. Use Case: 1) Aggregating messages into same partition for metric counting. 2) Replicate Old Producer behavior for sticking to partition for 3 minutes. Here is output: Frozen threads found (potential deadlock) It seems that the following threads have not changed their stack for more than 10 seconds. These threads are possibly (but not necessarily!) in a deadlock or hung. pool-1-thread-128 --- Frozen for at least 2m org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition, byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, Callback) KafkaProducer.java:237 org.kafka.test.TestNetworkDownProducer$MyProducer.run() TestNetworkDownProducer.java:84 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1145 java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:615 java.lang.Thread.run() Thread.java:744 pool-1-thread-159 --- Frozen for at least 2m 1 sec org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition, byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, Callback) KafkaProducer.java:237 org.kafka.test.TestNetworkDownProducer$MyProducer.run() TestNetworkDownProducer.java:84 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1145 java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:615 java.lang.Thread.run() Thread.java:744 pool-1-thread-55 --- Frozen for at least 2m org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition, byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, Callback) KafkaProducer.java:237 org.kafka.test.TestNetworkDownProducer$MyProducer.run() TestNetworkDownProducer.java:84 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) ThreadPoolExecutor.java:1145 java.util.concurrent.ThreadPoolExecutor$Worker.run() ThreadPoolExecutor.java:615 java.lang.Thread.run() Thread.java:744 Thanks, Bhavesh -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: High Level Consumer Iterator IllegalStateException Issue
Which version of Kafka are you using on the consumer? On Fri, Oct 24, 2014 at 4:14 PM, Bhavesh Mistry mistry.p.bhav...@gmail.com wrote: HI Kafka Community , I am using kafka trunk source code and I get following exception. What could cause the iterator to have FAILED state. Please let me know how I can fix this issue. *java.lang.IllegalStateException: Iterator is in failed stateat kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:54)* Here is Properties: Properties props = new Properties(); props.put(zookeeper.connect, zkConnect); props.put(group.id, groupId); *props.put(consumer.timeout.ms http://consumer.timeout.ms, -1);* props.put(zookeeper.session.timeout.ms, 1); props.put(zookeeper.sync.time.ms, 6000); props.put(auto.commit.interval.ms, 2000); props.put(rebalance.max.retries, 8); props.put(auto.offset.reset, largest); props.put(fetch.message.max.bytes,2097152); props.put(socket.receive.buffer.bytes,2097152); props.put(auto.commit.enable,true); Thanks, Bhavesh
[jira] [Commented] (KAFKA-1501) transient unit tests failures due to port already in use
[ https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183787#comment-14183787 ] Jay Kreps commented on KAFKA-1501: -- Oh, man, my heart is broken, I was sure that was going to work. It looks like a few projects do a few extra steps to check availability including creating a datagram socket using the given port. For example: http://stackoverflow.com/questions/434718/sockets-discover-port-availability-using-java The comments mention something in Apache Mina which is essentially the same. I have no idea why that is needed, but maybe worth trying that in choosePorts? I like the idea of improving choosePorts rather than changing SocketServer... transient unit tests failures due to port already in use Key: KAFKA-1501 URL: https://issues.apache.org/jira/browse/KAFKA-1501 Project: Kafka Issue Type: Improvement Components: core Reporter: Jun Rao Assignee: Guozhang Wang Labels: newbie Attachments: KAFKA-1501.patch Saw the following transient failures. kafka.api.ProducerFailureHandlingTest testTooLargeRecordWithAckOne FAILED kafka.common.KafkaException: Socket server failed to bind to localhost:59909: Address already in use. at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195) at kafka.network.Acceptor.init(SocketServer.scala:141) at kafka.network.SocketServer.startup(SocketServer.scala:68) at kafka.server.KafkaServer.startup(KafkaServer.scala:95) at kafka.utils.TestUtils$.createServer(TestUtils.scala:123) at kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68) -- This message was sent by Atlassian JIRA (v6.3.4#6332)