Re: Review Request 26994: Patch for KAFKA-1719

2014-10-24 Thread Jiangjie Qin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26994/
---

(Updated Oct. 24, 2014, 7:56 a.m.)


Review request for kafka.


Bugs: KAFKA-1719
https://issues.apache.org/jira/browse/KAFKA-1719


Repository: kafka


Description (updated)
---

Addressed Guozhang's comments.


Addressed Neha and Guzhang's comments.


Incorporated Joel and Neha's comments.


Incorporated Joel and Neha's comments. Also fixed a potential race where 
cleanShutdown could execute multiple times if several threads exit abnormally 
at same time.


Diffs (updated)
-

  core/src/main/scala/kafka/tools/MirrorMaker.scala 
b8698ee1469c8fbc92ccc176d916eb3e28b87867 

Diff: https://reviews.apache.org/r/26994/diff/


Testing
---


Thanks,

Jiangjie Qin



[jira] [Updated] (KAFKA-1719) Make mirror maker exit when one consumer/producer thread exits.

2014-10-24 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin updated KAFKA-1719:

Attachment: KAFKA-1719_2014-10-24_00:56:06.patch

 Make mirror maker exit when one consumer/producer thread exits.
 ---

 Key: KAFKA-1719
 URL: https://issues.apache.org/jira/browse/KAFKA-1719
 Project: Kafka
  Issue Type: Improvement
Reporter: Jiangjie Qin
Assignee: Jiangjie Qin
 Attachments: KAFKA-1719.patch, KAFKA-1719_2014-10-22_15:04:32.patch, 
 KAFKA-1719_2014-10-23_16:20:22.patch, KAFKA-1719_2014-10-24_00:56:06.patch


 When one of the consumer/producer thread exits, the entire mirror maker will 
 be blocked. In this case, it is better to make it exit. It seems a single 
 ZookeeperConsumerConnector is sufficient for mirror maker, probably we don't 
 need a list for the connectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1719) Make mirror maker exit when one consumer/producer thread exits.

2014-10-24 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182573#comment-14182573
 ] 

Jiangjie Qin commented on KAFKA-1719:
-

Updated reviewboard https://reviews.apache.org/r/26994/diff/
 against branch origin/trunk

 Make mirror maker exit when one consumer/producer thread exits.
 ---

 Key: KAFKA-1719
 URL: https://issues.apache.org/jira/browse/KAFKA-1719
 Project: Kafka
  Issue Type: Improvement
Reporter: Jiangjie Qin
Assignee: Jiangjie Qin
 Attachments: KAFKA-1719.patch, KAFKA-1719_2014-10-22_15:04:32.patch, 
 KAFKA-1719_2014-10-23_16:20:22.patch, KAFKA-1719_2014-10-24_00:56:06.patch


 When one of the consumer/producer thread exits, the entire mirror maker will 
 be blocked. In this case, it is better to make it exit. It seems a single 
 ZookeeperConsumerConnector is sufficient for mirror maker, probably we don't 
 need a list for the connectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names

2014-10-24 Thread Vladimir Tretyakov (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Tretyakov updated KAFKA-1481:
--
Attachment: KAFKA-1481_2014-10-24_14-14-35.patch.patch

 Stop using dashes AND underscores as separators in MBean names
 --

 Key: KAFKA-1481
 URL: https://issues.apache.org/jira/browse/KAFKA-1481
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1.1
Reporter: Otis Gospodnetic
Priority: Critical
  Labels: patch
 Fix For: 0.8.3

 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, 
 KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, 
 KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, 
 KAFKA-1481_2014-10-21_09-14-35.patch, 
 KAFKA-1481_2014-10-24_14-14-35.patch.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch


 MBeans should not use dashes or underscores as separators because these 
 characters are allowed in hostnames, topics, group and consumer IDs, etc., 
 and these are embedded in MBeans names making it impossible to parse out 
 individual bits from MBeans.
 Perhaps a pipe character should be used to avoid the conflict. 
 This looks like a major blocker because it means nobody can write Kafka 0.8.x 
 monitoring tools unless they are doing it for themselves AND do not use 
 dashes AND do not use underscores.
 See: http://search-hadoop.com/m/4TaT4lonIW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1481) Stop using dashes AND underscores as separators in MBean names

2014-10-24 Thread Vladimir Tretyakov (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182670#comment-14182670
 ] 

Vladimir Tretyakov commented on KAFKA-1481:
---

Hi Jun, thx for your feedback again, attached new patch.

20.1 done.
20.2 clientId is Taggable here, it will convert to something like 
clientId=XXX automatically
21 tags should be Map which iterators iterate in the same order elements were 
inserted (final string must be stable): from LinkedHashMap docs The iterator 
and all traversal methods of this class visit elements in the order they were 
inserted.. I am new in scala, if you know better candidate with predictable 
iterators please let me know
22.1 done
22.2 clientId is not always just clientId=XXX, look at:
{code}
class ConsumerFetcherThread(consumerFetcherThreadId: ConsumerFetcherThreadId,
val config: ConsumerConfig,
sourceBroker: Broker,
partitionMap: Map[TopicAndPartition, 
PartitionTopicInfo],
val consumerFetcherManager: ConsumerFetcherManager)
extends AbstractFetcherThread(name = consumerFetcherThreadId,
  clientId = new TaggableInfo(new 
TaggableInfo(clientId - config.clientId), consumerFetcherThreadId),  
  sourceBroker = sourceBroker,
  socketTimeout = config.socketTimeoutMs,
  socketBufferSize = 
config.socketReceiveBufferBytes,
  fetchSize = config.fetchMessageMaxBytes,
  fetcherBrokerId = 
Request.OrdinaryConsumerId,
  maxWait = config.fetchWaitMaxMs,
  minBytes = config.fetchMinBytes,
  isInterruptible = true) {
{code}


this clientId = new TaggableInfo(new TaggableInfo(clientId - 
config.clientId), consumerFetcherThreadId) part was in code before my changes 
(was manipulation with strings, not Taggable of course). Yeah, somewhere in 
code we can use string instead of Taggable, but maybe it is better to has 
Taggable everywhere what is related to metrics. At least it was not easy to 
me decide where I have to use Taggable, but where I can leave string. I'd 
prefer Taggable everywhere in this case, sorry:).

23. done everything as described by link, checked how it works/applies on 
'origin/0.8.2' from scratch.
24. see 22.2
25. agree, added second constructor with 'String' as clientId, can't remove 
constructor with Taggable as clientId because as you can see in 22.2 
sometimes clientId is not just clientId but something more complex.
26. added 'MetricsLeakTest', basic idea is start/stop many producers/consumes 
and observe count of metrics in Metrics.defaultRegistry(), count should not 
grow.

 Stop using dashes AND underscores as separators in MBean names
 --

 Key: KAFKA-1481
 URL: https://issues.apache.org/jira/browse/KAFKA-1481
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1.1
Reporter: Otis Gospodnetic
Priority: Critical
  Labels: patch
 Fix For: 0.8.3

 Attachments: KAFKA-1481_2014-06-06_13-06-35.patch, 
 KAFKA-1481_2014-10-13_18-23-35.patch, KAFKA-1481_2014-10-14_21-53-35.patch, 
 KAFKA-1481_2014-10-15_10-23-35.patch, KAFKA-1481_2014-10-20_23-14-35.patch, 
 KAFKA-1481_2014-10-21_09-14-35.patch, 
 KAFKA-1481_2014-10-24_14-14-35.patch.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-14_21-53-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-15_10-23-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_20-14-35.patch, 
 KAFKA-1481_IDEA_IDE_2014-10-20_23-14-35.patch


 MBeans should not use dashes or underscores as separators because these 
 characters are allowed in hostnames, topics, group and consumer IDs, etc., 
 and these are embedded in MBeans names making it impossible to parse out 
 individual bits from MBeans.
 Perhaps a pipe character should be used to avoid the conflict. 
 This looks like a major blocker because it means nobody can write Kafka 0.8.x 
 monitoring tools unless they are doing it for themselves AND do not use 
 dashes AND do not use underscores.
 See: http://search-hadoop.com/m/4TaT4lonIW



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] 0.8.2-beta Release Candidate 1

2014-10-24 Thread Joe Stein
Jun, I updated the
https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/
with the contents of kafka-clients-0.8.2-beta-javadoc.jar

There weren't any artifact changes so I don't think we need a new release
candidate ... we can extend the vote to Monday if we don't get a pass/fail
by 2pm PT today.

/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
/

On Fri, Oct 24, 2014 at 12:04 AM, Jun Rao jun...@gmail.com wrote:

 Joe,

 Verified quickstart on both the src and binary release. They all look good.
 The javadoc doesn't seem to include those in clients. Could you add them?

 Thanks,

 Jun

 On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly wrote:

  This is the first candidate for release of Apache Kafka 0.8.2-beta
 
  Release Notes for the 0.8.2-beta release
 
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEASE_NOTES.html
 
  *** Please download, test and vote by Friday, October 24th, 2pm PT
 
  Kafka's KEYS file containing PGP keys we use to sign the release:
  https://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5, sha1
  and sha2 (SHA256) checksum.
 
  * Release artifacts to be voted upon (source and binary):
  https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/
 
  * Maven artifacts to be voted upon prior to release:
  https://repository.apache.org/content/groups/staging/
 
  * scala-doc
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala-doc/
 
  * java-doc
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/
 
  * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2-beta tag
 
 
 https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c52bc62a89d60f85125d3723c8410fa0
 
  /***
   Joe Stein
   Founder, Principal Consultant
   Big Data Open Source Security LLC
   http://www.stealth.ly
   Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
  /
 



[jira] [Created] (KAFKA-1729) add doc for Kafka-based offset management in 0.8.2

2014-10-24 Thread Jun Rao (JIRA)
Jun Rao created KAFKA-1729:
--

 Summary: add doc for Kafka-based offset management in 0.8.2
 Key: KAFKA-1729
 URL: https://issues.apache.org/jira/browse/KAFKA-1729
 Project: Kafka
  Issue Type: Sub-task
Reporter: Jun Rao






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-1730) add the doc for the new java producer in 0.8.2

2014-10-24 Thread Jun Rao (JIRA)
Jun Rao created KAFKA-1730:
--

 Summary: add the doc for the new java producer in 0.8.2
 Key: KAFKA-1730
 URL: https://issues.apache.org/jira/browse/KAFKA-1730
 Project: Kafka
  Issue Type: Sub-task
Affects Versions: 0.8.2
Reporter: Jun Rao






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1730) add the doc for the new java producer in 0.8.2

2014-10-24 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182974#comment-14182974
 ] 

Jun Rao commented on KAFKA-1730:


We can probably just add a link to the java doc of the new producer.

 add the doc for the new java producer in 0.8.2
 --

 Key: KAFKA-1730
 URL: https://issues.apache.org/jira/browse/KAFKA-1730
 Project: Kafka
  Issue Type: Sub-task
Affects Versions: 0.8.2
Reporter: Jun Rao





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-1731) add config/jmx changes in 0.8.2 doc

2014-10-24 Thread Jun Rao (JIRA)
Jun Rao created KAFKA-1731:
--

 Summary: add config/jmx changes in 0.8.2 doc
 Key: KAFKA-1731
 URL: https://issues.apache.org/jira/browse/KAFKA-1731
 Project: Kafka
  Issue Type: Sub-task
Reporter: Jun Rao






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1728) update 082 docs

2014-10-24 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182986#comment-14182986
 ] 

Jun Rao commented on KAFKA-1728:


Need to include the doc change in KAFKA-1555.

 update 082 docs
 ---

 Key: KAFKA-1728
 URL: https://issues.apache.org/jira/browse/KAFKA-1728
 Project: Kafka
  Issue Type: Task
Affects Versions: 0.8.2
Reporter: Jun Rao

 We need to update the docs for 082 release.
 https://svn.apache.org/repos/asf/kafka/site/082
 http://kafka.apache.org/082/documentation.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] 0.8.2-beta Release Candidate 1

2014-10-24 Thread Jun Rao
Joe,

Thanks. +1 on RC 1 from me.

Jun

On Fri, Oct 24, 2014 at 5:22 AM, Joe Stein joe.st...@stealth.ly wrote:

 Jun, I updated the
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/
 with the contents of kafka-clients-0.8.2-beta-javadoc.jar

 There weren't any artifact changes so I don't think we need a new release
 candidate ... we can extend the vote to Monday if we don't get a pass/fail
 by 2pm PT today.

 /***
  Joe Stein
  Founder, Principal Consultant
  Big Data Open Source Security LLC
  http://www.stealth.ly
  Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
 /

 On Fri, Oct 24, 2014 at 12:04 AM, Jun Rao jun...@gmail.com wrote:

  Joe,
 
  Verified quickstart on both the src and binary release. They all look
 good.
  The javadoc doesn't seem to include those in clients. Could you add them?
 
  Thanks,
 
  Jun
 
  On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly wrote:
 
   This is the first candidate for release of Apache Kafka 0.8.2-beta
  
   Release Notes for the 0.8.2-beta release
  
  
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEASE_NOTES.html
  
   *** Please download, test and vote by Friday, October 24th, 2pm PT
  
   Kafka's KEYS file containing PGP keys we use to sign the release:
   https://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5,
 sha1
   and sha2 (SHA256) checksum.
  
   * Release artifacts to be voted upon (source and binary):
   https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/
  
   * Maven artifacts to be voted upon prior to release:
   https://repository.apache.org/content/groups/staging/
  
   * scala-doc
  
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala-doc/
  
   * java-doc
  
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/
  
   * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2-beta tag
  
  
 
 https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c52bc62a89d60f85125d3723c8410fa0
  
   /***
Joe Stein
Founder, Principal Consultant
Big Data Open Source Security LLC
http://www.stealth.ly
Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
   /
  
 



082 docs

2014-10-24 Thread Jun Rao
Hi, Everyone,

I created KAFKA-1728 to track the doc changes in the 082 release. If anyone
wants to help, you can sign up on one of the sub tasks.

Joel Koshy,

Do you want to update the doc for offset management?

Thanks,

Jun


Build failed in Jenkins: Kafka-trunk #317

2014-10-24 Thread Apache Jenkins Server
See https://builds.apache.org/job/Kafka-trunk/317/changes

Changes:

[neha.narkhede] KAFKA-1725 Configuration file bugs in system tests add noise to 
output and break a few tests; reviewed by Neha Narkhede

--
[...truncated 1080 lines...]

kafka.admin.AdminTest  testPartitionReassignmentNonOverlappingReplicas FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testReassigningNonExistingPartition FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testResumePartitionReassignmentThatWasCompleted FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testPreferredReplicaJsonData FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testBasicPreferredReplicaElection FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testShutdownBroker FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testTopicConfigChange FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 

[jira] [Updated] (KAFKA-1725) Configuration file bugs in system tests add noise to output and break a few tests

2014-10-24 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-1725:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for fixing the system tests! Pushed to trunk and 0.8.2

 Configuration file bugs in system tests add noise to output and break a few 
 tests
 -

 Key: KAFKA-1725
 URL: https://issues.apache.org/jira/browse/KAFKA-1725
 Project: Kafka
  Issue Type: Bug
  Components: tools
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
Priority: Minor
 Attachments: KAFKA-1725.patch


 There are some broken and misnamed system test configuration files 
 (testcase_*_properties.json) that are causing a bunch of exceptions when 
 running system tests and make it a lot harder to parse the output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (KAFKA-1725) Configuration file bugs in system tests add noise to output and break a few tests

2014-10-24 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede closed KAFKA-1725.


 Configuration file bugs in system tests add noise to output and break a few 
 tests
 -

 Key: KAFKA-1725
 URL: https://issues.apache.org/jira/browse/KAFKA-1725
 Project: Kafka
  Issue Type: Bug
  Components: tools
Reporter: Ewen Cheslack-Postava
Assignee: Ewen Cheslack-Postava
Priority: Minor
 Attachments: KAFKA-1725.patch


 There are some broken and misnamed system test configuration files 
 (testcase_*_properties.json) that are causing a bunch of exceptions when 
 running system tests and make it a lot harder to parse the output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] 0.8.2-beta Release Candidate 1

2014-10-24 Thread Gwen Shapira
+1 (non-official community vote).

Kicked the tires of the binary release. Works out of the box as
expected, new producer included.

On Fri, Oct 24, 2014 at 5:22 AM, Joe Stein joe.st...@stealth.ly wrote:
 Jun, I updated the
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/
 with the contents of kafka-clients-0.8.2-beta-javadoc.jar

 There weren't any artifact changes so I don't think we need a new release
 candidate ... we can extend the vote to Monday if we don't get a pass/fail
 by 2pm PT today.

 /***
  Joe Stein
  Founder, Principal Consultant
  Big Data Open Source Security LLC
  http://www.stealth.ly
  Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
 /

 On Fri, Oct 24, 2014 at 12:04 AM, Jun Rao jun...@gmail.com wrote:

 Joe,

 Verified quickstart on both the src and binary release. They all look good.
 The javadoc doesn't seem to include those in clients. Could you add them?

 Thanks,

 Jun

 On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly wrote:

  This is the first candidate for release of Apache Kafka 0.8.2-beta
 
  Release Notes for the 0.8.2-beta release
 
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEASE_NOTES.html
 
  *** Please download, test and vote by Friday, October 24th, 2pm PT
 
  Kafka's KEYS file containing PGP keys we use to sign the release:
  https://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5, sha1
  and sha2 (SHA256) checksum.
 
  * Release artifacts to be voted upon (source and binary):
  https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/
 
  * Maven artifacts to be voted upon prior to release:
  https://repository.apache.org/content/groups/staging/
 
  * scala-doc
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala-doc/
 
  * java-doc
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/
 
  * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2-beta tag
 
 
 https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c52bc62a89d60f85125d3723c8410fa0
 
  /***
   Joe Stein
   Founder, Principal Consultant
   Big Data Open Source Security LLC
   http://www.stealth.ly
   Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
  /
 



[jira] [Updated] (KAFKA-1719) Make mirror maker exit when one consumer/producer thread exits.

2014-10-24 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-1719:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the patches. Pushed to trunk and 0.8.2

 Make mirror maker exit when one consumer/producer thread exits.
 ---

 Key: KAFKA-1719
 URL: https://issues.apache.org/jira/browse/KAFKA-1719
 Project: Kafka
  Issue Type: Improvement
Reporter: Jiangjie Qin
Assignee: Jiangjie Qin
 Attachments: KAFKA-1719.patch, KAFKA-1719_2014-10-22_15:04:32.patch, 
 KAFKA-1719_2014-10-23_16:20:22.patch, KAFKA-1719_2014-10-24_00:56:06.patch


 When one of the consumer/producer thread exits, the entire mirror maker will 
 be blocked. In this case, it is better to make it exit. It seems a single 
 ZookeeperConsumerConnector is sufficient for mirror maker, probably we don't 
 need a list for the connectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1688) Add authorization interface and naive implementation

2014-10-24 Thread Michael Herstine (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183061#comment-14183061
 ] 

Michael Herstine commented on KAFKA-1688:
-

Apologies-- yes, I should have said Principal.

 Add authorization interface and naive implementation
 

 Key: KAFKA-1688
 URL: https://issues.apache.org/jira/browse/KAFKA-1688
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Reporter: Jay Kreps
Assignee: Sriharsha Chintalapani

 Add a PermissionManager interface as described here:
 https://cwiki.apache.org/confluence/display/KAFKA/Security
 (possibly there is a better name?)
 Implement calls to the PermissionsManager in KafkaApis for the main requests 
 (FetchRequest, ProduceRequest, etc). We will need to add a new error code and 
 exception to the protocol to indicate permission denied.
 Add a server configuration to give the class you want to instantiate that 
 implements that interface. That class can define its own configuration 
 properties from the main config file.
 Provide a simple implementation of this interface which just takes a user and 
 ip whitelist and permits those in either of the whitelists to do anything, 
 and denies all others.
 Rather than writing an integration test for this class we can probably just 
 use this class for the TLS and SASL authentication testing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] 0.8.2-beta Release Candidate 1

2014-10-24 Thread Neha Narkhede
+1 (binding)

Verified the quickstart, docs, unit tests on the source and binary release.

Thanks,
Neha

On Fri, Oct 24, 2014 at 9:26 AM, Gwen Shapira gshap...@cloudera.com wrote:

 +1 (non-official community vote).

 Kicked the tires of the binary release. Works out of the box as
 expected, new producer included.

 On Fri, Oct 24, 2014 at 5:22 AM, Joe Stein joe.st...@stealth.ly wrote:
  Jun, I updated the
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/
  with the contents of kafka-clients-0.8.2-beta-javadoc.jar
 
  There weren't any artifact changes so I don't think we need a new release
  candidate ... we can extend the vote to Monday if we don't get a
 pass/fail
  by 2pm PT today.
 
  /***
   Joe Stein
   Founder, Principal Consultant
   Big Data Open Source Security LLC
   http://www.stealth.ly
   Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
  /
 
  On Fri, Oct 24, 2014 at 12:04 AM, Jun Rao jun...@gmail.com wrote:
 
  Joe,
 
  Verified quickstart on both the src and binary release. They all look
 good.
  The javadoc doesn't seem to include those in clients. Could you add
 them?
 
  Thanks,
 
  Jun
 
  On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly
 wrote:
 
   This is the first candidate for release of Apache Kafka 0.8.2-beta
  
   Release Notes for the 0.8.2-beta release
  
  
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEASE_NOTES.html
  
   *** Please download, test and vote by Friday, October 24th, 2pm PT
  
   Kafka's KEYS file containing PGP keys we use to sign the release:
   https://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5,
 sha1
   and sha2 (SHA256) checksum.
  
   * Release artifacts to be voted upon (source and binary):
   https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/
  
   * Maven artifacts to be voted upon prior to release:
   https://repository.apache.org/content/groups/staging/
  
   * scala-doc
  
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala-doc/
  
   * java-doc
  
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/
  
   * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2-beta
 tag
  
  
 
 https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c52bc62a89d60f85125d3723c8410fa0
  
   /***
Joe Stein
Founder, Principal Consultant
Big Data Open Source Security LLC
http://www.stealth.ly
Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
   /
  
 



Build failed in Jenkins: Kafka-trunk #318

2014-10-24 Thread Apache Jenkins Server
See https://builds.apache.org/job/Kafka-trunk/318/changes

Changes:

[neha.narkhede] KAFKA-1719 Make mirror maker exit when one consumer/producer 
thread exits; reviewed by Neha Narkhede, Joel Koshy and Guozhang Wang

--
[...truncated 1284 lines...]

kafka.admin.AdminTest  testPartitionReassignmentNonOverlappingReplicas FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testReassigningNonExistingPartition FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testResumePartitionReassignmentThatWasCompleted FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testPreferredReplicaJsonData FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testBasicPreferredReplicaElection FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testShutdownBroker FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:95)
at kafka.zk.EmbeddedZookeeper.init(EmbeddedZookeeper.scala:33)
at 
kafka.zk.ZooKeeperTestHarness$class.setUp(ZooKeeperTestHarness.scala:33)
at kafka.admin.AdminTest.setUp(AdminTest.scala:33)

kafka.admin.AdminTest  testTopicConfigChange FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:124)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52)
at 

Re: [VOTE] 0.8.2-beta Release Candidate 1

2014-10-24 Thread Darion Yaphet
+1
New Sender is Added ~

2014-10-25 1:18 GMT+08:00 Neha Narkhede neha.narkh...@gmail.com:

 +1 (binding)

 Verified the quickstart, docs, unit tests on the source and binary release.

 Thanks,
 Neha

 On Fri, Oct 24, 2014 at 9:26 AM, Gwen Shapira gshap...@cloudera.com
 wrote:

  +1 (non-official community vote).
 
  Kicked the tires of the binary release. Works out of the box as
  expected, new producer included.
 
  On Fri, Oct 24, 2014 at 5:22 AM, Joe Stein joe.st...@stealth.ly wrote:
   Jun, I updated the
  
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/
   with the contents of kafka-clients-0.8.2-beta-javadoc.jar
  
   There weren't any artifact changes so I don't think we need a new
 release
   candidate ... we can extend the vote to Monday if we don't get a
  pass/fail
   by 2pm PT today.
  
   /***
Joe Stein
Founder, Principal Consultant
Big Data Open Source Security LLC
http://www.stealth.ly
Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
   /
  
   On Fri, Oct 24, 2014 at 12:04 AM, Jun Rao jun...@gmail.com wrote:
  
   Joe,
  
   Verified quickstart on both the src and binary release. They all look
  good.
   The javadoc doesn't seem to include those in clients. Could you add
  them?
  
   Thanks,
  
   Jun
  
   On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly
  wrote:
  
This is the first candidate for release of Apache Kafka 0.8.2-beta
   
Release Notes for the 0.8.2-beta release
   
   
  
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEASE_NOTES.html
   
*** Please download, test and vote by Friday, October 24th, 2pm PT
   
Kafka's KEYS file containing PGP keys we use to sign the release:
https://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5,
  sha1
and sha2 (SHA256) checksum.
   
* Release artifacts to be voted upon (source and binary):
https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/
   
* Maven artifacts to be voted upon prior to release:
https://repository.apache.org/content/groups/staging/
   
* scala-doc
   
  
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala-doc/
   
* java-doc
   
  
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/
   
* The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2-beta
  tag
   
   
  
 
 https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c52bc62a89d60f85125d3723c8410fa0
   
/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
/
   
  
 




-- 

long is the way and hard  that out of Hell leads up to light


Re: 082 docs

2014-10-24 Thread Joel Koshy
Yes I can update the doc for offset management, and review Gwen's doc
on min.isr/durability guarantees

On Fri, Oct 24, 2014 at 09:10:19AM -0700, Jun Rao wrote:
 Hi, Everyone,
 
 I created KAFKA-1728 to track the doc changes in the 082 release. If anyone
 wants to help, you can sign up on one of the sub tasks.
 
 Joel Koshy,
 
 Do you want to update the doc for offset management?
 
 Thanks,
 
 Jun



[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability

2014-10-24 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183086#comment-14183086
 ] 

Joel Koshy commented on KAFKA-1555:
---

Jun, you can just diff configuration.html, design.html between the files in the 
two directories. Or you can use a tool like meld
I have a few comments, that I will post later after some suggested edits.

 provide strong consistency with reasonable availability
 ---

 Key: KAFKA-1555
 URL: https://issues.apache.org/jira/browse/KAFKA-1555
 Project: Kafka
  Issue Type: Improvement
  Components: controller
Affects Versions: 0.8.1.1
Reporter: Jiang Wu
Assignee: Gwen Shapira
 Fix For: 0.8.2

 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555.0.patch, 
 KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, 
 KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, 
 KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch


 In a mission critical application, we expect a kafka cluster with 3 brokers 
 can satisfy two requirements:
 1. When 1 broker is down, no message loss or service blocking happens.
 2. In worse cases such as two brokers are down, service can be blocked, but 
 no message loss happens.
 We found that current kafka versoin (0.8.1.1) cannot achieve the requirements 
 due to its three behaviors:
 1. when choosing a new leader from 2 followers in ISR, the one with less 
 messages may be chosen as the leader.
 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it 
 has less messages than the leader.
 3. ISR can contains only 1 broker, therefore acknowledged messages may be 
 stored in only 1 broker.
 The following is an analytical proof. 
 We consider a cluster with 3 brokers and a topic with 3 replicas, and assume 
 that at the beginning, all 3 replicas, leader A, followers B and C, are in 
 sync, i.e., they have the same messages and are all in ISR.
 According to the value of request.required.acks (acks for short), there are 
 the following cases.
 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement.
 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this 
 time, although C hasn't received m, C is still in ISR. If A is killed, C can 
 be elected as the new leader, and consumers will miss m.
 3. acks=-1. B and C restart and are removed from ISR. Producer sends a 
 message m to A, and receives an acknowledgement. Disk failure happens in A 
 before B and C replicate m. Message m is lost.
 In summary, any existing configuration cannot satisfy the requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1683) Implement a session concept in the socket server

2014-10-24 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183089#comment-14183089
 ] 

Gwen Shapira commented on KAFKA-1683:
-

Reviewed Zookeeper SASL code to get better handle on where we are heading. 
Since our high-level design is inspired by ZK, I encourage everyone involved to 
take a look too.

One thing it clarified for me - SASL does not (necessarily) maintain its own 
information on the socket. We need to attach something extra that will be 
able to retrieve authenticated identity and provide some of the 
security-protocol specific implementation.

In ZK, this something is a ServerCnxn instance that gets attached to 
SelectionKey (nice trick that I was unaware of) and optionally references a 
ZooKeeperSaslServer.

In Ivan's patch for KAFKA-1684, we instead extend SocketChannel with a 
protocol-specific wrapper and use this to maintain authentication state. 

As far as I can see, both solutions are valid, and both allow us to attach 
authentication information to a socket/channel and maintain it there - with the 
benefit that it can easily match the socket lifecycle. 

I need to have one of these exist for the no authentication case for this 
patch. I'm going to go with Cnxn instance attached to keys and not the 
SocketChannel extension, simply because its less code to merge into KAFKA-1684 
later :) 
For same reason, the Cnxn will be a clear mock - i.e. will not contain any of 
the functionality we actually need from a security-protocol-specific object 
except authId getters and setters. If we decide to go with Cnxn objects, the 
functionality we actually need (handshakes, wrapping) will get implemented in 
followup JIRAs and if we decide to keep the SocketChannels, this will go away 
anyway.



 Implement a session concept in the socket server
 --

 Key: KAFKA-1683
 URL: https://issues.apache.org/jira/browse/KAFKA-1683
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Affects Versions: 0.9.0
Reporter: Jay Kreps
Assignee: Gwen Shapira

 To implement authentication we need a way to keep track of some things 
 between requests. The initial use for this would be remembering the 
 authenticated user/principle info, but likely more uses would come up (for 
 example we will also need to remember whether and which encryption or 
 integrity measures are in place on the socket so we can wrap and unwrap 
 writes and reads).
 I was thinking we could just add a Session object that might have a user 
 field. The session object would need to get added to RequestChannel.Request 
 so it is passed down to the API layer with each request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] 0.8.2-beta Release Candidate 1

2014-10-24 Thread Gwen Shapira
What's the plan for moving from beta to stable?
Time based? Or are there any patches that need to go in?

On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly wrote:
 This is the first candidate for release of Apache Kafka 0.8.2-beta

 Release Notes for the 0.8.2-beta release
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEASE_NOTES.html

 *** Please download, test and vote by Friday, October 24th, 2pm PT

 Kafka's KEYS file containing PGP keys we use to sign the release:
 https://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5, sha1
 and sha2 (SHA256) checksum.

 * Release artifacts to be voted upon (source and binary):
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/

 * Maven artifacts to be voted upon prior to release:
 https://repository.apache.org/content/groups/staging/

 * scala-doc
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala-doc/

 * java-doc
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/

 * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2-beta tag
 https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c52bc62a89d60f85125d3723c8410fa0

 /***
  Joe Stein
  Founder, Principal Consultant
  Big Data Open Source Security LLC
  http://www.stealth.ly
  Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
 /


Re: 082 docs

2014-10-24 Thread Gwen Shapira
I'll upload a clean patch today to make the review easier.

On Fri, Oct 24, 2014 at 10:32 AM, Joel Koshy jjkosh...@gmail.com wrote:
 Yes I can update the doc for offset management, and review Gwen's doc
 on min.isr/durability guarantees

 On Fri, Oct 24, 2014 at 09:10:19AM -0700, Jun Rao wrote:
 Hi, Everyone,

 I created KAFKA-1728 to track the doc changes in the 082 release. If anyone
 wants to help, you can sign up on one of the sub tasks.

 Joel Koshy,

 Do you want to update the doc for offset management?

 Thanks,

 Jun



Re: [VOTE] 0.8.2-beta Release Candidate 1

2014-10-24 Thread Neha Narkhede
Gwen,

Production grade testing is the plan for moving from beta to stable. It
will be great if companies that pick up beta, deploy it, use it and report
bugs. LinkedIn will be helping do this within a 4-5 weeks timeframe, if
possible.

Thanks,
Neha

On Fri, Oct 24, 2014 at 10:36 AM, Gwen Shapira gshap...@cloudera.com
wrote:

 What's the plan for moving from beta to stable?
 Time based? Or are there any patches that need to go in?

 On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly wrote:
  This is the first candidate for release of Apache Kafka 0.8.2-beta
 
  Release Notes for the 0.8.2-beta release
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEASE_NOTES.html
 
  *** Please download, test and vote by Friday, October 24th, 2pm PT
 
  Kafka's KEYS file containing PGP keys we use to sign the release:
  https://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5, sha1
  and sha2 (SHA256) checksum.
 
  * Release artifacts to be voted upon (source and binary):
  https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/
 
  * Maven artifacts to be voted upon prior to release:
  https://repository.apache.org/content/groups/staging/
 
  * scala-doc
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala-doc/
 
  * java-doc
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/
 
  * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2-beta tag
 
 https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c52bc62a89d60f85125d3723c8410fa0
 
  /***
   Joe Stein
   Founder, Principal Consultant
   Big Data Open Source Security LLC
   http://www.stealth.ly
   Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
  /



Re: [VOTE] 0.8.2-beta Release Candidate 1

2014-10-24 Thread Gwen Shapira
So the criteria is: At least one company doing a serious testing cycle
on the beta and possibly running it in production for few weeks?

On Fri, Oct 24, 2014 at 10:43 AM, Neha Narkhede neha.narkh...@gmail.com wrote:
 Gwen,

 Production grade testing is the plan for moving from beta to stable. It
 will be great if companies that pick up beta, deploy it, use it and report
 bugs. LinkedIn will be helping do this within a 4-5 weeks timeframe, if
 possible.

 Thanks,
 Neha

 On Fri, Oct 24, 2014 at 10:36 AM, Gwen Shapira gshap...@cloudera.com
 wrote:

 What's the plan for moving from beta to stable?
 Time based? Or are there any patches that need to go in?

 On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly wrote:
  This is the first candidate for release of Apache Kafka 0.8.2-beta
 
  Release Notes for the 0.8.2-beta release
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEASE_NOTES.html
 
  *** Please download, test and vote by Friday, October 24th, 2pm PT
 
  Kafka's KEYS file containing PGP keys we use to sign the release:
  https://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5, sha1
  and sha2 (SHA256) checksum.
 
  * Release artifacts to be voted upon (source and binary):
  https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/
 
  * Maven artifacts to be voted upon prior to release:
  https://repository.apache.org/content/groups/staging/
 
  * scala-doc
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala-doc/
 
  * java-doc
 
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-doc/
 
  * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2-beta tag
 
 https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c52bc62a89d60f85125d3723c8410fa0
 
  /***
   Joe Stein
   Founder, Principal Consultant
   Big Data Open Source Security LLC
   http://www.stealth.ly
   Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
  /



[jira] [Updated] (KAFKA-1555) provide strong consistency with reasonable availability

2014-10-24 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira updated KAFKA-1555:

Attachment: KAFKA-1555-DOCS.1.patch

Attaching a clean version of the docs - just the diffs from 082.

 provide strong consistency with reasonable availability
 ---

 Key: KAFKA-1555
 URL: https://issues.apache.org/jira/browse/KAFKA-1555
 Project: Kafka
  Issue Type: Improvement
  Components: controller
Affects Versions: 0.8.1.1
Reporter: Jiang Wu
Assignee: Gwen Shapira
 Fix For: 0.8.2

 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, 
 KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, 
 KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, 
 KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch


 In a mission critical application, we expect a kafka cluster with 3 brokers 
 can satisfy two requirements:
 1. When 1 broker is down, no message loss or service blocking happens.
 2. In worse cases such as two brokers are down, service can be blocked, but 
 no message loss happens.
 We found that current kafka versoin (0.8.1.1) cannot achieve the requirements 
 due to its three behaviors:
 1. when choosing a new leader from 2 followers in ISR, the one with less 
 messages may be chosen as the leader.
 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it 
 has less messages than the leader.
 3. ISR can contains only 1 broker, therefore acknowledged messages may be 
 stored in only 1 broker.
 The following is an analytical proof. 
 We consider a cluster with 3 brokers and a topic with 3 replicas, and assume 
 that at the beginning, all 3 replicas, leader A, followers B and C, are in 
 sync, i.e., they have the same messages and are all in ISR.
 According to the value of request.required.acks (acks for short), there are 
 the following cases.
 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement.
 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this 
 time, although C hasn't received m, C is still in ISR. If A is killed, C can 
 be elected as the new leader, and consumers will miss m.
 3. acks=-1. B and C restart and are removed from ISR. Producer sends a 
 message m to A, and receives an acknowledgement. Disk failure happens in A 
 before B and C replicate m. Message m is lost.
 In summary, any existing configuration cannot satisfy the requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1710) [New Java Producer Potential Deadlock] Producer Deadlock when all messages is being sent to single partition

2014-10-24 Thread Bhavesh Mistry (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14179014#comment-14179014
 ] 

Bhavesh Mistry edited comment on KAFKA-1710 at 10/24/14 6:21 PM:
-

[~jkreps],

I am sorry I did not get back to you so soon.  The cost of enqueue a message 
into single partition is ~54% as compare to round-robin.  (test with 32 
partitions to single topic and 3 broker cluster)  The throughput is measuring 
the cost  of put data into buffer only not cost of sending data to brokers.

Here is test I have done:

To *single* partition:
Throughput per Thread=2666.5  byte(s)/millisecond
All done...!

To *all* partition:
Throughput per Thread=5818.181818181818  byte(s)/millisecond
All done...!

Here is test program for this:
{code}
package org.kafka.test;

import java.io.IOException;
import java.io.InputStream;
import java.util.Properties;
import java.util.concurrent.Callable;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.LinkedBlockingQueue;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;

import org.apache.kafka.clients.producer.Callback;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.Producer;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.clients.producer.RecordMetadata;

public class TestNetworkDownProducer {

static int numberTh = 75;
static CountDownLatch latch = new CountDownLatch(numberTh);
public static void main(String[] args) throws IOException, 
InterruptedException {

//Thread.sleep(6);

Properties prop = new Properties();
InputStream propFile = 
Thread.currentThread().getContextClassLoader()

.getResourceAsStream(kafkaproducer.properties);

String topic = logmon.test;
prop.load(propFile);
System.out.println(Property:  + prop.toString());
StringBuilder builder = new StringBuilder(1024);
int msgLenth = 256;
int numberOfLoop = 5000;
for (int i = 0; i  msgLenth; i++)
builder.append(a);

int numberOfProducer = 1;
Producer[] producer = new Producer[numberOfProducer];

for (int i = 0; i  producer.length; i++) {
producer[i] = new KafkaProducer(prop);
}
ExecutorService service =   new ThreadPoolExecutor(numberTh, 
numberTh,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueueRunnable(numberTh *2));
MyProducer [] producerThResult = new MyProducer [numberTh];
for(int i = 0 ; i  numberTh;i++){
producerThResult[i] = new 
MyProducer(producer,numberOfLoop,builder.toString(), topic);
service.execute(producerThResult[i]);
}   
latch.await();
for (int i = 0; i  producer.length; i++) {
producer[i].close();
}   
service.shutdownNow();
System.out.println(All Producers done...!);
// now interpret the result... of this...
long lowestTime = 0 ;
for(int i =0 ; i  producerThResult.length;i++){
if(i == 1){
lowestTime = 
producerThResult[i].totalTimeinNano;
}else if ( producerThResult[i].totalTimeinNano  
lowestTime){
lowestTime = 
producerThResult[i].totalTimeinNano;
}
}
long bytesSend = msgLenth * numberOfLoop;
long durationInMs = TimeUnit.MILLISECONDS.convert(lowestTime, 
TimeUnit.NANOSECONDS);

double throughput = (bytesSend * 1.0) / (durationInMs);
System.out.println(Throughput per Thread= + throughput +   
byte(s)/microsecond);

System.out.println(All done...!);

}



static class MyProducer implements CallableLong , Runnable {

Producer[] producer;
long maxloops;
String msg ;
String topic;
long totalTimeinNano = 0;

MyProducer(Producer[] list, long maxloops,String msg,String 
topic){
this.producer = list;
this.maxloops = maxloops;
this.msg = msg;
this.topic = topic;
}
public void run() {
 

[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability

2014-10-24 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183304#comment-14183304
 ] 

Joel Koshy commented on KAFKA-1555:
---

Looks good - I just have a few minor edits to what you wrote.

configuration.html

_Minor edits: also, it would be good if we can also mention 
NotEnoughReplicasAfterAppend and document it_

  min.insync.replicas: The minimum number of replicas that are required to
  declare a message as committed. If the number of in-sync replicas drops
  below this threshold, then writing messages with request.required.acks set
  to -1 will return a NotEnoughReplicas or NotEnoughReplicasAfterAppend
  error code. This is used to provide enhanced durability guarantees - i.e.,
  all in-sync replicas need to acknowledge the message AND there needs to be
  at least this many replicas in the set of in-sync replicas.


ops.html:

log.cleanup.interval.mins=30 - log.retention.check.interval.ms=30


design.html

Couple of comments:

* all (or -1) brokers - maybe make it clear up front that this is all current 
in-sync replicas, and later clarify that consistency can be preferred over 
availability via the min.isr property
* bq. a message that was acked will not be lost as long as at least one in sync 
replica remains
** The above should probably be clarified a bit. i.e., availability of a 
replica affects whether a message will be lost or not only during the time it 
is yet to be replicated to all assigned replicas.
* It would be useful to describe how min.isr helps facilitate trading off 
consistency vs availability
* There are a couple of typos in various places

 provide strong consistency with reasonable availability
 ---

 Key: KAFKA-1555
 URL: https://issues.apache.org/jira/browse/KAFKA-1555
 Project: Kafka
  Issue Type: Improvement
  Components: controller
Affects Versions: 0.8.1.1
Reporter: Jiang Wu
Assignee: Gwen Shapira
 Fix For: 0.8.2

 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, 
 KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, 
 KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, 
 KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch


 In a mission critical application, we expect a kafka cluster with 3 brokers 
 can satisfy two requirements:
 1. When 1 broker is down, no message loss or service blocking happens.
 2. In worse cases such as two brokers are down, service can be blocked, but 
 no message loss happens.
 We found that current kafka versoin (0.8.1.1) cannot achieve the requirements 
 due to its three behaviors:
 1. when choosing a new leader from 2 followers in ISR, the one with less 
 messages may be chosen as the leader.
 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it 
 has less messages than the leader.
 3. ISR can contains only 1 broker, therefore acknowledged messages may be 
 stored in only 1 broker.
 The following is an analytical proof. 
 We consider a cluster with 3 brokers and a topic with 3 replicas, and assume 
 that at the beginning, all 3 replicas, leader A, followers B and C, are in 
 sync, i.e., they have the same messages and are all in ISR.
 According to the value of request.required.acks (acks for short), there are 
 the following cases.
 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement.
 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this 
 time, although C hasn't received m, C is still in ISR. If A is killed, C can 
 be elected as the new leader, and consumers will miss m.
 3. acks=-1. B and C restart and are removed from ISR. Producer sends a 
 message m to A, and receives an acknowledgement. Disk failure happens in A 
 before B and C replicate m. Message m is lost.
 In summary, any existing configuration cannot satisfy the requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


RE: [VOTE] 0.8.2-beta Release Candidate 1

2014-10-24 Thread Seshadri, Balaji
We are testing 0.8.2 branch now in dev-int.

Balaji

-Original Message-
From: Neha Narkhede [mailto:neha.narkh...@gmail.com] 
Sent: Friday, October 24, 2014 12:06 PM
To: dev@kafka.apache.org
Subject: Re: [VOTE] 0.8.2-beta Release Candidate 1

So the criteria is: At least one company doing a serious testing cycle on the 
beta and possibly running it in production for few weeks?

That's right.

On Fri, Oct 24, 2014 at 10:54 AM, Harsha ka...@harsha.io wrote:

 +1 (non-binding)

 On Fri, Oct 24, 2014, at 10:49 AM, Gwen Shapira wrote:
  So the criteria is: At least one company doing a serious testing 
  cycle on the beta and possibly running it in production for few weeks?
 
  On Fri, Oct 24, 2014 at 10:43 AM, Neha Narkhede 
  neha.narkh...@gmail.com
 
  wrote:
   Gwen,
  
   Production grade testing is the plan for moving from beta to 
   stable. It will be great if companies that pick up beta, deploy 
   it, use it and
 report
   bugs. LinkedIn will be helping do this within a 4-5 weeks 
   timeframe, if possible.
  
   Thanks,
   Neha
  
   On Fri, Oct 24, 2014 at 10:36 AM, Gwen Shapira 
   gshap...@cloudera.com
   wrote:
  
   What's the plan for moving from beta to stable?
   Time based? Or are there any patches that need to go in?
  
   On Tue, Oct 21, 2014 at 1:58 PM, Joe Stein joe.st...@stealth.ly
 wrote:
This is the first candidate for release of Apache Kafka 
0.8.2-beta
   
Release Notes for the 0.8.2-beta release
   
  
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/RELEAS
 E_NOTES.html
   
*** Please download, test and vote by Friday, October 24th, 2pm 
PT
   
Kafka's KEYS file containing PGP keys we use to sign the release:
https://svn.apache.org/repos/asf/kafka/KEYS in addition to the
 md5, sha1
and sha2 (SHA256) checksum.
   
* Release artifacts to be voted upon (source and binary):
https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1
/
   
* Maven artifacts to be voted upon prior to release:
https://repository.apache.org/content/groups/staging/
   
* scala-doc
   
  
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/scala-
 doc/
   
* java-doc
   
  
 https://people.apache.org/~joestein/kafka-0.8.2-beta-candidate1/java-d
 oc/
   
* The tag to be voted upon (off the 0.8.2 branch) is the 
0.8.2-beta
 tag
   
  
 https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2b2c3da2c5
 2bc62a89d60f85125d3723c8410fa0
   
/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC  http://www.stealth.ly
 Twitter: @allthingshadoop 
http://www.twitter.com/allthingshadoop
/
  



[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability

2014-10-24 Thread Kyle Banker (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183382#comment-14183382
 ] 

Kyle Banker commented on KAFKA-1555:


The new durability docs looks great so far. Here are a couple more proposed 
edits to the patch:

EDIT #1:
min.insync.replicas: When a producer sets request.required.acks to -1, 
min.insync.replicas specifies the minimum number of replicas that must 
acknowledge a write for the write to be considered successful. If this minimum 
cannot be met, then the producer will raise an exception (either 
NotEnoughReplicas or NotEnoughReplicasAfterAppend).

When used together, min.insync.replicas and request.required.acks allow you to 
enforce greater durability guarantees. A typical scenario would be to create a 
topic with a replication factor of 3, set min.insync.replicas to 2, and produce 
with request.required.acks of -1. This will ensure that the producer raises an 
exception if a majority of replicas do not receive a write.

EDIT #2 (for request.required.acks):
  -1. The producer gets an acknowledgement after all in-sync replicas have 
received the data. This option provides the greatest level of durability. 
However, it does not completely eliminate the risk of message loss because the 
number of in sync replicas may, in rare cases shrink, to 1. If you want to 
ensure that some minimum number of replicas (typically a majority) receive a 
write, the you must set the topic-level min.insync.replicas setting. Please 
read the Replication section of the design documentation for a more in-depth 
discussion.

 provide strong consistency with reasonable availability
 ---

 Key: KAFKA-1555
 URL: https://issues.apache.org/jira/browse/KAFKA-1555
 Project: Kafka
  Issue Type: Improvement
  Components: controller
Affects Versions: 0.8.1.1
Reporter: Jiang Wu
Assignee: Gwen Shapira
 Fix For: 0.8.2

 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, 
 KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, 
 KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, 
 KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch


 In a mission critical application, we expect a kafka cluster with 3 brokers 
 can satisfy two requirements:
 1. When 1 broker is down, no message loss or service blocking happens.
 2. In worse cases such as two brokers are down, service can be blocked, but 
 no message loss happens.
 We found that current kafka versoin (0.8.1.1) cannot achieve the requirements 
 due to its three behaviors:
 1. when choosing a new leader from 2 followers in ISR, the one with less 
 messages may be chosen as the leader.
 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it 
 has less messages than the leader.
 3. ISR can contains only 1 broker, therefore acknowledged messages may be 
 stored in only 1 broker.
 The following is an analytical proof. 
 We consider a cluster with 3 brokers and a topic with 3 replicas, and assume 
 that at the beginning, all 3 replicas, leader A, followers B and C, are in 
 sync, i.e., they have the same messages and are all in ISR.
 According to the value of request.required.acks (acks for short), there are 
 the following cases.
 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement.
 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this 
 time, although C hasn't received m, C is still in ISR. If A is killed, C can 
 be elected as the new leader, and consumers will miss m.
 3. acks=-1. B and C restart and are removed from ISR. Producer sends a 
 message m to A, and receives an acknowledgement. Disk failure happens in A 
 before B and C replicate m. Message m is lost.
 In summary, any existing configuration cannot satisfy the requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1555) provide strong consistency with reasonable availability

2014-10-24 Thread Kyle Banker (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183382#comment-14183382
 ] 

Kyle Banker edited comment on KAFKA-1555 at 10/24/14 8:03 PM:
--

The new durability docs looks great so far. Here are a couple more proposed 
edits to the patch:

EDIT #1:
min.insync.replicas: When a producer sets request.required.acks to -1, 
min.insync.replicas specifies the minimum number of replicas that must 
acknowledge a write for the write to be considered successful. If this minimum 
cannot be met, then the producer will raise an exception (either 
NotEnoughReplicas or NotEnoughReplicasAfterAppend).

When used together, min.insync.replicas and request.required.acks allow you to 
enforce greater durability guarantees. A typical scenario would be to create a 
topic with a replication factor of 3, set min.insync.replicas to 2, and produce 
with request.required.acks of -1. This will ensure that the producer raises an 
exception if a majority of replicas do not receive a write.

EDIT #2 (for request.required.acks):
  -1. The producer gets an acknowledgement after all in-sync replicas have 
received the data. This option provides the greatest level of durability. 
However, it does not completely eliminate the risk of message loss because the 
number of in sync replicas may, in rare cases, shrink to 1. If you want to 
ensure that some minimum number of replicas (typically a majority) receive a 
write, then you must set the topic-level min.insync.replicas setting. Please 
read the Replication section of the design documentation for a more in-depth 
discussion.


was (Author: kbanker):
The new durability docs looks great so far. Here are a couple more proposed 
edits to the patch:

EDIT #1:
min.insync.replicas: When a producer sets request.required.acks to -1, 
min.insync.replicas specifies the minimum number of replicas that must 
acknowledge a write for the write to be considered successful. If this minimum 
cannot be met, then the producer will raise an exception (either 
NotEnoughReplicas or NotEnoughReplicasAfterAppend).

When used together, min.insync.replicas and request.required.acks allow you to 
enforce greater durability guarantees. A typical scenario would be to create a 
topic with a replication factor of 3, set min.insync.replicas to 2, and produce 
with request.required.acks of -1. This will ensure that the producer raises an 
exception if a majority of replicas do not receive a write.

EDIT #2 (for request.required.acks):
  -1. The producer gets an acknowledgement after all in-sync replicas have 
received the data. This option provides the greatest level of durability. 
However, it does not completely eliminate the risk of message loss because the 
number of in sync replicas may, in rare cases, shrink to 1. If you want to 
ensure that some minimum number of replicas (typically a majority) receive a 
write, the you must set the topic-level min.insync.replicas setting. Please 
read the Replication section of the design documentation for a more in-depth 
discussion.

 provide strong consistency with reasonable availability
 ---

 Key: KAFKA-1555
 URL: https://issues.apache.org/jira/browse/KAFKA-1555
 Project: Kafka
  Issue Type: Improvement
  Components: controller
Affects Versions: 0.8.1.1
Reporter: Jiang Wu
Assignee: Gwen Shapira
 Fix For: 0.8.2

 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, 
 KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, 
 KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, 
 KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch


 In a mission critical application, we expect a kafka cluster with 3 brokers 
 can satisfy two requirements:
 1. When 1 broker is down, no message loss or service blocking happens.
 2. In worse cases such as two brokers are down, service can be blocked, but 
 no message loss happens.
 We found that current kafka versoin (0.8.1.1) cannot achieve the requirements 
 due to its three behaviors:
 1. when choosing a new leader from 2 followers in ISR, the one with less 
 messages may be chosen as the leader.
 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it 
 has less messages than the leader.
 3. ISR can contains only 1 broker, therefore acknowledged messages may be 
 stored in only 1 broker.
 The following is an analytical proof. 
 We consider a cluster with 3 brokers and a topic with 3 replicas, and assume 
 that at the beginning, all 3 replicas, leader A, followers B and C, are in 
 sync, i.e., they have the same messages and are all in ISR.
 According to the value of request.required.acks (acks for short), there are 
 the following cases.
 1. acks=0, 1, 3. Obviously 

[jira] [Comment Edited] (KAFKA-1555) provide strong consistency with reasonable availability

2014-10-24 Thread Kyle Banker (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183382#comment-14183382
 ] 

Kyle Banker edited comment on KAFKA-1555 at 10/24/14 8:03 PM:
--

The new durability docs looks great so far. Here are a couple more proposed 
edits to the patch:

EDIT #1:
min.insync.replicas: When a producer sets request.required.acks to -1, 
min.insync.replicas specifies the minimum number of replicas that must 
acknowledge a write for the write to be considered successful. If this minimum 
cannot be met, then the producer will raise an exception (either 
NotEnoughReplicas or NotEnoughReplicasAfterAppend).

When used together, min.insync.replicas and request.required.acks allow you to 
enforce greater durability guarantees. A typical scenario would be to create a 
topic with a replication factor of 3, set min.insync.replicas to 2, and produce 
with request.required.acks of -1. This will ensure that the producer raises an 
exception if a majority of replicas do not receive a write.

EDIT #2 (for request.required.acks):
  -1. The producer gets an acknowledgement after all in-sync replicas have 
received the data. This option provides the greatest level of durability. 
However, it does not completely eliminate the risk of message loss because the 
number of in sync replicas may, in rare cases, shrink to 1. If you want to 
ensure that some minimum number of replicas (typically a majority) receive a 
write, the you must set the topic-level min.insync.replicas setting. Please 
read the Replication section of the design documentation for a more in-depth 
discussion.


was (Author: kbanker):
The new durability docs looks great so far. Here are a couple more proposed 
edits to the patch:

EDIT #1:
min.insync.replicas: When a producer sets request.required.acks to -1, 
min.insync.replicas specifies the minimum number of replicas that must 
acknowledge a write for the write to be considered successful. If this minimum 
cannot be met, then the producer will raise an exception (either 
NotEnoughReplicas or NotEnoughReplicasAfterAppend).

When used together, min.insync.replicas and request.required.acks allow you to 
enforce greater durability guarantees. A typical scenario would be to create a 
topic with a replication factor of 3, set min.insync.replicas to 2, and produce 
with request.required.acks of -1. This will ensure that the producer raises an 
exception if a majority of replicas do not receive a write.

EDIT #2 (for request.required.acks):
  -1. The producer gets an acknowledgement after all in-sync replicas have 
received the data. This option provides the greatest level of durability. 
However, it does not completely eliminate the risk of message loss because the 
number of in sync replicas may, in rare cases shrink, to 1. If you want to 
ensure that some minimum number of replicas (typically a majority) receive a 
write, the you must set the topic-level min.insync.replicas setting. Please 
read the Replication section of the design documentation for a more in-depth 
discussion.

 provide strong consistency with reasonable availability
 ---

 Key: KAFKA-1555
 URL: https://issues.apache.org/jira/browse/KAFKA-1555
 Project: Kafka
  Issue Type: Improvement
  Components: controller
Affects Versions: 0.8.1.1
Reporter: Jiang Wu
Assignee: Gwen Shapira
 Fix For: 0.8.2

 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, 
 KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, 
 KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, 
 KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch


 In a mission critical application, we expect a kafka cluster with 3 brokers 
 can satisfy two requirements:
 1. When 1 broker is down, no message loss or service blocking happens.
 2. In worse cases such as two brokers are down, service can be blocked, but 
 no message loss happens.
 We found that current kafka versoin (0.8.1.1) cannot achieve the requirements 
 due to its three behaviors:
 1. when choosing a new leader from 2 followers in ISR, the one with less 
 messages may be chosen as the leader.
 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it 
 has less messages than the leader.
 3. ISR can contains only 1 broker, therefore acknowledged messages may be 
 stored in only 1 broker.
 The following is an analytical proof. 
 We consider a cluster with 3 brokers and a topic with 3 replicas, and assume 
 that at the beginning, all 3 replicas, leader A, followers B and C, are in 
 sync, i.e., they have the same messages and are all in ISR.
 According to the value of request.required.acks (acks for short), there are 
 the following cases.
 1. acks=0, 1, 3. Obviously 

[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability

2014-10-24 Thread Kyle Banker (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183393#comment-14183393
 ] 

Kyle Banker commented on KAFKA-1555:


I like the section on Availability and Durability Guarantees, but I believe 
that, in addition, it would be useful to suggest 3 or 4 typical durability 
configurations and the trade-offs provided by each one. As of now, users still 
have to infer from the docs the ideal settings for all of the following: topic 
replication factor, min.insync.replicas, request.required.acks, and whether or 
not to disable unclean leader election.

I'd be happy to write up a draft of these scenarios as I understand them if 
folks think this would be a good idea.

 provide strong consistency with reasonable availability
 ---

 Key: KAFKA-1555
 URL: https://issues.apache.org/jira/browse/KAFKA-1555
 Project: Kafka
  Issue Type: Improvement
  Components: controller
Affects Versions: 0.8.1.1
Reporter: Jiang Wu
Assignee: Gwen Shapira
 Fix For: 0.8.2

 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, 
 KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, 
 KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, 
 KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch


 In a mission critical application, we expect a kafka cluster with 3 brokers 
 can satisfy two requirements:
 1. When 1 broker is down, no message loss or service blocking happens.
 2. In worse cases such as two brokers are down, service can be blocked, but 
 no message loss happens.
 We found that current kafka versoin (0.8.1.1) cannot achieve the requirements 
 due to its three behaviors:
 1. when choosing a new leader from 2 followers in ISR, the one with less 
 messages may be chosen as the leader.
 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it 
 has less messages than the leader.
 3. ISR can contains only 1 broker, therefore acknowledged messages may be 
 stored in only 1 broker.
 The following is an analytical proof. 
 We consider a cluster with 3 brokers and a topic with 3 replicas, and assume 
 that at the beginning, all 3 replicas, leader A, followers B and C, are in 
 sync, i.e., they have the same messages and are all in ISR.
 According to the value of request.required.acks (acks for short), there are 
 the following cases.
 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement.
 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this 
 time, although C hasn't received m, C is still in ISR. If A is killed, C can 
 be elected as the new leader, and consumers will miss m.
 3. acks=-1. B and C restart and are removed from ISR. Producer sends a 
 message m to A, and receives an acknowledgement. Disk failure happens in A 
 before B and C replicate m. Message m is lost.
 In summary, any existing configuration cannot satisfy the requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1555) provide strong consistency with reasonable availability

2014-10-24 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira updated KAFKA-1555:

Attachment: KAFKA-1555-DOCS.2.patch

Uploading new docs, containing edits by [~kbanker] and [~jjkoshy].



 provide strong consistency with reasonable availability
 ---

 Key: KAFKA-1555
 URL: https://issues.apache.org/jira/browse/KAFKA-1555
 Project: Kafka
  Issue Type: Improvement
  Components: controller
Affects Versions: 0.8.1.1
Reporter: Jiang Wu
Assignee: Gwen Shapira
 Fix For: 0.8.2

 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, 
 KAFKA-1555-DOCS.2.patch, KAFKA-1555.0.patch, KAFKA-1555.1.patch, 
 KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, 
 KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, 
 KAFKA-1555.8.patch, KAFKA-1555.9.patch


 In a mission critical application, we expect a kafka cluster with 3 brokers 
 can satisfy two requirements:
 1. When 1 broker is down, no message loss or service blocking happens.
 2. In worse cases such as two brokers are down, service can be blocked, but 
 no message loss happens.
 We found that current kafka versoin (0.8.1.1) cannot achieve the requirements 
 due to its three behaviors:
 1. when choosing a new leader from 2 followers in ISR, the one with less 
 messages may be chosen as the leader.
 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it 
 has less messages than the leader.
 3. ISR can contains only 1 broker, therefore acknowledged messages may be 
 stored in only 1 broker.
 The following is an analytical proof. 
 We consider a cluster with 3 brokers and a topic with 3 replicas, and assume 
 that at the beginning, all 3 replicas, leader A, followers B and C, are in 
 sync, i.e., they have the same messages and are all in ISR.
 According to the value of request.required.acks (acks for short), there are 
 the following cases.
 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement.
 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this 
 time, although C hasn't received m, C is still in ISR. If A is killed, C can 
 be elected as the new leader, and consumers will miss m.
 3. acks=-1. B and C restart and are removed from ISR. Producer sends a 
 message m to A, and receives an acknowledgement. Disk failure happens in A 
 before B and C replicate m. Message m is lost.
 In summary, any existing configuration cannot satisfy the requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability

2014-10-24 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183411#comment-14183411
 ] 

Gwen Shapira commented on KAFKA-1555:
-

[~jjkoshy]:
* I did not make the suggested changes to ops.html since its not related to 
this JIRA and this line was not modified in this patch. (Actually, the line you 
refer to doesn't seem to exist in my copy of the docs...)

* The above should probably be clarified a bit. i.e., availability of a 
replica affects whether a message will be lost or not only during the time it 
is yet to be replicated to all assigned replicas. 

I've re-phrased the sentence per Kyle's suggestion. Let me know if it works now.

 provide strong consistency with reasonable availability
 ---

 Key: KAFKA-1555
 URL: https://issues.apache.org/jira/browse/KAFKA-1555
 Project: Kafka
  Issue Type: Improvement
  Components: controller
Affects Versions: 0.8.1.1
Reporter: Jiang Wu
Assignee: Gwen Shapira
 Fix For: 0.8.2

 Attachments: KAFKA-1555-DOCS.0.patch, KAFKA-1555-DOCS.1.patch, 
 KAFKA-1555-DOCS.2.patch, KAFKA-1555.0.patch, KAFKA-1555.1.patch, 
 KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, 
 KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, 
 KAFKA-1555.8.patch, KAFKA-1555.9.patch


 In a mission critical application, we expect a kafka cluster with 3 brokers 
 can satisfy two requirements:
 1. When 1 broker is down, no message loss or service blocking happens.
 2. In worse cases such as two brokers are down, service can be blocked, but 
 no message loss happens.
 We found that current kafka versoin (0.8.1.1) cannot achieve the requirements 
 due to its three behaviors:
 1. when choosing a new leader from 2 followers in ISR, the one with less 
 messages may be chosen as the leader.
 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it 
 has less messages than the leader.
 3. ISR can contains only 1 broker, therefore acknowledged messages may be 
 stored in only 1 broker.
 The following is an analytical proof. 
 We consider a cluster with 3 brokers and a topic with 3 replicas, and assume 
 that at the beginning, all 3 replicas, leader A, followers B and C, are in 
 sync, i.e., they have the same messages and are all in ISR.
 According to the value of request.required.acks (acks for short), there are 
 the following cases.
 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement.
 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this 
 time, although C hasn't received m, C is still in ISR. If A is killed, C can 
 be elected as the new leader, and consumers will miss m.
 3. acks=-1. B and C restart and are removed from ISR. Producer sends a 
 message m to A, and receives an acknowledgement. Disk failure happens in A 
 before B and C replicate m. Message m is lost.
 In summary, any existing configuration cannot satisfy the requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1710) [New Java Producer Potential Deadlock] Producer Deadlock when all messages is being sent to single partition

2014-10-24 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183514#comment-14183514
 ] 

Jay Kreps commented on KAFKA-1710:
--

Ah, gotcha, so that was per ms not per us. Question was this using compression? 
If so, which compression type (gzip, snappy, etc)?

 [New Java Producer Potential Deadlock] Producer Deadlock when all messages is 
 being sent to single partition
 

 Key: KAFKA-1710
 URL: https://issues.apache.org/jira/browse/KAFKA-1710
 Project: Kafka
  Issue Type: Bug
  Components: producer 
 Environment: Development
Reporter: Bhavesh Mistry
Assignee: Ewen Cheslack-Postava
Priority: Critical
  Labels: performance
 Attachments: Screen Shot 2014-10-13 at 10.19.04 AM.png, Screen Shot 
 2014-10-15 at 9.09.06 PM.png, Screen Shot 2014-10-15 at 9.14.15 PM.png, 
 TestNetworkDownProducer.java, th1.dump, th10.dump, th11.dump, th12.dump, 
 th13.dump, th14.dump, th15.dump, th2.dump, th3.dump, th4.dump, th5.dump, 
 th6.dump, th7.dump, th8.dump, th9.dump


 Hi Kafka Dev Team,
 When I run the test to send message to single partition for 3 minutes or so 
 on, I have encounter deadlock (please see the screen attached) and thread 
 contention from YourKit profiling.  
 Use Case:
 1)  Aggregating messages into same partition for metric counting. 
 2)  Replicate Old Producer behavior for sticking to partition for 3 minutes.
 Here is output:
 Frozen threads found (potential deadlock)
  
 It seems that the following threads have not changed their stack for more 
 than 10 seconds.
 These threads are possibly (but not necessarily!) in a deadlock or hung.
  
 pool-1-thread-128 --- Frozen for at least 2m
 org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition,
  byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139
 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, 
 Callback) KafkaProducer.java:237
 org.kafka.test.TestNetworkDownProducer$MyProducer.run() 
 TestNetworkDownProducer.java:84
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 
 ThreadPoolExecutor.java:1145
 java.util.concurrent.ThreadPoolExecutor$Worker.run() 
 ThreadPoolExecutor.java:615
 java.lang.Thread.run() Thread.java:744
 pool-1-thread-159 --- Frozen for at least 2m 1 sec
 org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition,
  byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139
 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, 
 Callback) KafkaProducer.java:237
 org.kafka.test.TestNetworkDownProducer$MyProducer.run() 
 TestNetworkDownProducer.java:84
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 
 ThreadPoolExecutor.java:1145
 java.util.concurrent.ThreadPoolExecutor$Worker.run() 
 ThreadPoolExecutor.java:615
 java.lang.Thread.run() Thread.java:744
 pool-1-thread-55 --- Frozen for at least 2m
 org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition,
  byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139
 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, 
 Callback) KafkaProducer.java:237
 org.kafka.test.TestNetworkDownProducer$MyProducer.run() 
 TestNetworkDownProducer.java:84
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 
 ThreadPoolExecutor.java:1145
 java.util.concurrent.ThreadPoolExecutor$Worker.run() 
 ThreadPoolExecutor.java:615
 java.lang.Thread.run() Thread.java:744
 Thanks,
 Bhavesh 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1710) [New Java Producer Potential Deadlock] Producer Deadlock when all messages is being sent to single partition

2014-10-24 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183545#comment-14183545
 ] 

Jay Kreps commented on KAFKA-1710:
--

The reason I ask is because the per-partition lock is held for the duration of 
the write to the buffer. In the case where compression is enabled that will be 
longer because the compression occurs as part of the write. So in the case 
where you have two partitions you are effectively getting two cpu cores for 
compression and if you have only one you get just one.

 [New Java Producer Potential Deadlock] Producer Deadlock when all messages is 
 being sent to single partition
 

 Key: KAFKA-1710
 URL: https://issues.apache.org/jira/browse/KAFKA-1710
 Project: Kafka
  Issue Type: Bug
  Components: producer 
 Environment: Development
Reporter: Bhavesh Mistry
Assignee: Ewen Cheslack-Postava
Priority: Critical
  Labels: performance
 Attachments: Screen Shot 2014-10-13 at 10.19.04 AM.png, Screen Shot 
 2014-10-15 at 9.09.06 PM.png, Screen Shot 2014-10-15 at 9.14.15 PM.png, 
 TestNetworkDownProducer.java, th1.dump, th10.dump, th11.dump, th12.dump, 
 th13.dump, th14.dump, th15.dump, th2.dump, th3.dump, th4.dump, th5.dump, 
 th6.dump, th7.dump, th8.dump, th9.dump


 Hi Kafka Dev Team,
 When I run the test to send message to single partition for 3 minutes or so 
 on, I have encounter deadlock (please see the screen attached) and thread 
 contention from YourKit profiling.  
 Use Case:
 1)  Aggregating messages into same partition for metric counting. 
 2)  Replicate Old Producer behavior for sticking to partition for 3 minutes.
 Here is output:
 Frozen threads found (potential deadlock)
  
 It seems that the following threads have not changed their stack for more 
 than 10 seconds.
 These threads are possibly (but not necessarily!) in a deadlock or hung.
  
 pool-1-thread-128 --- Frozen for at least 2m
 org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition,
  byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139
 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, 
 Callback) KafkaProducer.java:237
 org.kafka.test.TestNetworkDownProducer$MyProducer.run() 
 TestNetworkDownProducer.java:84
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 
 ThreadPoolExecutor.java:1145
 java.util.concurrent.ThreadPoolExecutor$Worker.run() 
 ThreadPoolExecutor.java:615
 java.lang.Thread.run() Thread.java:744
 pool-1-thread-159 --- Frozen for at least 2m 1 sec
 org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition,
  byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139
 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, 
 Callback) KafkaProducer.java:237
 org.kafka.test.TestNetworkDownProducer$MyProducer.run() 
 TestNetworkDownProducer.java:84
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 
 ThreadPoolExecutor.java:1145
 java.util.concurrent.ThreadPoolExecutor$Worker.run() 
 ThreadPoolExecutor.java:615
 java.lang.Thread.run() Thread.java:744
 pool-1-thread-55 --- Frozen for at least 2m
 org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition,
  byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139
 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, 
 Callback) KafkaProducer.java:237
 org.kafka.test.TestNetworkDownProducer$MyProducer.run() 
 TestNetworkDownProducer.java:84
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 
 ThreadPoolExecutor.java:1145
 java.util.concurrent.ThreadPoolExecutor$Worker.run() 
 ThreadPoolExecutor.java:615
 java.lang.Thread.run() Thread.java:744
 Thanks,
 Bhavesh 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1710) [New Java Producer Potential Deadlock] Producer Deadlock when all messages is being sent to single partition

2014-10-24 Thread Bhavesh Mistry (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183648#comment-14183648
 ] 

Bhavesh Mistry commented on KAFKA-1710:
---

[~jkreps],

Yes , I did this test with 75 threads and on My mac pro with 8 core with Snappy 
compression ON.  Do you have any idea how we can improve this enqueue for 
single partition ?  May be have x # of CPU active buffer ?

Here is info about the box:

{code}
machdep.cpu.max_basic: 13
machdep.cpu.max_ext: 2147483656
machdep.cpu.vendor: GenuineIntel
machdep.cpu.brand_string: Intel(R) Core(TM) i7-3840QM CPU @ 2.80GHz
machdep.cpu.family: 6
machdep.cpu.model: 58
machdep.cpu.extmodel: 3
machdep.cpu.extfamily: 0
machdep.cpu.stepping: 9
machdep.cpu.feature_bits: 3219913727 2142954495
machdep.cpu.leaf7_feature_bits: 641
machdep.cpu.extfeature_bits: 672139520 1
machdep.cpu.signature: 198313
machdep.cpu.brand: 0
machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA 
CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ 
DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC 
POPCNT AES PCID XSAVE OSXSAVE TSCTMR AVX1.0 RDRAND F16C
machdep.cpu.leaf7_features: SMEP ENFSTRG RDWRFSGS
machdep.cpu.extfeatures: SYSCALL XD EM64T LAHF RDTSCP TSCI
machdep.cpu.logical_per_package: 16
machdep.cpu.cores_per_package: 8
{code}

 [New Java Producer Potential Deadlock] Producer Deadlock when all messages is 
 being sent to single partition
 

 Key: KAFKA-1710
 URL: https://issues.apache.org/jira/browse/KAFKA-1710
 Project: Kafka
  Issue Type: Bug
  Components: producer 
 Environment: Development
Reporter: Bhavesh Mistry
Assignee: Ewen Cheslack-Postava
Priority: Critical
  Labels: performance
 Attachments: Screen Shot 2014-10-13 at 10.19.04 AM.png, Screen Shot 
 2014-10-15 at 9.09.06 PM.png, Screen Shot 2014-10-15 at 9.14.15 PM.png, 
 TestNetworkDownProducer.java, th1.dump, th10.dump, th11.dump, th12.dump, 
 th13.dump, th14.dump, th15.dump, th2.dump, th3.dump, th4.dump, th5.dump, 
 th6.dump, th7.dump, th8.dump, th9.dump


 Hi Kafka Dev Team,
 When I run the test to send message to single partition for 3 minutes or so 
 on, I have encounter deadlock (please see the screen attached) and thread 
 contention from YourKit profiling.  
 Use Case:
 1)  Aggregating messages into same partition for metric counting. 
 2)  Replicate Old Producer behavior for sticking to partition for 3 minutes.
 Here is output:
 Frozen threads found (potential deadlock)
  
 It seems that the following threads have not changed their stack for more 
 than 10 seconds.
 These threads are possibly (but not necessarily!) in a deadlock or hung.
  
 pool-1-thread-128 --- Frozen for at least 2m
 org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition,
  byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139
 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, 
 Callback) KafkaProducer.java:237
 org.kafka.test.TestNetworkDownProducer$MyProducer.run() 
 TestNetworkDownProducer.java:84
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 
 ThreadPoolExecutor.java:1145
 java.util.concurrent.ThreadPoolExecutor$Worker.run() 
 ThreadPoolExecutor.java:615
 java.lang.Thread.run() Thread.java:744
 pool-1-thread-159 --- Frozen for at least 2m 1 sec
 org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition,
  byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139
 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, 
 Callback) KafkaProducer.java:237
 org.kafka.test.TestNetworkDownProducer$MyProducer.run() 
 TestNetworkDownProducer.java:84
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 
 ThreadPoolExecutor.java:1145
 java.util.concurrent.ThreadPoolExecutor$Worker.run() 
 ThreadPoolExecutor.java:615
 java.lang.Thread.run() Thread.java:744
 pool-1-thread-55 --- Frozen for at least 2m
 org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition,
  byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139
 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, 
 Callback) KafkaProducer.java:237
 org.kafka.test.TestNetworkDownProducer$MyProducer.run() 
 TestNetworkDownProducer.java:84
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 
 ThreadPoolExecutor.java:1145
 java.util.concurrent.ThreadPoolExecutor$Worker.run() 
 ThreadPoolExecutor.java:615
 java.lang.Thread.run() Thread.java:744
 Thanks,
 Bhavesh 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1501) transient unit tests failures due to port already in use

2014-10-24 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183659#comment-14183659
 ] 

Guozhang Wang commented on KAFKA-1501:
--

Thanks Chris, I also got the same conclusion on my tests. Need some more time 
tackling the problem.

 transient unit tests failures due to port already in use
 

 Key: KAFKA-1501
 URL: https://issues.apache.org/jira/browse/KAFKA-1501
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Jun Rao
Assignee: Guozhang Wang
  Labels: newbie
 Attachments: KAFKA-1501.patch


 Saw the following transient failures.
 kafka.api.ProducerFailureHandlingTest  testTooLargeRecordWithAckOne FAILED
 kafka.common.KafkaException: Socket server failed to bind to 
 localhost:59909: Address already in use.
 at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
 at kafka.network.Acceptor.init(SocketServer.scala:141)
 at kafka.network.SocketServer.startup(SocketServer.scala:68)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
 at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
 at 
 kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


High Level Consumer Iterator IllegalStateException Issue

2014-10-24 Thread Bhavesh Mistry
HI Kafka Community ,

I am using kafka trunk source code and I get following exception.  What
could cause the iterator to have FAILED state.  Please let me know how I
can fix this issue.





*java.lang.IllegalStateException: Iterator is in failed stateat
kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:54)*
Here is Properties:

Properties props = new Properties();
props.put(zookeeper.connect, zkConnect);
props.put(group.id, groupId);
*props.put(consumer.timeout.ms http://consumer.timeout.ms,
-1);*
props.put(zookeeper.session.timeout.ms, 1);
props.put(zookeeper.sync.time.ms, 6000);
props.put(auto.commit.interval.ms, 2000);
props.put(rebalance.max.retries, 8);
props.put(auto.offset.reset, largest);
props.put(fetch.message.max.bytes,2097152);
props.put(socket.receive.buffer.bytes,2097152);
props.put(auto.commit.enable,true);


Thanks,

Bhavesh


[jira] [Commented] (KAFKA-1710) [New Java Producer Potential Deadlock] Producer Deadlock when all messages is being sent to single partition

2014-10-24 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183775#comment-14183775
 ] 

Jay Kreps commented on KAFKA-1710:
--

You could try some profiling and see if you see any implementation bottlenecks.

I don't think we can fundamentally reengineer this piece or move the 
compression outside the lock. The reason being that you have multiple threads 
that want to write to a shared byte array. We need to synchronized access to 
ensure safety (otherwise they would overwrite each others data). Furthermore 
since this is batch compression we are compressing into the destination array 
using a compressor used for the prior messages. This batch compression is very 
important to get a good compression ratio as it allows redundancy between 
messages to be exploited. 

 [New Java Producer Potential Deadlock] Producer Deadlock when all messages is 
 being sent to single partition
 

 Key: KAFKA-1710
 URL: https://issues.apache.org/jira/browse/KAFKA-1710
 Project: Kafka
  Issue Type: Bug
  Components: producer 
 Environment: Development
Reporter: Bhavesh Mistry
Assignee: Ewen Cheslack-Postava
Priority: Critical
  Labels: performance
 Attachments: Screen Shot 2014-10-13 at 10.19.04 AM.png, Screen Shot 
 2014-10-15 at 9.09.06 PM.png, Screen Shot 2014-10-15 at 9.14.15 PM.png, 
 TestNetworkDownProducer.java, th1.dump, th10.dump, th11.dump, th12.dump, 
 th13.dump, th14.dump, th15.dump, th2.dump, th3.dump, th4.dump, th5.dump, 
 th6.dump, th7.dump, th8.dump, th9.dump


 Hi Kafka Dev Team,
 When I run the test to send message to single partition for 3 minutes or so 
 on, I have encounter deadlock (please see the screen attached) and thread 
 contention from YourKit profiling.  
 Use Case:
 1)  Aggregating messages into same partition for metric counting. 
 2)  Replicate Old Producer behavior for sticking to partition for 3 minutes.
 Here is output:
 Frozen threads found (potential deadlock)
  
 It seems that the following threads have not changed their stack for more 
 than 10 seconds.
 These threads are possibly (but not necessarily!) in a deadlock or hung.
  
 pool-1-thread-128 --- Frozen for at least 2m
 org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition,
  byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139
 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, 
 Callback) KafkaProducer.java:237
 org.kafka.test.TestNetworkDownProducer$MyProducer.run() 
 TestNetworkDownProducer.java:84
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 
 ThreadPoolExecutor.java:1145
 java.util.concurrent.ThreadPoolExecutor$Worker.run() 
 ThreadPoolExecutor.java:615
 java.lang.Thread.run() Thread.java:744
 pool-1-thread-159 --- Frozen for at least 2m 1 sec
 org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition,
  byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139
 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, 
 Callback) KafkaProducer.java:237
 org.kafka.test.TestNetworkDownProducer$MyProducer.run() 
 TestNetworkDownProducer.java:84
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 
 ThreadPoolExecutor.java:1145
 java.util.concurrent.ThreadPoolExecutor$Worker.run() 
 ThreadPoolExecutor.java:615
 java.lang.Thread.run() Thread.java:744
 pool-1-thread-55 --- Frozen for at least 2m
 org.apache.kafka.clients.producer.internals.RecordAccumulator.append(TopicPartition,
  byte[], byte[], CompressionType, Callback) RecordAccumulator.java:139
 org.apache.kafka.clients.producer.KafkaProducer.send(ProducerRecord, 
 Callback) KafkaProducer.java:237
 org.kafka.test.TestNetworkDownProducer$MyProducer.run() 
 TestNetworkDownProducer.java:84
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 
 ThreadPoolExecutor.java:1145
 java.util.concurrent.ThreadPoolExecutor$Worker.run() 
 ThreadPoolExecutor.java:615
 java.lang.Thread.run() Thread.java:744
 Thanks,
 Bhavesh 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: High Level Consumer Iterator IllegalStateException Issue

2014-10-24 Thread Neha Narkhede
Which version of Kafka are you using on the consumer?

On Fri, Oct 24, 2014 at 4:14 PM, Bhavesh Mistry mistry.p.bhav...@gmail.com
wrote:

 HI Kafka Community ,

 I am using kafka trunk source code and I get following exception.  What
 could cause the iterator to have FAILED state.  Please let me know how I
 can fix this issue.





 *java.lang.IllegalStateException: Iterator is in failed stateat
 kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:54)*
 Here is Properties:

 Properties props = new Properties();
 props.put(zookeeper.connect, zkConnect);
 props.put(group.id, groupId);
 *props.put(consumer.timeout.ms http://consumer.timeout.ms,
 -1);*
 props.put(zookeeper.session.timeout.ms, 1);
 props.put(zookeeper.sync.time.ms, 6000);
 props.put(auto.commit.interval.ms, 2000);
 props.put(rebalance.max.retries, 8);
 props.put(auto.offset.reset, largest);
 props.put(fetch.message.max.bytes,2097152);
 props.put(socket.receive.buffer.bytes,2097152);
 props.put(auto.commit.enable,true);


 Thanks,

 Bhavesh



[jira] [Commented] (KAFKA-1501) transient unit tests failures due to port already in use

2014-10-24 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183787#comment-14183787
 ] 

Jay Kreps commented on KAFKA-1501:
--

Oh, man, my heart is broken, I was sure that was going to work.

It looks like a few projects do a few extra steps to check availability 
including creating a datagram socket using the given port. For example:
http://stackoverflow.com/questions/434718/sockets-discover-port-availability-using-java

The comments mention something in Apache Mina which is essentially the same.

I have no idea why that is needed, but maybe worth trying that in choosePorts? 
I like the idea of improving choosePorts rather than changing SocketServer...

 transient unit tests failures due to port already in use
 

 Key: KAFKA-1501
 URL: https://issues.apache.org/jira/browse/KAFKA-1501
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Jun Rao
Assignee: Guozhang Wang
  Labels: newbie
 Attachments: KAFKA-1501.patch


 Saw the following transient failures.
 kafka.api.ProducerFailureHandlingTest  testTooLargeRecordWithAckOne FAILED
 kafka.common.KafkaException: Socket server failed to bind to 
 localhost:59909: Address already in use.
 at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
 at kafka.network.Acceptor.init(SocketServer.scala:141)
 at kafka.network.SocketServer.startup(SocketServer.scala:68)
 at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
 at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
 at 
 kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)