date:20141010


 [ 
https://issues.apache.org/jira/browse/KAFKA-784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-784:
-
Fix Version/s: 0.8.2

 creating topic without partitions, deleting then creating with partition 
 causes errors in 'kafka-list-topic'
 

 Key: KAFKA-784
 URL: https://issues.apache.org/jira/browse/KAFKA-784
 Project: Kafka
  Issue Type: Sub-task
  Components: core
Affects Versions: 0.8.0
 Environment: 0.8.0 head as of 3/4/2013
Reporter: Chris Curtin
Assignee: Swapnil Ghike
Priority: Minor
 Fix For: 0.8.2


 Create a new topic using the command line:
 ./kafka-create-topic.sh --topic trash-1 --replica 3 --zookeeper localhost
 Realize you forgot to add the partition command, so remove it:
 ./kafka-delete-topic.sh --topic trash-1 --zookeeper localhost
 Recreate it with partitions:
 ./kafka-create-topic.sh --topic trash-1 --replica 3 --zookeeper localhost 
 --partition 5
 Try to get a listing:
 ./kafka-list-topic.sh --topic trash-1 --zookeeper localhost
 Errors:
 [2013-03-04 14:15:23,876] ERROR Error while fetching metadata for partition 
 [trash-1,0] (kafka.admin.AdminUtils$)
 kafka.common.LeaderNotAvailableException: Leader not available for topic 
 trash-1 partition 0
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
 at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at 
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
 at scala.collection.immutable.List.foreach(List.scala:45)
 at 
 scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
 at scala.collection.immutable.List.map(List.scala:45)
 at 
 kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
 at 
 kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
 at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
 at 
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
 at 
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
 at 
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
 at scala.collection.immutable.List.foreach(List.scala:45)
 at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65)
 at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala)
 Caused by: kafka.common.LeaderNotAvailableException: No leader exists for 
 partition 0
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117)
 ... 16 more
 topic: trash-1
 PartitionMetadata(0,None,List(),List(),5)
 Can't recover until you restart all the Brokers in the cluster. Then the list 
 command works correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (KAFKA-784) creating topic without partitions, deleting then creating with partition causes errors in 'kafka-list-topic'


 [ 
https://issues.apache.org/jira/browse/KAFKA-784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani resolved KAFKA-784.
--
Resolution: Fixed

 creating topic without partitions, deleting then creating with partition 
 causes errors in 'kafka-list-topic'
 

 Key: KAFKA-784
 URL: https://issues.apache.org/jira/browse/KAFKA-784
 Project: Kafka
  Issue Type: Sub-task
  Components: core
Affects Versions: 0.8.0
 Environment: 0.8.0 head as of 3/4/2013
Reporter: Chris Curtin
Assignee: Swapnil Ghike
Priority: Minor
 Fix For: 0.8.2


 Create a new topic using the command line:
 ./kafka-create-topic.sh --topic trash-1 --replica 3 --zookeeper localhost
 Realize you forgot to add the partition command, so remove it:
 ./kafka-delete-topic.sh --topic trash-1 --zookeeper localhost
 Recreate it with partitions:
 ./kafka-create-topic.sh --topic trash-1 --replica 3 --zookeeper localhost 
 --partition 5
 Try to get a listing:
 ./kafka-list-topic.sh --topic trash-1 --zookeeper localhost
 Errors:
 [2013-03-04 14:15:23,876] ERROR Error while fetching metadata for partition 
 [trash-1,0] (kafka.admin.AdminUtils$)
 kafka.common.LeaderNotAvailableException: Leader not available for topic 
 trash-1 partition 0
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
 at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at 
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
 at scala.collection.immutable.List.foreach(List.scala:45)
 at 
 scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
 at scala.collection.immutable.List.map(List.scala:45)
 at 
 kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
 at 
 kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
 at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
 at 
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
 at 
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
 at 
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
 at scala.collection.immutable.List.foreach(List.scala:45)
 at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65)
 at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala)
 Caused by: kafka.common.LeaderNotAvailableException: No leader exists for 
 partition 0
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117)
 ... 16 more
 topic: trash-1
 PartitionMetadata(0,None,List(),List(),5)
 Can't recover until you restart all the Brokers in the cluster. Then the list 
 command works correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-784) creating topic without partitions, deleting then creating with partition causes errors in 'kafka-list-topic'


[ 
https://issues.apache.org/jira/browse/KAFKA-784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166988#comment-14166988
 ] 

Sriharsha Chintalapani commented on KAFKA-784:
--

[~liqusha] with kafka 0.8.2 branch or trunk . kafka-topics won't allow creation 
of topic without --partitions and --replication-factor. Once the topic is 
created , deleting the topic and re-creating works fine. Also tested this by 
increasing the topic partitions and deleting it works fine. Closing this as 
resolved.

 creating topic without partitions, deleting then creating with partition 
 causes errors in 'kafka-list-topic'
 

 Key: KAFKA-784
 URL: https://issues.apache.org/jira/browse/KAFKA-784
 Project: Kafka
  Issue Type: Sub-task
  Components: core
Affects Versions: 0.8.0
 Environment: 0.8.0 head as of 3/4/2013
Reporter: Chris Curtin
Assignee: Swapnil Ghike
Priority: Minor
 Fix For: 0.8.2


 Create a new topic using the command line:
 ./kafka-create-topic.sh --topic trash-1 --replica 3 --zookeeper localhost
 Realize you forgot to add the partition command, so remove it:
 ./kafka-delete-topic.sh --topic trash-1 --zookeeper localhost
 Recreate it with partitions:
 ./kafka-create-topic.sh --topic trash-1 --replica 3 --zookeeper localhost 
 --partition 5
 Try to get a listing:
 ./kafka-list-topic.sh --topic trash-1 --zookeeper localhost
 Errors:
 [2013-03-04 14:15:23,876] ERROR Error while fetching metadata for partition 
 [trash-1,0] (kafka.admin.AdminUtils$)
 kafka.common.LeaderNotAvailableException: Leader not available for topic 
 trash-1 partition 0
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
 at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
 at 
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
 at scala.collection.immutable.List.foreach(List.scala:45)
 at 
 scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
 at scala.collection.immutable.List.map(List.scala:45)
 at 
 kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
 at 
 kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
 at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
 at 
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
 at 
 kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
 at 
 scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
 at scala.collection.immutable.List.foreach(List.scala:45)
 at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65)
 at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala)
 Caused by: kafka.common.LeaderNotAvailableException: No leader exists for 
 partition 0
 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117)
 ... 16 more
 topic: trash-1
 PartitionMetadata(0,None,List(),List(),5)
 Can't recover until you restart all the Brokers in the cluster. Then the list 
 command works correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 26560: Patch for KAFKA-1305

2014-10-10 Thread Sriharsha Chintalapani


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26560/
---

Review request for kafka.


Bugs: KAFKA-1305
https://issues.apache.org/jira/browse/KAFKA-1305


Repository: kafka


Description
---

KAFKA-1305. Controller can hang on controlled shutdown with auto leader balance 
enabled.


Diffs
-

  core/src/main/scala/kafka/server/KafkaConfig.scala 
90af698b01ec82b6168e02b6af41887ef164ad51 

Diff: https://reviews.apache.org/r/26560/diff/


Testing
---


Thanks,

Sriharsha Chintalapani

[jira] [Commented] (KAFKA-1305) Controller can hang on controlled shutdown with auto leader balance enabled


[ 
https://issues.apache.org/jira/browse/KAFKA-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167017#comment-14167017
 ] 

Sriharsha Chintalapani commented on KAFKA-1305:
---

Created reviewboard https://reviews.apache.org/r/26560/diff/
 against branch origin/trunk

 Controller can hang on controlled shutdown with auto leader balance enabled
 ---

 Key: KAFKA-1305
 URL: https://issues.apache.org/jira/browse/KAFKA-1305
 Project: Kafka
  Issue Type: Bug
Reporter: Joel Koshy
Assignee: Sriharsha Chintalapani
Priority: Blocker
 Fix For: 0.8.2, 0.9.0

 Attachments: KAFKA-1305.patch


 This is relatively easy to reproduce especially when doing a rolling bounce.
 What happened here is as follows:
 1. The previous controller was bounced and broker 265 became the new 
 controller.
 2. I went on to do a controlled shutdown of broker 265 (the new controller).
 3. In the mean time the automatically scheduled preferred replica leader 
 election process started doing its thing and starts sending 
 LeaderAndIsrRequests/UpdateMetadataRequests to itself (and other brokers).  
 (t@113 below).
 4. While that's happening, the controlled shutdown process on 265 succeeds 
 and proceeds to deregister itself from ZooKeeper and shuts down the socket 
 server.
 5. (ReplicaStateMachine actually removes deregistered brokers from the 
 controller channel manager's list of brokers to send requests to.  However, 
 that removal cannot take place (t@18 below) because preferred replica leader 
 election task owns the controller lock.)
 6. So the request thread to broker 265 gets into infinite retries.
 7. The entire broker shutdown process is blocked on controller shutdown for 
 the same reason (it needs to acquire the controller lock).
 Relevant portions from the thread-dump:
 Controller-265-to-broker-265-send-thread - Thread t@113
java.lang.Thread.State: TIMED_WAITING
   at java.lang.Thread.sleep(Native Method)
   at 
 kafka.controller.RequestSendThread$$anonfun$liftedTree1$1$1.apply$mcV$sp(ControllerChannelManager.scala:143)
   at kafka.utils.Utils$.swallow(Utils.scala:167)
   at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
   at kafka.utils.Utils$.swallowWarn(Utils.scala:46)
   at kafka.utils.Logging$class.swallow(Logging.scala:94)
   at kafka.utils.Utils$.swallow(Utils.scala:46)
   at 
 kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:143)
   at 
 kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
   - locked java.lang.Object@6dbf14a7
   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
Locked ownable synchronizers:
   - None
 ...
 Thread-4 - Thread t@17
java.lang.Thread.State: WAITING on 
 java.util.concurrent.locks.ReentrantLock$NonfairSync@4836840 owned by: 
 kafka-scheduler-0
   at sun.misc.Unsafe.park(Native Method)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
   at 
 java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
   at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
   at kafka.utils.Utils$.inLock(Utils.scala:536)
   at kafka.controller.KafkaController.shutdown(KafkaController.scala:642)
   at 
 kafka.server.KafkaServer$$anonfun$shutdown$9.apply$mcV$sp(KafkaServer.scala:242)
   at kafka.utils.Utils$.swallow(Utils.scala:167)
   at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
   at kafka.utils.Utils$.swallowWarn(Utils.scala:46)
   at kafka.utils.Logging$class.swallow(Logging.scala:94)
   at kafka.utils.Utils$.swallow(Utils.scala:46)
   at kafka.server.KafkaServer.shutdown(KafkaServer.scala:242)
   at 
 kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:46)
   at kafka.Kafka$$anon$1.run(Kafka.scala:42)
 ...
 kafka-scheduler-0 - Thread t@117
java.lang.Thread.State: WAITING on 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1dc407fc
   at sun.misc.Unsafe.park(Native Method)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
   at 
 java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:306)
   at

[jira] [Updated] (KAFKA-1305) Controller can hang on controlled shutdown with auto leader balance enabled


 [ 
https://issues.apache.org/jira/browse/KAFKA-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1305:
--
Attachment: KAFKA-1305.patch

 Controller can hang on controlled shutdown with auto leader balance enabled
 ---

 Key: KAFKA-1305
 URL: https://issues.apache.org/jira/browse/KAFKA-1305
 Project: Kafka
  Issue Type: Bug
Reporter: Joel Koshy
Assignee: Sriharsha Chintalapani
Priority: Blocker
 Fix For: 0.8.2, 0.9.0

 Attachments: KAFKA-1305.patch


 This is relatively easy to reproduce especially when doing a rolling bounce.
 What happened here is as follows:
 1. The previous controller was bounced and broker 265 became the new 
 controller.
 2. I went on to do a controlled shutdown of broker 265 (the new controller).
 3. In the mean time the automatically scheduled preferred replica leader 
 election process started doing its thing and starts sending 
 LeaderAndIsrRequests/UpdateMetadataRequests to itself (and other brokers).  
 (t@113 below).
 4. While that's happening, the controlled shutdown process on 265 succeeds 
 and proceeds to deregister itself from ZooKeeper and shuts down the socket 
 server.
 5. (ReplicaStateMachine actually removes deregistered brokers from the 
 controller channel manager's list of brokers to send requests to.  However, 
 that removal cannot take place (t@18 below) because preferred replica leader 
 election task owns the controller lock.)
 6. So the request thread to broker 265 gets into infinite retries.
 7. The entire broker shutdown process is blocked on controller shutdown for 
 the same reason (it needs to acquire the controller lock).
 Relevant portions from the thread-dump:
 Controller-265-to-broker-265-send-thread - Thread t@113
java.lang.Thread.State: TIMED_WAITING
   at java.lang.Thread.sleep(Native Method)
   at 
 kafka.controller.RequestSendThread$$anonfun$liftedTree1$1$1.apply$mcV$sp(ControllerChannelManager.scala:143)
   at kafka.utils.Utils$.swallow(Utils.scala:167)
   at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
   at kafka.utils.Utils$.swallowWarn(Utils.scala:46)
   at kafka.utils.Logging$class.swallow(Logging.scala:94)
   at kafka.utils.Utils$.swallow(Utils.scala:46)
   at 
 kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:143)
   at 
 kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
   - locked java.lang.Object@6dbf14a7
   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
Locked ownable synchronizers:
   - None
 ...
 Thread-4 - Thread t@17
java.lang.Thread.State: WAITING on 
 java.util.concurrent.locks.ReentrantLock$NonfairSync@4836840 owned by: 
 kafka-scheduler-0
   at sun.misc.Unsafe.park(Native Method)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
   at 
 java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
   at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
   at kafka.utils.Utils$.inLock(Utils.scala:536)
   at kafka.controller.KafkaController.shutdown(KafkaController.scala:642)
   at 
 kafka.server.KafkaServer$$anonfun$shutdown$9.apply$mcV$sp(KafkaServer.scala:242)
   at kafka.utils.Utils$.swallow(Utils.scala:167)
   at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
   at kafka.utils.Utils$.swallowWarn(Utils.scala:46)
   at kafka.utils.Logging$class.swallow(Logging.scala:94)
   at kafka.utils.Utils$.swallow(Utils.scala:46)
   at kafka.server.KafkaServer.shutdown(KafkaServer.scala:242)
   at 
 kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:46)
   at kafka.Kafka$$anon$1.run(Kafka.scala:42)
 ...
 kafka-scheduler-0 - Thread t@117
java.lang.Thread.State: WAITING on 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1dc407fc
   at sun.misc.Unsafe.park(Native Method)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
   at 
 java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:306)
   at 
 kafka.controller.ControllerChannelManager.sendRequest(ControllerChannelManager.scala:57)
   - locked

[jira] [Updated] (KAFKA-1305) Controller can hang on controlled shutdown with auto leader balance enabled


 [ 
https://issues.apache.org/jira/browse/KAFKA-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-1305:
--
Status: Patch Available  (was: Open)

 Controller can hang on controlled shutdown with auto leader balance enabled
 ---

 Key: KAFKA-1305
 URL: https://issues.apache.org/jira/browse/KAFKA-1305
 Project: Kafka
  Issue Type: Bug
Reporter: Joel Koshy
Assignee: Sriharsha Chintalapani
Priority: Blocker
 Fix For: 0.8.2, 0.9.0

 Attachments: KAFKA-1305.patch


 This is relatively easy to reproduce especially when doing a rolling bounce.
 What happened here is as follows:
 1. The previous controller was bounced and broker 265 became the new 
 controller.
 2. I went on to do a controlled shutdown of broker 265 (the new controller).
 3. In the mean time the automatically scheduled preferred replica leader 
 election process started doing its thing and starts sending 
 LeaderAndIsrRequests/UpdateMetadataRequests to itself (and other brokers).  
 (t@113 below).
 4. While that's happening, the controlled shutdown process on 265 succeeds 
 and proceeds to deregister itself from ZooKeeper and shuts down the socket 
 server.
 5. (ReplicaStateMachine actually removes deregistered brokers from the 
 controller channel manager's list of brokers to send requests to.  However, 
 that removal cannot take place (t@18 below) because preferred replica leader 
 election task owns the controller lock.)
 6. So the request thread to broker 265 gets into infinite retries.
 7. The entire broker shutdown process is blocked on controller shutdown for 
 the same reason (it needs to acquire the controller lock).
 Relevant portions from the thread-dump:
 Controller-265-to-broker-265-send-thread - Thread t@113
java.lang.Thread.State: TIMED_WAITING
   at java.lang.Thread.sleep(Native Method)
   at 
 kafka.controller.RequestSendThread$$anonfun$liftedTree1$1$1.apply$mcV$sp(ControllerChannelManager.scala:143)
   at kafka.utils.Utils$.swallow(Utils.scala:167)
   at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
   at kafka.utils.Utils$.swallowWarn(Utils.scala:46)
   at kafka.utils.Logging$class.swallow(Logging.scala:94)
   at kafka.utils.Utils$.swallow(Utils.scala:46)
   at 
 kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:143)
   at 
 kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
   - locked java.lang.Object@6dbf14a7
   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
Locked ownable synchronizers:
   - None
 ...
 Thread-4 - Thread t@17
java.lang.Thread.State: WAITING on 
 java.util.concurrent.locks.ReentrantLock$NonfairSync@4836840 owned by: 
 kafka-scheduler-0
   at sun.misc.Unsafe.park(Native Method)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
   at 
 java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
   at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
   at kafka.utils.Utils$.inLock(Utils.scala:536)
   at kafka.controller.KafkaController.shutdown(KafkaController.scala:642)
   at 
 kafka.server.KafkaServer$$anonfun$shutdown$9.apply$mcV$sp(KafkaServer.scala:242)
   at kafka.utils.Utils$.swallow(Utils.scala:167)
   at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
   at kafka.utils.Utils$.swallowWarn(Utils.scala:46)
   at kafka.utils.Logging$class.swallow(Logging.scala:94)
   at kafka.utils.Utils$.swallow(Utils.scala:46)
   at kafka.server.KafkaServer.shutdown(KafkaServer.scala:242)
   at 
 kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:46)
   at kafka.Kafka$$anon$1.run(Kafka.scala:42)
 ...
 kafka-scheduler-0 - Thread t@117
java.lang.Thread.State: WAITING on 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1dc407fc
   at sun.misc.Unsafe.park(Native Method)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
   at 
 java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:306)
   at 
 kafka.controller.ControllerChannelManager.sendRequest(ControllerChannelManager.scala:57)
   - locked

Re: including KAFKA-1555 in 0.8.2?

2014-10-10 Thread Neha Narkhede

+1.

On Thu, Oct 9, 2014 at 6:41 PM, Jun Rao jun...@gmail.com wrote:

 Hi, Everyone,

 I just committed KAFKA-1555 (min.isr support) to trunk. I felt that it's
 probably useful to include it in the 0.8.2 release. Any objections?

 Thanks,

 Jun

[jira] [Commented] (KAFKA-1305) Controller can hang on controlled shutdown with auto leader balance enabled


[ 
https://issues.apache.org/jira/browse/KAFKA-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167050#comment-14167050
 ] 

Neha Narkhede commented on KAFKA-1305:
--

[~junrao], [~sriharsha] What's the value in changing it from something to 10K 
vs unbounded?

 Controller can hang on controlled shutdown with auto leader balance enabled
 ---

 Key: KAFKA-1305
 URL: https://issues.apache.org/jira/browse/KAFKA-1305
 Project: Kafka
  Issue Type: Bug
Reporter: Joel Koshy
Assignee: Sriharsha Chintalapani
Priority: Blocker
 Fix For: 0.8.2, 0.9.0

 Attachments: KAFKA-1305.patch


 This is relatively easy to reproduce especially when doing a rolling bounce.
 What happened here is as follows:
 1. The previous controller was bounced and broker 265 became the new 
 controller.
 2. I went on to do a controlled shutdown of broker 265 (the new controller).
 3. In the mean time the automatically scheduled preferred replica leader 
 election process started doing its thing and starts sending 
 LeaderAndIsrRequests/UpdateMetadataRequests to itself (and other brokers).  
 (t@113 below).
 4. While that's happening, the controlled shutdown process on 265 succeeds 
 and proceeds to deregister itself from ZooKeeper and shuts down the socket 
 server.
 5. (ReplicaStateMachine actually removes deregistered brokers from the 
 controller channel manager's list of brokers to send requests to.  However, 
 that removal cannot take place (t@18 below) because preferred replica leader 
 election task owns the controller lock.)
 6. So the request thread to broker 265 gets into infinite retries.
 7. The entire broker shutdown process is blocked on controller shutdown for 
 the same reason (it needs to acquire the controller lock).
 Relevant portions from the thread-dump:
 Controller-265-to-broker-265-send-thread - Thread t@113
java.lang.Thread.State: TIMED_WAITING
   at java.lang.Thread.sleep(Native Method)
   at 
 kafka.controller.RequestSendThread$$anonfun$liftedTree1$1$1.apply$mcV$sp(ControllerChannelManager.scala:143)
   at kafka.utils.Utils$.swallow(Utils.scala:167)
   at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
   at kafka.utils.Utils$.swallowWarn(Utils.scala:46)
   at kafka.utils.Logging$class.swallow(Logging.scala:94)
   at kafka.utils.Utils$.swallow(Utils.scala:46)
   at 
 kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:143)
   at 
 kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
   - locked java.lang.Object@6dbf14a7
   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
Locked ownable synchronizers:
   - None
 ...
 Thread-4 - Thread t@17
java.lang.Thread.State: WAITING on 
 java.util.concurrent.locks.ReentrantLock$NonfairSync@4836840 owned by: 
 kafka-scheduler-0
   at sun.misc.Unsafe.park(Native Method)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
   at 
 java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
   at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
   at kafka.utils.Utils$.inLock(Utils.scala:536)
   at kafka.controller.KafkaController.shutdown(KafkaController.scala:642)
   at 
 kafka.server.KafkaServer$$anonfun$shutdown$9.apply$mcV$sp(KafkaServer.scala:242)
   at kafka.utils.Utils$.swallow(Utils.scala:167)
   at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
   at kafka.utils.Utils$.swallowWarn(Utils.scala:46)
   at kafka.utils.Logging$class.swallow(Logging.scala:94)
   at kafka.utils.Utils$.swallow(Utils.scala:46)
   at kafka.server.KafkaServer.shutdown(KafkaServer.scala:242)
   at 
 kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:46)
   at kafka.Kafka$$anon$1.run(Kafka.scala:42)
 ...
 kafka-scheduler-0 - Thread t@117
java.lang.Thread.State: WAITING on 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1dc407fc
   at sun.misc.Unsafe.park(Native Method)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
   at 
 java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:306)
   at

Re: Review Request 26560: Patch for KAFKA-1305

2014-10-10 Thread Neha Narkhede


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26560/#review56156
---



core/src/main/scala/kafka/server/KafkaConfig.scala
https://reviews.apache.org/r/26560/#comment96475

Weren't we going to make this unbounded?


- Neha Narkhede


On Oct. 10, 2014, 3:51 p.m., Sriharsha Chintalapani wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/26560/
 ---
 
 (Updated Oct. 10, 2014, 3:51 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1305
 https://issues.apache.org/jira/browse/KAFKA-1305
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1305. Controller can hang on controlled shutdown with auto leader 
 balance enabled.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/server/KafkaConfig.scala 
 90af698b01ec82b6168e02b6af41887ef164ad51 
 
 Diff: https://reviews.apache.org/r/26560/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sriharsha Chintalapani

[jira] [Comment Edited] (KAFKA-1697) remove code related to ack1 on the broker


[ 
https://issues.apache.org/jira/browse/KAFKA-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166144#comment-14166144
 ] 

Neha Narkhede edited comment on KAFKA-1697 at 10/10/14 4:12 PM:


We probably can wait under KAFKA-1583 is done since code change will likely be 
easier then.


was (Author: junrao):
We probably can wait under kafka-1583 is done since code change will likely be 
easier then.

 remove code related to ack1 on the broker
 --

 Key: KAFKA-1697
 URL: https://issues.apache.org/jira/browse/KAFKA-1697
 Project: Kafka
  Issue Type: Bug
Reporter: Jun Rao
Assignee: Gwen Shapira
 Fix For: 0.8.3


 We removed the ack1 support from the producer client in kafka-1555. We can 
 completely remove the code in the broker that supports ack1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability

2014-10-10 Thread Jay Kreps (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167057#comment-14167057
]

Jay Kreps commented on KAFKA-1555:
--

w00t!

provide strong consistency with reasonable availability
---

Key: KAFKA-1555
URL: https://issues.apache.org/jira/browse/KAFKA-1555
Project: Kafka
Issue Type: Improvement
Components: controller
Affects Versions: 0.8.1.1
Reporter: Jiang Wu
Assignee: Gwen Shapira
Fix For: 0.8.2

Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch,
KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch,
KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch,
KAFKA-1555.8.patch, KAFKA-1555.9.patch

In a mission critical application, we expect a kafka cluster with 3 brokers
can satisfy two requirements:
1. When 1 broker is down, no message loss or service blocking happens.
2. In worse cases such as two brokers are down, service can be blocked, but
no message loss happens.
We found that current kafka versoin (0.8.1.1) cannot achieve the requirements
due to its three behaviors:
1. when choosing a new leader from 2 followers in ISR, the one with less
messages may be chosen as the leader.
2. even when replica.lag.max.messages=0, a follower can stay in ISR when it
has less messages than the leader.
3. ISR can contains only 1 broker, therefore acknowledged messages may be
stored in only 1 broker.
The following is an analytical proof.
We consider a cluster with 3 brokers and a topic with 3 replicas, and assume
that at the beginning, all 3 replicas, leader A, followers B and C, are in
sync, i.e., they have the same messages and are all in ISR.
According to the value of request.required.acks (acks for short), there are
the following cases.
1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement.
2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this
time, although C hasn't received m, C is still in ISR. If A is killed, C can
be elected as the new leader, and consumers will miss m.
3. acks=-1. B and C restart and are removed from ISR. Producer sends a
message m to A, and receives an acknowledgement. Disk failure happens in A
before B and C replicate m. Message m is lost.
In summary, any existing configuration cannot satisfy the requirements.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 26474: KAFKA-1654 Provide a way to override server configuration from command line

2014-10-10 Thread Jarek Cecho


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26474/
---

(Updated Oct. 10, 2014, 4:23 p.m.)


Review request for kafka and Neha Narkhede.


Bugs: SQOOP-1654
https://issues.apache.org/jira/browse/SQOOP-1654


Repository: kafka


Description
---

I'm assuming that we might want to add additional arguments in the future as 
well, so I've added general facility to parse arguments to Kafka main class and 
added argument --set that defines/overrides any property in the config file. 
I've decided to use --set rather then exposing each property that is availalbe 
in KafkaConfig class as it's own argument, so that we don't have to keep those 
two classes always in sync.

This is first bigger patch that I've written in Scala, so I'm particularly 
interested to hear feedback on the coding style.


Diffs
-

  core/src/main/scala/kafka/Kafka.scala 2e94fee 
  core/src/test/scala/unit/kafka/KafkaTest.scala PRE-CREATION 

Diff: https://reviews.apache.org/r/26474/diff/


Testing
---

I've added unit tests and verified the functionality on real cluster.


Thanks,

Jarek Cecho

Re: including KAFKA-1555 in 0.8.2?

2014-10-10 Thread Joe Stein

+1
 On Oct 10, 2014 12:08 PM, Neha Narkhede neha.narkh...@gmail.com wrote:

 +1.

 On Thu, Oct 9, 2014 at 6:41 PM, Jun Rao jun...@gmail.com wrote:

  Hi, Everyone,
 
  I just committed KAFKA-1555 (min.isr support) to trunk. I felt that it's
  probably useful to include it in the 0.8.2 release. Any objections?
 
  Thanks,
 
  Jun

Re: including KAFKA-1555 in 0.8.2?

2014-10-10 Thread Gwen Shapira

+1 :)

On Fri, Oct 10, 2014 at 9:34 AM, Joe Stein joe.st...@stealth.ly wrote:
 +1
  On Oct 10, 2014 12:08 PM, Neha Narkhede neha.narkh...@gmail.com wrote:

 +1.

 On Thu, Oct 9, 2014 at 6:41 PM, Jun Rao jun...@gmail.com wrote:

  Hi, Everyone,
 
  I just committed KAFKA-1555 (min.isr support) to trunk. I felt that it's
  probably useful to include it in the 0.8.2 release. Any objections?
 
  Thanks,
 
  Jun

Re: Security JIRAS

2014-10-10 Thread Neha Narkhede

I'd vote for accepting every major change with the relevant system tests.
We didn't do this for major features in the past that lead to weak coverage
and a great deal of work for someone else to add tests for features that
were done in the past. I'm guilty of this myself :-(

On Thu, Oct 9, 2014 at 6:45 PM, Gwen Shapira gshap...@cloudera.com wrote:

 Added some details on delegation tokens. I hope it at least clarifies
 some of the scope.
 I'm working on a more detailed design doc.

 On Thu, Oct 9, 2014 at 1:44 PM, Jay Kreps jay.kr...@gmail.com wrote:
  Hey Gwen,
 
  Your absolutely right about these. I added the ticket for ZK
 authentication
  and Hadoop delegation tokens.
 
  For the Hadoop case I actually don't understand Hadoop security very
 well.
  Maybe you could fill in some of the details on what needs to happen for
  that to work?
 
  For testing, we should probably discuss the best way to test security. I
  think this is a fairly critical thing, if we are going to say we have
  security we really need to have good tests in place to ensure we do. This
  will require some thought. I think we should be able to test TLS fairly
  easily using junit integration test that just starts the server and
  connects using TLS. For Kerberos though it isn't clear to me how to do
 good
  integration testing since we need a KDC to test against and it isn't
 clear
  how that happens in the test environment except possibly manually (which
 is
  not ideal). How do other projects handle this?
 
  -Jay
 
  On Tue, Oct 7, 2014 at 5:25 PM, Gwen Shapira gshap...@cloudera.com
 wrote:
 
  I think we need to add:
 
  * Authentication of Kafka brokers with a secured ZooKeeper
  * Kafka should be able to generate delegation tokens for MapReduce /
  Spark / Yarn jobs.
  * Extend systest framework to allow testing secured kafka
 
  Gwen
 
  On Tue, Oct 7, 2014 at 5:15 PM, Jay Kreps jay.kr...@gmail.com wrote:
   Hey guys,
  
   As promised, I added a tree of JIRAs for the stuff in the security
 wiki (
   https://cwiki.apache.org/confluence/display/KAFKA/Security):
  
   https://issues.apache.org/jira/browse/KAFKA-1682
  
   I tried to break it into reasonably standalone pieces. I think many of
  the
   tickets could actually be done in parallel. Since there were many
 people
   interested in this area this may help parallelize the work a bit.
  
   I added some strawman details on implementation to each ticket. We can
   discuss and refine further on the individual tickets.
  
   Please take a look and let me know if this breakdown seems reasonable.
  
   Cheers,
  
   -Jay

Re: including KAFKA-1555 in 0.8.2?

2014-10-10 Thread Sriram Subramanian

+1

On 10/10/14 9:39 AM, Gwen Shapira gshap...@cloudera.com wrote:

+1 :)

On Fri, Oct 10, 2014 at 9:34 AM, Joe Stein joe.st...@stealth.ly wrote:
 +1
  On Oct 10, 2014 12:08 PM, Neha Narkhede neha.narkh...@gmail.com
wrote:

 +1.

 On Thu, Oct 9, 2014 at 6:41 PM, Jun Rao jun...@gmail.com wrote:

  Hi, Everyone,
 
  I just committed KAFKA-1555 (min.isr support) to trunk. I felt that
it's
  probably useful to include it in the 0.8.2 release. Any objections?
 
  Thanks,
 
  Jun

Re: Security JIRAS

2014-10-10 Thread Jarek Jarcec Cecho

I would be strong +1 on that. I’ve seen a lot of regressions on other projects 
when new functionality cause regressions when running in secure mode.

Jarcec

On Oct 10, 2014, at 9:43 AM, Neha Narkhede neha.narkh...@gmail.com wrote:

 I'd vote for accepting every major change with the relevant system tests.
 We didn't do this for major features in the past that lead to weak coverage
 and a great deal of work for someone else to add tests for features that
 were done in the past. I'm guilty of this myself :-(
 
 On Thu, Oct 9, 2014 at 6:45 PM, Gwen Shapira gshap...@cloudera.com wrote:
 
 Added some details on delegation tokens. I hope it at least clarifies
 some of the scope.
 I'm working on a more detailed design doc.
 
 On Thu, Oct 9, 2014 at 1:44 PM, Jay Kreps jay.kr...@gmail.com wrote:
 Hey Gwen,
 
 Your absolutely right about these. I added the ticket for ZK
 authentication
 and Hadoop delegation tokens.
 
 For the Hadoop case I actually don't understand Hadoop security very
 well.
 Maybe you could fill in some of the details on what needs to happen for
 that to work?
 
 For testing, we should probably discuss the best way to test security. I
 think this is a fairly critical thing, if we are going to say we have
 security we really need to have good tests in place to ensure we do. This
 will require some thought. I think we should be able to test TLS fairly
 easily using junit integration test that just starts the server and
 connects using TLS. For Kerberos though it isn't clear to me how to do
 good
 integration testing since we need a KDC to test against and it isn't
 clear
 how that happens in the test environment except possibly manually (which
 is
 not ideal). How do other projects handle this?
 
 -Jay
 
 On Tue, Oct 7, 2014 at 5:25 PM, Gwen Shapira gshap...@cloudera.com
 wrote:
 
 I think we need to add:
 
 * Authentication of Kafka brokers with a secured ZooKeeper
 * Kafka should be able to generate delegation tokens for MapReduce /
 Spark / Yarn jobs.
 * Extend systest framework to allow testing secured kafka
 
 Gwen
 
 On Tue, Oct 7, 2014 at 5:15 PM, Jay Kreps jay.kr...@gmail.com wrote:
 Hey guys,
 
 As promised, I added a tree of JIRAs for the stuff in the security
 wiki (
 https://cwiki.apache.org/confluence/display/KAFKA/Security):
 
 https://issues.apache.org/jira/browse/KAFKA-1682
 
 I tried to break it into reasonably standalone pieces. I think many of
 the
 tickets could actually be done in parallel. Since there were many
 people
 interested in this area this may help parallelize the work a bit.
 
 I added some strawman details on implementation to each ticket. We can
 discuss and refine further on the individual tickets.
 
 Please take a look and let me know if this breakdown seems reasonable.
 
 Cheers,
 
 -Jay

[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability

2014-10-10 Thread Sriram Subramanian (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167113#comment-14167113
]

Sriram Subramanian commented on KAFKA-1555:
---

Awesome. I suggest we document the guarantees provided by the different knobs.
That would be very useful.

provide strong consistency with reasonable availability
---

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-559) Garbage collect old consumer metadata entries


 [ 
https://issues.apache.org/jira/browse/KAFKA-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-559:

Labels: newbie project  (was: project)

 Garbage collect old consumer metadata entries
 -

 Key: KAFKA-559
 URL: https://issues.apache.org/jira/browse/KAFKA-559
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
Assignee: Tejas Patil
  Labels: newbie, project
 Attachments: KAFKA-559.v1.patch, KAFKA-559.v2.patch


 Many use cases involve tranient consumers. These consumers create entries 
 under their consumer group in zk and maintain offsets there as well. There is 
 currently no way to delete these entries. It would be good to have a tool 
 that did something like
   bin/delete-obsolete-consumer-groups.sh [--topic t1] --since [date] 
 --zookeeper [zk_connect]
 This would scan through consumer group entries and delete any that had no 
 offset update since the given date.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1305) Controller can hang on controlled shutdown with auto leader balance enabled


[ 
https://issues.apache.org/jira/browse/KAFKA-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167145#comment-14167145
 ] 

Sriharsha Chintalapani commented on KAFKA-1305:
---

[~nehanarkhede] [~junrao] my understanding is that we created more room for 
KafkaController not to get into any of the above mentioned issues by setting to 
10k but yes making unbounded is a better option as there could be a chance of 
exhausting 10k bounded queue and run into issues. We can get rid off  
controller.message.queue.size as config option and make the LinkedBlockingQueue 
unbounded.  


 Controller can hang on controlled shutdown with auto leader balance enabled
 ---

 Key: KAFKA-1305
 URL: https://issues.apache.org/jira/browse/KAFKA-1305
 Project: Kafka
  Issue Type: Bug
Reporter: Joel Koshy
Assignee: Sriharsha Chintalapani
Priority: Blocker
 Fix For: 0.8.2, 0.9.0

 Attachments: KAFKA-1305.patch


 This is relatively easy to reproduce especially when doing a rolling bounce.
 What happened here is as follows:
 1. The previous controller was bounced and broker 265 became the new 
 controller.
 2. I went on to do a controlled shutdown of broker 265 (the new controller).
 3. In the mean time the automatically scheduled preferred replica leader 
 election process started doing its thing and starts sending 
 LeaderAndIsrRequests/UpdateMetadataRequests to itself (and other brokers).  
 (t@113 below).
 4. While that's happening, the controlled shutdown process on 265 succeeds 
 and proceeds to deregister itself from ZooKeeper and shuts down the socket 
 server.
 5. (ReplicaStateMachine actually removes deregistered brokers from the 
 controller channel manager's list of brokers to send requests to.  However, 
 that removal cannot take place (t@18 below) because preferred replica leader 
 election task owns the controller lock.)
 6. So the request thread to broker 265 gets into infinite retries.
 7. The entire broker shutdown process is blocked on controller shutdown for 
 the same reason (it needs to acquire the controller lock).
 Relevant portions from the thread-dump:
 Controller-265-to-broker-265-send-thread - Thread t@113
java.lang.Thread.State: TIMED_WAITING
   at java.lang.Thread.sleep(Native Method)
   at 
 kafka.controller.RequestSendThread$$anonfun$liftedTree1$1$1.apply$mcV$sp(ControllerChannelManager.scala:143)
   at kafka.utils.Utils$.swallow(Utils.scala:167)
   at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
   at kafka.utils.Utils$.swallowWarn(Utils.scala:46)
   at kafka.utils.Logging$class.swallow(Logging.scala:94)
   at kafka.utils.Utils$.swallow(Utils.scala:46)
   at 
 kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:143)
   at 
 kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
   - locked java.lang.Object@6dbf14a7
   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
Locked ownable synchronizers:
   - None
 ...
 Thread-4 - Thread t@17
java.lang.Thread.State: WAITING on 
 java.util.concurrent.locks.ReentrantLock$NonfairSync@4836840 owned by: 
 kafka-scheduler-0
   at sun.misc.Unsafe.park(Native Method)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
   at 
 java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
   at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
   at kafka.utils.Utils$.inLock(Utils.scala:536)
   at kafka.controller.KafkaController.shutdown(KafkaController.scala:642)
   at 
 kafka.server.KafkaServer$$anonfun$shutdown$9.apply$mcV$sp(KafkaServer.scala:242)
   at kafka.utils.Utils$.swallow(Utils.scala:167)
   at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
   at kafka.utils.Utils$.swallowWarn(Utils.scala:46)
   at kafka.utils.Logging$class.swallow(Logging.scala:94)
   at kafka.utils.Utils$.swallow(Utils.scala:46)
   at kafka.server.KafkaServer.shutdown(KafkaServer.scala:242)
   at 
 kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:46)
   at kafka.Kafka$$anon$1.run(Kafka.scala:42)
 ...
 kafka-scheduler-0 - Thread t@117
java.lang.Thread.State: WAITING on 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1dc407fc
   at

Re: Review Request 26503: Patch for KAFKA-1493

2014-10-10 Thread Jun Rao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26503/#review56164
---


Thanks for the patch. A couple of high level questions.

1. Is the format used in KafkaLZ4BlockInputStream standard? I am wonder if 
there are libraries in other languages that support this format too.
2. Could you summarize the key difference btw the format in 
KafkaLZ4BlockInputStream and lz4BlockInputStream?
3. Could we add lz4 in ProducerCompressionTest?


clients/src/main/java/org/apache/kafka/common/message/KafkaLZ4BlockInputStream.java
https://reviews.apache.org/r/26503/#comment96483

Should originalLen be compressedLen?



clients/src/main/java/org/apache/kafka/common/message/KafkaLZ4BlockInputStream.java
https://reviews.apache.org/r/26503/#comment96484

Should we throw an UnsupportedOperationException?



clients/src/main/java/org/apache/kafka/common/message/KafkaLZ4BlockOutputStream.java
https://reviews.apache.org/r/26503/#comment96488

It seems this byte has both the compression method and the compression 
level. Could we document this?



clients/src/main/java/org/apache/kafka/common/message/KafkaLZ4BlockOutputStream.java
https://reviews.apache.org/r/26503/#comment96487

When is finished set to true?


- Jun Rao


On Oct. 9, 2014, 3:39 p.m., Ivan Lyutov wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/26503/
 ---
 
 (Updated Oct. 9, 2014, 3:39 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1493
 https://issues.apache.org/jira/browse/KAFKA-1493
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1493 - implemented input/output lz4 streams for kafka message 
 compression, added compression format description, minor typo fix.
 
 
 Diffs
 -
 
   
 clients/src/main/java/org/apache/kafka/common/message/KafkaLZ4BlockInputStream.java
  PRE-CREATION 
   
 clients/src/main/java/org/apache/kafka/common/message/KafkaLZ4BlockOutputStream.java
  PRE-CREATION 
   clients/src/main/java/org/apache/kafka/common/record/CompressionType.java 
 5227b2d7ab803389d1794f48c8232350c05b14fd 
   clients/src/main/java/org/apache/kafka/common/record/Compressor.java 
 0323f5f7032dceb49d820c17a41b78c56591ffc4 
   config/producer.properties 39d65d7c6c21f4fccd7af89be6ca12a088d5dd98 
   core/src/main/scala/kafka/message/CompressionCodec.scala 
 de0a0fade5387db63299c6b112b3c9a5e41d82ec 
   core/src/main/scala/kafka/message/CompressionFactory.scala 
 8420e13d0d8680648df78f22ada4a0d4e3ab8758 
   core/src/test/scala/unit/kafka/message/MessageCompressionTest.scala 
 6f0addcea64f1e78a4de50ec8135f4d02cebd305 
 
 Diff: https://reviews.apache.org/r/26503/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Ivan Lyutov

[jira] [Updated] (KAFKA-559) Garbage collect old consumer metadata entries


 [ 
https://issues.apache.org/jira/browse/KAFKA-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-559:

Reviewer: Neha Narkhede

 Garbage collect old consumer metadata entries
 -

 Key: KAFKA-559
 URL: https://issues.apache.org/jira/browse/KAFKA-559
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
Assignee: Tejas Patil
  Labels: newbie, project
 Attachments: KAFKA-559.v1.patch, KAFKA-559.v2.patch


 Many use cases involve tranient consumers. These consumers create entries 
 under their consumer group in zk and maintain offsets there as well. There is 
 currently no way to delete these entries. It would be good to have a tool 
 that did something like
   bin/delete-obsolete-consumer-groups.sh [--topic t1] --since [date] 
 --zookeeper [zk_connect]
 This would scan through consumer group entries and delete any that had no 
 offset update since the given date.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1493) Use a well-documented LZ4 compression format and remove redundant LZ4HC option

2014-10-10 Thread Jun Rao (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167153#comment-14167153
 ] 

Jun Rao commented on KAFKA-1493:


James,

Could you help review the format in Ivan's patch? Is the format used in 
KafkaLZ4BlockInputStream standard? I am wondering if there are libraries in 
other languages that support this format too. Thanks,

 Use a well-documented LZ4 compression format and remove redundant LZ4HC option
 --

 Key: KAFKA-1493
 URL: https://issues.apache.org/jira/browse/KAFKA-1493
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 0.8.2
Reporter: James Oliver
Assignee: Ivan Lyutov
Priority: Blocker
 Fix For: 0.8.2

 Attachments: KAFKA-1493.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 26563: Patch for KAFKA-1692

2014-10-10 Thread Ewen Cheslack-Postava


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26563/
---

Review request for kafka.


Bugs: KAFKA-1692
https://issues.apache.org/jira/browse/KAFKA-1692


Repository: kafka


Description
---

KAFKA-1692 Include client ID in new producer IO thread name.


Diffs
-

  clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
f58b8508d3f813a51015abed772c704390887d7e 

Diff: https://reviews.apache.org/r/26563/diff/


Testing
---


Thanks,

Ewen Cheslack-Postava

[jira] [Updated] (KAFKA-1692) [Java New Producer] IO Thread Name Must include Client ID


 [ 
https://issues.apache.org/jira/browse/KAFKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1692:
-
Assignee: Ewen Cheslack-Postava  (was: Jun Rao)
  Status: Patch Available  (was: Open)

 [Java New Producer]  IO Thread Name Must include  Client ID
 ---

 Key: KAFKA-1692
 URL: https://issues.apache.org/jira/browse/KAFKA-1692
 Project: Kafka
  Issue Type: Improvement
  Components: producer 
Affects Versions: 0.8.2
Reporter: Bhavesh Mistry
Assignee: Ewen Cheslack-Postava
Priority: Trivial
  Labels: newbie
 Attachments: KAFKA-1692.patch


 Please add client id so people who are looking at Jconsole or Profile tool 
 can see Thread by client id since single JVM can have multiple producer 
 instance.  
 org.apache.kafka.clients.producer.KafkaProducer
 {code}
 String ioThreadName = kafka-producer-network-thread;
  if(clientId != null){
   ioThreadName = ioThreadName  +  | +clientId; 
 }
 this.ioThread = new KafkaThread(ioThreadName, this.sender, true);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1692) [Java New Producer] IO Thread Name Must include Client ID


 [ 
https://issues.apache.org/jira/browse/KAFKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1692:
-
Attachment: KAFKA-1692.patch

 [Java New Producer]  IO Thread Name Must include  Client ID
 ---

 Key: KAFKA-1692
 URL: https://issues.apache.org/jira/browse/KAFKA-1692
 Project: Kafka
  Issue Type: Improvement
  Components: producer 
Affects Versions: 0.8.2
Reporter: Bhavesh Mistry
Assignee: Jun Rao
Priority: Trivial
  Labels: newbie
 Attachments: KAFKA-1692.patch


 Please add client id so people who are looking at Jconsole or Profile tool 
 can see Thread by client id since single JVM can have multiple producer 
 instance.  
 org.apache.kafka.clients.producer.KafkaProducer
 {code}
 String ioThreadName = kafka-producer-network-thread;
  if(clientId != null){
   ioThreadName = ioThreadName  +  | +clientId; 
 }
 this.ioThread = new KafkaThread(ioThreadName, this.sender, true);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1692) [Java New Producer] IO Thread Name Must include Client ID


[ 
https://issues.apache.org/jira/browse/KAFKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167169#comment-14167169
 ] 

Ewen Cheslack-Postava commented on KAFKA-1692:
--

Created reviewboard https://reviews.apache.org/r/26563/diff/
 against branch origin/trunk

 [Java New Producer]  IO Thread Name Must include  Client ID
 ---

 Key: KAFKA-1692
 URL: https://issues.apache.org/jira/browse/KAFKA-1692
 Project: Kafka
  Issue Type: Improvement
  Components: producer 
Affects Versions: 0.8.2
Reporter: Bhavesh Mistry
Assignee: Jun Rao
Priority: Trivial
  Labels: newbie
 Attachments: KAFKA-1692.patch


 Please add client id so people who are looking at Jconsole or Profile tool 
 can see Thread by client id since single JVM can have multiple producer 
 instance.  
 org.apache.kafka.clients.producer.KafkaProducer
 {code}
 String ioThreadName = kafka-producer-network-thread;
  if(clientId != null){
   ioThreadName = ioThreadName  +  | +clientId; 
 }
 this.ioThread = new KafkaThread(ioThreadName, this.sender, true);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 26564: Patch for KAFKA-1471

2014-10-10 Thread Ewen Cheslack-Postava


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26564/
---

Review request for kafka.


Bugs: KAFKA-1471
https://issues.apache.org/jira/browse/KAFKA-1471


Repository: kafka


Description
---

KAFKA-1471 Add producer unit tests for LZ4 and LZ4HC compression codecs


Diffs
-

  clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
79d57f9bf31606ffa5400f2f12356eba84703cc2 
  core/src/main/scala/kafka/tools/ConsoleProducer.scala 
8e9ba0b284671989f87d9c421bc98f5c4384c260 
  core/src/test/scala/integration/kafka/api/ProducerCompressionTest.scala 
17e2c6e9dfd789acb4b6db37c780c862667e4e11 

Diff: https://reviews.apache.org/r/26564/diff/


Testing
---


Thanks,

Ewen Cheslack-Postava

[jira] [Updated] (KAFKA-1471) Add Producer Unit Tests for LZ4 and LZ4HC compression


 [ 
https://issues.apache.org/jira/browse/KAFKA-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1471:
-
Attachment: KAFKA-1471.patch

 Add Producer Unit Tests for LZ4 and LZ4HC compression
 -

 Key: KAFKA-1471
 URL: https://issues.apache.org/jira/browse/KAFKA-1471
 Project: Kafka
  Issue Type: Sub-task
Reporter: James Oliver
 Fix For: 0.8.2

 Attachments: KAFKA-1471.patch, KAFKA-1471.patch, KAFKA-1471.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1305) Controller can hang on controlled shutdown with auto leader balance enabled

2014-10-10 Thread Jun Rao (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167179#comment-14167179
 ] 

Jun Rao commented on KAFKA-1305:


Yes, in theory, we can make the queue unbounded. However, in practice, the 
queue shouldn't build up. I was a bit concerned that if we make the queue 
unbounded and another issue that causes the queue to build up, we may hit OOME. 
Then, we may not be able to take a thread dump to diagnose the issue.

 Controller can hang on controlled shutdown with auto leader balance enabled
 ---

 Key: KAFKA-1305
 URL: https://issues.apache.org/jira/browse/KAFKA-1305
 Project: Kafka
  Issue Type: Bug
Reporter: Joel Koshy
Assignee: Sriharsha Chintalapani
Priority: Blocker
 Fix For: 0.8.2, 0.9.0

 Attachments: KAFKA-1305.patch


 This is relatively easy to reproduce especially when doing a rolling bounce.
 What happened here is as follows:
 1. The previous controller was bounced and broker 265 became the new 
 controller.
 2. I went on to do a controlled shutdown of broker 265 (the new controller).
 3. In the mean time the automatically scheduled preferred replica leader 
 election process started doing its thing and starts sending 
 LeaderAndIsrRequests/UpdateMetadataRequests to itself (and other brokers).  
 (t@113 below).
 4. While that's happening, the controlled shutdown process on 265 succeeds 
 and proceeds to deregister itself from ZooKeeper and shuts down the socket 
 server.
 5. (ReplicaStateMachine actually removes deregistered brokers from the 
 controller channel manager's list of brokers to send requests to.  However, 
 that removal cannot take place (t@18 below) because preferred replica leader 
 election task owns the controller lock.)
 6. So the request thread to broker 265 gets into infinite retries.
 7. The entire broker shutdown process is blocked on controller shutdown for 
 the same reason (it needs to acquire the controller lock).
 Relevant portions from the thread-dump:
 Controller-265-to-broker-265-send-thread - Thread t@113
java.lang.Thread.State: TIMED_WAITING
   at java.lang.Thread.sleep(Native Method)
   at 
 kafka.controller.RequestSendThread$$anonfun$liftedTree1$1$1.apply$mcV$sp(ControllerChannelManager.scala:143)
   at kafka.utils.Utils$.swallow(Utils.scala:167)
   at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
   at kafka.utils.Utils$.swallowWarn(Utils.scala:46)
   at kafka.utils.Logging$class.swallow(Logging.scala:94)
   at kafka.utils.Utils$.swallow(Utils.scala:46)
   at 
 kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:143)
   at 
 kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
   - locked java.lang.Object@6dbf14a7
   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
Locked ownable synchronizers:
   - None
 ...
 Thread-4 - Thread t@17
java.lang.Thread.State: WAITING on 
 java.util.concurrent.locks.ReentrantLock$NonfairSync@4836840 owned by: 
 kafka-scheduler-0
   at sun.misc.Unsafe.park(Native Method)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
   at 
 java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
   at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
   at kafka.utils.Utils$.inLock(Utils.scala:536)
   at kafka.controller.KafkaController.shutdown(KafkaController.scala:642)
   at 
 kafka.server.KafkaServer$$anonfun$shutdown$9.apply$mcV$sp(KafkaServer.scala:242)
   at kafka.utils.Utils$.swallow(Utils.scala:167)
   at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
   at kafka.utils.Utils$.swallowWarn(Utils.scala:46)
   at kafka.utils.Logging$class.swallow(Logging.scala:94)
   at kafka.utils.Utils$.swallow(Utils.scala:46)
   at kafka.server.KafkaServer.shutdown(KafkaServer.scala:242)
   at 
 kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:46)
   at kafka.Kafka$$anon$1.run(Kafka.scala:42)
 ...
 kafka-scheduler-0 - Thread t@117
java.lang.Thread.State: WAITING on 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1dc407fc
   at sun.misc.Unsafe.park(Native Method)
   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
   at

[jira] [Commented] (KAFKA-1471) Add Producer Unit Tests for LZ4 and LZ4HC compression


[ 
https://issues.apache.org/jira/browse/KAFKA-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167178#comment-14167178
 ] 

Ewen Cheslack-Postava commented on KAFKA-1471:
--

Created reviewboard https://reviews.apache.org/r/26564/diff/
 against branch origin/trunk

 Add Producer Unit Tests for LZ4 and LZ4HC compression
 -

 Key: KAFKA-1471
 URL: https://issues.apache.org/jira/browse/KAFKA-1471
 Project: Kafka
  Issue Type: Sub-task
Reporter: James Oliver
 Fix For: 0.8.2

 Attachments: KAFKA-1471.patch, KAFKA-1471.patch, KAFKA-1471.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1471) Add Producer Unit Tests for LZ4 and LZ4HC compression


[ 
https://issues.apache.org/jira/browse/KAFKA-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167183#comment-14167183
 ] 

Ewen Cheslack-Postava commented on KAFKA-1471:
--

This updated version applies cleanly (just had a bit of fuzz), removes a few 
unnecessarily changed lines, and fixes some typos.

 Add Producer Unit Tests for LZ4 and LZ4HC compression
 -

 Key: KAFKA-1471
 URL: https://issues.apache.org/jira/browse/KAFKA-1471
 Project: Kafka
  Issue Type: Sub-task
Reporter: James Oliver
Assignee: Ewen Cheslack-Postava
 Fix For: 0.8.2

 Attachments: KAFKA-1471.patch, KAFKA-1471.patch, KAFKA-1471.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1583) Kafka API Refactoring

2014-10-10 Thread Guozhang Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167187#comment-14167187
 ] 

Guozhang Wang commented on KAFKA-1583:
--

Sure, will do that asap.

 Kafka API Refactoring
 -

 Key: KAFKA-1583
 URL: https://issues.apache.org/jira/browse/KAFKA-1583
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.9.0

 Attachments: KAFKA-1583.patch, KAFKA-1583_2014-08-20_13:54:38.patch, 
 KAFKA-1583_2014-08-21_11:30:34.patch, KAFKA-1583_2014-08-27_09:44:50.patch, 
 KAFKA-1583_2014-09-01_18:07:42.patch, KAFKA-1583_2014-09-02_13:37:47.patch, 
 KAFKA-1583_2014-09-05_14:08:36.patch, KAFKA-1583_2014-09-05_14:55:38.patch


 This is the next step of KAFKA-1430. Details can be found at this page:
 https://cwiki.apache.org/confluence/display/KAFKA/Kafka+API+Refactoring



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (KAFKA-1670) Corrupt log files for segment.bytes values close to Int.MaxInt

2014-10-10 Thread Guozhang Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang reopened KAFKA-1670:
--

 Corrupt log files for segment.bytes values close to Int.MaxInt
 --

 Key: KAFKA-1670
 URL: https://issues.apache.org/jira/browse/KAFKA-1670
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ryan Berdeen
Assignee: Sriharsha Chintalapani
Priority: Blocker
 Fix For: 0.8.2

 Attachments: KAFKA-1670.patch, KAFKA-1670_2014-10-04_20:17:46.patch, 
 KAFKA-1670_2014-10-06_09:48:25.patch, KAFKA-1670_2014-10-07_13:39:13.patch, 
 KAFKA-1670_2014-10-07_13:49:10.patch, KAFKA-1670_2014-10-07_18:39:31.patch


 The maximum value for the topic-level config {{segment.bytes}} is 
 {{Int.MaxInt}} (2147483647). *Using this value causes brokers to corrupt 
 their log files, leaving them unreadable.*
 We set {{segment.bytes}} to {{2122317824}} which is well below the maximum. 
 One by one, the ISR of all partitions shrunk to 1. Brokers would crash when 
 restarted, attempting to read from a negative offset in a log file. After 
 discovering that many segment files had grown to 4GB or more, we were forced 
 to shut down our *entire production Kafka cluster* for several hours while we 
 split all segment files into 1GB chunks.
 Looking into the {{kafka.log}} code, the {{segment.bytes}} parameter is used 
 inconsistently. It is treated as a *soft* maximum for the size of the segment 
 file 
 (https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/log/LogConfig.scala#L26)
  with logs rolled only after 
 (https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/log/Log.scala#L246)
  they exceed this value. However, much of the code that deals with log files 
 uses *ints* to store the size of the file and the position in the file. 
 Overflow of these ints leads the broker to append to the segments 
 indefinitely, and to fail to read these segments for consuming or recovery.
 This is trivial to reproduce:
 {code}
 $ bin/kafka-topics.sh --topic segment-bytes-test --create 
 --replication-factor 2 --partitions 1 --zookeeper zkhost:2181
 $ bin/kafka-topics.sh --topic segment-bytes-test --alter --config 
 segment.bytes=2147483647 --zookeeper zkhost:2181
 $ yes Int.MaxValue is a ridiculous bound on file size in 2014 | 
 bin/kafka-console-producer.sh --broker-list localhost:6667 zkhost:2181 
 --topic segment-bytes-test
 {code}
 After running for a few minutes, the log file is corrupt:
 {code}
 $ ls -lh data/segment-bytes-test-0/
 total 9.7G
 -rw-r--r-- 1 root root  10M Oct  3 19:39 .index
 -rw-r--r-- 1 root root 9.7G Oct  3 19:39 .log
 {code}
 We recovered the data from the log files using a simple Python script: 
 https://gist.github.com/also/9f823d9eb9dc0a410796



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1670) Corrupt log files for segment.bytes values close to Int.MaxInt

2014-10-10 Thread Guozhang Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167219#comment-14167219
 ] 

Guozhang Wang commented on KAFKA-1670:
--

Hi [~sriharsha] [~junrao] this patch causes some regression errors on system 
tests, including replication / mirror maker test suites (you can try reproduce 
it with 5001/2/3 easily).

The log entries I saw from the producer:
{code}
[2014-10-10 05:32:46,714] ERROR Error when sending message to topic test_1 with 
key: 1 bytes, value: 500 bytes with error: The request included message batch 
larger than the configured segment size on the server. (org.apache.kafka.client
s.producer.internals.ErrorLoggingCallback)
[2014-10-10 05:32:46,714] ERROR Error when sending message to topic test_1 with 
key: 1 bytes, value: 500 bytes with error: The request included message batch 
larger than the configured segment size on the server. (org.apache.kafka.client
s.producer.internals.ErrorLoggingCallback)
[2014-10-10 05:32:46,714] ERROR Error when sending message to topic test_1 with 
key: 1 bytes, value: 500 bytes with error: The request included message batch 
larger than the configured segment size on the server. (org.apache.kafka.client
s.producer.internals.ErrorLoggingCallback)
[2014-10-10 05:32:46,714] ERROR Error when sending message to topic test_1 with 
key: 1 bytes, value: 500 bytes with error: The request included message batch 
larger than the configured segment size on the server. (org.apache.kafka.client
s.producer.internals.ErrorLoggingCallback)
[2014-10-10 05:32:46,714] ERROR Error when sending message to topic test_1 with 
key: 1 bytes, value: 500 bytes with error: The request included message batch 
larger than the configured segment size on the server. (org.apache.kafka.client
s.producer.internals.ErrorLoggingCallback)
{code}

 Corrupt log files for segment.bytes values close to Int.MaxInt
 --

 Key: KAFKA-1670
 URL: https://issues.apache.org/jira/browse/KAFKA-1670
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ryan Berdeen
Assignee: Sriharsha Chintalapani
Priority: Blocker
 Fix For: 0.8.2

 Attachments: KAFKA-1670.patch, KAFKA-1670_2014-10-04_20:17:46.patch, 
 KAFKA-1670_2014-10-06_09:48:25.patch, KAFKA-1670_2014-10-07_13:39:13.patch, 
 KAFKA-1670_2014-10-07_13:49:10.patch, KAFKA-1670_2014-10-07_18:39:31.patch


 The maximum value for the topic-level config {{segment.bytes}} is 
 {{Int.MaxInt}} (2147483647). *Using this value causes brokers to corrupt 
 their log files, leaving them unreadable.*
 We set {{segment.bytes}} to {{2122317824}} which is well below the maximum. 
 One by one, the ISR of all partitions shrunk to 1. Brokers would crash when 
 restarted, attempting to read from a negative offset in a log file. After 
 discovering that many segment files had grown to 4GB or more, we were forced 
 to shut down our *entire production Kafka cluster* for several hours while we 
 split all segment files into 1GB chunks.
 Looking into the {{kafka.log}} code, the {{segment.bytes}} parameter is used 
 inconsistently. It is treated as a *soft* maximum for the size of the segment 
 file 
 (https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/log/LogConfig.scala#L26)
  with logs rolled only after 
 (https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/log/Log.scala#L246)
  they exceed this value. However, much of the code that deals with log files 
 uses *ints* to store the size of the file and the position in the file. 
 Overflow of these ints leads the broker to append to the segments 
 indefinitely, and to fail to read these segments for consuming or recovery.
 This is trivial to reproduce:
 {code}
 $ bin/kafka-topics.sh --topic segment-bytes-test --create 
 --replication-factor 2 --partitions 1 --zookeeper zkhost:2181
 $ bin/kafka-topics.sh --topic segment-bytes-test --alter --config 
 segment.bytes=2147483647 --zookeeper zkhost:2181
 $ yes Int.MaxValue is a ridiculous bound on file size in 2014 | 
 bin/kafka-console-producer.sh --broker-list localhost:6667 zkhost:2181 
 --topic segment-bytes-test
 {code}
 After running for a few minutes, the log file is corrupt:
 {code}
 $ ls -lh data/segment-bytes-test-0/
 total 9.7G
 -rw-r--r-- 1 root root  10M Oct  3 19:39 .index
 -rw-r--r-- 1 root root 9.7G Oct  3 19:39 .log
 {code}
 We recovered the data from the log files using a simple Python script: 
 https://gist.github.com/also/9f823d9eb9dc0a410796



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1670) Corrupt log files for segment.bytes values close to Int.MaxInt


[ 
https://issues.apache.org/jira/browse/KAFKA-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167223#comment-14167223
 ] 

Sriharsha Chintalapani commented on KAFKA-1670:
---

[~guozhang] Sorry will fix those.

 Corrupt log files for segment.bytes values close to Int.MaxInt
 --

 Key: KAFKA-1670
 URL: https://issues.apache.org/jira/browse/KAFKA-1670
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1.1
Reporter: Ryan Berdeen
Assignee: Sriharsha Chintalapani
Priority: Blocker
 Fix For: 0.8.2

 Attachments: KAFKA-1670.patch, KAFKA-1670_2014-10-04_20:17:46.patch, 
 KAFKA-1670_2014-10-06_09:48:25.patch, KAFKA-1670_2014-10-07_13:39:13.patch, 
 KAFKA-1670_2014-10-07_13:49:10.patch, KAFKA-1670_2014-10-07_18:39:31.patch


 The maximum value for the topic-level config {{segment.bytes}} is 
 {{Int.MaxInt}} (2147483647). *Using this value causes brokers to corrupt 
 their log files, leaving them unreadable.*
 We set {{segment.bytes}} to {{2122317824}} which is well below the maximum. 
 One by one, the ISR of all partitions shrunk to 1. Brokers would crash when 
 restarted, attempting to read from a negative offset in a log file. After 
 discovering that many segment files had grown to 4GB or more, we were forced 
 to shut down our *entire production Kafka cluster* for several hours while we 
 split all segment files into 1GB chunks.
 Looking into the {{kafka.log}} code, the {{segment.bytes}} parameter is used 
 inconsistently. It is treated as a *soft* maximum for the size of the segment 
 file 
 (https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/log/LogConfig.scala#L26)
  with logs rolled only after 
 (https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/log/Log.scala#L246)
  they exceed this value. However, much of the code that deals with log files 
 uses *ints* to store the size of the file and the position in the file. 
 Overflow of these ints leads the broker to append to the segments 
 indefinitely, and to fail to read these segments for consuming or recovery.
 This is trivial to reproduce:
 {code}
 $ bin/kafka-topics.sh --topic segment-bytes-test --create 
 --replication-factor 2 --partitions 1 --zookeeper zkhost:2181
 $ bin/kafka-topics.sh --topic segment-bytes-test --alter --config 
 segment.bytes=2147483647 --zookeeper zkhost:2181
 $ yes Int.MaxValue is a ridiculous bound on file size in 2014 | 
 bin/kafka-console-producer.sh --broker-list localhost:6667 zkhost:2181 
 --topic segment-bytes-test
 {code}
 After running for a few minutes, the log file is corrupt:
 {code}
 $ ls -lh data/segment-bytes-test-0/
 total 9.7G
 -rw-r--r-- 1 root root  10M Oct  3 19:39 .index
 -rw-r--r-- 1 root root 9.7G Oct  3 19:39 .log
 {code}
 We recovered the data from the log files using a simple Python script: 
 https://gist.github.com/also/9f823d9eb9dc0a410796



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 26566: Patch for KAFKA-1680

2014-10-10 Thread Ewen Cheslack-Postava


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26566/
---

Review request for kafka.


Bugs: KAFKA-1680
https://issues.apache.org/jira/browse/KAFKA-1680


Repository: kafka


Description
---

KAFKA-1680 Standardize command line argument parsing and usage messages. At 
it's heart, this was just a test of args.length that was invalid for this 
command, but 6b0ae4bba0d introduced the same potential issue across all the 
command line tools. This standardizes all the command line tools on a cleaner 
parsing pattern by pushing most of the work into CommandLineUtils and printing 
usage info for any type of parsing exception. Ideally the long term solution 
would be to use a newer version of joptsimple that allows us to express 
constraints on arguments to get almost all command line option issues resolved 
at parse time.


Diffs
-

  core/src/main/scala/kafka/admin/PreferredReplicaLeaderElectionCommand.scala 
c7918483c02040a7cc18d6e9edbd20a3025a3a55 
  core/src/main/scala/kafka/admin/ReassignPartitionsCommand.scala 
691d69a49a240f38883d2025afaec26fd61281b5 
  core/src/main/scala/kafka/admin/TopicCommand.scala 
7672c5aab4fba8c23b1bb5cd4785c332d300a3fa 
  core/src/main/scala/kafka/tools/ConsoleConsumer.scala 
323fc8566d974acc4e5c7d7c2a065794f3b5df4a 
  core/src/main/scala/kafka/tools/ConsoleProducer.scala 
8e9ba0b284671989f87d9c421bc98f5c4384c260 
  core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala 
d1e7c434e77859d746b8dc68dd5d5a3740425e79 
  core/src/main/scala/kafka/tools/ConsumerPerformance.scala 
093c800ea7f8a9c972bb66e99ac4e4d431cf11cc 
  core/src/main/scala/kafka/tools/DumpLogSegments.scala 
8e9d47b8d4adc5754ed8861aa04ddd3c6b629e3d 
  core/src/main/scala/kafka/tools/ExportZkOffsets.scala 
4d051bc2db12f0e15aa6a3289abeb9dd25d373d2 
  core/src/main/scala/kafka/tools/GetOffsetShell.scala 
3d9293e4abbe3f4a4a2bc5833385747c604d5a95 
  core/src/main/scala/kafka/tools/ImportZkOffsets.scala 
abe09721b13f71320510fd1a01c1917470450c6e 
  core/src/main/scala/kafka/tools/JmxTool.scala 
1d1a120c45ff70fbd60df5b147ca230eb1ef50de 
  core/src/main/scala/kafka/tools/MirrorMaker.scala 
b8698ee1469c8fbc92ccc176d916eb3e28b87867 
  core/src/main/scala/kafka/tools/ProducerPerformance.scala 
f61c7c701fd85caabc2d2950a7b02aa85e5cdfe3 
  core/src/main/scala/kafka/tools/ReplayLogProducer.scala 
3393a3dd574ac45a27bf7eda646b737146c55038 
  core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala 
ba6ddd7a909df79a0f7d45e8b4a2af94ea0fceb6 
  core/src/main/scala/kafka/tools/SimpleConsumerPerformance.scala 
7602b8d705970a5dab49ed36d117346a960701ac 
  core/src/main/scala/kafka/tools/SimpleConsumerShell.scala 
b4f903b6c7c3bb725cac7c05eb1f885906413c4d 
  core/src/main/scala/kafka/tools/StateChangeLogMerger.scala 
d298e7e81acc7427c6cf4796b445966267ca54eb 
  core/src/main/scala/kafka/tools/TestLogCleaning.scala 
1d4ea93f2ba8d4d4d47a307cd47f54a15d3d30dd 
  core/src/main/scala/kafka/tools/VerifyConsumerRebalance.scala 
aef8361b73a0934641fc4f5cee942b5b50f3e7d7 
  core/src/main/scala/kafka/utils/CommandLineUtils.scala 
086a62483fad0c9cfc7004ff94c890cfb9929fa6 
  core/src/main/scala/kafka/utils/ToolsUtils.scala 
fef93929ea03e181f87fe294c06d9bc9fc823e9e 
  core/src/test/scala/other/kafka/TestLinearWriteSpeed.scala 
7211c2529c1db76100432737da7a1d1d221dfba0 

Diff: https://reviews.apache.org/r/26566/diff/


Testing
---


Thanks,

Ewen Cheslack-Postava

[jira] [Updated] (KAFKA-1680) JmxTool exits if no arguments are given


 [ 
https://issues.apache.org/jira/browse/KAFKA-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1680:
-
Attachment: KAFKA-1680.patch

 JmxTool exits if no arguments are given
 ---

 Key: KAFKA-1680
 URL: https://issues.apache.org/jira/browse/KAFKA-1680
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.2
Reporter: Ryan Berdeen
Priority: Minor
 Attachments: KAFKA-1680.patch


 JmxTool has no required arguments, but it exits if no arguments are provided. 
 You can work around this by passing a non-option argument, which will be 
 ignored, e.g.{{./bin/kafka-run-class.sh kafka.tools.JmxTool xxx}}.
 It looks like this was broken in KAFKA-1291 / 
 6b0ae4bba0d0f8e4c8da19de65a8f03f162bec39



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1680) JmxTool exits if no arguments are given


 [ 
https://issues.apache.org/jira/browse/KAFKA-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-1680:
-
Assignee: Ewen Cheslack-Postava
  Status: Patch Available  (was: Open)

 JmxTool exits if no arguments are given
 ---

 Key: KAFKA-1680
 URL: https://issues.apache.org/jira/browse/KAFKA-1680
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.2
Reporter: Ryan Berdeen
Assignee: Ewen Cheslack-Postava
Priority: Minor
 Attachments: KAFKA-1680.patch


 JmxTool has no required arguments, but it exits if no arguments are provided. 
 You can work around this by passing a non-option argument, which will be 
 ignored, e.g.{{./bin/kafka-run-class.sh kafka.tools.JmxTool xxx}}.
 It looks like this was broken in KAFKA-1291 / 
 6b0ae4bba0d0f8e4c8da19de65a8f03f162bec39



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1680) JmxTool exits if no arguments are given


[ 
https://issues.apache.org/jira/browse/KAFKA-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167228#comment-14167228
 ] 

Ewen Cheslack-Postava commented on KAFKA-1680:
--

Created reviewboard https://reviews.apache.org/r/26566/diff/
 against branch origin/trunk

 JmxTool exits if no arguments are given
 ---

 Key: KAFKA-1680
 URL: https://issues.apache.org/jira/browse/KAFKA-1680
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.2
Reporter: Ryan Berdeen
Priority: Minor
 Attachments: KAFKA-1680.patch


 JmxTool has no required arguments, but it exits if no arguments are provided. 
 You can work around this by passing a non-option argument, which will be 
 ignored, e.g.{{./bin/kafka-run-class.sh kafka.tools.JmxTool xxx}}.
 It looks like this was broken in KAFKA-1291 / 
 6b0ae4bba0d0f8e4c8da19de65a8f03f162bec39



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability

2014-10-10 Thread Gwen Shapira (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167261#comment-14167261
]

Gwen Shapira commented on KAFKA-1555:
-

will be happy to do that. Is the wiki the right place?

provide strong consistency with reasonable availability
---

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability

2014-10-10 Thread Sriram Subramanian (JIRA)

[
https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167464#comment-14167464
]

Sriram Subramanian commented on KAFKA-1555:
---

My vote would be to update our documentation -
http://kafka.apache.org/documentation.html
It currently refers to 0.8.1. We should make 0.8.2 the current one after the
release. The Design section can have Guarantees portion that talks about what
guarantees that Kafka gives w.r.t consistency Vs availability and when. What do
the rest think?

provide strong consistency with reasonable availability
---

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-1700) examples directory - README and shell scripts are out of date

2014-10-10 Thread Geoffrey Anderson (JIRA)

Geoffrey Anderson created KAFKA-1700:


 Summary: examples directory - README and shell scripts are out of 
date
 Key: KAFKA-1700
 URL: https://issues.apache.org/jira/browse/KAFKA-1700
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Geoffrey Anderson
Priority: Minor
 Fix For: 0.8.2


sbt build files were removed during resolution of KAFKA-1254, so the README 
under the examples directory should no longer make reference to sbt.

Also, the paths added to CLASSPATH variable in the example shell script are no 
longer correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 26575: Fix for KAFKA-1700

2014-10-10 Thread Geoffrey Anderson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26575/
---

Review request for kafka.


Bugs: KAFKA-1700
https://issues.apache.org/jira/browse/KAFKA-1700


Repository: kafka


Description
---

Tweaked README, updated shell scripts so they call kafka-run-class.sh instead 
of manually adding to the CLASSPATH.


Diffs
-

  examples/README 61de2868de29e7c04811bfe12ccabc50b45d148e 
  examples/bin/java-producer-consumer-demo.sh 
29e01c2dcf82365c28fef0836e1771282cb49bc1 
  examples/bin/java-simple-consumer-demo.sh 
4716a098c7d404477e0e7254e65e0f509e9df92e 

Diff: https://reviews.apache.org/r/26575/diff/


Testing
---


Thanks,

Geoffrey Anderson

[jira] [Updated] (KAFKA-1700) examples directory - README and shell scripts are out of date

2014-10-10 Thread Geoffrey Anderson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Anderson updated KAFKA-1700:
-
Attachment: KAFKA-1700.patch

 examples directory - README and shell scripts are out of date
 -

 Key: KAFKA-1700
 URL: https://issues.apache.org/jira/browse/KAFKA-1700
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Geoffrey Anderson
Priority: Minor
  Labels: newbie
 Fix For: 0.8.2

 Attachments: KAFKA-1700.patch


 sbt build files were removed during resolution of KAFKA-1254, so the README 
 under the examples directory should no longer make reference to sbt.
 Also, the paths added to CLASSPATH variable in the example shell script are 
 no longer correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1700) examples directory - README and shell scripts are out of date

2014-10-10 Thread Geoffrey Anderson (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167501#comment-14167501
 ] 

Geoffrey Anderson commented on KAFKA-1700:
--

Created reviewboard https://reviews.apache.org/r/26575/diff/
 against branch origin/trunk

 examples directory - README and shell scripts are out of date
 -

 Key: KAFKA-1700
 URL: https://issues.apache.org/jira/browse/KAFKA-1700
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Geoffrey Anderson
Priority: Minor
  Labels: newbie
 Fix For: 0.8.2

 Attachments: KAFKA-1700.patch


 sbt build files were removed during resolution of KAFKA-1254, so the README 
 under the examples directory should no longer make reference to sbt.
 Also, the paths added to CLASSPATH variable in the example shell script are 
 no longer correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-1701) Improve controller and broker message handling.

2014-10-10 Thread Jiangjie Qin (JIRA)

Jiangjie Qin created KAFKA-1701:
---

 Summary: Improve controller and broker message handling.
 Key: KAFKA-1701
 URL: https://issues.apache.org/jira/browse/KAFKA-1701
 Project: Kafka
  Issue Type: Improvement
Reporter: Jiangjie Qin


This ticket is a memo for future controller refactoring.
It is related to KAFKA-1547. Ideally, the broker should only follow instruction 
from controller but not handle it smartly. For KAFKA-1547, the controller 
should filter out the partitions whose leader is not up yet before send 
LeaderAndIsrRequest to broker. 
The idea is controller should handle all the edge cases instead of letting 
broker do it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: including KAFKA-1555 in 0.8.2?

2014-10-10 Thread Joel Koshy

+1

On Fri, Oct 10, 2014 at 04:53:45PM +, Sriram Subramanian wrote:
 +1
 
 On 10/10/14 9:39 AM, Gwen Shapira gshap...@cloudera.com wrote:
 
 +1 :)
 
 On Fri, Oct 10, 2014 at 9:34 AM, Joe Stein joe.st...@stealth.ly wrote:
  +1
   On Oct 10, 2014 12:08 PM, Neha Narkhede neha.narkh...@gmail.com
 wrote:
 
  +1.
 
  On Thu, Oct 9, 2014 at 6:41 PM, Jun Rao jun...@gmail.com wrote:
 
   Hi, Everyone,
  
   I just committed KAFKA-1555 (min.isr support) to trunk. I felt that
 it's
   probably useful to include it in the 0.8.2 release. Any objections?
  
   Thanks,
  
   Jun

[jira] [Assigned] (KAFKA-1634) Update protocol wiki to reflect the new offset management feature


 [ 
https://issues.apache.org/jira/browse/KAFKA-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy reassigned KAFKA-1634:
-

Assignee: Joel Koshy  (was: Jun Rao)

 Update protocol wiki to reflect the new offset management feature
 -

 Key: KAFKA-1634
 URL: https://issues.apache.org/jira/browse/KAFKA-1634
 Project: Kafka
  Issue Type: Bug
Reporter: Neha Narkhede
Assignee: Joel Koshy
Priority: Blocker
 Fix For: 0.8.2


 From the mailing list -
 following up on this -- I think the online API docs for OffsetCommitRequest
 still incorrectly refer to client-side timestamps:
 https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest
 Wasn't that removed and now always handled server-side now?  Would one of
 the devs mind updating the API spec wiki?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1634) Update protocol wiki to reflect the new offset management feature


[ 
https://issues.apache.org/jira/browse/KAFKA-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167696#comment-14167696
 ] 

Joel Koshy commented on KAFKA-1634:
---

Actually, one more potential source of confusion is that we use 
OffsetAndMetadata for both offset commits requests and offset fetch responses.

i.e., an OffsetFetchResponse will contain: offset, metadata and this timestamp 
field. The timestamp field should really be ignored. It is annoying to document 
such things - i.e., tell users to just ignore the field.

Ideally, I think we should do the following:
* Remove the timestamp from the OffsetAndMetadata class
* Move it to the top-level of the OffsetCommitRequest and rename it to 
retentionMs
* The broker will compute the absolute time (based off time of receipt) that 
the offset should be expired
* The above absolute time will continue to be stored in the offsets topic and 
the cleanup thread can remove those offsets when they are past their TTL.
* OffsetFetchResponse will just return OffsetAndMetadata (no timestamp)

We (linkedin and possibly others) already deployed this to some of our 
consumers but if we can bump up the protocol version when doing the above and 
translate requests that come in with the older version I think it should be 
okay.


 Update protocol wiki to reflect the new offset management feature
 -

 Key: KAFKA-1634
 URL: https://issues.apache.org/jira/browse/KAFKA-1634
 Project: Kafka
  Issue Type: Bug
Reporter: Neha Narkhede
Assignee: Joel Koshy
Priority: Blocker
 Fix For: 0.8.2


 From the mailing list -
 following up on this -- I think the online API docs for OffsetCommitRequest
 still incorrectly refer to client-side timestamps:
 https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest
 Wasn't that removed and now always handled server-side now?  Would one of
 the devs mind updating the API spec wiki?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-1634) Update protocol wiki to reflect the new offset management feature


[ 
https://issues.apache.org/jira/browse/KAFKA-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167700#comment-14167700
 ] 

Joel Koshy commented on KAFKA-1634:
---

(BTW, I'm looking for a  +1 or -1 on the above comment :) )

 Update protocol wiki to reflect the new offset management feature
 -

 Key: KAFKA-1634
 URL: https://issues.apache.org/jira/browse/KAFKA-1634
 Project: Kafka
  Issue Type: Bug
Reporter: Neha Narkhede
Assignee: Joel Koshy
Priority: Blocker
 Fix For: 0.8.2


 From the mailing list -
 following up on this -- I think the online API docs for OffsetCommitRequest
 still incorrectly refer to client-side timestamps:
 https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest
 Wasn't that removed and now always handled server-side now?  Would one of
 the devs mind updating the API spec wiki?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1634) Improve semantics of timestamp in OffsetCommitRequests and update documentation


 [ 
https://issues.apache.org/jira/browse/KAFKA-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-1634:
--
Summary: Improve semantics of timestamp in OffsetCommitRequests and update 
documentation  (was: Improve semantics of timestamp in OffsetCommitRequests)

 Improve semantics of timestamp in OffsetCommitRequests and update 
 documentation
 ---

 Key: KAFKA-1634
 URL: https://issues.apache.org/jira/browse/KAFKA-1634
 Project: Kafka
  Issue Type: Bug
Reporter: Neha Narkhede
Assignee: Joel Koshy
Priority: Blocker
 Fix For: 0.8.2


 From the mailing list -
 following up on this -- I think the online API docs for OffsetCommitRequest
 still incorrectly refer to client-side timestamps:
 https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest
 Wasn't that removed and now always handled server-side now?  Would one of
 the devs mind updating the API spec wiki?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-1634) Improve semantics of timestamp in OffsetCommitRequests