[jira] [Commented] (KAFKA-1693) Issue sending more messages to single Kafka server (Load testing for Kafka transport)
[ https://issues.apache.org/jira/browse/KAFKA-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166538#comment-14166538 ] rajendram kathees commented on KAFKA-1693: -- Thanks for quick response. I changed to 127.0.0.1 IP address for localhost As said in the above link. still I am getting same exception. Could you please share the kafka server.properties file for my scenario or give a solution to solve this problem. kafka.common.KafkaException: fetching topic metadata for topics [Set(test1)] from broker [ArrayBuffer(id:0,host:127.0.0.1,port:9092)] failed at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:67) at kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82) at kafka.producer.async.DefaultEventHandler$$anonfun$handle$2.apply$mcV$sp(DefaultEventHandler.scala:78) at kafka.utils.Utils$.swallow(Utils.scala:167) at kafka.utils.Logging$class.swallowError(Logging.scala:106) at kafka.utils.Utils$.swallowError(Utils.scala:46) at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:78) at kafka.producer.Producer.send(Producer.scala:76) at kafka.javaapi.producer.Producer.send(Producer.scala:33) at org.wso2.carbon.connector.KafkaProduce.send(KafkaProduce.java:76) at org.wso2.carbon.connector.KafkaProduce.connect(KafkaProduce.java:27) at org.wso2.carbon.connector.core.AbstractConnector.mediate(AbstractConnector.java:32) at org.apache.synapse.mediators.ext.ClassMediator.mediate(ClassMediator.java:78) at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:77) at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:47) at org.apache.synapse.mediators.template.TemplateMediator.mediate(TemplateMediator.java:77) at org.apache.synapse.mediators.template.InvokeMediator.mediate(InvokeMediator.java:129) at org.apache.synapse.mediators.template.InvokeMediator.mediate(InvokeMediator.java:78) at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:77) at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:47) at org.apache.synapse.mediators.base.SequenceMediator.mediate(SequenceMediator.java:131) at org.apache.synapse.core.axis2.ProxyServiceMessageReceiver.receive(ProxyServiceMessageReceiver.java:166) at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:180) at org.apache.synapse.transport.passthru.ServerWorker.processNonEntityEnclosingRESTHandler(ServerWorker.java:344) at org.apache.synapse.transport.passthru.ServerWorker.processEntityEnclosingRequest(ServerWorker.java:385) at org.apache.synapse.transport.passthru.ServerWorker.run(ServerWorker.java:183) at org.apache.axis2.transport.base.threads.NativeWorkerPool$1.run(NativeWorkerPool.java:172) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.BindException: Cannot assign requested address at sun.nio.ch.Net.connect0(Native Method) at sun.nio.ch.Net.connect(Net.java:465) at sun.nio.ch.Net.connect(Net.java:457) at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:670) at kafka.network.BlockingChannel.connect(BlockingChannel.scala:57) at kafka.producer.SyncProducer.connect(SyncProducer.scala:141) at kafka.producer.SyncProducer.getOrMakeConnection(SyncProducer.scala:156) at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:68) at kafka.producer.SyncProducer.send(SyncProducer.scala:112) at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:53) Issue sending more messages to single Kafka server (Load testing for Kafka transport) - Key: KAFKA-1693 URL: https://issues.apache.org/jira/browse/KAFKA-1693 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Environment: Ubuntu 14, Java 6 Reporter: rajendram kathees Original Estimate: 24h Remaining Estimate: 24h I tried to send 5 messages to single Kafka server.I sent the messages to ESB using JMeter and ESB sent to Kafka server. After 28000 message I am getting following exception.Do I need to change any parameter value in Kafka server.Please give me the solution. [2014-10-06 11:41:05,182] ERROR - Utils$ fetching topic metadata for topics [Set(test1)] from broker
[jira] [Comment Edited] (KAFKA-1693) Issue sending more messages to single Kafka server (Load testing for Kafka transport)
[ https://issues.apache.org/jira/browse/KAFKA-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166538#comment-14166538 ] rajendram kathees edited comment on KAFKA-1693 at 10/10/14 7:58 AM: Thanks for your quick response. I changed to 127.0.0.1 IP address for localhost As said in the above link. still I am getting same exception. Could you please share the kafka server.properties file for my scenario or give a solution to solve this problem. kafka.common.KafkaException: fetching topic metadata for topics [Set(test1)] from broker [ArrayBuffer(id:0,host:127.0.0.1,port:9092)] failed at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:67) at kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82) at kafka.producer.async.DefaultEventHandler$$anonfun$handle$2.apply$mcV$sp(DefaultEventHandler.scala:78) at kafka.utils.Utils$.swallow(Utils.scala:167) at kafka.utils.Logging$class.swallowError(Logging.scala:106) at kafka.utils.Utils$.swallowError(Utils.scala:46) at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:78) at kafka.producer.Producer.send(Producer.scala:76) at kafka.javaapi.producer.Producer.send(Producer.scala:33) at org.wso2.carbon.connector.KafkaProduce.send(KafkaProduce.java:76) at org.wso2.carbon.connector.KafkaProduce.connect(KafkaProduce.java:27) at org.wso2.carbon.connector.core.AbstractConnector.mediate(AbstractConnector.java:32) at org.apache.synapse.mediators.ext.ClassMediator.mediate(ClassMediator.java:78) at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:77) at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:47) at org.apache.synapse.mediators.template.TemplateMediator.mediate(TemplateMediator.java:77) at org.apache.synapse.mediators.template.InvokeMediator.mediate(InvokeMediator.java:129) at org.apache.synapse.mediators.template.InvokeMediator.mediate(InvokeMediator.java:78) at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:77) at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:47) at org.apache.synapse.mediators.base.SequenceMediator.mediate(SequenceMediator.java:131) at org.apache.synapse.core.axis2.ProxyServiceMessageReceiver.receive(ProxyServiceMessageReceiver.java:166) at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:180) at org.apache.synapse.transport.passthru.ServerWorker.processNonEntityEnclosingRESTHandler(ServerWorker.java:344) at org.apache.synapse.transport.passthru.ServerWorker.processEntityEnclosingRequest(ServerWorker.java:385) at org.apache.synapse.transport.passthru.ServerWorker.run(ServerWorker.java:183) at org.apache.axis2.transport.base.threads.NativeWorkerPool$1.run(NativeWorkerPool.java:172) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.BindException: Cannot assign requested address at sun.nio.ch.Net.connect0(Native Method) at sun.nio.ch.Net.connect(Net.java:465) at sun.nio.ch.Net.connect(Net.java:457) at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:670) at kafka.network.BlockingChannel.connect(BlockingChannel.scala:57) at kafka.producer.SyncProducer.connect(SyncProducer.scala:141) at kafka.producer.SyncProducer.getOrMakeConnection(SyncProducer.scala:156) at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:68) at kafka.producer.SyncProducer.send(SyncProducer.scala:112) at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:53) was (Author: kathees): Thanks for quick response. I changed to 127.0.0.1 IP address for localhost As said in the above link. still I am getting same exception. Could you please share the kafka server.properties file for my scenario or give a solution to solve this problem. kafka.common.KafkaException: fetching topic metadata for topics [Set(test1)] from broker [ArrayBuffer(id:0,host:127.0.0.1,port:9092)] failed at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:67) at kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82) at kafka.producer.async.DefaultEventHandler$$anonfun$handle$2.apply$mcV$sp(DefaultEventHandler.scala:78) at kafka.utils.Utils$.swallow(Utils.scala:167) at kafka.utils.Logging$class.swallowError(Logging.scala:106) at
[jira] [Comment Edited] (KAFKA-1693) Issue sending more messages to single Kafka server (Load testing for Kafka transport)
[ https://issues.apache.org/jira/browse/KAFKA-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166538#comment-14166538 ] rajendram kathees edited comment on KAFKA-1693 at 10/10/14 8:10 AM: Thanks for your quick response. I changed to 127.0.0.1 IP address for localhost As said in the above link. Still I am getting same exception. Could you please share kafka server.properties file or give a solution to solve this problem. kafka.common.KafkaException: fetching topic metadata for topics [Set(test1)] from broker [ArrayBuffer(id:0,host:127.0.0.1,port:9092)] failed at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:67) at kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82) at kafka.producer.async.DefaultEventHandler$$anonfun$handle$2.apply$mcV$sp(DefaultEventHandler.scala:78) at kafka.utils.Utils$.swallow(Utils.scala:167) at kafka.utils.Logging$class.swallowError(Logging.scala:106) at kafka.utils.Utils$.swallowError(Utils.scala:46) at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:78) at kafka.producer.Producer.send(Producer.scala:76) at kafka.javaapi.producer.Producer.send(Producer.scala:33) at org.wso2.carbon.connector.KafkaProduce.send(KafkaProduce.java:76) at org.wso2.carbon.connector.KafkaProduce.connect(KafkaProduce.java:27) at org.wso2.carbon.connector.core.AbstractConnector.mediate(AbstractConnector.java:32) at org.apache.synapse.mediators.ext.ClassMediator.mediate(ClassMediator.java:78) at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:77) at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:47) at org.apache.synapse.mediators.template.TemplateMediator.mediate(TemplateMediator.java:77) at org.apache.synapse.mediators.template.InvokeMediator.mediate(InvokeMediator.java:129) at org.apache.synapse.mediators.template.InvokeMediator.mediate(InvokeMediator.java:78) at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:77) at org.apache.synapse.mediators.AbstractListMediator.mediate(AbstractListMediator.java:47) at org.apache.synapse.mediators.base.SequenceMediator.mediate(SequenceMediator.java:131) at org.apache.synapse.core.axis2.ProxyServiceMessageReceiver.receive(ProxyServiceMessageReceiver.java:166) at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:180) at org.apache.synapse.transport.passthru.ServerWorker.processNonEntityEnclosingRESTHandler(ServerWorker.java:344) at org.apache.synapse.transport.passthru.ServerWorker.processEntityEnclosingRequest(ServerWorker.java:385) at org.apache.synapse.transport.passthru.ServerWorker.run(ServerWorker.java:183) at org.apache.axis2.transport.base.threads.NativeWorkerPool$1.run(NativeWorkerPool.java:172) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.BindException: Cannot assign requested address at sun.nio.ch.Net.connect0(Native Method) at sun.nio.ch.Net.connect(Net.java:465) at sun.nio.ch.Net.connect(Net.java:457) at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:670) at kafka.network.BlockingChannel.connect(BlockingChannel.scala:57) at kafka.producer.SyncProducer.connect(SyncProducer.scala:141) at kafka.producer.SyncProducer.getOrMakeConnection(SyncProducer.scala:156) at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:68) at kafka.producer.SyncProducer.send(SyncProducer.scala:112) at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:53) was (Author: kathees): Thanks for your quick response. I changed to 127.0.0.1 IP address for localhost As said in the above link. still I am getting same exception. Could you please share the kafka server.properties file for my scenario or give a solution to solve this problem. kafka.common.KafkaException: fetching topic metadata for topics [Set(test1)] from broker [ArrayBuffer(id:0,host:127.0.0.1,port:9092)] failed at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:67) at kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82) at kafka.producer.async.DefaultEventHandler$$anonfun$handle$2.apply$mcV$sp(DefaultEventHandler.scala:78) at kafka.utils.Utils$.swallow(Utils.scala:167) at kafka.utils.Logging$class.swallowError(Logging.scala:106) at
[jira] [Issue Comment Deleted] (KAFKA-1328) Add new consumer APIs
[ https://issues.apache.org/jira/browse/KAFKA-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stevo Slavic updated KAFKA-1328: Comment: was deleted (was: Affects 0.9.0 ?!) Add new consumer APIs - Key: KAFKA-1328 URL: https://issues.apache.org/jira/browse/KAFKA-1328 Project: Kafka Issue Type: Sub-task Components: consumer Affects Versions: 0.8.0, 0.8.1 Reporter: Neha Narkhede Assignee: Neha Narkhede Fix For: 0.8.2 Attachments: KAFKA-1328.patch, KAFKA-1328_2014-04-10_17:13:24.patch, KAFKA-1328_2014-04-10_18:30:48.patch, KAFKA-1328_2014-04-11_10:54:19.patch, KAFKA-1328_2014-04-11_11:16:44.patch, KAFKA-1328_2014-04-12_18:30:22.patch, KAFKA-1328_2014-04-12_19:12:12.patch, KAFKA-1328_2014-05-05_11:35:07.patch, KAFKA-1328_2014-05-05_11:35:41.patch, KAFKA-1328_2014-05-09_17:18:55.patch, KAFKA-1328_2014-05-16_11:46:02.patch, KAFKA-1328_2014-05-20_15:55:01.patch, KAFKA-1328_2014-05-20_16:34:37.patch New consumer API discussion is here - http://mail-archives.apache.org/mod_mbox/kafka-users/201402.mbox/%3CCAOG_4QYBHwyi0xN=hl1fpnrtkvfjzx14ujfntft3nn_mw3+...@mail.gmail.com%3E This JIRA includes reviewing and checking in the new consumer APIs -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-784) creating topic without partitions, deleting then creating with partition causes errors in 'kafka-list-topic'
[ https://issues.apache.org/jira/browse/KAFKA-784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166709#comment-14166709 ] Ahmet AKYOL commented on KAFKA-784: --- happy to hear that. I was using this version: ./kafka-run-class.sh kafka.admin.DeleteTopicCommand --topic mytopic --zookeeper localhost:2181 I couldn't create same topic after that until removing related kafka data logs manually. creating topic without partitions, deleting then creating with partition causes errors in 'kafka-list-topic' Key: KAFKA-784 URL: https://issues.apache.org/jira/browse/KAFKA-784 Project: Kafka Issue Type: Sub-task Components: core Affects Versions: 0.8.0 Environment: 0.8.0 head as of 3/4/2013 Reporter: Chris Curtin Assignee: Swapnil Ghike Priority: Minor Create a new topic using the command line: ./kafka-create-topic.sh --topic trash-1 --replica 3 --zookeeper localhost Realize you forgot to add the partition command, so remove it: ./kafka-delete-topic.sh --topic trash-1 --zookeeper localhost Recreate it with partitions: ./kafka-create-topic.sh --topic trash-1 --replica 3 --zookeeper localhost --partition 5 Try to get a listing: ./kafka-list-topic.sh --topic trash-1 --zookeeper localhost Errors: [2013-03-04 14:15:23,876] ERROR Error while fetching metadata for partition [trash-1,0] (kafka.admin.AdminUtils$) kafka.common.LeaderNotAvailableException: Leader not available for topic trash-1 partition 0 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120) at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206) at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61) at scala.collection.immutable.List.foreach(List.scala:45) at scala.collection.TraversableLike$class.map(TraversableLike.scala:206) at scala.collection.immutable.List.map(List.scala:45) at kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103) at kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92) at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80) at kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66) at kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65) at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61) at scala.collection.immutable.List.foreach(List.scala:45) at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65) at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala) Caused by: kafka.common.LeaderNotAvailableException: No leader exists for partition 0 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117) ... 16 more topic: trash-1 PartitionMetadata(0,None,List(),List(),5) Can't recover until you restart all the Brokers in the cluster. Then the list command works correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (KAFKA-1691) new java consumer needs ssl support as a client
[ https://issues.apache.org/jira/browse/KAFKA-1691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Lyutov reassigned KAFKA-1691: -- Assignee: Ivan Lyutov new java consumer needs ssl support as a client --- Key: KAFKA-1691 URL: https://issues.apache.org/jira/browse/KAFKA-1691 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Ivan Lyutov Fix For: 0.8.3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (KAFKA-1690) new java producer needs ssl support as a client
[ https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Lyutov reassigned KAFKA-1690: -- Assignee: Ivan Lyutov new java producer needs ssl support as a client --- Key: KAFKA-1690 URL: https://issues.apache.org/jira/browse/KAFKA-1690 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Ivan Lyutov Fix For: 0.8.3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1477) add authentication layer and initial JKS x509 implementation for brokers, producers and consumer for network communication
[ https://issues.apache.org/jira/browse/KAFKA-1477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166880#comment-14166880 ] Joe Stein commented on KAFKA-1477: -- Jun, Getting this to work in the new (java) client is going to take some more work integrating a secure nio version. Ivan is going to start later next week on that after this last configuration item is done. I created separate tickets KAFKA-1690 and KAFKA-1691 for the new (java) client work. I am +1 to committing this patch to trunk after the server configuration item is done and working how folks expect. This gives the broker support so we can start to get the existing clients (including third party clients e.g. https://github.com/Shopify/sarama/pull/156 ) to hook in now for security features as we are moving ahead with them overall. I see the existing (scala and third party) clients as stable clients we are incrementally changing to confirm compatibility with each other, the broker and new (java) clients too. We can drop support once they are not being used and folks have upgraded (or on an upgrade path at least) but I don't think we are there yet so putting this in now make sense. add authentication layer and initial JKS x509 implementation for brokers, producers and consumer for network communication -- Key: KAFKA-1477 URL: https://issues.apache.org/jira/browse/KAFKA-1477 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Ivan Lyutov Fix For: 0.8.3 Attachments: KAFKA-1477-binary.patch, KAFKA-1477.patch, KAFKA-1477_2014-06-02_16:59:40.patch, KAFKA-1477_2014-06-02_17:24:26.patch, KAFKA-1477_2014-06-03_13:46:17.patch, KAFKA-1477_trunk.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-784) creating topic without partitions, deleting then creating with partition causes errors in 'kafka-list-topic'
[ https://issues.apache.org/jira/browse/KAFKA-784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriharsha Chintalapani updated KAFKA-784: - Fix Version/s: 0.8.2 creating topic without partitions, deleting then creating with partition causes errors in 'kafka-list-topic' Key: KAFKA-784 URL: https://issues.apache.org/jira/browse/KAFKA-784 Project: Kafka Issue Type: Sub-task Components: core Affects Versions: 0.8.0 Environment: 0.8.0 head as of 3/4/2013 Reporter: Chris Curtin Assignee: Swapnil Ghike Priority: Minor Fix For: 0.8.2 Create a new topic using the command line: ./kafka-create-topic.sh --topic trash-1 --replica 3 --zookeeper localhost Realize you forgot to add the partition command, so remove it: ./kafka-delete-topic.sh --topic trash-1 --zookeeper localhost Recreate it with partitions: ./kafka-create-topic.sh --topic trash-1 --replica 3 --zookeeper localhost --partition 5 Try to get a listing: ./kafka-list-topic.sh --topic trash-1 --zookeeper localhost Errors: [2013-03-04 14:15:23,876] ERROR Error while fetching metadata for partition [trash-1,0] (kafka.admin.AdminUtils$) kafka.common.LeaderNotAvailableException: Leader not available for topic trash-1 partition 0 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120) at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206) at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61) at scala.collection.immutable.List.foreach(List.scala:45) at scala.collection.TraversableLike$class.map(TraversableLike.scala:206) at scala.collection.immutable.List.map(List.scala:45) at kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103) at kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92) at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80) at kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66) at kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65) at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61) at scala.collection.immutable.List.foreach(List.scala:45) at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65) at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala) Caused by: kafka.common.LeaderNotAvailableException: No leader exists for partition 0 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117) ... 16 more topic: trash-1 PartitionMetadata(0,None,List(),List(),5) Can't recover until you restart all the Brokers in the cluster. Then the list command works correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KAFKA-784) creating topic without partitions, deleting then creating with partition causes errors in 'kafka-list-topic'
[ https://issues.apache.org/jira/browse/KAFKA-784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriharsha Chintalapani resolved KAFKA-784. -- Resolution: Fixed creating topic without partitions, deleting then creating with partition causes errors in 'kafka-list-topic' Key: KAFKA-784 URL: https://issues.apache.org/jira/browse/KAFKA-784 Project: Kafka Issue Type: Sub-task Components: core Affects Versions: 0.8.0 Environment: 0.8.0 head as of 3/4/2013 Reporter: Chris Curtin Assignee: Swapnil Ghike Priority: Minor Fix For: 0.8.2 Create a new topic using the command line: ./kafka-create-topic.sh --topic trash-1 --replica 3 --zookeeper localhost Realize you forgot to add the partition command, so remove it: ./kafka-delete-topic.sh --topic trash-1 --zookeeper localhost Recreate it with partitions: ./kafka-create-topic.sh --topic trash-1 --replica 3 --zookeeper localhost --partition 5 Try to get a listing: ./kafka-list-topic.sh --topic trash-1 --zookeeper localhost Errors: [2013-03-04 14:15:23,876] ERROR Error while fetching metadata for partition [trash-1,0] (kafka.admin.AdminUtils$) kafka.common.LeaderNotAvailableException: Leader not available for topic trash-1 partition 0 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120) at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206) at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61) at scala.collection.immutable.List.foreach(List.scala:45) at scala.collection.TraversableLike$class.map(TraversableLike.scala:206) at scala.collection.immutable.List.map(List.scala:45) at kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103) at kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92) at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80) at kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66) at kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65) at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61) at scala.collection.immutable.List.foreach(List.scala:45) at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65) at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala) Caused by: kafka.common.LeaderNotAvailableException: No leader exists for partition 0 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117) ... 16 more topic: trash-1 PartitionMetadata(0,None,List(),List(),5) Can't recover until you restart all the Brokers in the cluster. Then the list command works correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-784) creating topic without partitions, deleting then creating with partition causes errors in 'kafka-list-topic'
[ https://issues.apache.org/jira/browse/KAFKA-784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166988#comment-14166988 ] Sriharsha Chintalapani commented on KAFKA-784: -- [~liqusha] with kafka 0.8.2 branch or trunk . kafka-topics won't allow creation of topic without --partitions and --replication-factor. Once the topic is created , deleting the topic and re-creating works fine. Also tested this by increasing the topic partitions and deleting it works fine. Closing this as resolved. creating topic without partitions, deleting then creating with partition causes errors in 'kafka-list-topic' Key: KAFKA-784 URL: https://issues.apache.org/jira/browse/KAFKA-784 Project: Kafka Issue Type: Sub-task Components: core Affects Versions: 0.8.0 Environment: 0.8.0 head as of 3/4/2013 Reporter: Chris Curtin Assignee: Swapnil Ghike Priority: Minor Fix For: 0.8.2 Create a new topic using the command line: ./kafka-create-topic.sh --topic trash-1 --replica 3 --zookeeper localhost Realize you forgot to add the partition command, so remove it: ./kafka-delete-topic.sh --topic trash-1 --zookeeper localhost Recreate it with partitions: ./kafka-create-topic.sh --topic trash-1 --replica 3 --zookeeper localhost --partition 5 Try to get a listing: ./kafka-list-topic.sh --topic trash-1 --zookeeper localhost Errors: [2013-03-04 14:15:23,876] ERROR Error while fetching metadata for partition [trash-1,0] (kafka.admin.AdminUtils$) kafka.common.LeaderNotAvailableException: Leader not available for topic trash-1 partition 0 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120) at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206) at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61) at scala.collection.immutable.List.foreach(List.scala:45) at scala.collection.TraversableLike$class.map(TraversableLike.scala:206) at scala.collection.immutable.List.map(List.scala:45) at kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103) at kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92) at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80) at kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66) at kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65) at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61) at scala.collection.immutable.List.foreach(List.scala:45) at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65) at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala) Caused by: kafka.common.LeaderNotAvailableException: No leader exists for partition 0 at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117) ... 16 more topic: trash-1 PartitionMetadata(0,None,List(),List(),5) Can't recover until you restart all the Brokers in the cluster. Then the list command works correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 26560: Patch for KAFKA-1305
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26560/ --- Review request for kafka. Bugs: KAFKA-1305 https://issues.apache.org/jira/browse/KAFKA-1305 Repository: kafka Description --- KAFKA-1305. Controller can hang on controlled shutdown with auto leader balance enabled. Diffs - core/src/main/scala/kafka/server/KafkaConfig.scala 90af698b01ec82b6168e02b6af41887ef164ad51 Diff: https://reviews.apache.org/r/26560/diff/ Testing --- Thanks, Sriharsha Chintalapani
[jira] [Commented] (KAFKA-1305) Controller can hang on controlled shutdown with auto leader balance enabled
[ https://issues.apache.org/jira/browse/KAFKA-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167017#comment-14167017 ] Sriharsha Chintalapani commented on KAFKA-1305: --- Created reviewboard https://reviews.apache.org/r/26560/diff/ against branch origin/trunk Controller can hang on controlled shutdown with auto leader balance enabled --- Key: KAFKA-1305 URL: https://issues.apache.org/jira/browse/KAFKA-1305 Project: Kafka Issue Type: Bug Reporter: Joel Koshy Assignee: Sriharsha Chintalapani Priority: Blocker Fix For: 0.8.2, 0.9.0 Attachments: KAFKA-1305.patch This is relatively easy to reproduce especially when doing a rolling bounce. What happened here is as follows: 1. The previous controller was bounced and broker 265 became the new controller. 2. I went on to do a controlled shutdown of broker 265 (the new controller). 3. In the mean time the automatically scheduled preferred replica leader election process started doing its thing and starts sending LeaderAndIsrRequests/UpdateMetadataRequests to itself (and other brokers). (t@113 below). 4. While that's happening, the controlled shutdown process on 265 succeeds and proceeds to deregister itself from ZooKeeper and shuts down the socket server. 5. (ReplicaStateMachine actually removes deregistered brokers from the controller channel manager's list of brokers to send requests to. However, that removal cannot take place (t@18 below) because preferred replica leader election task owns the controller lock.) 6. So the request thread to broker 265 gets into infinite retries. 7. The entire broker shutdown process is blocked on controller shutdown for the same reason (it needs to acquire the controller lock). Relevant portions from the thread-dump: Controller-265-to-broker-265-send-thread - Thread t@113 java.lang.Thread.State: TIMED_WAITING at java.lang.Thread.sleep(Native Method) at kafka.controller.RequestSendThread$$anonfun$liftedTree1$1$1.apply$mcV$sp(ControllerChannelManager.scala:143) at kafka.utils.Utils$.swallow(Utils.scala:167) at kafka.utils.Logging$class.swallowWarn(Logging.scala:92) at kafka.utils.Utils$.swallowWarn(Utils.scala:46) at kafka.utils.Logging$class.swallow(Logging.scala:94) at kafka.utils.Utils$.swallow(Utils.scala:46) at kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:143) at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131) - locked java.lang.Object@6dbf14a7 at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) Locked ownable synchronizers: - None ... Thread-4 - Thread t@17 java.lang.Thread.State: WAITING on java.util.concurrent.locks.ReentrantLock$NonfairSync@4836840 owned by: kafka-scheduler-0 at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262) at kafka.utils.Utils$.inLock(Utils.scala:536) at kafka.controller.KafkaController.shutdown(KafkaController.scala:642) at kafka.server.KafkaServer$$anonfun$shutdown$9.apply$mcV$sp(KafkaServer.scala:242) at kafka.utils.Utils$.swallow(Utils.scala:167) at kafka.utils.Logging$class.swallowWarn(Logging.scala:92) at kafka.utils.Utils$.swallowWarn(Utils.scala:46) at kafka.utils.Logging$class.swallow(Logging.scala:94) at kafka.utils.Utils$.swallow(Utils.scala:46) at kafka.server.KafkaServer.shutdown(KafkaServer.scala:242) at kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:46) at kafka.Kafka$$anon$1.run(Kafka.scala:42) ... kafka-scheduler-0 - Thread t@117 java.lang.Thread.State: WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1dc407fc at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:306) at
[jira] [Updated] (KAFKA-1305) Controller can hang on controlled shutdown with auto leader balance enabled
[ https://issues.apache.org/jira/browse/KAFKA-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriharsha Chintalapani updated KAFKA-1305: -- Attachment: KAFKA-1305.patch Controller can hang on controlled shutdown with auto leader balance enabled --- Key: KAFKA-1305 URL: https://issues.apache.org/jira/browse/KAFKA-1305 Project: Kafka Issue Type: Bug Reporter: Joel Koshy Assignee: Sriharsha Chintalapani Priority: Blocker Fix For: 0.8.2, 0.9.0 Attachments: KAFKA-1305.patch This is relatively easy to reproduce especially when doing a rolling bounce. What happened here is as follows: 1. The previous controller was bounced and broker 265 became the new controller. 2. I went on to do a controlled shutdown of broker 265 (the new controller). 3. In the mean time the automatically scheduled preferred replica leader election process started doing its thing and starts sending LeaderAndIsrRequests/UpdateMetadataRequests to itself (and other brokers). (t@113 below). 4. While that's happening, the controlled shutdown process on 265 succeeds and proceeds to deregister itself from ZooKeeper and shuts down the socket server. 5. (ReplicaStateMachine actually removes deregistered brokers from the controller channel manager's list of brokers to send requests to. However, that removal cannot take place (t@18 below) because preferred replica leader election task owns the controller lock.) 6. So the request thread to broker 265 gets into infinite retries. 7. The entire broker shutdown process is blocked on controller shutdown for the same reason (it needs to acquire the controller lock). Relevant portions from the thread-dump: Controller-265-to-broker-265-send-thread - Thread t@113 java.lang.Thread.State: TIMED_WAITING at java.lang.Thread.sleep(Native Method) at kafka.controller.RequestSendThread$$anonfun$liftedTree1$1$1.apply$mcV$sp(ControllerChannelManager.scala:143) at kafka.utils.Utils$.swallow(Utils.scala:167) at kafka.utils.Logging$class.swallowWarn(Logging.scala:92) at kafka.utils.Utils$.swallowWarn(Utils.scala:46) at kafka.utils.Logging$class.swallow(Logging.scala:94) at kafka.utils.Utils$.swallow(Utils.scala:46) at kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:143) at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131) - locked java.lang.Object@6dbf14a7 at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) Locked ownable synchronizers: - None ... Thread-4 - Thread t@17 java.lang.Thread.State: WAITING on java.util.concurrent.locks.ReentrantLock$NonfairSync@4836840 owned by: kafka-scheduler-0 at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262) at kafka.utils.Utils$.inLock(Utils.scala:536) at kafka.controller.KafkaController.shutdown(KafkaController.scala:642) at kafka.server.KafkaServer$$anonfun$shutdown$9.apply$mcV$sp(KafkaServer.scala:242) at kafka.utils.Utils$.swallow(Utils.scala:167) at kafka.utils.Logging$class.swallowWarn(Logging.scala:92) at kafka.utils.Utils$.swallowWarn(Utils.scala:46) at kafka.utils.Logging$class.swallow(Logging.scala:94) at kafka.utils.Utils$.swallow(Utils.scala:46) at kafka.server.KafkaServer.shutdown(KafkaServer.scala:242) at kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:46) at kafka.Kafka$$anon$1.run(Kafka.scala:42) ... kafka-scheduler-0 - Thread t@117 java.lang.Thread.State: WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1dc407fc at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:306) at kafka.controller.ControllerChannelManager.sendRequest(ControllerChannelManager.scala:57) - locked
[jira] [Updated] (KAFKA-1305) Controller can hang on controlled shutdown with auto leader balance enabled
[ https://issues.apache.org/jira/browse/KAFKA-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriharsha Chintalapani updated KAFKA-1305: -- Status: Patch Available (was: Open) Controller can hang on controlled shutdown with auto leader balance enabled --- Key: KAFKA-1305 URL: https://issues.apache.org/jira/browse/KAFKA-1305 Project: Kafka Issue Type: Bug Reporter: Joel Koshy Assignee: Sriharsha Chintalapani Priority: Blocker Fix For: 0.8.2, 0.9.0 Attachments: KAFKA-1305.patch This is relatively easy to reproduce especially when doing a rolling bounce. What happened here is as follows: 1. The previous controller was bounced and broker 265 became the new controller. 2. I went on to do a controlled shutdown of broker 265 (the new controller). 3. In the mean time the automatically scheduled preferred replica leader election process started doing its thing and starts sending LeaderAndIsrRequests/UpdateMetadataRequests to itself (and other brokers). (t@113 below). 4. While that's happening, the controlled shutdown process on 265 succeeds and proceeds to deregister itself from ZooKeeper and shuts down the socket server. 5. (ReplicaStateMachine actually removes deregistered brokers from the controller channel manager's list of brokers to send requests to. However, that removal cannot take place (t@18 below) because preferred replica leader election task owns the controller lock.) 6. So the request thread to broker 265 gets into infinite retries. 7. The entire broker shutdown process is blocked on controller shutdown for the same reason (it needs to acquire the controller lock). Relevant portions from the thread-dump: Controller-265-to-broker-265-send-thread - Thread t@113 java.lang.Thread.State: TIMED_WAITING at java.lang.Thread.sleep(Native Method) at kafka.controller.RequestSendThread$$anonfun$liftedTree1$1$1.apply$mcV$sp(ControllerChannelManager.scala:143) at kafka.utils.Utils$.swallow(Utils.scala:167) at kafka.utils.Logging$class.swallowWarn(Logging.scala:92) at kafka.utils.Utils$.swallowWarn(Utils.scala:46) at kafka.utils.Logging$class.swallow(Logging.scala:94) at kafka.utils.Utils$.swallow(Utils.scala:46) at kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:143) at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131) - locked java.lang.Object@6dbf14a7 at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) Locked ownable synchronizers: - None ... Thread-4 - Thread t@17 java.lang.Thread.State: WAITING on java.util.concurrent.locks.ReentrantLock$NonfairSync@4836840 owned by: kafka-scheduler-0 at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262) at kafka.utils.Utils$.inLock(Utils.scala:536) at kafka.controller.KafkaController.shutdown(KafkaController.scala:642) at kafka.server.KafkaServer$$anonfun$shutdown$9.apply$mcV$sp(KafkaServer.scala:242) at kafka.utils.Utils$.swallow(Utils.scala:167) at kafka.utils.Logging$class.swallowWarn(Logging.scala:92) at kafka.utils.Utils$.swallowWarn(Utils.scala:46) at kafka.utils.Logging$class.swallow(Logging.scala:94) at kafka.utils.Utils$.swallow(Utils.scala:46) at kafka.server.KafkaServer.shutdown(KafkaServer.scala:242) at kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:46) at kafka.Kafka$$anon$1.run(Kafka.scala:42) ... kafka-scheduler-0 - Thread t@117 java.lang.Thread.State: WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1dc407fc at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:306) at kafka.controller.ControllerChannelManager.sendRequest(ControllerChannelManager.scala:57) - locked
Re: including KAFKA-1555 in 0.8.2?
+1. On Thu, Oct 9, 2014 at 6:41 PM, Jun Rao jun...@gmail.com wrote: Hi, Everyone, I just committed KAFKA-1555 (min.isr support) to trunk. I felt that it's probably useful to include it in the 0.8.2 release. Any objections? Thanks, Jun
[jira] [Commented] (KAFKA-1305) Controller can hang on controlled shutdown with auto leader balance enabled
[ https://issues.apache.org/jira/browse/KAFKA-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167050#comment-14167050 ] Neha Narkhede commented on KAFKA-1305: -- [~junrao], [~sriharsha] What's the value in changing it from something to 10K vs unbounded? Controller can hang on controlled shutdown with auto leader balance enabled --- Key: KAFKA-1305 URL: https://issues.apache.org/jira/browse/KAFKA-1305 Project: Kafka Issue Type: Bug Reporter: Joel Koshy Assignee: Sriharsha Chintalapani Priority: Blocker Fix For: 0.8.2, 0.9.0 Attachments: KAFKA-1305.patch This is relatively easy to reproduce especially when doing a rolling bounce. What happened here is as follows: 1. The previous controller was bounced and broker 265 became the new controller. 2. I went on to do a controlled shutdown of broker 265 (the new controller). 3. In the mean time the automatically scheduled preferred replica leader election process started doing its thing and starts sending LeaderAndIsrRequests/UpdateMetadataRequests to itself (and other brokers). (t@113 below). 4. While that's happening, the controlled shutdown process on 265 succeeds and proceeds to deregister itself from ZooKeeper and shuts down the socket server. 5. (ReplicaStateMachine actually removes deregistered brokers from the controller channel manager's list of brokers to send requests to. However, that removal cannot take place (t@18 below) because preferred replica leader election task owns the controller lock.) 6. So the request thread to broker 265 gets into infinite retries. 7. The entire broker shutdown process is blocked on controller shutdown for the same reason (it needs to acquire the controller lock). Relevant portions from the thread-dump: Controller-265-to-broker-265-send-thread - Thread t@113 java.lang.Thread.State: TIMED_WAITING at java.lang.Thread.sleep(Native Method) at kafka.controller.RequestSendThread$$anonfun$liftedTree1$1$1.apply$mcV$sp(ControllerChannelManager.scala:143) at kafka.utils.Utils$.swallow(Utils.scala:167) at kafka.utils.Logging$class.swallowWarn(Logging.scala:92) at kafka.utils.Utils$.swallowWarn(Utils.scala:46) at kafka.utils.Logging$class.swallow(Logging.scala:94) at kafka.utils.Utils$.swallow(Utils.scala:46) at kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:143) at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131) - locked java.lang.Object@6dbf14a7 at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) Locked ownable synchronizers: - None ... Thread-4 - Thread t@17 java.lang.Thread.State: WAITING on java.util.concurrent.locks.ReentrantLock$NonfairSync@4836840 owned by: kafka-scheduler-0 at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262) at kafka.utils.Utils$.inLock(Utils.scala:536) at kafka.controller.KafkaController.shutdown(KafkaController.scala:642) at kafka.server.KafkaServer$$anonfun$shutdown$9.apply$mcV$sp(KafkaServer.scala:242) at kafka.utils.Utils$.swallow(Utils.scala:167) at kafka.utils.Logging$class.swallowWarn(Logging.scala:92) at kafka.utils.Utils$.swallowWarn(Utils.scala:46) at kafka.utils.Logging$class.swallow(Logging.scala:94) at kafka.utils.Utils$.swallow(Utils.scala:46) at kafka.server.KafkaServer.shutdown(KafkaServer.scala:242) at kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:46) at kafka.Kafka$$anon$1.run(Kafka.scala:42) ... kafka-scheduler-0 - Thread t@117 java.lang.Thread.State: WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1dc407fc at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:306) at
Re: Review Request 26560: Patch for KAFKA-1305
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26560/#review56156 --- core/src/main/scala/kafka/server/KafkaConfig.scala https://reviews.apache.org/r/26560/#comment96475 Weren't we going to make this unbounded? - Neha Narkhede On Oct. 10, 2014, 3:51 p.m., Sriharsha Chintalapani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26560/ --- (Updated Oct. 10, 2014, 3:51 p.m.) Review request for kafka. Bugs: KAFKA-1305 https://issues.apache.org/jira/browse/KAFKA-1305 Repository: kafka Description --- KAFKA-1305. Controller can hang on controlled shutdown with auto leader balance enabled. Diffs - core/src/main/scala/kafka/server/KafkaConfig.scala 90af698b01ec82b6168e02b6af41887ef164ad51 Diff: https://reviews.apache.org/r/26560/diff/ Testing --- Thanks, Sriharsha Chintalapani
[jira] [Comment Edited] (KAFKA-1697) remove code related to ack1 on the broker
[ https://issues.apache.org/jira/browse/KAFKA-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166144#comment-14166144 ] Neha Narkhede edited comment on KAFKA-1697 at 10/10/14 4:12 PM: We probably can wait under KAFKA-1583 is done since code change will likely be easier then. was (Author: junrao): We probably can wait under kafka-1583 is done since code change will likely be easier then. remove code related to ack1 on the broker -- Key: KAFKA-1697 URL: https://issues.apache.org/jira/browse/KAFKA-1697 Project: Kafka Issue Type: Bug Reporter: Jun Rao Assignee: Gwen Shapira Fix For: 0.8.3 We removed the ack1 support from the producer client in kafka-1555. We can completely remove the code in the broker that supports ack1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167057#comment-14167057 ] Jay Kreps commented on KAFKA-1555: -- w00t! provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 26474: KAFKA-1654 Provide a way to override server configuration from command line
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26474/ --- (Updated Oct. 10, 2014, 4:23 p.m.) Review request for kafka and Neha Narkhede. Bugs: SQOOP-1654 https://issues.apache.org/jira/browse/SQOOP-1654 Repository: kafka Description --- I'm assuming that we might want to add additional arguments in the future as well, so I've added general facility to parse arguments to Kafka main class and added argument --set that defines/overrides any property in the config file. I've decided to use --set rather then exposing each property that is availalbe in KafkaConfig class as it's own argument, so that we don't have to keep those two classes always in sync. This is first bigger patch that I've written in Scala, so I'm particularly interested to hear feedback on the coding style. Diffs - core/src/main/scala/kafka/Kafka.scala 2e94fee core/src/test/scala/unit/kafka/KafkaTest.scala PRE-CREATION Diff: https://reviews.apache.org/r/26474/diff/ Testing --- I've added unit tests and verified the functionality on real cluster. Thanks, Jarek Cecho
Re: including KAFKA-1555 in 0.8.2?
+1 On Oct 10, 2014 12:08 PM, Neha Narkhede neha.narkh...@gmail.com wrote: +1. On Thu, Oct 9, 2014 at 6:41 PM, Jun Rao jun...@gmail.com wrote: Hi, Everyone, I just committed KAFKA-1555 (min.isr support) to trunk. I felt that it's probably useful to include it in the 0.8.2 release. Any objections? Thanks, Jun
Re: including KAFKA-1555 in 0.8.2?
+1 :) On Fri, Oct 10, 2014 at 9:34 AM, Joe Stein joe.st...@stealth.ly wrote: +1 On Oct 10, 2014 12:08 PM, Neha Narkhede neha.narkh...@gmail.com wrote: +1. On Thu, Oct 9, 2014 at 6:41 PM, Jun Rao jun...@gmail.com wrote: Hi, Everyone, I just committed KAFKA-1555 (min.isr support) to trunk. I felt that it's probably useful to include it in the 0.8.2 release. Any objections? Thanks, Jun
Re: Security JIRAS
I'd vote for accepting every major change with the relevant system tests. We didn't do this for major features in the past that lead to weak coverage and a great deal of work for someone else to add tests for features that were done in the past. I'm guilty of this myself :-( On Thu, Oct 9, 2014 at 6:45 PM, Gwen Shapira gshap...@cloudera.com wrote: Added some details on delegation tokens. I hope it at least clarifies some of the scope. I'm working on a more detailed design doc. On Thu, Oct 9, 2014 at 1:44 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Gwen, Your absolutely right about these. I added the ticket for ZK authentication and Hadoop delegation tokens. For the Hadoop case I actually don't understand Hadoop security very well. Maybe you could fill in some of the details on what needs to happen for that to work? For testing, we should probably discuss the best way to test security. I think this is a fairly critical thing, if we are going to say we have security we really need to have good tests in place to ensure we do. This will require some thought. I think we should be able to test TLS fairly easily using junit integration test that just starts the server and connects using TLS. For Kerberos though it isn't clear to me how to do good integration testing since we need a KDC to test against and it isn't clear how that happens in the test environment except possibly manually (which is not ideal). How do other projects handle this? -Jay On Tue, Oct 7, 2014 at 5:25 PM, Gwen Shapira gshap...@cloudera.com wrote: I think we need to add: * Authentication of Kafka brokers with a secured ZooKeeper * Kafka should be able to generate delegation tokens for MapReduce / Spark / Yarn jobs. * Extend systest framework to allow testing secured kafka Gwen On Tue, Oct 7, 2014 at 5:15 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey guys, As promised, I added a tree of JIRAs for the stuff in the security wiki ( https://cwiki.apache.org/confluence/display/KAFKA/Security): https://issues.apache.org/jira/browse/KAFKA-1682 I tried to break it into reasonably standalone pieces. I think many of the tickets could actually be done in parallel. Since there were many people interested in this area this may help parallelize the work a bit. I added some strawman details on implementation to each ticket. We can discuss and refine further on the individual tickets. Please take a look and let me know if this breakdown seems reasonable. Cheers, -Jay
Re: including KAFKA-1555 in 0.8.2?
+1 On 10/10/14 9:39 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 :) On Fri, Oct 10, 2014 at 9:34 AM, Joe Stein joe.st...@stealth.ly wrote: +1 On Oct 10, 2014 12:08 PM, Neha Narkhede neha.narkh...@gmail.com wrote: +1. On Thu, Oct 9, 2014 at 6:41 PM, Jun Rao jun...@gmail.com wrote: Hi, Everyone, I just committed KAFKA-1555 (min.isr support) to trunk. I felt that it's probably useful to include it in the 0.8.2 release. Any objections? Thanks, Jun
Re: Security JIRAS
I would be strong +1 on that. I’ve seen a lot of regressions on other projects when new functionality cause regressions when running in secure mode. Jarcec On Oct 10, 2014, at 9:43 AM, Neha Narkhede neha.narkh...@gmail.com wrote: I'd vote for accepting every major change with the relevant system tests. We didn't do this for major features in the past that lead to weak coverage and a great deal of work for someone else to add tests for features that were done in the past. I'm guilty of this myself :-( On Thu, Oct 9, 2014 at 6:45 PM, Gwen Shapira gshap...@cloudera.com wrote: Added some details on delegation tokens. I hope it at least clarifies some of the scope. I'm working on a more detailed design doc. On Thu, Oct 9, 2014 at 1:44 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Gwen, Your absolutely right about these. I added the ticket for ZK authentication and Hadoop delegation tokens. For the Hadoop case I actually don't understand Hadoop security very well. Maybe you could fill in some of the details on what needs to happen for that to work? For testing, we should probably discuss the best way to test security. I think this is a fairly critical thing, if we are going to say we have security we really need to have good tests in place to ensure we do. This will require some thought. I think we should be able to test TLS fairly easily using junit integration test that just starts the server and connects using TLS. For Kerberos though it isn't clear to me how to do good integration testing since we need a KDC to test against and it isn't clear how that happens in the test environment except possibly manually (which is not ideal). How do other projects handle this? -Jay On Tue, Oct 7, 2014 at 5:25 PM, Gwen Shapira gshap...@cloudera.com wrote: I think we need to add: * Authentication of Kafka brokers with a secured ZooKeeper * Kafka should be able to generate delegation tokens for MapReduce / Spark / Yarn jobs. * Extend systest framework to allow testing secured kafka Gwen On Tue, Oct 7, 2014 at 5:15 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey guys, As promised, I added a tree of JIRAs for the stuff in the security wiki ( https://cwiki.apache.org/confluence/display/KAFKA/Security): https://issues.apache.org/jira/browse/KAFKA-1682 I tried to break it into reasonably standalone pieces. I think many of the tickets could actually be done in parallel. Since there were many people interested in this area this may help parallelize the work a bit. I added some strawman details on implementation to each ticket. We can discuss and refine further on the individual tickets. Please take a look and let me know if this breakdown seems reasonable. Cheers, -Jay
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167113#comment-14167113 ] Sriram Subramanian commented on KAFKA-1555: --- Awesome. I suggest we document the guarantees provided by the different knobs. That would be very useful. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-559) Garbage collect old consumer metadata entries
[ https://issues.apache.org/jira/browse/KAFKA-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neha Narkhede updated KAFKA-559: Labels: newbie project (was: project) Garbage collect old consumer metadata entries - Key: KAFKA-559 URL: https://issues.apache.org/jira/browse/KAFKA-559 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Assignee: Tejas Patil Labels: newbie, project Attachments: KAFKA-559.v1.patch, KAFKA-559.v2.patch Many use cases involve tranient consumers. These consumers create entries under their consumer group in zk and maintain offsets there as well. There is currently no way to delete these entries. It would be good to have a tool that did something like bin/delete-obsolete-consumer-groups.sh [--topic t1] --since [date] --zookeeper [zk_connect] This would scan through consumer group entries and delete any that had no offset update since the given date. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1305) Controller can hang on controlled shutdown with auto leader balance enabled
[ https://issues.apache.org/jira/browse/KAFKA-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167145#comment-14167145 ] Sriharsha Chintalapani commented on KAFKA-1305: --- [~nehanarkhede] [~junrao] my understanding is that we created more room for KafkaController not to get into any of the above mentioned issues by setting to 10k but yes making unbounded is a better option as there could be a chance of exhausting 10k bounded queue and run into issues. We can get rid off controller.message.queue.size as config option and make the LinkedBlockingQueue unbounded. Controller can hang on controlled shutdown with auto leader balance enabled --- Key: KAFKA-1305 URL: https://issues.apache.org/jira/browse/KAFKA-1305 Project: Kafka Issue Type: Bug Reporter: Joel Koshy Assignee: Sriharsha Chintalapani Priority: Blocker Fix For: 0.8.2, 0.9.0 Attachments: KAFKA-1305.patch This is relatively easy to reproduce especially when doing a rolling bounce. What happened here is as follows: 1. The previous controller was bounced and broker 265 became the new controller. 2. I went on to do a controlled shutdown of broker 265 (the new controller). 3. In the mean time the automatically scheduled preferred replica leader election process started doing its thing and starts sending LeaderAndIsrRequests/UpdateMetadataRequests to itself (and other brokers). (t@113 below). 4. While that's happening, the controlled shutdown process on 265 succeeds and proceeds to deregister itself from ZooKeeper and shuts down the socket server. 5. (ReplicaStateMachine actually removes deregistered brokers from the controller channel manager's list of brokers to send requests to. However, that removal cannot take place (t@18 below) because preferred replica leader election task owns the controller lock.) 6. So the request thread to broker 265 gets into infinite retries. 7. The entire broker shutdown process is blocked on controller shutdown for the same reason (it needs to acquire the controller lock). Relevant portions from the thread-dump: Controller-265-to-broker-265-send-thread - Thread t@113 java.lang.Thread.State: TIMED_WAITING at java.lang.Thread.sleep(Native Method) at kafka.controller.RequestSendThread$$anonfun$liftedTree1$1$1.apply$mcV$sp(ControllerChannelManager.scala:143) at kafka.utils.Utils$.swallow(Utils.scala:167) at kafka.utils.Logging$class.swallowWarn(Logging.scala:92) at kafka.utils.Utils$.swallowWarn(Utils.scala:46) at kafka.utils.Logging$class.swallow(Logging.scala:94) at kafka.utils.Utils$.swallow(Utils.scala:46) at kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:143) at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131) - locked java.lang.Object@6dbf14a7 at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) Locked ownable synchronizers: - None ... Thread-4 - Thread t@17 java.lang.Thread.State: WAITING on java.util.concurrent.locks.ReentrantLock$NonfairSync@4836840 owned by: kafka-scheduler-0 at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262) at kafka.utils.Utils$.inLock(Utils.scala:536) at kafka.controller.KafkaController.shutdown(KafkaController.scala:642) at kafka.server.KafkaServer$$anonfun$shutdown$9.apply$mcV$sp(KafkaServer.scala:242) at kafka.utils.Utils$.swallow(Utils.scala:167) at kafka.utils.Logging$class.swallowWarn(Logging.scala:92) at kafka.utils.Utils$.swallowWarn(Utils.scala:46) at kafka.utils.Logging$class.swallow(Logging.scala:94) at kafka.utils.Utils$.swallow(Utils.scala:46) at kafka.server.KafkaServer.shutdown(KafkaServer.scala:242) at kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:46) at kafka.Kafka$$anon$1.run(Kafka.scala:42) ... kafka-scheduler-0 - Thread t@117 java.lang.Thread.State: WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1dc407fc at
Re: Review Request 26503: Patch for KAFKA-1493
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26503/#review56164 --- Thanks for the patch. A couple of high level questions. 1. Is the format used in KafkaLZ4BlockInputStream standard? I am wonder if there are libraries in other languages that support this format too. 2. Could you summarize the key difference btw the format in KafkaLZ4BlockInputStream and lz4BlockInputStream? 3. Could we add lz4 in ProducerCompressionTest? clients/src/main/java/org/apache/kafka/common/message/KafkaLZ4BlockInputStream.java https://reviews.apache.org/r/26503/#comment96483 Should originalLen be compressedLen? clients/src/main/java/org/apache/kafka/common/message/KafkaLZ4BlockInputStream.java https://reviews.apache.org/r/26503/#comment96484 Should we throw an UnsupportedOperationException? clients/src/main/java/org/apache/kafka/common/message/KafkaLZ4BlockOutputStream.java https://reviews.apache.org/r/26503/#comment96488 It seems this byte has both the compression method and the compression level. Could we document this? clients/src/main/java/org/apache/kafka/common/message/KafkaLZ4BlockOutputStream.java https://reviews.apache.org/r/26503/#comment96487 When is finished set to true? - Jun Rao On Oct. 9, 2014, 3:39 p.m., Ivan Lyutov wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26503/ --- (Updated Oct. 9, 2014, 3:39 p.m.) Review request for kafka. Bugs: KAFKA-1493 https://issues.apache.org/jira/browse/KAFKA-1493 Repository: kafka Description --- KAFKA-1493 - implemented input/output lz4 streams for kafka message compression, added compression format description, minor typo fix. Diffs - clients/src/main/java/org/apache/kafka/common/message/KafkaLZ4BlockInputStream.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/message/KafkaLZ4BlockOutputStream.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/record/CompressionType.java 5227b2d7ab803389d1794f48c8232350c05b14fd clients/src/main/java/org/apache/kafka/common/record/Compressor.java 0323f5f7032dceb49d820c17a41b78c56591ffc4 config/producer.properties 39d65d7c6c21f4fccd7af89be6ca12a088d5dd98 core/src/main/scala/kafka/message/CompressionCodec.scala de0a0fade5387db63299c6b112b3c9a5e41d82ec core/src/main/scala/kafka/message/CompressionFactory.scala 8420e13d0d8680648df78f22ada4a0d4e3ab8758 core/src/test/scala/unit/kafka/message/MessageCompressionTest.scala 6f0addcea64f1e78a4de50ec8135f4d02cebd305 Diff: https://reviews.apache.org/r/26503/diff/ Testing --- Thanks, Ivan Lyutov
[jira] [Updated] (KAFKA-559) Garbage collect old consumer metadata entries
[ https://issues.apache.org/jira/browse/KAFKA-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neha Narkhede updated KAFKA-559: Reviewer: Neha Narkhede Garbage collect old consumer metadata entries - Key: KAFKA-559 URL: https://issues.apache.org/jira/browse/KAFKA-559 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Assignee: Tejas Patil Labels: newbie, project Attachments: KAFKA-559.v1.patch, KAFKA-559.v2.patch Many use cases involve tranient consumers. These consumers create entries under their consumer group in zk and maintain offsets there as well. There is currently no way to delete these entries. It would be good to have a tool that did something like bin/delete-obsolete-consumer-groups.sh [--topic t1] --since [date] --zookeeper [zk_connect] This would scan through consumer group entries and delete any that had no offset update since the given date. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1493) Use a well-documented LZ4 compression format and remove redundant LZ4HC option
[ https://issues.apache.org/jira/browse/KAFKA-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167153#comment-14167153 ] Jun Rao commented on KAFKA-1493: James, Could you help review the format in Ivan's patch? Is the format used in KafkaLZ4BlockInputStream standard? I am wondering if there are libraries in other languages that support this format too. Thanks, Use a well-documented LZ4 compression format and remove redundant LZ4HC option -- Key: KAFKA-1493 URL: https://issues.apache.org/jira/browse/KAFKA-1493 Project: Kafka Issue Type: Improvement Affects Versions: 0.8.2 Reporter: James Oliver Assignee: Ivan Lyutov Priority: Blocker Fix For: 0.8.2 Attachments: KAFKA-1493.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 26563: Patch for KAFKA-1692
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26563/ --- Review request for kafka. Bugs: KAFKA-1692 https://issues.apache.org/jira/browse/KAFKA-1692 Repository: kafka Description --- KAFKA-1692 Include client ID in new producer IO thread name. Diffs - clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java f58b8508d3f813a51015abed772c704390887d7e Diff: https://reviews.apache.org/r/26563/diff/ Testing --- Thanks, Ewen Cheslack-Postava
[jira] [Updated] (KAFKA-1692) [Java New Producer] IO Thread Name Must include Client ID
[ https://issues.apache.org/jira/browse/KAFKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-1692: - Assignee: Ewen Cheslack-Postava (was: Jun Rao) Status: Patch Available (was: Open) [Java New Producer] IO Thread Name Must include Client ID --- Key: KAFKA-1692 URL: https://issues.apache.org/jira/browse/KAFKA-1692 Project: Kafka Issue Type: Improvement Components: producer Affects Versions: 0.8.2 Reporter: Bhavesh Mistry Assignee: Ewen Cheslack-Postava Priority: Trivial Labels: newbie Attachments: KAFKA-1692.patch Please add client id so people who are looking at Jconsole or Profile tool can see Thread by client id since single JVM can have multiple producer instance. org.apache.kafka.clients.producer.KafkaProducer {code} String ioThreadName = kafka-producer-network-thread; if(clientId != null){ ioThreadName = ioThreadName + | +clientId; } this.ioThread = new KafkaThread(ioThreadName, this.sender, true); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1692) [Java New Producer] IO Thread Name Must include Client ID
[ https://issues.apache.org/jira/browse/KAFKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-1692: - Attachment: KAFKA-1692.patch [Java New Producer] IO Thread Name Must include Client ID --- Key: KAFKA-1692 URL: https://issues.apache.org/jira/browse/KAFKA-1692 Project: Kafka Issue Type: Improvement Components: producer Affects Versions: 0.8.2 Reporter: Bhavesh Mistry Assignee: Jun Rao Priority: Trivial Labels: newbie Attachments: KAFKA-1692.patch Please add client id so people who are looking at Jconsole or Profile tool can see Thread by client id since single JVM can have multiple producer instance. org.apache.kafka.clients.producer.KafkaProducer {code} String ioThreadName = kafka-producer-network-thread; if(clientId != null){ ioThreadName = ioThreadName + | +clientId; } this.ioThread = new KafkaThread(ioThreadName, this.sender, true); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1692) [Java New Producer] IO Thread Name Must include Client ID
[ https://issues.apache.org/jira/browse/KAFKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167169#comment-14167169 ] Ewen Cheslack-Postava commented on KAFKA-1692: -- Created reviewboard https://reviews.apache.org/r/26563/diff/ against branch origin/trunk [Java New Producer] IO Thread Name Must include Client ID --- Key: KAFKA-1692 URL: https://issues.apache.org/jira/browse/KAFKA-1692 Project: Kafka Issue Type: Improvement Components: producer Affects Versions: 0.8.2 Reporter: Bhavesh Mistry Assignee: Jun Rao Priority: Trivial Labels: newbie Attachments: KAFKA-1692.patch Please add client id so people who are looking at Jconsole or Profile tool can see Thread by client id since single JVM can have multiple producer instance. org.apache.kafka.clients.producer.KafkaProducer {code} String ioThreadName = kafka-producer-network-thread; if(clientId != null){ ioThreadName = ioThreadName + | +clientId; } this.ioThread = new KafkaThread(ioThreadName, this.sender, true); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 26564: Patch for KAFKA-1471
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26564/ --- Review request for kafka. Bugs: KAFKA-1471 https://issues.apache.org/jira/browse/KAFKA-1471 Repository: kafka Description --- KAFKA-1471 Add producer unit tests for LZ4 and LZ4HC compression codecs Diffs - clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 79d57f9bf31606ffa5400f2f12356eba84703cc2 core/src/main/scala/kafka/tools/ConsoleProducer.scala 8e9ba0b284671989f87d9c421bc98f5c4384c260 core/src/test/scala/integration/kafka/api/ProducerCompressionTest.scala 17e2c6e9dfd789acb4b6db37c780c862667e4e11 Diff: https://reviews.apache.org/r/26564/diff/ Testing --- Thanks, Ewen Cheslack-Postava
[jira] [Updated] (KAFKA-1471) Add Producer Unit Tests for LZ4 and LZ4HC compression
[ https://issues.apache.org/jira/browse/KAFKA-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-1471: - Attachment: KAFKA-1471.patch Add Producer Unit Tests for LZ4 and LZ4HC compression - Key: KAFKA-1471 URL: https://issues.apache.org/jira/browse/KAFKA-1471 Project: Kafka Issue Type: Sub-task Reporter: James Oliver Fix For: 0.8.2 Attachments: KAFKA-1471.patch, KAFKA-1471.patch, KAFKA-1471.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1305) Controller can hang on controlled shutdown with auto leader balance enabled
[ https://issues.apache.org/jira/browse/KAFKA-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167179#comment-14167179 ] Jun Rao commented on KAFKA-1305: Yes, in theory, we can make the queue unbounded. However, in practice, the queue shouldn't build up. I was a bit concerned that if we make the queue unbounded and another issue that causes the queue to build up, we may hit OOME. Then, we may not be able to take a thread dump to diagnose the issue. Controller can hang on controlled shutdown with auto leader balance enabled --- Key: KAFKA-1305 URL: https://issues.apache.org/jira/browse/KAFKA-1305 Project: Kafka Issue Type: Bug Reporter: Joel Koshy Assignee: Sriharsha Chintalapani Priority: Blocker Fix For: 0.8.2, 0.9.0 Attachments: KAFKA-1305.patch This is relatively easy to reproduce especially when doing a rolling bounce. What happened here is as follows: 1. The previous controller was bounced and broker 265 became the new controller. 2. I went on to do a controlled shutdown of broker 265 (the new controller). 3. In the mean time the automatically scheduled preferred replica leader election process started doing its thing and starts sending LeaderAndIsrRequests/UpdateMetadataRequests to itself (and other brokers). (t@113 below). 4. While that's happening, the controlled shutdown process on 265 succeeds and proceeds to deregister itself from ZooKeeper and shuts down the socket server. 5. (ReplicaStateMachine actually removes deregistered brokers from the controller channel manager's list of brokers to send requests to. However, that removal cannot take place (t@18 below) because preferred replica leader election task owns the controller lock.) 6. So the request thread to broker 265 gets into infinite retries. 7. The entire broker shutdown process is blocked on controller shutdown for the same reason (it needs to acquire the controller lock). Relevant portions from the thread-dump: Controller-265-to-broker-265-send-thread - Thread t@113 java.lang.Thread.State: TIMED_WAITING at java.lang.Thread.sleep(Native Method) at kafka.controller.RequestSendThread$$anonfun$liftedTree1$1$1.apply$mcV$sp(ControllerChannelManager.scala:143) at kafka.utils.Utils$.swallow(Utils.scala:167) at kafka.utils.Logging$class.swallowWarn(Logging.scala:92) at kafka.utils.Utils$.swallowWarn(Utils.scala:46) at kafka.utils.Logging$class.swallow(Logging.scala:94) at kafka.utils.Utils$.swallow(Utils.scala:46) at kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:143) at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131) - locked java.lang.Object@6dbf14a7 at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) Locked ownable synchronizers: - None ... Thread-4 - Thread t@17 java.lang.Thread.State: WAITING on java.util.concurrent.locks.ReentrantLock$NonfairSync@4836840 owned by: kafka-scheduler-0 at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262) at kafka.utils.Utils$.inLock(Utils.scala:536) at kafka.controller.KafkaController.shutdown(KafkaController.scala:642) at kafka.server.KafkaServer$$anonfun$shutdown$9.apply$mcV$sp(KafkaServer.scala:242) at kafka.utils.Utils$.swallow(Utils.scala:167) at kafka.utils.Logging$class.swallowWarn(Logging.scala:92) at kafka.utils.Utils$.swallowWarn(Utils.scala:46) at kafka.utils.Logging$class.swallow(Logging.scala:94) at kafka.utils.Utils$.swallow(Utils.scala:46) at kafka.server.KafkaServer.shutdown(KafkaServer.scala:242) at kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:46) at kafka.Kafka$$anon$1.run(Kafka.scala:42) ... kafka-scheduler-0 - Thread t@117 java.lang.Thread.State: WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1dc407fc at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) at
[jira] [Commented] (KAFKA-1471) Add Producer Unit Tests for LZ4 and LZ4HC compression
[ https://issues.apache.org/jira/browse/KAFKA-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167178#comment-14167178 ] Ewen Cheslack-Postava commented on KAFKA-1471: -- Created reviewboard https://reviews.apache.org/r/26564/diff/ against branch origin/trunk Add Producer Unit Tests for LZ4 and LZ4HC compression - Key: KAFKA-1471 URL: https://issues.apache.org/jira/browse/KAFKA-1471 Project: Kafka Issue Type: Sub-task Reporter: James Oliver Fix For: 0.8.2 Attachments: KAFKA-1471.patch, KAFKA-1471.patch, KAFKA-1471.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1471) Add Producer Unit Tests for LZ4 and LZ4HC compression
[ https://issues.apache.org/jira/browse/KAFKA-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167183#comment-14167183 ] Ewen Cheslack-Postava commented on KAFKA-1471: -- This updated version applies cleanly (just had a bit of fuzz), removes a few unnecessarily changed lines, and fixes some typos. Add Producer Unit Tests for LZ4 and LZ4HC compression - Key: KAFKA-1471 URL: https://issues.apache.org/jira/browse/KAFKA-1471 Project: Kafka Issue Type: Sub-task Reporter: James Oliver Assignee: Ewen Cheslack-Postava Fix For: 0.8.2 Attachments: KAFKA-1471.patch, KAFKA-1471.patch, KAFKA-1471.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1583) Kafka API Refactoring
[ https://issues.apache.org/jira/browse/KAFKA-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167187#comment-14167187 ] Guozhang Wang commented on KAFKA-1583: -- Sure, will do that asap. Kafka API Refactoring - Key: KAFKA-1583 URL: https://issues.apache.org/jira/browse/KAFKA-1583 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.9.0 Attachments: KAFKA-1583.patch, KAFKA-1583_2014-08-20_13:54:38.patch, KAFKA-1583_2014-08-21_11:30:34.patch, KAFKA-1583_2014-08-27_09:44:50.patch, KAFKA-1583_2014-09-01_18:07:42.patch, KAFKA-1583_2014-09-02_13:37:47.patch, KAFKA-1583_2014-09-05_14:08:36.patch, KAFKA-1583_2014-09-05_14:55:38.patch This is the next step of KAFKA-1430. Details can be found at this page: https://cwiki.apache.org/confluence/display/KAFKA/Kafka+API+Refactoring -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (KAFKA-1670) Corrupt log files for segment.bytes values close to Int.MaxInt
[ https://issues.apache.org/jira/browse/KAFKA-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang reopened KAFKA-1670: -- Corrupt log files for segment.bytes values close to Int.MaxInt -- Key: KAFKA-1670 URL: https://issues.apache.org/jira/browse/KAFKA-1670 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ryan Berdeen Assignee: Sriharsha Chintalapani Priority: Blocker Fix For: 0.8.2 Attachments: KAFKA-1670.patch, KAFKA-1670_2014-10-04_20:17:46.patch, KAFKA-1670_2014-10-06_09:48:25.patch, KAFKA-1670_2014-10-07_13:39:13.patch, KAFKA-1670_2014-10-07_13:49:10.patch, KAFKA-1670_2014-10-07_18:39:31.patch The maximum value for the topic-level config {{segment.bytes}} is {{Int.MaxInt}} (2147483647). *Using this value causes brokers to corrupt their log files, leaving them unreadable.* We set {{segment.bytes}} to {{2122317824}} which is well below the maximum. One by one, the ISR of all partitions shrunk to 1. Brokers would crash when restarted, attempting to read from a negative offset in a log file. After discovering that many segment files had grown to 4GB or more, we were forced to shut down our *entire production Kafka cluster* for several hours while we split all segment files into 1GB chunks. Looking into the {{kafka.log}} code, the {{segment.bytes}} parameter is used inconsistently. It is treated as a *soft* maximum for the size of the segment file (https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/log/LogConfig.scala#L26) with logs rolled only after (https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/log/Log.scala#L246) they exceed this value. However, much of the code that deals with log files uses *ints* to store the size of the file and the position in the file. Overflow of these ints leads the broker to append to the segments indefinitely, and to fail to read these segments for consuming or recovery. This is trivial to reproduce: {code} $ bin/kafka-topics.sh --topic segment-bytes-test --create --replication-factor 2 --partitions 1 --zookeeper zkhost:2181 $ bin/kafka-topics.sh --topic segment-bytes-test --alter --config segment.bytes=2147483647 --zookeeper zkhost:2181 $ yes Int.MaxValue is a ridiculous bound on file size in 2014 | bin/kafka-console-producer.sh --broker-list localhost:6667 zkhost:2181 --topic segment-bytes-test {code} After running for a few minutes, the log file is corrupt: {code} $ ls -lh data/segment-bytes-test-0/ total 9.7G -rw-r--r-- 1 root root 10M Oct 3 19:39 .index -rw-r--r-- 1 root root 9.7G Oct 3 19:39 .log {code} We recovered the data from the log files using a simple Python script: https://gist.github.com/also/9f823d9eb9dc0a410796 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1670) Corrupt log files for segment.bytes values close to Int.MaxInt
[ https://issues.apache.org/jira/browse/KAFKA-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167219#comment-14167219 ] Guozhang Wang commented on KAFKA-1670: -- Hi [~sriharsha] [~junrao] this patch causes some regression errors on system tests, including replication / mirror maker test suites (you can try reproduce it with 5001/2/3 easily). The log entries I saw from the producer: {code} [2014-10-10 05:32:46,714] ERROR Error when sending message to topic test_1 with key: 1 bytes, value: 500 bytes with error: The request included message batch larger than the configured segment size on the server. (org.apache.kafka.client s.producer.internals.ErrorLoggingCallback) [2014-10-10 05:32:46,714] ERROR Error when sending message to topic test_1 with key: 1 bytes, value: 500 bytes with error: The request included message batch larger than the configured segment size on the server. (org.apache.kafka.client s.producer.internals.ErrorLoggingCallback) [2014-10-10 05:32:46,714] ERROR Error when sending message to topic test_1 with key: 1 bytes, value: 500 bytes with error: The request included message batch larger than the configured segment size on the server. (org.apache.kafka.client s.producer.internals.ErrorLoggingCallback) [2014-10-10 05:32:46,714] ERROR Error when sending message to topic test_1 with key: 1 bytes, value: 500 bytes with error: The request included message batch larger than the configured segment size on the server. (org.apache.kafka.client s.producer.internals.ErrorLoggingCallback) [2014-10-10 05:32:46,714] ERROR Error when sending message to topic test_1 with key: 1 bytes, value: 500 bytes with error: The request included message batch larger than the configured segment size on the server. (org.apache.kafka.client s.producer.internals.ErrorLoggingCallback) {code} Corrupt log files for segment.bytes values close to Int.MaxInt -- Key: KAFKA-1670 URL: https://issues.apache.org/jira/browse/KAFKA-1670 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ryan Berdeen Assignee: Sriharsha Chintalapani Priority: Blocker Fix For: 0.8.2 Attachments: KAFKA-1670.patch, KAFKA-1670_2014-10-04_20:17:46.patch, KAFKA-1670_2014-10-06_09:48:25.patch, KAFKA-1670_2014-10-07_13:39:13.patch, KAFKA-1670_2014-10-07_13:49:10.patch, KAFKA-1670_2014-10-07_18:39:31.patch The maximum value for the topic-level config {{segment.bytes}} is {{Int.MaxInt}} (2147483647). *Using this value causes brokers to corrupt their log files, leaving them unreadable.* We set {{segment.bytes}} to {{2122317824}} which is well below the maximum. One by one, the ISR of all partitions shrunk to 1. Brokers would crash when restarted, attempting to read from a negative offset in a log file. After discovering that many segment files had grown to 4GB or more, we were forced to shut down our *entire production Kafka cluster* for several hours while we split all segment files into 1GB chunks. Looking into the {{kafka.log}} code, the {{segment.bytes}} parameter is used inconsistently. It is treated as a *soft* maximum for the size of the segment file (https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/log/LogConfig.scala#L26) with logs rolled only after (https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/log/Log.scala#L246) they exceed this value. However, much of the code that deals with log files uses *ints* to store the size of the file and the position in the file. Overflow of these ints leads the broker to append to the segments indefinitely, and to fail to read these segments for consuming or recovery. This is trivial to reproduce: {code} $ bin/kafka-topics.sh --topic segment-bytes-test --create --replication-factor 2 --partitions 1 --zookeeper zkhost:2181 $ bin/kafka-topics.sh --topic segment-bytes-test --alter --config segment.bytes=2147483647 --zookeeper zkhost:2181 $ yes Int.MaxValue is a ridiculous bound on file size in 2014 | bin/kafka-console-producer.sh --broker-list localhost:6667 zkhost:2181 --topic segment-bytes-test {code} After running for a few minutes, the log file is corrupt: {code} $ ls -lh data/segment-bytes-test-0/ total 9.7G -rw-r--r-- 1 root root 10M Oct 3 19:39 .index -rw-r--r-- 1 root root 9.7G Oct 3 19:39 .log {code} We recovered the data from the log files using a simple Python script: https://gist.github.com/also/9f823d9eb9dc0a410796 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1670) Corrupt log files for segment.bytes values close to Int.MaxInt
[ https://issues.apache.org/jira/browse/KAFKA-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167223#comment-14167223 ] Sriharsha Chintalapani commented on KAFKA-1670: --- [~guozhang] Sorry will fix those. Corrupt log files for segment.bytes values close to Int.MaxInt -- Key: KAFKA-1670 URL: https://issues.apache.org/jira/browse/KAFKA-1670 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Ryan Berdeen Assignee: Sriharsha Chintalapani Priority: Blocker Fix For: 0.8.2 Attachments: KAFKA-1670.patch, KAFKA-1670_2014-10-04_20:17:46.patch, KAFKA-1670_2014-10-06_09:48:25.patch, KAFKA-1670_2014-10-07_13:39:13.patch, KAFKA-1670_2014-10-07_13:49:10.patch, KAFKA-1670_2014-10-07_18:39:31.patch The maximum value for the topic-level config {{segment.bytes}} is {{Int.MaxInt}} (2147483647). *Using this value causes brokers to corrupt their log files, leaving them unreadable.* We set {{segment.bytes}} to {{2122317824}} which is well below the maximum. One by one, the ISR of all partitions shrunk to 1. Brokers would crash when restarted, attempting to read from a negative offset in a log file. After discovering that many segment files had grown to 4GB or more, we were forced to shut down our *entire production Kafka cluster* for several hours while we split all segment files into 1GB chunks. Looking into the {{kafka.log}} code, the {{segment.bytes}} parameter is used inconsistently. It is treated as a *soft* maximum for the size of the segment file (https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/log/LogConfig.scala#L26) with logs rolled only after (https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/log/Log.scala#L246) they exceed this value. However, much of the code that deals with log files uses *ints* to store the size of the file and the position in the file. Overflow of these ints leads the broker to append to the segments indefinitely, and to fail to read these segments for consuming or recovery. This is trivial to reproduce: {code} $ bin/kafka-topics.sh --topic segment-bytes-test --create --replication-factor 2 --partitions 1 --zookeeper zkhost:2181 $ bin/kafka-topics.sh --topic segment-bytes-test --alter --config segment.bytes=2147483647 --zookeeper zkhost:2181 $ yes Int.MaxValue is a ridiculous bound on file size in 2014 | bin/kafka-console-producer.sh --broker-list localhost:6667 zkhost:2181 --topic segment-bytes-test {code} After running for a few minutes, the log file is corrupt: {code} $ ls -lh data/segment-bytes-test-0/ total 9.7G -rw-r--r-- 1 root root 10M Oct 3 19:39 .index -rw-r--r-- 1 root root 9.7G Oct 3 19:39 .log {code} We recovered the data from the log files using a simple Python script: https://gist.github.com/also/9f823d9eb9dc0a410796 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 26566: Patch for KAFKA-1680
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26566/ --- Review request for kafka. Bugs: KAFKA-1680 https://issues.apache.org/jira/browse/KAFKA-1680 Repository: kafka Description --- KAFKA-1680 Standardize command line argument parsing and usage messages. At it's heart, this was just a test of args.length that was invalid for this command, but 6b0ae4bba0d introduced the same potential issue across all the command line tools. This standardizes all the command line tools on a cleaner parsing pattern by pushing most of the work into CommandLineUtils and printing usage info for any type of parsing exception. Ideally the long term solution would be to use a newer version of joptsimple that allows us to express constraints on arguments to get almost all command line option issues resolved at parse time. Diffs - core/src/main/scala/kafka/admin/PreferredReplicaLeaderElectionCommand.scala c7918483c02040a7cc18d6e9edbd20a3025a3a55 core/src/main/scala/kafka/admin/ReassignPartitionsCommand.scala 691d69a49a240f38883d2025afaec26fd61281b5 core/src/main/scala/kafka/admin/TopicCommand.scala 7672c5aab4fba8c23b1bb5cd4785c332d300a3fa core/src/main/scala/kafka/tools/ConsoleConsumer.scala 323fc8566d974acc4e5c7d7c2a065794f3b5df4a core/src/main/scala/kafka/tools/ConsoleProducer.scala 8e9ba0b284671989f87d9c421bc98f5c4384c260 core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala d1e7c434e77859d746b8dc68dd5d5a3740425e79 core/src/main/scala/kafka/tools/ConsumerPerformance.scala 093c800ea7f8a9c972bb66e99ac4e4d431cf11cc core/src/main/scala/kafka/tools/DumpLogSegments.scala 8e9d47b8d4adc5754ed8861aa04ddd3c6b629e3d core/src/main/scala/kafka/tools/ExportZkOffsets.scala 4d051bc2db12f0e15aa6a3289abeb9dd25d373d2 core/src/main/scala/kafka/tools/GetOffsetShell.scala 3d9293e4abbe3f4a4a2bc5833385747c604d5a95 core/src/main/scala/kafka/tools/ImportZkOffsets.scala abe09721b13f71320510fd1a01c1917470450c6e core/src/main/scala/kafka/tools/JmxTool.scala 1d1a120c45ff70fbd60df5b147ca230eb1ef50de core/src/main/scala/kafka/tools/MirrorMaker.scala b8698ee1469c8fbc92ccc176d916eb3e28b87867 core/src/main/scala/kafka/tools/ProducerPerformance.scala f61c7c701fd85caabc2d2950a7b02aa85e5cdfe3 core/src/main/scala/kafka/tools/ReplayLogProducer.scala 3393a3dd574ac45a27bf7eda646b737146c55038 core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala ba6ddd7a909df79a0f7d45e8b4a2af94ea0fceb6 core/src/main/scala/kafka/tools/SimpleConsumerPerformance.scala 7602b8d705970a5dab49ed36d117346a960701ac core/src/main/scala/kafka/tools/SimpleConsumerShell.scala b4f903b6c7c3bb725cac7c05eb1f885906413c4d core/src/main/scala/kafka/tools/StateChangeLogMerger.scala d298e7e81acc7427c6cf4796b445966267ca54eb core/src/main/scala/kafka/tools/TestLogCleaning.scala 1d4ea93f2ba8d4d4d47a307cd47f54a15d3d30dd core/src/main/scala/kafka/tools/VerifyConsumerRebalance.scala aef8361b73a0934641fc4f5cee942b5b50f3e7d7 core/src/main/scala/kafka/utils/CommandLineUtils.scala 086a62483fad0c9cfc7004ff94c890cfb9929fa6 core/src/main/scala/kafka/utils/ToolsUtils.scala fef93929ea03e181f87fe294c06d9bc9fc823e9e core/src/test/scala/other/kafka/TestLinearWriteSpeed.scala 7211c2529c1db76100432737da7a1d1d221dfba0 Diff: https://reviews.apache.org/r/26566/diff/ Testing --- Thanks, Ewen Cheslack-Postava
[jira] [Updated] (KAFKA-1680) JmxTool exits if no arguments are given
[ https://issues.apache.org/jira/browse/KAFKA-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-1680: - Attachment: KAFKA-1680.patch JmxTool exits if no arguments are given --- Key: KAFKA-1680 URL: https://issues.apache.org/jira/browse/KAFKA-1680 Project: Kafka Issue Type: Bug Affects Versions: 0.8.2 Reporter: Ryan Berdeen Priority: Minor Attachments: KAFKA-1680.patch JmxTool has no required arguments, but it exits if no arguments are provided. You can work around this by passing a non-option argument, which will be ignored, e.g.{{./bin/kafka-run-class.sh kafka.tools.JmxTool xxx}}. It looks like this was broken in KAFKA-1291 / 6b0ae4bba0d0f8e4c8da19de65a8f03f162bec39 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1680) JmxTool exits if no arguments are given
[ https://issues.apache.org/jira/browse/KAFKA-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-1680: - Assignee: Ewen Cheslack-Postava Status: Patch Available (was: Open) JmxTool exits if no arguments are given --- Key: KAFKA-1680 URL: https://issues.apache.org/jira/browse/KAFKA-1680 Project: Kafka Issue Type: Bug Affects Versions: 0.8.2 Reporter: Ryan Berdeen Assignee: Ewen Cheslack-Postava Priority: Minor Attachments: KAFKA-1680.patch JmxTool has no required arguments, but it exits if no arguments are provided. You can work around this by passing a non-option argument, which will be ignored, e.g.{{./bin/kafka-run-class.sh kafka.tools.JmxTool xxx}}. It looks like this was broken in KAFKA-1291 / 6b0ae4bba0d0f8e4c8da19de65a8f03f162bec39 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1680) JmxTool exits if no arguments are given
[ https://issues.apache.org/jira/browse/KAFKA-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167228#comment-14167228 ] Ewen Cheslack-Postava commented on KAFKA-1680: -- Created reviewboard https://reviews.apache.org/r/26566/diff/ against branch origin/trunk JmxTool exits if no arguments are given --- Key: KAFKA-1680 URL: https://issues.apache.org/jira/browse/KAFKA-1680 Project: Kafka Issue Type: Bug Affects Versions: 0.8.2 Reporter: Ryan Berdeen Priority: Minor Attachments: KAFKA-1680.patch JmxTool has no required arguments, but it exits if no arguments are provided. You can work around this by passing a non-option argument, which will be ignored, e.g.{{./bin/kafka-run-class.sh kafka.tools.JmxTool xxx}}. It looks like this was broken in KAFKA-1291 / 6b0ae4bba0d0f8e4c8da19de65a8f03f162bec39 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167261#comment-14167261 ] Gwen Shapira commented on KAFKA-1555: - will be happy to do that. Is the wiki the right place? provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167464#comment-14167464 ] Sriram Subramanian commented on KAFKA-1555: --- My vote would be to update our documentation - http://kafka.apache.org/documentation.html It currently refers to 0.8.1. We should make 0.8.2 the current one after the release. The Design section can have Guarantees portion that talks about what guarantees that Kafka gives w.r.t consistency Vs availability and when. What do the rest think? provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-1700) examples directory - README and shell scripts are out of date
Geoffrey Anderson created KAFKA-1700: Summary: examples directory - README and shell scripts are out of date Key: KAFKA-1700 URL: https://issues.apache.org/jira/browse/KAFKA-1700 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1 Reporter: Geoffrey Anderson Priority: Minor Fix For: 0.8.2 sbt build files were removed during resolution of KAFKA-1254, so the README under the examples directory should no longer make reference to sbt. Also, the paths added to CLASSPATH variable in the example shell script are no longer correct. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 26575: Fix for KAFKA-1700
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26575/ --- Review request for kafka. Bugs: KAFKA-1700 https://issues.apache.org/jira/browse/KAFKA-1700 Repository: kafka Description --- Tweaked README, updated shell scripts so they call kafka-run-class.sh instead of manually adding to the CLASSPATH. Diffs - examples/README 61de2868de29e7c04811bfe12ccabc50b45d148e examples/bin/java-producer-consumer-demo.sh 29e01c2dcf82365c28fef0836e1771282cb49bc1 examples/bin/java-simple-consumer-demo.sh 4716a098c7d404477e0e7254e65e0f509e9df92e Diff: https://reviews.apache.org/r/26575/diff/ Testing --- Thanks, Geoffrey Anderson
[jira] [Updated] (KAFKA-1700) examples directory - README and shell scripts are out of date
[ https://issues.apache.org/jira/browse/KAFKA-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Geoffrey Anderson updated KAFKA-1700: - Attachment: KAFKA-1700.patch examples directory - README and shell scripts are out of date - Key: KAFKA-1700 URL: https://issues.apache.org/jira/browse/KAFKA-1700 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1 Reporter: Geoffrey Anderson Priority: Minor Labels: newbie Fix For: 0.8.2 Attachments: KAFKA-1700.patch sbt build files were removed during resolution of KAFKA-1254, so the README under the examples directory should no longer make reference to sbt. Also, the paths added to CLASSPATH variable in the example shell script are no longer correct. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1700) examples directory - README and shell scripts are out of date
[ https://issues.apache.org/jira/browse/KAFKA-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167501#comment-14167501 ] Geoffrey Anderson commented on KAFKA-1700: -- Created reviewboard https://reviews.apache.org/r/26575/diff/ against branch origin/trunk examples directory - README and shell scripts are out of date - Key: KAFKA-1700 URL: https://issues.apache.org/jira/browse/KAFKA-1700 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1 Reporter: Geoffrey Anderson Priority: Minor Labels: newbie Fix For: 0.8.2 Attachments: KAFKA-1700.patch sbt build files were removed during resolution of KAFKA-1254, so the README under the examples directory should no longer make reference to sbt. Also, the paths added to CLASSPATH variable in the example shell script are no longer correct. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-1701) Improve controller and broker message handling.
Jiangjie Qin created KAFKA-1701: --- Summary: Improve controller and broker message handling. Key: KAFKA-1701 URL: https://issues.apache.org/jira/browse/KAFKA-1701 Project: Kafka Issue Type: Improvement Reporter: Jiangjie Qin This ticket is a memo for future controller refactoring. It is related to KAFKA-1547. Ideally, the broker should only follow instruction from controller but not handle it smartly. For KAFKA-1547, the controller should filter out the partitions whose leader is not up yet before send LeaderAndIsrRequest to broker. The idea is controller should handle all the edge cases instead of letting broker do it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: including KAFKA-1555 in 0.8.2?
+1 On Fri, Oct 10, 2014 at 04:53:45PM +, Sriram Subramanian wrote: +1 On 10/10/14 9:39 AM, Gwen Shapira gshap...@cloudera.com wrote: +1 :) On Fri, Oct 10, 2014 at 9:34 AM, Joe Stein joe.st...@stealth.ly wrote: +1 On Oct 10, 2014 12:08 PM, Neha Narkhede neha.narkh...@gmail.com wrote: +1. On Thu, Oct 9, 2014 at 6:41 PM, Jun Rao jun...@gmail.com wrote: Hi, Everyone, I just committed KAFKA-1555 (min.isr support) to trunk. I felt that it's probably useful to include it in the 0.8.2 release. Any objections? Thanks, Jun
[jira] [Assigned] (KAFKA-1634) Update protocol wiki to reflect the new offset management feature
[ https://issues.apache.org/jira/browse/KAFKA-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Koshy reassigned KAFKA-1634: - Assignee: Joel Koshy (was: Jun Rao) Update protocol wiki to reflect the new offset management feature - Key: KAFKA-1634 URL: https://issues.apache.org/jira/browse/KAFKA-1634 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Joel Koshy Priority: Blocker Fix For: 0.8.2 From the mailing list - following up on this -- I think the online API docs for OffsetCommitRequest still incorrectly refer to client-side timestamps: https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest Wasn't that removed and now always handled server-side now? Would one of the devs mind updating the API spec wiki? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1634) Update protocol wiki to reflect the new offset management feature
[ https://issues.apache.org/jira/browse/KAFKA-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167696#comment-14167696 ] Joel Koshy commented on KAFKA-1634: --- Actually, one more potential source of confusion is that we use OffsetAndMetadata for both offset commits requests and offset fetch responses. i.e., an OffsetFetchResponse will contain: offset, metadata and this timestamp field. The timestamp field should really be ignored. It is annoying to document such things - i.e., tell users to just ignore the field. Ideally, I think we should do the following: * Remove the timestamp from the OffsetAndMetadata class * Move it to the top-level of the OffsetCommitRequest and rename it to retentionMs * The broker will compute the absolute time (based off time of receipt) that the offset should be expired * The above absolute time will continue to be stored in the offsets topic and the cleanup thread can remove those offsets when they are past their TTL. * OffsetFetchResponse will just return OffsetAndMetadata (no timestamp) We (linkedin and possibly others) already deployed this to some of our consumers but if we can bump up the protocol version when doing the above and translate requests that come in with the older version I think it should be okay. Update protocol wiki to reflect the new offset management feature - Key: KAFKA-1634 URL: https://issues.apache.org/jira/browse/KAFKA-1634 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Joel Koshy Priority: Blocker Fix For: 0.8.2 From the mailing list - following up on this -- I think the online API docs for OffsetCommitRequest still incorrectly refer to client-side timestamps: https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest Wasn't that removed and now always handled server-side now? Would one of the devs mind updating the API spec wiki? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1634) Update protocol wiki to reflect the new offset management feature
[ https://issues.apache.org/jira/browse/KAFKA-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167700#comment-14167700 ] Joel Koshy commented on KAFKA-1634: --- (BTW, I'm looking for a +1 or -1 on the above comment :) ) Update protocol wiki to reflect the new offset management feature - Key: KAFKA-1634 URL: https://issues.apache.org/jira/browse/KAFKA-1634 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Joel Koshy Priority: Blocker Fix For: 0.8.2 From the mailing list - following up on this -- I think the online API docs for OffsetCommitRequest still incorrectly refer to client-side timestamps: https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest Wasn't that removed and now always handled server-side now? Would one of the devs mind updating the API spec wiki? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1634) Improve semantics of timestamp in OffsetCommitRequests and update documentation
[ https://issues.apache.org/jira/browse/KAFKA-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Koshy updated KAFKA-1634: -- Summary: Improve semantics of timestamp in OffsetCommitRequests and update documentation (was: Improve semantics of timestamp in OffsetCommitRequests) Improve semantics of timestamp in OffsetCommitRequests and update documentation --- Key: KAFKA-1634 URL: https://issues.apache.org/jira/browse/KAFKA-1634 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Joel Koshy Priority: Blocker Fix For: 0.8.2 From the mailing list - following up on this -- I think the online API docs for OffsetCommitRequest still incorrectly refer to client-side timestamps: https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest Wasn't that removed and now always handled server-side now? Would one of the devs mind updating the API spec wiki? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1634) Improve semantics of timestamp in OffsetCommitRequests
[ https://issues.apache.org/jira/browse/KAFKA-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Koshy updated KAFKA-1634: -- Summary: Improve semantics of timestamp in OffsetCommitRequests (was: Update protocol wiki to reflect the new offset management feature) Improve semantics of timestamp in OffsetCommitRequests -- Key: KAFKA-1634 URL: https://issues.apache.org/jira/browse/KAFKA-1634 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Joel Koshy Priority: Blocker Fix For: 0.8.2 From the mailing list - following up on this -- I think the online API docs for OffsetCommitRequest still incorrectly refer to client-side timestamps: https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest Wasn't that removed and now always handled server-side now? Would one of the devs mind updating the API spec wiki? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[DISCUSSION] Message Metadata
Hello all, I put some thoughts on enhancing our current message metadata format to solve a bunch of existing issues: https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Enriched+Message+Metadata This wiki page is for kicking off some discussions about the feasibility of adding more info into the message header, and if possible how we would add them. -- Guozhang