[jira] [Commented] (KAFKA-1772) Add an Admin message type for request response
[ https://issues.apache.org/jira/browse/KAFKA-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14229799#comment-14229799 ] Joe Stein commented on KAFKA-1772: -- 1) agreed 2) I think we can do away with the format field. If we use JSON or by structures it will be one or the other and not both and structured so no need to have which as an option. 3) Yes, lets document the message format(s) in the main section https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Command+Line+and+Related+Improvements 4) We definitely need to support response right away and polling for status. It might be best to just do this for all of the commands and never block at the broker and keep a single pattern at the wire. The layer above the calls will definitely want/need a --sync type option which should be completely in the CLI (IMHO) to loop the check status (every few ms (configurable overridden)) and in some cases (like topic related) to be default because that is experience now. We could also have some be blocking some not. Add an Admin message type for request response -- Key: KAFKA-1772 URL: https://issues.apache.org/jira/browse/KAFKA-1772 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Andrii Biletskyi Fix For: 0.8.3 Attachments: KAFKA-1772.patch - utility int8 - command int8 - format int8 - args variable length bytes utility 0 - Broker 1 - Topic 2 - Replication 3 - Controller 4 - Consumer 5 - Producer Command 0 - Create 1 - Alter 3 - Delete 4 - List 5 - Audit format 0 - JSON args e.g. (which would equate to the data structure values == 2,1,0) meta-store: { {zookeeper:localhost:12913/kafka} }args: { partitions: [ {topic: topic1, partition: 0}, {topic: topic1, partition: 1}, {topic: topic1, partition: 2}, {topic: topic2, partition: 0}, {topic: topic2, partition: 1}, ] } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1737) Document required ZkSerializer for ZkClient used with AdminUtils
[ https://issues.apache.org/jira/browse/KAFKA-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vivek Madani updated KAFKA-1737: Reviewer: Guozhang Wang Status: Patch Available (was: Open) Hi Guozhang, Find attached the patch. Since the patch is straight-forward I doubt if it warrants unit tests. Let me know if you think otherwise. Document required ZkSerializer for ZkClient used with AdminUtils Key: KAFKA-1737 URL: https://issues.apache.org/jira/browse/KAFKA-1737 Project: Kafka Issue Type: Improvement Components: tools Affects Versions: 0.8.1.1 Reporter: Stevo Slavic Assignee: Vivek Madani Priority: Minor {{ZkClient}} instances passed to {{AdminUtils}} calls must have {{kafka.utils.ZKStringSerializer}} set as {{ZkSerializer}}. Otherwise commands executed via {{AdminUtils}} may not be seen/recognizable to broker, producer or consumer. E.g. producer (with auto topic creation turned off) will not be able to send messages to a topic created via {{AdminUtils}}, it will throw {{UnknownTopicOrPartitionException}}. Please consider at least documenting this requirement in {{AdminUtils}} scaladoc. For more info see [related discussion on Kafka user mailing list|http://mail-archives.apache.org/mod_mbox/kafka-users/201410.mbox/%3CCAAUywg-oihNiXuQRYeS%3D8Z3ymsmEHo6ghLs%3Dru4nbm%2BdHVz6TA%40mail.gmail.com%3E]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1737) Document required ZkSerializer for ZkClient used with AdminUtils
[ https://issues.apache.org/jira/browse/KAFKA-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vivek Madani updated KAFKA-1737: Attachment: KAFKA-1737.patch Document required ZkSerializer for ZkClient used with AdminUtils Key: KAFKA-1737 URL: https://issues.apache.org/jira/browse/KAFKA-1737 Project: Kafka Issue Type: Improvement Components: tools Affects Versions: 0.8.1.1 Reporter: Stevo Slavic Assignee: Vivek Madani Priority: Minor Attachments: KAFKA-1737.patch {{ZkClient}} instances passed to {{AdminUtils}} calls must have {{kafka.utils.ZKStringSerializer}} set as {{ZkSerializer}}. Otherwise commands executed via {{AdminUtils}} may not be seen/recognizable to broker, producer or consumer. E.g. producer (with auto topic creation turned off) will not be able to send messages to a topic created via {{AdminUtils}}, it will throw {{UnknownTopicOrPartitionException}}. Please consider at least documenting this requirement in {{AdminUtils}} scaladoc. For more info see [related discussion on Kafka user mailing list|http://mail-archives.apache.org/mod_mbox/kafka-users/201410.mbox/%3CCAAUywg-oihNiXuQRYeS%3D8Z3ymsmEHo6ghLs%3Dru4nbm%2BdHVz6TA%40mail.gmail.com%3E]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1789) Issue with Async producer
[ https://issues.apache.org/jira/browse/KAFKA-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14229936#comment-14229936 ] Jiangjie Qin commented on KAFKA-1789: - Hmm, that looks a little bit strange... Can you double check to confirm that you are using async producer instead of sync producer? Issue with Async producer - Key: KAFKA-1789 URL: https://issues.apache.org/jira/browse/KAFKA-1789 Project: Kafka Issue Type: Bug Affects Versions: 0.8.0, 0.8.1 Reporter: devendra tagare Priority: Critical Hi, We are using an async producer to send data to a kafka cluster.The event rate at peak is around 250 events/second of size 25KB each. In the producer code base we have added specific debug statements to capture the time taken to create a producer,create a keyed message with a byte payload send the message. We have added the below properties to the producerConfig queue.enqueue.timeout.ms=20 send.buffer.bytes=1024000 topic.metadata.refresh.interval.ms=3 Based on the documentation, producer.send() queues the message on the async producer's queue. So, ideally if the queue is full then the enqueue operation should result in an kafka.common.QueueFullException in 20 ms. The logs indicate that the enqueue operation is taking more than 20ms (takes around 250ms) without throwing any exceptions. Is there any other property that could conflict with queue.enqueue.timeout.ms which is causing this behavior ? Or is it possible that the queue is not full yet the producer.send() call is still taking around 200ms under peak load ? Also, could you suggest any other alternatives so that we can either enforce a timeout or throw an exception in-case the async producer is taking more than a specified amount of time. Regards, Dev -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1802) Add a new type of request for the discovery of the controller
[ https://issues.apache.org/jira/browse/KAFKA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230001#comment-14230001 ] Andrii Biletskyi commented on KAFKA-1802: - Proposed RQ/RP format: === ControllerDiscoveryRequest = // Comment: empty response body ControllerDiscoveryResponse = [Broker] Controller Controller = Broker Broker = NodeId Host Port NodeId = int32 Host = string Port = int32 // Comment: controller will be included in [Broker] list, for simplicity === I'll update A Guide To The Kafka Protocol if there will be no major remarks. Add a new type of request for the discovery of the controller - Key: KAFKA-1802 URL: https://issues.apache.org/jira/browse/KAFKA-1802 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Fix For: 0.8.3 The goal here is like meta data discovery is for producer so CLI can find which broker it should send the rest of its admin requests too. Any broker can respond to this specific AdminMeta RQ/RP but only the controller broker should be responding to Admin message otherwise that broker should respond to any admin message with the response for what the controller is. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1802) Add a new type of request for the discovery of the controller
[ https://issues.apache.org/jira/browse/KAFKA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230001#comment-14230001 ] Andrii Biletskyi edited comment on KAFKA-1802 at 12/1/14 4:37 PM: -- Proposed RQ/RP format: === ControllerDiscoveryRequest = // Comment: empty request body ControllerDiscoveryResponse = [Broker] Controller Controller = Broker Broker = NodeId Host Port NodeId = int32 Host = string Port = int32 // Comment: controller will be included in [Broker] list, for simplicity === I'll update A Guide To The Kafka Protocol if there will be no major remarks. was (Author: abiletskyi): Proposed RQ/RP format: === ControllerDiscoveryRequest = // Comment: empty response body ControllerDiscoveryResponse = [Broker] Controller Controller = Broker Broker = NodeId Host Port NodeId = int32 Host = string Port = int32 // Comment: controller will be included in [Broker] list, for simplicity === I'll update A Guide To The Kafka Protocol if there will be no major remarks. Add a new type of request for the discovery of the controller - Key: KAFKA-1802 URL: https://issues.apache.org/jira/browse/KAFKA-1802 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Fix For: 0.8.3 The goal here is like meta data discovery is for producer so CLI can find which broker it should send the rest of its admin requests too. Any broker can respond to this specific AdminMeta RQ/RP but only the controller broker should be responding to Admin message otherwise that broker should respond to any admin message with the response for what the controller is. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1802) Add a new type of request for the discovery of the controller
[ https://issues.apache.org/jira/browse/KAFKA-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230037#comment-14230037 ] Joe Stein commented on KAFKA-1802: -- lets not update A guide to the kafka protocol until everything is baked in and committed (that is final resting place) but do please update https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Command+Line+and+Related+Improvements so discussion of all message changes/additions for this can happen there. Add a new type of request for the discovery of the controller - Key: KAFKA-1802 URL: https://issues.apache.org/jira/browse/KAFKA-1802 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Fix For: 0.8.3 The goal here is like meta data discovery is for producer so CLI can find which broker it should send the rest of its admin requests too. Any broker can respond to this specific AdminMeta RQ/RP but only the controller broker should be responding to Admin message otherwise that broker should respond to any admin message with the response for what the controller is. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1800) KafkaException was not recorded at the per-topic metrics
[ https://issues.apache.org/jira/browse/KAFKA-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230111#comment-14230111 ] Jay Kreps commented on KAFKA-1800: -- Is it important to count this exception at the topic level? Maybe just count it at the global level? KafkaException was not recorded at the per-topic metrics Key: KAFKA-1800 URL: https://issues.apache.org/jira/browse/KAFKA-1800 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.9.0 Attachments: KAFKA-1800.patch When KafkaException was thrown from producer.send() call, it is not recorded on the per-topic record-error-rate, but only the global error-rate. Since users are usually monitoring on the per-topic metrics, loosing all dropped message counts at this level that are caused by kafka producer thrown exceptions such as BufferExhaustedException could be very dangerous. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1800) KafkaException was not recorded at the per-topic metrics
[ https://issues.apache.org/jira/browse/KAFKA-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230151#comment-14230151 ] Guozhang Wang commented on KAFKA-1800: -- It is actually quite important for monitoring topic level error / retry rate for producer thrown exceptions since many users do not monitor global level metrics but only on topics that they are interested in. KafkaException was not recorded at the per-topic metrics Key: KAFKA-1800 URL: https://issues.apache.org/jira/browse/KAFKA-1800 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.9.0 Attachments: KAFKA-1800.patch When KafkaException was thrown from producer.send() call, it is not recorded on the per-topic record-error-rate, but only the global error-rate. Since users are usually monitoring on the per-topic metrics, loosing all dropped message counts at this level that are caused by kafka producer thrown exceptions such as BufferExhaustedException could be very dangerous. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1783) Missing slash in documentation for the Zookeeper paths in ZookeeperConsumerConnector
[ https://issues.apache.org/jira/browse/KAFKA-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230160#comment-14230160 ] Guozhang Wang commented on KAFKA-1783: -- Thanks for the patch, committed to trunk. Missing slash in documentation for the Zookeeper paths in ZookeeperConsumerConnector Key: KAFKA-1783 URL: https://issues.apache.org/jira/browse/KAFKA-1783 Project: Kafka Issue Type: Bug Components: consumer Reporter: Jean-Francois Im Assignee: Neha Narkhede Priority: Trivial Attachments: kafka-missing-doc-slash.patch The documentation for the ZookeeperConsumerConnector refers to the consumer id registry location as /consumers/[group_id]/ids[consumer_id], it should be /consumers/[group_id]/ids/[consumer_id], as evidenced by registerConsumerInZK() and TopicCount.scala line 61. A patch is provided that adds the missing forwards slash. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KAFKA-1783) Missing slash in documentation for the Zookeeper paths in ZookeeperConsumerConnector
[ https://issues.apache.org/jira/browse/KAFKA-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang resolved KAFKA-1783. -- Resolution: Fixed Assignee: Jean-Francois Im (was: Neha Narkhede) Missing slash in documentation for the Zookeeper paths in ZookeeperConsumerConnector Key: KAFKA-1783 URL: https://issues.apache.org/jira/browse/KAFKA-1783 Project: Kafka Issue Type: Bug Components: consumer Reporter: Jean-Francois Im Assignee: Jean-Francois Im Priority: Trivial Attachments: kafka-missing-doc-slash.patch The documentation for the ZookeeperConsumerConnector refers to the consumer id registry location as /consumers/[group_id]/ids[consumer_id], it should be /consumers/[group_id]/ids/[consumer_id], as evidenced by registerConsumerInZK() and TopicCount.scala line 61. A patch is provided that adds the missing forwards slash. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments
[ https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230236#comment-14230236 ] Neha Narkhede commented on KAFKA-1792: -- [~Dmitry Pekar] Left some comments on the rb. At a high level, it will help to know the source of the algorithm. Is this published or well tested someplace else? If this is completely new, we'd have to do the due diligence on generating enough test cases, beyond unit tests, to verify that the algorithm achieves its objectives (fairness and minimum data movement). Would you be up for sharing such test results. The metrics for such tests would be the # of replicas per broker and the total replica movement. change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments - Key: KAFKA-1792 URL: https://issues.apache.org/jira/browse/KAFKA-1792 Project: Kafka Issue Type: Sub-task Components: tools Reporter: Dmitry Pekar Assignee: Dmitry Pekar Fix For: 0.8.3 Attachments: KAFKA-1792.patch Current implementation produces fair replica distribution between specified list of brokers. Unfortunately, it doesn't take into account current replica assignment. So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth broker id=3, generate will create an assignment config which will redistribute replicas fairly across brokers [0..3] in the same way as those partitions were created from scratch. It will not take into consideration current replica assignment and accordingly will not try to minimize number of replica moves between brokers. As proposed by [~charmalloc] this should be improved. New output of improved --generate algorithm should suite following requirements: - fairness of replica distribution - every broker will have R or R+1 replicas assigned; - minimum of reassignments - number of replica moves between brokers will be minimal; Example. Consider following replica distribution per brokers [0..3] (we just added brokers 2 and 3): - broker - 0, 1, 2, 3 - replicas - 7, 6, 0, 0 The new algorithm will produce following assignment: - broker - 0, 1, 2, 3 - replicas - 4, 3, 3, 3 - moves - -3, -3, +3, +3 It will be fair and number of moves will be 6, which is minimal for specified initial distribution. The scope of this issue is: - design an algorithm matching the above requirements; - implement this algorithm and unit tests; - test it manually using different initial assignments; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1753) add --decommission-broker option
[ https://issues.apache.org/jira/browse/KAFKA-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230239#comment-14230239 ] Joe Stein commented on KAFKA-1753: -- So, I think with the changes in KAFKA-1792 this now becomes the same type of algo but the partitions on the broker being decommissioned are what we are evenly spreading to the rest of the cluster nodes essentially. add --decommission-broker option Key: KAFKA-1753 URL: https://issues.apache.org/jira/browse/KAFKA-1753 Project: Kafka Issue Type: Sub-task Components: tools Reporter: Dmitry Pekar Assignee: Dmitry Pekar Fix For: 0.8.3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 28481: Patch for KAFKA-1792
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28481/#review63409 --- core/src/main/scala/kafka/admin/AdminUtils.scala https://reviews.apache.org/r/28481/#comment105663 Will help to clarify that this groups the reassignment by partition. core/src/main/scala/kafka/admin/AdminUtils.scala https://reviews.apache.org/r/28481/#comment105662 It will help to document the algorithm with examples in detail here. Makes it much easier to understand the code core/src/main/scala/kafka/admin/ReassignPartitionsCommand.scala https://reviews.apache.org/r/28481/#comment105664 Shouldn't we get rid of the 2 ways of generating reassignments? - Neha Narkhede On Nov. 26, 2014, 9:09 p.m., Dmitry Pekar wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28481/ --- (Updated Nov. 26, 2014, 9:09 p.m.) Review request for kafka. Bugs: KAFKA-1792 https://issues.apache.org/jira/browse/KAFKA-1792 Repository: kafka Description --- KAFKA-1792: change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments Diffs - core/src/main/scala/kafka/admin/AdminUtils.scala 28b12c7b89a56c113b665fbde1b95f873f8624a3 core/src/main/scala/kafka/admin/ReassignPartitionsCommand.scala 979992b68af3723cd229845faff81c641123bb88 core/src/test/scala/unit/kafka/admin/AdminTest.scala e28979827110dfbbb92fe5b152e7f1cc973de400 topics.json ff011ed381e781b9a177036001d44dca3eac586f Diff: https://reviews.apache.org/r/28481/diff/ Testing --- Thanks, Dmitry Pekar
Re: [SECURITY DISCUSSION] Refactoring Brokers to support multiple ports
+1 to doing this, can you sub ticket in the security ticket when you create the JIRA for this (unless you did it already and I missed it). I made one comment in regards to the JSON returned on the confluence otherwise this matches what we already agreed and discussed and like how it is breaking it down into smaller chunks so that the different implementations can utilize it. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / On Tue, Nov 25, 2014 at 2:13 PM, Gwen Shapira gshap...@cloudera.com wrote: Hi Everyone, One of the pre-requisites we have for supporting multiple security protocols (SSL, Kerberos) is to support them on separate ports. This is done in KAFKA-1684 (The SSL Patch), but that patch addresses several different issues - Multiple ports, enriching the channels, SSL implementation - which makes it more challenging to review and to test. I'd like to split this into 3 separate patches: multi-port brokers, enriching SocketChannel, and the actual security implementations. Since even just adding support for multiple listeners per broker is somewhat involved and touches multiple components, I wrote a short design document that covers the necessary changes and the upgrade process: https://cwiki.apache.org/confluence/display/KAFKA/Multiple+Listeners+for+Kafka+Brokers Comments are more than welcome :) If this is acceptable, hope to have a patch ready in few days. Gwen Shapira
Re: Review Request 27391: Fix KAFKA-1634
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27391/ --- (Updated Dec. 1, 2014, 7:44 p.m.) Review request for kafka. Bugs: KAFKA-1634 https://issues.apache.org/jira/browse/KAFKA-1634 Repository: kafka Description (updated) --- Add another api in offset manager to return the struct, and the cache layer will only read its expiration timestamp while the offset formatter will read the struct as a whole Diffs (updated) - clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 7517b879866fc5dad5f8d8ad30636da8bbe7784a clients/src/main/java/org/apache/kafka/common/requests/OffsetCommitRequest.java 3ee5cbad55ce836fd04bb954dcf6ef2f9bc3288f clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java df37fc6d8f0db0b8192a948426af603be3444da4 core/src/main/scala/kafka/api/OffsetCommitRequest.scala 050615c72efe7dbaa4634f53943bd73273d20ffb core/src/main/scala/kafka/api/OffsetFetchRequest.scala c7604b9cdeb8f46507f04ed7d35fcb3d5f150713 core/src/main/scala/kafka/common/OffsetMetadataAndError.scala 4cabffeacea09a49913505db19a96a55d58c0909 core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala da29a8cb461099eb675161db2f11a9937424a5c6 core/src/main/scala/kafka/server/KafkaApis.scala 2a1c0326b6e6966d8b8254bd6a1cb83ad98a3b80 core/src/main/scala/kafka/server/KafkaServer.scala 1bf7d10cef23a77e71eb16bf6d0e68bc4ebe core/src/main/scala/kafka/server/OffsetManager.scala 3c79428962604800983415f6f705e04f52acb8fb core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala cd16ced5465d098be7a60498326b2a98c248f343 core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 8c5364fa97da1be09973c176d1baeb339455d319 Diff: https://reviews.apache.org/r/27391/diff/ Testing --- Thanks, Guozhang Wang
[jira] [Updated] (KAFKA-1634) Improve semantics of timestamp in OffsetCommitRequests and update documentation
[ https://issues.apache.org/jira/browse/KAFKA-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-1634: - Attachment: KAFKA-1634_2014-12-01_11:44:35.patch Improve semantics of timestamp in OffsetCommitRequests and update documentation --- Key: KAFKA-1634 URL: https://issues.apache.org/jira/browse/KAFKA-1634 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Guozhang Wang Priority: Blocker Fix For: 0.8.3 Attachments: KAFKA-1634.patch, KAFKA-1634_2014-11-06_15:35:46.patch, KAFKA-1634_2014-11-07_16:54:33.patch, KAFKA-1634_2014-11-17_17:42:44.patch, KAFKA-1634_2014-11-21_14:00:34.patch, KAFKA-1634_2014-12-01_11:44:35.patch From the mailing list - following up on this -- I think the online API docs for OffsetCommitRequest still incorrectly refer to client-side timestamps: https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest Wasn't that removed and now always handled server-side now? Would one of the devs mind updating the API spec wiki? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1634) Improve semantics of timestamp in OffsetCommitRequests and update documentation
[ https://issues.apache.org/jira/browse/KAFKA-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230317#comment-14230317 ] Guozhang Wang commented on KAFKA-1634: -- Updated reviewboard https://reviews.apache.org/r/27391/diff/ against branch origin/trunk Improve semantics of timestamp in OffsetCommitRequests and update documentation --- Key: KAFKA-1634 URL: https://issues.apache.org/jira/browse/KAFKA-1634 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Guozhang Wang Priority: Blocker Fix For: 0.8.3 Attachments: KAFKA-1634.patch, KAFKA-1634_2014-11-06_15:35:46.patch, KAFKA-1634_2014-11-07_16:54:33.patch, KAFKA-1634_2014-11-17_17:42:44.patch, KAFKA-1634_2014-11-21_14:00:34.patch, KAFKA-1634_2014-12-01_11:44:35.patch From the mailing list - following up on this -- I think the online API docs for OffsetCommitRequest still incorrectly refer to client-side timestamps: https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest Wasn't that removed and now always handled server-side now? Would one of the devs mind updating the API spec wiki? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1642) [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost
[ https://issues.apache.org/jira/browse/KAFKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-1642: - Attachment: KAFKA-1642.patch [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost --- Key: KAFKA-1642 URL: https://issues.apache.org/jira/browse/KAFKA-1642 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 0.8.2 Reporter: Bhavesh Mistry Assignee: Ewen Cheslack-Postava Priority: Blocker Fix For: 0.8.2 Attachments: 0001-Initial-CPU-Hish-Usage-by-Kafka-FIX-and-Also-fix-CLO.patch, KAFKA-1642.patch, KAFKA-1642.patch, KAFKA-1642_2014-10-20_17:33:57.patch, KAFKA-1642_2014-10-23_16:19:41.patch I see my CPU spike to 100% when network connection is lost for while. It seems network IO thread are very busy logging following error message. Is this expected behavior ? 2014-09-17 14:06:16.830 [kafka-producer-network-thread] ERROR org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka producer I/O thread: java.lang.IllegalStateException: No entry found for node -2 at org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:110) at org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:99) at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:394) at org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:380) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:174) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:175) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:744) Thanks, Bhavesh -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1642) [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost
[ https://issues.apache.org/jira/browse/KAFKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230709#comment-14230709 ] Ewen Cheslack-Postava commented on KAFKA-1642: -- Created reviewboard https://reviews.apache.org/r/28582/diff/ against branch origin/trunk [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost --- Key: KAFKA-1642 URL: https://issues.apache.org/jira/browse/KAFKA-1642 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 0.8.2 Reporter: Bhavesh Mistry Assignee: Ewen Cheslack-Postava Priority: Blocker Fix For: 0.8.2 Attachments: 0001-Initial-CPU-Hish-Usage-by-Kafka-FIX-and-Also-fix-CLO.patch, KAFKA-1642.patch, KAFKA-1642.patch, KAFKA-1642_2014-10-20_17:33:57.patch, KAFKA-1642_2014-10-23_16:19:41.patch I see my CPU spike to 100% when network connection is lost for while. It seems network IO thread are very busy logging following error message. Is this expected behavior ? 2014-09-17 14:06:16.830 [kafka-producer-network-thread] ERROR org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka producer I/O thread: java.lang.IllegalStateException: No entry found for node -2 at org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:110) at org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:99) at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:394) at org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:380) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:174) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:175) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:744) Thanks, Bhavesh -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1642) [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost
[ https://issues.apache.org/jira/browse/KAFKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava updated KAFKA-1642: - Status: Patch Available (was: Reopened) [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost --- Key: KAFKA-1642 URL: https://issues.apache.org/jira/browse/KAFKA-1642 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 0.8.2 Reporter: Bhavesh Mistry Assignee: Ewen Cheslack-Postava Priority: Blocker Fix For: 0.8.2 Attachments: 0001-Initial-CPU-Hish-Usage-by-Kafka-FIX-and-Also-fix-CLO.patch, KAFKA-1642.patch, KAFKA-1642.patch, KAFKA-1642_2014-10-20_17:33:57.patch, KAFKA-1642_2014-10-23_16:19:41.patch I see my CPU spike to 100% when network connection is lost for while. It seems network IO thread are very busy logging following error message. Is this expected behavior ? 2014-09-17 14:06:16.830 [kafka-producer-network-thread] ERROR org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka producer I/O thread: java.lang.IllegalStateException: No entry found for node -2 at org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:110) at org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:99) at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:394) at org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:380) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:174) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:175) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:744) Thanks, Bhavesh -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 28582: Patch for KAFKA-1642
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28582/ --- Review request for kafka. Bugs: KAFKA-1642 https://issues.apache.org/jira/browse/KAFKA-1642 Repository: kafka Description --- KAFKA-1642 Make nodes as connecting before making connect request so exceptions are handled properly. KAFKA-1642 Update last time no nodes were available when updating metadata has to initiate a connection. KAFKA-1642 Don't busy wait trying to send metadata to a node that is connected but has too many in flight requests. Diffs - clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 525b95e98010cd2053eacd8c321d079bcac2f910 Diff: https://reviews.apache.org/r/28582/diff/ Testing --- Thanks, Ewen Cheslack-Postava
[jira] [Commented] (KAFKA-1642) [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost
[ https://issues.apache.org/jira/browse/KAFKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230715#comment-14230715 ] Ewen Cheslack-Postava commented on KAFKA-1642: -- Attached a new patch that fixes all the timeout issues I'm aware of. Here's how it addresses each of the situations I listed earlier: 1. lastNoNodeAvailableMs is updated, which forces metadata timeout for each poll to be use a backoff period 2. Added another backoff value based on metadataFetchInProgress. Since the request actually made it out, this can be arbitrarily large -- we just need to see some sort of response or failure for the request. 3. Requires some response to arrive to clear out space, so we can wait arbitrarily long. Updating lastNoNodeAvailableMs works even though it may wake up sooner than necessary. But making all cases that didn't send the data use a single approach keeps the code simpler. 4a. This can happen if, e.g., the network interface has been taken down entirely. After fixing the ordering of marking the node as connecting and issuing the request, this cleans up after that error cleanly. lastNoNodeAvailableMs is updated since there *weren't* any nodes available. This triggers a backoff period where the connection won't be retried. 4b. This can be handled in the same way - we set lastNoNodeAvailableMs whether or not we immediately saw an error from the connection request. This causes it to sleep while waiting for the connection request. This may wake up before we get connected. However, if the node is still in the connecting state, it'll be ignored during the next round and we'll either start trying to connect to another node or we'll end up in state 1 with no nodes available. Either way, we still only wake up periodically based on the this timeout. 5. Looking more carefully at how leastLoadedNode works, this case isn't actually possible. One additional note -- apparently you can't use Long.MAX_VALUE as a timeout, it throws an exception. That's why Integer.MAX_VALUE is there instead. We could also detect the large value and convert it to a negative value instead, which the underlying API treats as having no timeout. [~Bmis13] can you test this out for the failure modes you found? [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network connection is lost --- Key: KAFKA-1642 URL: https://issues.apache.org/jira/browse/KAFKA-1642 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 0.8.2 Reporter: Bhavesh Mistry Assignee: Ewen Cheslack-Postava Priority: Blocker Fix For: 0.8.2 Attachments: 0001-Initial-CPU-Hish-Usage-by-Kafka-FIX-and-Also-fix-CLO.patch, KAFKA-1642.patch, KAFKA-1642.patch, KAFKA-1642_2014-10-20_17:33:57.patch, KAFKA-1642_2014-10-23_16:19:41.patch I see my CPU spike to 100% when network connection is lost for while. It seems network IO thread are very busy logging following error message. Is this expected behavior ? 2014-09-17 14:06:16.830 [kafka-producer-network-thread] ERROR org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka producer I/O thread: java.lang.IllegalStateException: No entry found for node -2 at org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:110) at org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:99) at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:394) at org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:380) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:174) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:175) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115) at java.lang.Thread.run(Thread.java:744) Thanks, Bhavesh -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (KAFKA-742) Existing directories under the Kafka data directory without any data cause process to not start
[ https://issues.apache.org/jira/browse/KAFKA-742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Kumar Singh reassigned KAFKA-742: Assignee: Ashish Kumar Singh Existing directories under the Kafka data directory without any data cause process to not start --- Key: KAFKA-742 URL: https://issues.apache.org/jira/browse/KAFKA-742 Project: Kafka Issue Type: Bug Components: config Affects Versions: 0.8.0 Reporter: Chris Curtin Assignee: Ashish Kumar Singh I incorrectly setup the configuration file to have the metrics go to /var/kafka/metrics while the logs were in /var/kafka. On startup I received the following error then the daemon exited: 30 [main] INFO kafka.log.LogManager - [Log Manager on Broker 0] Loading log 'metrics' 32 [main] FATAL kafka.server.KafkaServerStartable - Fatal error during KafkaServerStable startup. Prepare to shutdown java.lang.StringIndexOutOfBoundsException: String index out of range: -1 at java.lang.String.substring(String.java:1937) at kafka.log.LogManager.kafka$log$LogManager$$parseTopicPartitionName(LogManager.scala:335) at kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$3.apply(LogManager.scala:112) at kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$3.apply(LogManager.scala:109) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34) at scala.collection.mutable.ArrayOps.foreach(ArrayOps.scala:34) at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:109) at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:101) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:32) at kafka.log.LogManager.loadLogs(LogManager.scala:101) at kafka.log.LogManager.init(LogManager.scala:62) at kafka.server.KafkaServer.startup(KafkaServer.scala:59) at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34) at kafka.Kafka$.main(Kafka.scala:46) at kafka.Kafka.main(Kafka.scala) 34 [main] INFO kafka.server.KafkaServer - [Kafka Server 0], shutting down This was on a brand new cluster so no data or metrics logs existed yet. Moving the metrics to their own directory (not a child of the logs) allowed the daemon to start. Took a few minutes to figure out what was wrong. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-742) Existing directories under the Kafka data directory without any data cause process to not start
[ https://issues.apache.org/jira/browse/KAFKA-742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230737#comment-14230737 ] Ashish Kumar Singh commented on KAFKA-742: -- [~jkreps], [~chriscurtin], I would like to take a stab at this. Assigning it to myself. Existing directories under the Kafka data directory without any data cause process to not start --- Key: KAFKA-742 URL: https://issues.apache.org/jira/browse/KAFKA-742 Project: Kafka Issue Type: Bug Components: config Affects Versions: 0.8.0 Reporter: Chris Curtin Assignee: Ashish Kumar Singh I incorrectly setup the configuration file to have the metrics go to /var/kafka/metrics while the logs were in /var/kafka. On startup I received the following error then the daemon exited: 30 [main] INFO kafka.log.LogManager - [Log Manager on Broker 0] Loading log 'metrics' 32 [main] FATAL kafka.server.KafkaServerStartable - Fatal error during KafkaServerStable startup. Prepare to shutdown java.lang.StringIndexOutOfBoundsException: String index out of range: -1 at java.lang.String.substring(String.java:1937) at kafka.log.LogManager.kafka$log$LogManager$$parseTopicPartitionName(LogManager.scala:335) at kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$3.apply(LogManager.scala:112) at kafka.log.LogManager$$anonfun$loadLogs$1$$anonfun$apply$3.apply(LogManager.scala:109) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34) at scala.collection.mutable.ArrayOps.foreach(ArrayOps.scala:34) at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:109) at kafka.log.LogManager$$anonfun$loadLogs$1.apply(LogManager.scala:101) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:32) at kafka.log.LogManager.loadLogs(LogManager.scala:101) at kafka.log.LogManager.init(LogManager.scala:62) at kafka.server.KafkaServer.startup(KafkaServer.scala:59) at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34) at kafka.Kafka$.main(Kafka.scala:46) at kafka.Kafka.main(Kafka.scala) 34 [main] INFO kafka.server.KafkaServer - [Kafka Server 0], shutting down This was on a brand new cluster so no data or metrics logs existed yet. Moving the metrics to their own directory (not a child of the logs) allowed the daemon to start. Took a few minutes to figure out what was wrong. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1799) ProducerConfig.METRIC_REPORTER_CLASSES_CONFIG doesn't work
[ https://issues.apache.org/jira/browse/KAFKA-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Rao updated KAFKA-1799: --- Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the patch. +1 and committed to trunk and 0.8.2. ProducerConfig.METRIC_REPORTER_CLASSES_CONFIG doesn't work -- Key: KAFKA-1799 URL: https://issues.apache.org/jira/browse/KAFKA-1799 Project: Kafka Issue Type: Bug Affects Versions: 0.8.2 Reporter: Jun Rao Assignee: Manikumar Reddy Priority: Blocker Labels: newbie++ Fix For: 0.8.2 Attachments: KAFKA-1799.patch, KAFKA-1799_2014-11-30_15:58:32.patch, KAFKA-1799_2014-11-30_16:04:16.patch When running the following test, we got an unknown configuration exception. @Test public void testMetricsReporter() { Properties producerProps = new Properties(); producerProps.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, host1:123); producerProps.put(ProducerConfig.METRIC_REPORTER_CLASSES_CONFIG, org.apache.kafka.clients.producer.new-metrics-reporter); new KafkaProducer(producerProps); } org.apache.kafka.common.config.ConfigException: Unknown configuration 'org.apache.kafka.clients.producer.new-metrics-reporter' at org.apache.kafka.common.config.AbstractConfig.get(AbstractConfig.java:60) at org.apache.kafka.common.config.AbstractConfig.getClass(AbstractConfig.java:91) at org.apache.kafka.common.config.AbstractConfig.getConfiguredInstances(AbstractConfig.java:147) at org.apache.kafka.clients.producer.KafkaProducer.init(KafkaProducer.java:105) at org.apache.kafka.clients.producer.KafkaProducer.init(KafkaProducer.java:94) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-1803) UncleanLeaderElectionEnableProp in LogConfig should be of boolean
Jun Rao created KAFKA-1803: -- Summary: UncleanLeaderElectionEnableProp in LogConfig should be of boolean Key: KAFKA-1803 URL: https://issues.apache.org/jira/browse/KAFKA-1803 Project: Kafka Issue Type: Bug Components: core Reporter: Jun Rao Fix For: 0.8.3 Now that KAFKA-1798 is fixed, we should define UncleanLeaderElectionEnableProp as a boolean, instead of String in LogConfig and get rid of the customized validation for boolean. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1789) Issue with Async producer
[ https://issues.apache.org/jira/browse/KAFKA-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230804#comment-14230804 ] devendra tagare commented on KAFKA-1789: Hi, We are using an async producer. ~Dev Issue with Async producer - Key: KAFKA-1789 URL: https://issues.apache.org/jira/browse/KAFKA-1789 Project: Kafka Issue Type: Bug Affects Versions: 0.8.0, 0.8.1 Reporter: devendra tagare Priority: Critical Hi, We are using an async producer to send data to a kafka cluster.The event rate at peak is around 250 events/second of size 25KB each. In the producer code base we have added specific debug statements to capture the time taken to create a producer,create a keyed message with a byte payload send the message. We have added the below properties to the producerConfig queue.enqueue.timeout.ms=20 send.buffer.bytes=1024000 topic.metadata.refresh.interval.ms=3 Based on the documentation, producer.send() queues the message on the async producer's queue. So, ideally if the queue is full then the enqueue operation should result in an kafka.common.QueueFullException in 20 ms. The logs indicate that the enqueue operation is taking more than 20ms (takes around 250ms) without throwing any exceptions. Is there any other property that could conflict with queue.enqueue.timeout.ms which is causing this behavior ? Or is it possible that the queue is not full yet the producer.send() call is still taking around 200ms under peak load ? Also, could you suggest any other alternatives so that we can either enforce a timeout or throw an exception in-case the async producer is taking more than a specified amount of time. Regards, Dev -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 27391: Fix KAFKA-1634
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27391/ --- (Updated Dec. 2, 2014, 2:03 a.m.) Review request for kafka. Bugs: KAFKA-1634 https://issues.apache.org/jira/browse/KAFKA-1634 Repository: kafka Description --- Add another api in offset manager to return the struct, and the cache layer will only read its expiration timestamp while the offset formatter will read the struct as a whole Diffs (updated) - clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 7517b879866fc5dad5f8d8ad30636da8bbe7784a clients/src/main/java/org/apache/kafka/common/protocol/types/Struct.java 121e880a941fcd3e6392859edba11a94236494cc clients/src/main/java/org/apache/kafka/common/requests/OffsetCommitRequest.java 3ee5cbad55ce836fd04bb954dcf6ef2f9bc3288f clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java df37fc6d8f0db0b8192a948426af603be3444da4 core/src/main/scala/kafka/api/OffsetCommitRequest.scala 050615c72efe7dbaa4634f53943bd73273d20ffb core/src/main/scala/kafka/api/OffsetFetchRequest.scala c7604b9cdeb8f46507f04ed7d35fcb3d5f150713 core/src/main/scala/kafka/common/OffsetMetadataAndError.scala 4cabffeacea09a49913505db19a96a55d58c0909 core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala da29a8cb461099eb675161db2f11a9937424a5c6 core/src/main/scala/kafka/server/KafkaApis.scala 2a1c0326b6e6966d8b8254bd6a1cb83ad98a3b80 core/src/main/scala/kafka/server/KafkaServer.scala 1bf7d10cef23a77e71eb16bf6d0e68bc4ebe core/src/main/scala/kafka/server/OffsetManager.scala 3c79428962604800983415f6f705e04f52acb8fb core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala cd16ced5465d098be7a60498326b2a98c248f343 core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 8c5364fa97da1be09973c176d1baeb339455d319 Diff: https://reviews.apache.org/r/27391/diff/ Testing --- Thanks, Guozhang Wang
[jira] [Commented] (KAFKA-1634) Improve semantics of timestamp in OffsetCommitRequests and update documentation
[ https://issues.apache.org/jira/browse/KAFKA-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230836#comment-14230836 ] Guozhang Wang commented on KAFKA-1634: -- Updated reviewboard https://reviews.apache.org/r/27391/diff/ against branch origin/trunk Improve semantics of timestamp in OffsetCommitRequests and update documentation --- Key: KAFKA-1634 URL: https://issues.apache.org/jira/browse/KAFKA-1634 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Guozhang Wang Priority: Blocker Fix For: 0.8.3 Attachments: KAFKA-1634.patch, KAFKA-1634_2014-11-06_15:35:46.patch, KAFKA-1634_2014-11-07_16:54:33.patch, KAFKA-1634_2014-11-17_17:42:44.patch, KAFKA-1634_2014-11-21_14:00:34.patch, KAFKA-1634_2014-12-01_11:44:35.patch, KAFKA-1634_2014-12-01_18:03:12.patch From the mailing list - following up on this -- I think the online API docs for OffsetCommitRequest still incorrectly refer to client-side timestamps: https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest Wasn't that removed and now always handled server-side now? Would one of the devs mind updating the API spec wiki? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1634) Improve semantics of timestamp in OffsetCommitRequests and update documentation
[ https://issues.apache.org/jira/browse/KAFKA-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-1634: - Attachment: KAFKA-1634_2014-12-01_18:03:12.patch Improve semantics of timestamp in OffsetCommitRequests and update documentation --- Key: KAFKA-1634 URL: https://issues.apache.org/jira/browse/KAFKA-1634 Project: Kafka Issue Type: Bug Reporter: Neha Narkhede Assignee: Guozhang Wang Priority: Blocker Fix For: 0.8.3 Attachments: KAFKA-1634.patch, KAFKA-1634_2014-11-06_15:35:46.patch, KAFKA-1634_2014-11-07_16:54:33.patch, KAFKA-1634_2014-11-17_17:42:44.patch, KAFKA-1634_2014-11-21_14:00:34.patch, KAFKA-1634_2014-12-01_11:44:35.patch, KAFKA-1634_2014-12-01_18:03:12.patch From the mailing list - following up on this -- I think the online API docs for OffsetCommitRequest still incorrectly refer to client-side timestamps: https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest Wasn't that removed and now always handled server-side now? Would one of the devs mind updating the API spec wiki? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1787) Consumer should delete offsets and release partition ownership for deleted topics
[ https://issues.apache.org/jira/browse/KAFKA-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230875#comment-14230875 ] Jun Rao commented on KAFKA-1787: Even if we trigger a rebalance in the consumer on topic deletion, it's still possible for a deleted topic to be recreated since a consumer could have issued a fetch request (with the deleted topic) after the topic is deleted on the broker, but before the rebalance is triggered on the consumer. To completely solve this problem, we need to add a create topic api and modify the TopicMetadataRequest such that it doesn't trigger auto topic creation on the broker. This is probably too big a change for 0.8.2 though. Consumer should delete offsets and release partition ownership for deleted topics - Key: KAFKA-1787 URL: https://issues.apache.org/jira/browse/KAFKA-1787 Project: Kafka Issue Type: Bug Affects Versions: 0.8.2 Reporter: Joel Koshy Assignee: Onur Karaman Priority: Blocker Fix For: 0.8.2 Marking as a blocker for now, but we can evaluate. If a topic is deleted, the consumer currently does not rebalance (handleDataDeleted is a TODO in ZooKeeperConsumerConnector) As a result, it will continue owning that (deleted) partition and continue to issue fetch requests. Those fetch requests will result in an UnknownTopicOrPartition error and thus will result in TopicMetadataRequests issued by the leader finder thread which can recreate the topic if auto-create is turned on. Furthermore if we don't delete the offsets it is possible to lose messages if the topic is recreated immediately and the previously checkpointed offset (for the old data) is small. E.g., if a consumer is at offset 10 on some partition, and if the partition is deleted and recreated and 15 messages are sent to it then the first 10 messages will be lost. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1784) Implement a ConsumerOffsetClient library
[ https://issues.apache.org/jira/browse/KAFKA-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14230877#comment-14230877 ] Jun Rao commented on KAFKA-1784: Joel, does this need to be a blocker for 0.8.2? Implement a ConsumerOffsetClient library Key: KAFKA-1784 URL: https://issues.apache.org/jira/browse/KAFKA-1784 Project: Kafka Issue Type: New Feature Reporter: Joel Koshy Assignee: Mayuresh Gharat Priority: Blocker Fix For: 0.8.2 I think it would be useful to provide an offset client library. It would make the documentation a lot simpler. Right now it is non-trivial to commit/fetch offsets to/from kafka. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1803) UncleanLeaderElectionEnableProp in LogConfig should be of boolean
[ https://issues.apache.org/jira/browse/KAFKA-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Parfitt updated KAFKA-1803: Attachment: KAFKA1803.patch UncleanLeaderElectionEnableProp in LogConfig should be of boolean - Key: KAFKA-1803 URL: https://issues.apache.org/jira/browse/KAFKA-1803 Project: Kafka Issue Type: Bug Components: core Reporter: Jun Rao Labels: newbie Fix For: 0.8.3 Attachments: KAFKA1803.patch Now that KAFKA-1798 is fixed, we should define UncleanLeaderElectionEnableProp as a boolean, instead of String in LogConfig and get rid of the customized validation for boolean. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1787) Consumer should delete offsets and release partition ownership for deleted topics
[ https://issues.apache.org/jira/browse/KAFKA-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14231088#comment-14231088 ] Joel Koshy commented on KAFKA-1787: --- Yeah that is one issue; [~onurkaraman] and I had discussed this offline a few days ago. Another obvious issue is that it won't help with consumers that happen to be down while the delete topic happens. Our conclusion was that ultimately some kind of topic versioning in combination with a create-topic API may be necessary. So I think we can close this as out of scope for 0.8.2 Consumer should delete offsets and release partition ownership for deleted topics - Key: KAFKA-1787 URL: https://issues.apache.org/jira/browse/KAFKA-1787 Project: Kafka Issue Type: Bug Affects Versions: 0.8.2 Reporter: Joel Koshy Assignee: Onur Karaman Priority: Blocker Fix For: 0.8.2 Marking as a blocker for now, but we can evaluate. If a topic is deleted, the consumer currently does not rebalance (handleDataDeleted is a TODO in ZooKeeperConsumerConnector) As a result, it will continue owning that (deleted) partition and continue to issue fetch requests. Those fetch requests will result in an UnknownTopicOrPartition error and thus will result in TopicMetadataRequests issued by the leader finder thread which can recreate the topic if auto-create is turned on. Furthermore if we don't delete the offsets it is possible to lose messages if the topic is recreated immediately and the previously checkpointed offset (for the old data) is small. E.g., if a consumer is at offset 10 on some partition, and if the partition is deleted and recreated and 15 messages are sent to it then the first 10 messages will be lost. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KAFKA-1787) Consumer should delete offsets and release partition ownership for deleted topics
[ https://issues.apache.org/jira/browse/KAFKA-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Koshy resolved KAFKA-1787. --- Resolution: Won't Fix Consumer should delete offsets and release partition ownership for deleted topics - Key: KAFKA-1787 URL: https://issues.apache.org/jira/browse/KAFKA-1787 Project: Kafka Issue Type: Bug Affects Versions: 0.8.2 Reporter: Joel Koshy Assignee: Onur Karaman Priority: Blocker Fix For: 0.8.2 Marking as a blocker for now, but we can evaluate. If a topic is deleted, the consumer currently does not rebalance (handleDataDeleted is a TODO in ZooKeeperConsumerConnector) As a result, it will continue owning that (deleted) partition and continue to issue fetch requests. Those fetch requests will result in an UnknownTopicOrPartition error and thus will result in TopicMetadataRequests issued by the leader finder thread which can recreate the topic if auto-create is turned on. Furthermore if we don't delete the offsets it is possible to lose messages if the topic is recreated immediately and the previously checkpointed offset (for the old data) is small. E.g., if a consumer is at offset 10 on some partition, and if the partition is deleted and recreated and 15 messages are sent to it then the first 10 messages will be lost. -- This message was sent by Atlassian JIRA (v6.3.4#6332)