[jira] [Updated] (KAFKA-2068) Replace OffsetCommit Request/Response with org.apache.kafka.common.requests equivalent
[ https://issues.apache.org/jira/browse/KAFKA-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-2068: - Status: In Progress (was: Patch Available) Replace OffsetCommit Request/Response with org.apache.kafka.common.requests equivalent Key: KAFKA-2068 URL: https://issues.apache.org/jira/browse/KAFKA-2068 Project: Kafka Issue Type: Sub-task Reporter: Gwen Shapira Assignee: Guozhang Wang Fix For: 0.8.3 Attachments: KAFKA-2068.patch Replace OffsetCommit Request/Response with org.apache.kafka.common.requests equivalent -- This message was sent by Atlassian JIRA (v6.3.4#6332)
RE: [DISCUSS] KIP-21 Configuration Management
Hey Joe, Can you elaborate what you mean by a stop the world change? In this protocol, we can target notifications to a subset of brokers in the cluster (controller if we need to). Is the AdminChangeNotification a ZK notification or a request type exposed by each broker? Thanks, Aditya From: Joe Stein [joe.st...@stealth.ly] Sent: Friday, May 01, 2015 5:25 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Hi Aditya, thanks for the write up and focusing on this piece. Agreed we need something that we can do broker changes dynamically without rolling restarts. I think though if every broker is getting changes it with notifications it is going to limit which configs can be dynamic. We could never deliver a stop the world configuration change because then that would happen on the entire cluster to every broker on the same time. Can maybe just the controller get the notification? And we provide a layer for brokers to work with the controller to-do the config change operations at is discretion (so it can stop things if needs). controller gets notification, sends AdminChangeNotification to broker [X .. N] then brokers can do their things, even send a response for heartbeating while it takes the few milliseconds it needs or crashes. We need to go through both scenarios. I am worried we put this change in like this and it works for quotas and maybe a few other things but nothing else gets dynamic and we don't get far enough for almost no more rolling restarts. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Thu, Apr 30, 2015 at 8:14 PM, Joel Koshy jjkosh...@gmail.com wrote: 1. I have deep concerns about managing configuration in ZooKeeper. First, Producers and Consumers shouldn't depend on ZK at all, this seems to add back a dependency we are trying to get away from. The KIP probably needs to be clarified here - I don't think Aditya was referring to client (producer/consumer) configs. These are global client-id-specific configs that need to be managed centrally. (Specifically, quota overrides on a per-client basis).
Jenkins build is back to normal : Kafka-trunk #482
See https://builds.apache.org/job/Kafka-trunk/482/changes
Re: [DISCUSS] KIP-21 Configuration Management
Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should not depend on ZK for the reason mentioned above. It's a pain to assume that the ZK port is open from any machine that needs to run this tool. Ideally what users want is a REST API to the brokers to change or read the config (ala Elasticsearch), but in the absence of the REST API, we should think if we can write the tool such that it just requires talking to the Kafka broker port. This will require a config RPC. 4. Not sure if KIP is the right place to discuss the design of propagating the config changes to the brokers, but have you thought about just letting the controller oversee the config changes and propagate via RPC to the brokers? That way, there is an easier way to express config changes that require all brokers to change it for it to be called complete. Maybe this is not required, but it is hard to say if we don't discuss the full set of configs that need to be dynamic. Thanks, Neha On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Aditya, This is a great! A couple of comments: 1. Leaving the file config in place is definitely the least disturbance. But let's really think about getting rid of the files and just have one config mechanism. There is always a tendency to make everything pluggable which so often just leads to two mediocre solutions. Can we do the exercise of trying to consider fully getting rid of file config and seeing what goes wrong? 2. Do we need to model defaults? The current approach is that if you have a global config x it is overridden for a topic xyz by /topics/xyz/x, and I think this could be extended to /brokers/0/x. I think this is simpler. We need to specify the precedence for these overrides, e.g. if you override at the broker and topic level I think the topic level takes precedence. 3. I recommend we have the producer and consumer config just be an override under client.id. The override is by client id and we can have separate properties for controlling quotas for producers and consumers. 4. Some configs can be changed just by updating the reference, others may require some action. An example of this is if you want to disable log compaction (assuming we wanted to make that dynamic) we need to call shutdown() on the cleaner. I think it may be required to register a listener callback that gets called when the config changes. 5. For handling the reference can you explain your plan a bit? Currently we have an immutable KafkaConfig object with a bunch of vals. That or individual values in there get injected all over the code base. I was thinking something like this: a. We retain the KafkaConfig object as an immutable object just as today. b. It is no longer legit to grab values out fo that config if they are changeable. c. Instead of making KafkaConfig itself mutable we make KafkaConfiguration which has a single volatile reference to the current KafkaConfig. KafkaConfiguration is what gets passed into various components. So to access a config you do something like config.instance.myValue. When the config changes the config manager updates this reference. d. The KafkaConfiguration is the thing that allows doing the configuration.onChange(my.config, callback) -Jay On Tue, Apr 28, 2015 at 3:57 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Hey everyone, Wrote up a KIP to update topic, client and broker configs dynamically via Zookeeper. https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration Please read and provide feedback. Thanks, Aditya PS: I've intentionally kept this discussion separate from KIP-5 since I'm not sure if that is actively being worked on and I wanted to start with a clean slate. -- Thanks, Neha
[jira] [Commented] (KAFKA-2000) Delete consumer offsets from kafka once the topic is deleted
[ https://issues.apache.org/jira/browse/KAFKA-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525913#comment-14525913 ] Sriharsha Chintalapani commented on KAFKA-2000: --- Updated reviewboard https://reviews.apache.org/r/32650/diff/ against branch origin/trunk Delete consumer offsets from kafka once the topic is deleted Key: KAFKA-2000 URL: https://issues.apache.org/jira/browse/KAFKA-2000 Project: Kafka Issue Type: Bug Reporter: Sriharsha Chintalapani Assignee: Sriharsha Chintalapani Labels: newbie++ Attachments: KAFKA-2000.patch, KAFKA-2000_2015-05-03_10:39:11.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 32650: Patch for KAFKA-2000
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32650/ --- (Updated May 3, 2015, 5:39 p.m.) Review request for kafka. Bugs: KAFKA-2000 https://issues.apache.org/jira/browse/KAFKA-2000 Repository: kafka Description --- KAFKA-2000. Delete consumer offsets from kafka once the topic is deleted. Diffs (updated) - core/src/main/scala/kafka/server/OffsetManager.scala 18680ce100f10035175cc0263ba7787ab0f6a17a core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 652208a70f66045b854549d93cbbc2b77c24b10b Diff: https://reviews.apache.org/r/32650/diff/ Testing --- Thanks, Sriharsha Chintalapani
[jira] [Updated] (KAFKA-2000) Delete consumer offsets from kafka once the topic is deleted
[ https://issues.apache.org/jira/browse/KAFKA-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriharsha Chintalapani updated KAFKA-2000: -- Status: Patch Available (was: In Progress) Delete consumer offsets from kafka once the topic is deleted Key: KAFKA-2000 URL: https://issues.apache.org/jira/browse/KAFKA-2000 Project: Kafka Issue Type: Bug Reporter: Sriharsha Chintalapani Assignee: Sriharsha Chintalapani Labels: newbie++ Attachments: KAFKA-2000.patch, KAFKA-2000_2015-05-03_10:39:11.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2000) Delete consumer offsets from kafka once the topic is deleted
[ https://issues.apache.org/jira/browse/KAFKA-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriharsha Chintalapani updated KAFKA-2000: -- Attachment: KAFKA-2000_2015-05-03_10:39:11.patch Delete consumer offsets from kafka once the topic is deleted Key: KAFKA-2000 URL: https://issues.apache.org/jira/browse/KAFKA-2000 Project: Kafka Issue Type: Bug Reporter: Sriharsha Chintalapani Assignee: Sriharsha Chintalapani Labels: newbie++ Attachments: KAFKA-2000.patch, KAFKA-2000_2015-05-03_10:39:11.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
RE: [DISCUSS] KIP-21 Configuration Management
Hey everyone, Thanks for the comments. I'll respond to each one-by-one. In the meantime, can we put this on the agenda for the KIP hangout for next week? Thanks, Aditya From: Neha Narkhede [n...@confluent.io] Sent: Sunday, May 03, 2015 9:48 AM To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-21 Configuration Management Thanks for starting this discussion, Aditya. Few questions/comments 1. If you change the default values like it's mentioned in the KIP, do you also overwrite the local config file as part of updating the default value? If not, where does the admin look to find the default values, ZK or local Kafka config file? What if a config value is different in both places? 2. I share Gwen's concern around making sure that popular config management tools continue to work with this change. Would love to see how each of those would work with the proposal in the KIP. I don't know enough about each of the tools but seems like in some of the tools, you have to define some sort of class with parameter names as config names. How will such tools find out about the config values? In Puppet, if this means that each Puppet agent has to read it from ZK, this means the ZK port has to be open to pretty much every machine in the DC. This is a bummer and a very confusing requirement. Not sure if this is really a problem or not (each of those tools might behave differently), though pointing out that this is something worth paying attention to. 3. The wrapper tools that let users read/change config tools should not depend on ZK for the reason mentioned above. It's a pain to assume that the ZK port is open from any machine that needs to run this tool. Ideally what users want is a REST API to the brokers to change or read the config (ala Elasticsearch), but in the absence of the REST API, we should think if we can write the tool such that it just requires talking to the Kafka broker port. This will require a config RPC. 4. Not sure if KIP is the right place to discuss the design of propagating the config changes to the brokers, but have you thought about just letting the controller oversee the config changes and propagate via RPC to the brokers? That way, there is an easier way to express config changes that require all brokers to change it for it to be called complete. Maybe this is not required, but it is hard to say if we don't discuss the full set of configs that need to be dynamic. Thanks, Neha On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com wrote: Hey Aditya, This is a great! A couple of comments: 1. Leaving the file config in place is definitely the least disturbance. But let's really think about getting rid of the files and just have one config mechanism. There is always a tendency to make everything pluggable which so often just leads to two mediocre solutions. Can we do the exercise of trying to consider fully getting rid of file config and seeing what goes wrong? 2. Do we need to model defaults? The current approach is that if you have a global config x it is overridden for a topic xyz by /topics/xyz/x, and I think this could be extended to /brokers/0/x. I think this is simpler. We need to specify the precedence for these overrides, e.g. if you override at the broker and topic level I think the topic level takes precedence. 3. I recommend we have the producer and consumer config just be an override under client.id. The override is by client id and we can have separate properties for controlling quotas for producers and consumers. 4. Some configs can be changed just by updating the reference, others may require some action. An example of this is if you want to disable log compaction (assuming we wanted to make that dynamic) we need to call shutdown() on the cleaner. I think it may be required to register a listener callback that gets called when the config changes. 5. For handling the reference can you explain your plan a bit? Currently we have an immutable KafkaConfig object with a bunch of vals. That or individual values in there get injected all over the code base. I was thinking something like this: a. We retain the KafkaConfig object as an immutable object just as today. b. It is no longer legit to grab values out fo that config if they are changeable. c. Instead of making KafkaConfig itself mutable we make KafkaConfiguration which has a single volatile reference to the current KafkaConfig. KafkaConfiguration is what gets passed into various components. So to access a config you do something like config.instance.myValue. When the config changes the config manager updates this reference. d. The KafkaConfiguration is the thing that allows doing the configuration.onChange(my.config, callback) -Jay On Tue, Apr 28, 2015 at 3:57 PM, Aditya Auradkar aaurad...@linkedin.com.invalid wrote: Hey everyone, Wrote up a KIP to update topic, client and broker configs dynamically via Zookeeper.
Re: Review Request 32650: Patch for KAFKA-2000
On April 23, 2015, 9:51 p.m., Joel Koshy wrote: core/src/main/scala/kafka/server/OffsetManager.scala, line 124 https://reviews.apache.org/r/32650/diff/1/?file=909897#file909897line124 A safer fix is to proactively purge as part of UpdateMetadataRequest - i.e., removePartitionInfo in metadata cache. Your fix is nice, but we need to make sure of the following: on a given offset manager (broker) the metadata cache must contain topic X before any consumer of topic X (and whose group is managed by that broker) commits offsets for topic X. The original scenario I was concerned about should be fine: - Suppose broker A (offset manager for G) starts up - It receives UpdateMetadataRequests from the controller for all topics in the cluster - It then receives LeaderAndIsrRequest for partitions of the offset topic which make it the offset manager. - We should be fine _as long as_ the update metadata requests occur first. So if we go with your approach we should at the very least add a unit test to guarantee this. There is another scenario. If topic X is a new topic (or has new partitions): - Broker A is the offset manager for consumer group G - Broker B leads a new partition of X - Controller C sends become leader to B and update metadata to A (which will populate its metadata cache) - B becomes the leader first - A consumer starts consuming X and commits offsets to A (before it has received the update metadata request) - Other consumers in the group may rebalance while all this is happening (since new partitions for the topic appeared) and may fetch offsets from A - But A could have deleted the offset by then. - This is improbable but not impossible. Onur mentioned another corner case: https://issues.apache.org/jira/browse/KAFKA-1787 Both would be solved by having topic generations and incorporating generation information when determining which offsets to purge. I don't think we have a jira open for that but I will follow-up offline with Onur. Do you see any other issues? So I think the options are: - Go with your approach + a unit test to ensure that the controller sends update metadata request first. - Go with the more conservative fix which is to purge on metadataCache.removePartitionInfo Also, we should add a unit test to verify offsets are in fact removed after deletion. Joel Koshy wrote: Never mind - for the second scenario we are fine. We check in offset manager if the topic exists before committing offsets. So your fix should be fine. Can you add a unit test? Thanks for the review Joel. Added a unit test to check if the offsets are deleted after topic deletion. - Sriharsha --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32650/#review81413 --- On May 3, 2015, 5:39 p.m., Sriharsha Chintalapani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32650/ --- (Updated May 3, 2015, 5:39 p.m.) Review request for kafka. Bugs: KAFKA-2000 https://issues.apache.org/jira/browse/KAFKA-2000 Repository: kafka Description --- KAFKA-2000. Delete consumer offsets from kafka once the topic is deleted. Diffs - core/src/main/scala/kafka/server/OffsetManager.scala 18680ce100f10035175cc0263ba7787ab0f6a17a core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 652208a70f66045b854549d93cbbc2b77c24b10b Diff: https://reviews.apache.org/r/32650/diff/ Testing --- Thanks, Sriharsha Chintalapani