[jira] [Updated] (KAFKA-2068) Replace OffsetCommit Request/Response with org.apache.kafka.common.requests equivalent

2015-05-03 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-2068:
-
Status: In Progress  (was: Patch Available)

 Replace OffsetCommit Request/Response with  org.apache.kafka.common.requests  
 equivalent
 

 Key: KAFKA-2068
 URL: https://issues.apache.org/jira/browse/KAFKA-2068
 Project: Kafka
  Issue Type: Sub-task
Reporter: Gwen Shapira
Assignee: Guozhang Wang
 Fix For: 0.8.3

 Attachments: KAFKA-2068.patch


 Replace OffsetCommit Request/Response with  org.apache.kafka.common.requests  
 equivalent



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


RE: [DISCUSS] KIP-21 Configuration Management

2015-05-03 Thread Aditya Auradkar
Hey Joe,

Can you elaborate what you mean by a stop the world change? In this protocol, 
we can target notifications to a subset of brokers in the cluster (controller 
if we need to). Is the AdminChangeNotification a ZK notification or a request 
type exposed by each broker? 

Thanks,
Aditya


From: Joe Stein [joe.st...@stealth.ly]
Sent: Friday, May 01, 2015 5:25 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management

Hi Aditya, thanks for the write up and focusing on this piece.

Agreed we need something that we can do broker changes dynamically without
rolling restarts.

I think though if every broker is getting changes it with notifications it
is going to limit which configs can be dynamic.

We could never deliver a stop the world configuration change because then
that would happen on the entire cluster to every broker on the same time.

Can maybe just the controller get the notification?

And we provide a layer for brokers to work with the controller to-do the
config change operations at is discretion (so it can stop things if needs).

controller gets notification, sends AdminChangeNotification to broker [X ..
N] then brokers can do their things, even send a response for heartbeating
while it takes the few milliseconds it needs or crashes. We need to go
through both scenarios.

I am worried we put this change in like this and it works for quotas and
maybe a few other things but nothing else gets dynamic and we don't get far
enough for almost no more rolling restarts.

~ Joe Stein
- - - - - - - - - - - - - - - - -

  http://www.stealth.ly
- - - - - - - - - - - - - - - - -

On Thu, Apr 30, 2015 at 8:14 PM, Joel Koshy jjkosh...@gmail.com wrote:

 1. I have deep concerns about managing configuration in ZooKeeper.
 First, Producers and Consumers shouldn't depend on ZK at all, this
 seems
 to add back a dependency we are trying to get away from.

 The KIP probably needs to be clarified here - I don't think Aditya was
 referring to client (producer/consumer) configs. These are global
 client-id-specific configs that need to be managed centrally.
 (Specifically, quota overrides on a per-client basis).




Jenkins build is back to normal : Kafka-trunk #482

2015-05-03 Thread Apache Jenkins Server
See https://builds.apache.org/job/Kafka-trunk/482/changes



Re: [DISCUSS] KIP-21 Configuration Management

2015-05-03 Thread Neha Narkhede
Thanks for starting this discussion, Aditya. Few questions/comments

1. If you change the default values like it's mentioned in the KIP, do you
also overwrite the local config file as part of updating the default value?
If not, where does the admin look to find the default values, ZK or local
Kafka config file? What if a config value is different in both places?

2. I share Gwen's concern around making sure that popular config management
tools continue to work with this change. Would love to see how each of
those would work with the proposal in the KIP. I don't know enough about
each of the tools but seems like in some of the tools, you have to define
some sort of class with parameter names as config names. How will such
tools find out about the config values? In Puppet, if this means that each
Puppet agent has to read it from ZK, this means the ZK port has to be open
to pretty much every machine in the DC. This is a bummer and a very
confusing requirement. Not sure if this is really a problem or not (each of
those tools might behave differently), though pointing out that this is
something worth paying attention to.

3. The wrapper tools that let users read/change config tools should not
depend on ZK for the reason mentioned above. It's a pain to assume that the
ZK port is open from any machine that needs to run this tool. Ideally what
users want is a REST API to the brokers to change or read the config (ala
Elasticsearch), but in the absence of the REST API, we should think if we
can write the tool such that it just requires talking to the Kafka broker
port. This will require a config RPC.

4. Not sure if KIP is the right place to discuss the design of propagating
the config changes to the brokers, but have you thought about just letting
the controller oversee the config changes and propagate via RPC to the
brokers? That way, there is an easier way to express config changes that
require all brokers to change it for it to be called complete. Maybe this
is not required, but it is hard to say if we don't discuss the full set of
configs that need to be dynamic.

Thanks,
Neha

On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com wrote:

 Hey Aditya,

 This is a great! A couple of comments:

 1. Leaving the file config in place is definitely the least disturbance.
 But let's really think about getting rid of the files and just have one
 config mechanism. There is always a tendency to make everything pluggable
 which so often just leads to two mediocre solutions. Can we do the exercise
 of trying to consider fully getting rid of file config and seeing what goes
 wrong?

 2. Do we need to model defaults? The current approach is that if you have a
 global config x it is overridden for a topic xyz by /topics/xyz/x, and I
 think this could be extended to /brokers/0/x. I think this is simpler. We
 need to specify the precedence for these overrides, e.g. if you override at
 the broker and topic level I think the topic level takes precedence.

 3. I recommend we have the producer and consumer config just be an override
 under client.id. The override is by client id and we can have separate
 properties for controlling quotas for producers and consumers.

 4. Some configs can be changed just by updating the reference, others may
 require some action. An example of this is if you want to disable log
 compaction (assuming we wanted to make that dynamic) we need to call
 shutdown() on the cleaner. I think it may be required to register a
 listener callback that gets called when the config changes.

 5. For handling the reference can you explain your plan a bit? Currently we
 have an immutable KafkaConfig object with a bunch of vals. That or
 individual values in there get injected all over the code base. I was
 thinking something like this:
 a. We retain the KafkaConfig object as an immutable object just as today.
 b. It is no longer legit to grab values out fo that config if they are
 changeable.
 c. Instead of making KafkaConfig itself mutable we make KafkaConfiguration
 which has a single volatile reference to the current KafkaConfig.
 KafkaConfiguration is what gets passed into various components. So to
 access a config you do something like config.instance.myValue. When the
 config changes the config manager updates this reference.
 d. The KafkaConfiguration is the thing that allows doing the
 configuration.onChange(my.config, callback)

 -Jay

 On Tue, Apr 28, 2015 at 3:57 PM, Aditya Auradkar 
 aaurad...@linkedin.com.invalid wrote:

  Hey everyone,
 
  Wrote up a KIP to update topic, client and broker configs dynamically via
  Zookeeper.
 
 
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration
 
  Please read and provide feedback.
 
  Thanks,
  Aditya
 
  PS: I've intentionally kept this discussion separate from KIP-5 since I'm
  not sure if that is actively being worked on and I wanted to start with a
  clean slate.
 




-- 
Thanks,
Neha


[jira] [Commented] (KAFKA-2000) Delete consumer offsets from kafka once the topic is deleted

2015-05-03 Thread Sriharsha Chintalapani (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525913#comment-14525913
 ] 

Sriharsha Chintalapani commented on KAFKA-2000:
---

Updated reviewboard https://reviews.apache.org/r/32650/diff/
 against branch origin/trunk

 Delete consumer offsets from kafka once the topic is deleted
 

 Key: KAFKA-2000
 URL: https://issues.apache.org/jira/browse/KAFKA-2000
 Project: Kafka
  Issue Type: Bug
Reporter: Sriharsha Chintalapani
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Attachments: KAFKA-2000.patch, KAFKA-2000_2015-05-03_10:39:11.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 32650: Patch for KAFKA-2000

2015-05-03 Thread Sriharsha Chintalapani

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32650/
---

(Updated May 3, 2015, 5:39 p.m.)


Review request for kafka.


Bugs: KAFKA-2000
https://issues.apache.org/jira/browse/KAFKA-2000


Repository: kafka


Description
---

KAFKA-2000. Delete consumer offsets from kafka once the topic is deleted.


Diffs (updated)
-

  core/src/main/scala/kafka/server/OffsetManager.scala 
18680ce100f10035175cc0263ba7787ab0f6a17a 
  core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
652208a70f66045b854549d93cbbc2b77c24b10b 

Diff: https://reviews.apache.org/r/32650/diff/


Testing
---


Thanks,

Sriharsha Chintalapani



[jira] [Updated] (KAFKA-2000) Delete consumer offsets from kafka once the topic is deleted

2015-05-03 Thread Sriharsha Chintalapani (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-2000:
--
Status: Patch Available  (was: In Progress)

 Delete consumer offsets from kafka once the topic is deleted
 

 Key: KAFKA-2000
 URL: https://issues.apache.org/jira/browse/KAFKA-2000
 Project: Kafka
  Issue Type: Bug
Reporter: Sriharsha Chintalapani
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Attachments: KAFKA-2000.patch, KAFKA-2000_2015-05-03_10:39:11.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2000) Delete consumer offsets from kafka once the topic is deleted

2015-05-03 Thread Sriharsha Chintalapani (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriharsha Chintalapani updated KAFKA-2000:
--
Attachment: KAFKA-2000_2015-05-03_10:39:11.patch

 Delete consumer offsets from kafka once the topic is deleted
 

 Key: KAFKA-2000
 URL: https://issues.apache.org/jira/browse/KAFKA-2000
 Project: Kafka
  Issue Type: Bug
Reporter: Sriharsha Chintalapani
Assignee: Sriharsha Chintalapani
  Labels: newbie++
 Attachments: KAFKA-2000.patch, KAFKA-2000_2015-05-03_10:39:11.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


RE: [DISCUSS] KIP-21 Configuration Management

2015-05-03 Thread Aditya Auradkar
Hey everyone,

Thanks for the comments. I'll respond to each one-by-one. In the meantime, can 
we put this on the agenda for the KIP hangout for next week?

Thanks,
Aditya


From: Neha Narkhede [n...@confluent.io]
Sent: Sunday, May 03, 2015 9:48 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-21 Configuration Management

Thanks for starting this discussion, Aditya. Few questions/comments

1. If you change the default values like it's mentioned in the KIP, do you
also overwrite the local config file as part of updating the default value?
If not, where does the admin look to find the default values, ZK or local
Kafka config file? What if a config value is different in both places?

2. I share Gwen's concern around making sure that popular config management
tools continue to work with this change. Would love to see how each of
those would work with the proposal in the KIP. I don't know enough about
each of the tools but seems like in some of the tools, you have to define
some sort of class with parameter names as config names. How will such
tools find out about the config values? In Puppet, if this means that each
Puppet agent has to read it from ZK, this means the ZK port has to be open
to pretty much every machine in the DC. This is a bummer and a very
confusing requirement. Not sure if this is really a problem or not (each of
those tools might behave differently), though pointing out that this is
something worth paying attention to.

3. The wrapper tools that let users read/change config tools should not
depend on ZK for the reason mentioned above. It's a pain to assume that the
ZK port is open from any machine that needs to run this tool. Ideally what
users want is a REST API to the brokers to change or read the config (ala
Elasticsearch), but in the absence of the REST API, we should think if we
can write the tool such that it just requires talking to the Kafka broker
port. This will require a config RPC.

4. Not sure if KIP is the right place to discuss the design of propagating
the config changes to the brokers, but have you thought about just letting
the controller oversee the config changes and propagate via RPC to the
brokers? That way, there is an easier way to express config changes that
require all brokers to change it for it to be called complete. Maybe this
is not required, but it is hard to say if we don't discuss the full set of
configs that need to be dynamic.

Thanks,
Neha

On Fri, May 1, 2015 at 12:53 PM, Jay Kreps jay.kr...@gmail.com wrote:

 Hey Aditya,

 This is a great! A couple of comments:

 1. Leaving the file config in place is definitely the least disturbance.
 But let's really think about getting rid of the files and just have one
 config mechanism. There is always a tendency to make everything pluggable
 which so often just leads to two mediocre solutions. Can we do the exercise
 of trying to consider fully getting rid of file config and seeing what goes
 wrong?

 2. Do we need to model defaults? The current approach is that if you have a
 global config x it is overridden for a topic xyz by /topics/xyz/x, and I
 think this could be extended to /brokers/0/x. I think this is simpler. We
 need to specify the precedence for these overrides, e.g. if you override at
 the broker and topic level I think the topic level takes precedence.

 3. I recommend we have the producer and consumer config just be an override
 under client.id. The override is by client id and we can have separate
 properties for controlling quotas for producers and consumers.

 4. Some configs can be changed just by updating the reference, others may
 require some action. An example of this is if you want to disable log
 compaction (assuming we wanted to make that dynamic) we need to call
 shutdown() on the cleaner. I think it may be required to register a
 listener callback that gets called when the config changes.

 5. For handling the reference can you explain your plan a bit? Currently we
 have an immutable KafkaConfig object with a bunch of vals. That or
 individual values in there get injected all over the code base. I was
 thinking something like this:
 a. We retain the KafkaConfig object as an immutable object just as today.
 b. It is no longer legit to grab values out fo that config if they are
 changeable.
 c. Instead of making KafkaConfig itself mutable we make KafkaConfiguration
 which has a single volatile reference to the current KafkaConfig.
 KafkaConfiguration is what gets passed into various components. So to
 access a config you do something like config.instance.myValue. When the
 config changes the config manager updates this reference.
 d. The KafkaConfiguration is the thing that allows doing the
 configuration.onChange(my.config, callback)

 -Jay

 On Tue, Apr 28, 2015 at 3:57 PM, Aditya Auradkar 
 aaurad...@linkedin.com.invalid wrote:

  Hey everyone,
 
  Wrote up a KIP to update topic, client and broker configs dynamically via
  Zookeeper.
 
 
 

Re: Review Request 32650: Patch for KAFKA-2000

2015-05-03 Thread Sriharsha Chintalapani


 On April 23, 2015, 9:51 p.m., Joel Koshy wrote:
  core/src/main/scala/kafka/server/OffsetManager.scala, line 124
  https://reviews.apache.org/r/32650/diff/1/?file=909897#file909897line124
 
  A safer fix is to proactively purge as part of UpdateMetadataRequest - 
  i.e., removePartitionInfo in metadata cache.
  
  Your fix is nice, but we need to make sure of the following: on a given 
  offset manager (broker) the metadata cache must contain topic X before any 
  consumer of topic X (and whose group is managed by that broker) commits 
  offsets for topic X.
  
  The original scenario I was concerned about should be fine:
  - Suppose broker A (offset manager for G) starts up
  - It receives UpdateMetadataRequests from the controller for all topics 
  in the cluster
  - It then receives LeaderAndIsrRequest for partitions of the offset 
  topic which make it the offset manager.
  - We should be fine _as long as_ the update metadata requests occur 
  first. So if we go with your approach we should at the very least add a 
  unit test to guarantee this.
  
  There is another scenario. If topic X is a new topic (or has new 
  partitions):
  - Broker A is the offset manager for consumer group G
  - Broker B leads a new partition of X
  - Controller C sends become leader to B and update metadata to A (which 
  will populate its metadata cache)
  - B becomes the leader first
  - A consumer starts consuming X and commits offsets to A (before it has 
  received the update metadata request)
  - Other consumers in the group may rebalance while all this is 
  happening (since new partitions for the topic appeared) and may fetch 
  offsets from A
  - But A could have deleted the offset by then.
  - This is improbable but not impossible.
  
  Onur mentioned another corner case:
  https://issues.apache.org/jira/browse/KAFKA-1787 
  
  Both would be solved by having topic generations and incorporating 
  generation information when determining which offsets to purge. I don't 
  think we have a jira open for that but I will follow-up offline with Onur.
  
  Do you see any other issues?
  
  So I think the options are:
  - Go with your approach + a unit test to ensure that the controller 
  sends update metadata request first.
  - Go with the more conservative fix which is to purge on 
  metadataCache.removePartitionInfo
  
  Also, we should add a unit test to verify offsets are in fact removed 
  after deletion.
 
 Joel Koshy wrote:
 Never mind - for the second scenario we are fine. We check in offset 
 manager if the topic exists before committing offsets.
 
 So your fix should be fine. Can you add a unit test?

Thanks for the review Joel. Added a unit test to check if the offsets are 
deleted after topic deletion.


- Sriharsha


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32650/#review81413
---


On May 3, 2015, 5:39 p.m., Sriharsha Chintalapani wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/32650/
 ---
 
 (Updated May 3, 2015, 5:39 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-2000
 https://issues.apache.org/jira/browse/KAFKA-2000
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-2000. Delete consumer offsets from kafka once the topic is deleted.
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/server/OffsetManager.scala 
 18680ce100f10035175cc0263ba7787ab0f6a17a 
   core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
 652208a70f66045b854549d93cbbc2b77c24b10b 
 
 Diff: https://reviews.apache.org/r/32650/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sriharsha Chintalapani