[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934469#comment-13934469 ] korebantic2 commented on KAFKA-657: --- All, I'm looking to update a client to support this new addition to 0.8.1. I just wanted to check-in, is the guide here reflect the latest for the protocol? https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommit/FetchAPI It hasn't been updated since December so just wanted to make sure. Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Fix For: 0.8.1 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, KAFKA-657v4.patch, KAFKA-657v5.patch, KAFKA-657v6.patch, KAFKA-657v7.patch, KAFKA-657v8.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13544261#comment-13544261 ] Neha Narkhede commented on KAFKA-657: - +1 on v8 Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, KAFKA-657v4.patch, KAFKA-657v5.patch, KAFKA-657v6.patch, KAFKA-657v7.patch, KAFKA-657v8.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13544322#comment-13544322 ] Jun Rao commented on KAFKA-657: --- v8 looks good. Just some minor comments. 80. OffsetCommitTest.testCommitAndFetchOffsets(): Could you remove the commented out code? Also, remove unused imports. 81. We are trying to standardize the config names in kafka-648. Should we rename offset.metadata.max.size to offset.metadata.max.bytes? Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Fix For: 0.8.1 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, KAFKA-657v4.patch, KAFKA-657v5.patch, KAFKA-657v6.patch, KAFKA-657v7.patch, KAFKA-657v8.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13544354#comment-13544354 ] David Arthur commented on KAFKA-657: Jun, 80. Those commented out tests will be valid once the metadata is actually stored. I left it to save someone the effort later on 81. +1 Since this patch has been committed maybe you or Jay can just make these changes on trunk. Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Fix For: 0.8.1 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, KAFKA-657v4.patch, KAFKA-657v5.patch, KAFKA-657v6.patch, KAFKA-657v7.patch, KAFKA-657v8.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13544370#comment-13544370 ] Jay Kreps commented on KAFKA-657: - I updated the property name. Normally I am against commenting out code since it is in version control anyway, but in this case those tests are actually useful and not in version control so probably makes sense to leave them. Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Fix For: 0.8.1 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, KAFKA-657v4.patch, KAFKA-657v5.patch, KAFKA-657v6.patch, KAFKA-657v7.patch, KAFKA-657v8.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543277#comment-13543277 ] Jay Kreps commented on KAFKA-657: - Yeah that is the right place for a new config. It is worth discussing the name as part of the review since this ends up being kind of part of our api to the operator. I would just skip storing the metadata for now (i.e. just throw it away). If we make the change in zk we need a script to grandfather from the old format to the new format. Since we will need a conversion script when we move off zk anyway it makes sense to avoid two conversions for users (one to add the metadata, and another to move off zk). Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, KAFKA-657v4.patch, KAFKA-657v5.patch, KAFKA-657v6.patch, KAFKA-657v7.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543448#comment-13543448 ] David Arthur commented on KAFKA-657: I have offset.metadata.max.size for the time being with a default of 1024 Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, KAFKA-657v4.patch, KAFKA-657v5.patch, KAFKA-657v6.patch, KAFKA-657v7.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537062#comment-13537062 ] David Arthur commented on KAFKA-657: Jay, the one thing I'm still unclear on are the various failure scenarios. Could you double check that bit of the patch (in KafkaApis.scala) Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537237#comment-13537237 ] Jun Rao commented on KAFKA-657: --- I was trying to apply patch v3 in trunk (on a revision before the 0.8 merge patch), but got the following error. Anyone know what the issue is? git apply -p0 ~/Downloads/KAFKA-657v3.patch fatal: git apply: bad git-diff - inconsistent new filename on line 5 Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537327#comment-13537327 ] Jun Rao commented on KAFKA-657: --- Thanks for patch v4. Some comments: 40. Could you add the Apache license header to all new files? 41. SimpleConsumer is a public API. So we need to add the new requests to the javaapi version of SimpleConsumer. We likely need a java version of the new requests/responses. 42. KafkaApis.handle(): Currently, for each type of requests, we catch all unexpected exceptions and send a corresponding response with an error code to the client. We need to do this for the 2 new types of requests too. 43. Do we plan to use the new API to commit offsets in the high level consumer? Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, KAFKA-657v4.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537368#comment-13537368 ] Jay Kreps commented on KAFKA-657: - It probably makes sense to do this in two phases. Let's get the patch in that adds the api, then make the changes to the consumers to make use of it as phase 2. Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, KAFKA-657v4.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534122#comment-13534122 ] Jay Kreps commented on KAFKA-657: - This looks great! Three minor things: 1. Can you change the logging for the common case to debug? Our logging policy is that you should be able to run in INFO and have all messages be things you need to know. 2. Can you handle any exceptions from ZK and send back an UnknownException 3. Can you remove the checks on topic/partition validity? (3) is maybe more controversial. Here is my rationale. First ZK is a huge bottleneck so adding two more zk round-trips will be a problem. Second we actually have two use cases for allowing the user to store offsets for non-existant topics or partitions. The first use case is that if you are doing mirroring between two clusters in different data centers (a common case) it probably makes sense to store the offsets in whatever data center the mirroring process runs in. However there is no requirement that the two clusters have the same partitioning. The second use case is probably specific only to our usage, we have several systems that produce offset-like markers and being able to commit these all together to mark a single point in consumption time across all systems is nice. Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534151#comment-13534151 ] David Arthur commented on KAFKA-657: Re 3: Maybe this is a case for the check-and-set functionality I originally proposed. The default case could update ZK with no checks (which would cover your two use cases), and a special case could do the check as well as check the last offset stored for a conditional update. Thoughts? Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534162#comment-13534162 ] Jay Kreps commented on KAFKA-657: - I think that actually covers an orthogonal problem right? 1. Checking topic/partition covers bugs in the client impl that set the wrong values. 2. Check and set catches a bug that might lead to you clobbering your offset due to a concurrency issue where there are two processes both trying to update the same offset. Originally my concern with (2) was that I wasn't sure if we could implement it in a post-zk world. Now that we wrote up that proposal in a lot more detail I think we can. We wouldn't want to make the last offset mandatory because in the case that you are manually resetting your offset to 0 (or some low number) you might not know the previous value. But I think what you are proposing is that we could have a current_offset field in the request, and if it is set we would only update the offset if the current offset equals the given offset. We could make it optional by having the value -1 indicate don't care, clobber whatever is there. The question is, what is the use case for this? Our approach to the scala client has been to ensure mutual exclusion for the consumer processes, at which point this basically can't happen. I wonder how an alternative client implementation could make use of it? It would be good to work that out before including it. Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530174#comment-13530174 ] David Arthur commented on KAFKA-657: Thanks [~jkreps], that clears things up quite a bit. Another question I have is around the request envelope (clientId, correlationId, etc). I understand correlationId is used to allow multiplexing requests/response, but what about replicaId, clientId, etc. I mostly copied these from other Request classes - a bit of cargo-cult programming I guess :) {code} val versionId = buffer.getShort val correlationId = buffer.getInt val clientId = readShortString(buffer) val replicaId = buffer.getInt {code} Are all of these necessary for the OffsetCommitRequest/Response? Specifically, is replicaId necessary? Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Attachments: KAFKA-657v1.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530700#comment-13530700 ] Jay Kreps commented on KAFKA-657: - I wonder if you could take a look at the updated docs and see if they seem clear. I tried to cover those, but, well, documentation is hard: https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol Summary: version id is the version of this api format. In the future if we decide we missed an important field (e.g. lastOffset) we will add it and bump the version number and handle both cases on the server side based on the version we see. client id is a logical name for the client that could be used across many client servers. This is useful for logging and metrics (i.e. figuring out WHY you are suddenly getting 5x the qps, or whatever) if you have lots of clients. replica id is just in the fetch request and shouldn't be in this request. A fetch request can be issued either by a normal consumer or by a replica and the broker has slightly different behavior in each case (e.g. whether uncommitted messages are visible...) Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Attachments: KAFKA-657v1.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529583#comment-13529583 ] Jay Kreps commented on KAFKA-657: - This looks great. To confirm, the final format for the commit response is group [topic [partition offset]] I think logically there are two phases of work around fixing offset management 1. Add the API and convert the consumer to use it a. CommitOffsetRequest/Response (to save your position) b. FetchOffsetRequest/Response (to read back a saved position) c. Integration into the consumer (using the new api in the scala client) d. Unit test coverage for these (say in kafka.integration.PrimitiveApiTest) 2. Move offsets out of zookeeper, since zookeeper doesn't scale well for writes It would be nice to do (1) more or less together, and if we do it right (2) can be a follow-up item and need not be done by you unless you want to. We can definitely break (1) into successive patches if that is helpful to keep the individual changes small--I am happy to take what you have now if you are up for finishing the other items in (1). I would like to get people to brainstorm a little on (2) in parallel as it could potentially have some impact on (1). We have some time to fiddle with the API if we think of improvements before it would be released and we would have to start versioning changes, though, so we probably don't need to block on that. So let me know if you are up for the rest of the items in (1) Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Attachments: KAFKA-657v1.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529589#comment-13529589 ] Jay Kreps commented on KAFKA-657: - Oh yes, two other things: 1. We don't have a response in this api yet. We should at least have a way to indicate if the request failed (i.e. we got an error writing to zk, etc). 2. Would be good to replace the println with a proper log statement. Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Attachments: KAFKA-657v1.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529613#comment-13529613 ] David Arthur commented on KAFKA-657: I'm fine working on the rest of 1. 1a is simple enough, 1c I might need some direction on. For 1b, how exactly is it different from the existing Offsets API? I have never really been clear what the purpose of the old API is. Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Attachments: KAFKA-657v1.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-657) Add an API to commit offsets
[ https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529617#comment-13529617 ] Jay Kreps commented on KAFKA-657: - The existing API describes the offset ranges contained in log segments on the server. It is poorly named and we should really rename it to something like LogMetadataRequest and we should really generalize it a bit to include things other than segment offset beginnings. The intended use case for the existing API is for new consumers when they consume for the first time in an existing stream. When they first start consuming they have no position in the log to read from (or to save out using your new api). They want to start consuming, but to start consuming they need a valid offset to start at. What offsets are valid depends on what is available on the server, so they need to be able to ask the server what offset ranges do you have and then they could chose to start consuming either at the beginning or end of that (or somewhere in the middle). Your api on the other hand answers the question what is the latest offset I have 'committed' (i.e. recorded as consumed). This would be used when a restart or rebalancing of the consumers occurs. Hope that makes sense? We could rename the existing API as part of this change to avoid the muddle. Add an API to commit offsets Key: KAFKA-657 URL: https://issues.apache.org/jira/browse/KAFKA-657 Project: Kafka Issue Type: New Feature Reporter: Jay Kreps Labels: project Attachments: KAFKA-657v1.patch Currently the consumer directly writes their offsets to zookeeper. Two problems with this: (1) This is a poor use of zookeeper, and we need to replace it with a more scalable offset store, and (2) it makes it hard to carry over to clients in other languages. A first step towards accomplishing that is to add a proper Kafka API for committing offsets. The initial version of this would just write to zookeeper as we do today, but in the future we would then have the option of changing this. This api likely needs to take a sequence of consumer-group/topic/partition/offset entries and commit them all. It would be good to do a wiki design on how this would work and consensus on that first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira