[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2014-03-13 Thread korebantic2 (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934469#comment-13934469
 ] 

korebantic2 commented on KAFKA-657:
---

All,

I'm looking to update a client to support this new addition to 0.8.1. I just 
wanted to check-in, is the guide here reflect the latest for the protocol? 

https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommit/FetchAPI

It hasn't been updated since December so just wanted to make sure.




 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Fix For: 0.8.1

 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, 
 KAFKA-657v4.patch, KAFKA-657v5.patch, KAFKA-657v6.patch, KAFKA-657v7.patch, 
 KAFKA-657v8.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2013-01-04 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13544261#comment-13544261
 ] 

Neha Narkhede commented on KAFKA-657:
-

+1 on v8

 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, 
 KAFKA-657v4.patch, KAFKA-657v5.patch, KAFKA-657v6.patch, KAFKA-657v7.patch, 
 KAFKA-657v8.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2013-01-04 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13544322#comment-13544322
 ] 

Jun Rao commented on KAFKA-657:
---

v8 looks good. Just some minor comments.

80. OffsetCommitTest.testCommitAndFetchOffsets(): Could you remove the 
commented out code? Also, remove unused imports.

81. We are trying to standardize the config names in kafka-648. Should we 
rename offset.metadata.max.size to offset.metadata.max.bytes?

 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Fix For: 0.8.1

 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, 
 KAFKA-657v4.patch, KAFKA-657v5.patch, KAFKA-657v6.patch, KAFKA-657v7.patch, 
 KAFKA-657v8.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2013-01-04 Thread David Arthur (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13544354#comment-13544354
 ] 

David Arthur commented on KAFKA-657:


Jun, 

80. Those commented out tests will be valid once the metadata is actually 
stored. I left it to save someone the effort later on

81. +1

Since this patch has been committed maybe you or Jay can just make these 
changes on trunk.

 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Fix For: 0.8.1

 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, 
 KAFKA-657v4.patch, KAFKA-657v5.patch, KAFKA-657v6.patch, KAFKA-657v7.patch, 
 KAFKA-657v8.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2013-01-04 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13544370#comment-13544370
 ] 

Jay Kreps commented on KAFKA-657:
-

I updated the property name. Normally I am against commenting out code since it 
is in version control anyway, but in this case those tests are actually useful 
and not in version control so probably makes sense to leave them.

 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Fix For: 0.8.1

 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, 
 KAFKA-657v4.patch, KAFKA-657v5.patch, KAFKA-657v6.patch, KAFKA-657v7.patch, 
 KAFKA-657v8.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2013-01-03 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543277#comment-13543277
 ] 

Jay Kreps commented on KAFKA-657:
-

Yeah that is the right place for a new config. It is worth discussing the name 
as part of the review since this ends up being kind of part of our api to the 
operator.

I would just skip storing the metadata for now (i.e. just throw it away). If we 
make the change in zk we need a script to grandfather from the old format to 
the new format. Since we will need a conversion script when we move off zk 
anyway it makes sense to avoid two conversions for users (one to add the 
metadata, and another to move off zk).

 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, 
 KAFKA-657v4.patch, KAFKA-657v5.patch, KAFKA-657v6.patch, KAFKA-657v7.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2013-01-03 Thread David Arthur (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543448#comment-13543448
 ] 

David Arthur commented on KAFKA-657:


I have offset.metadata.max.size for the time being with a default of 1024


 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, 
 KAFKA-657v4.patch, KAFKA-657v5.patch, KAFKA-657v6.patch, KAFKA-657v7.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2012-12-20 Thread David Arthur (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537062#comment-13537062
 ] 

David Arthur commented on KAFKA-657:


Jay, the one thing I'm still unclear on are the various failure scenarios. 
Could you double check that bit of the patch (in KafkaApis.scala)

 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2012-12-20 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537237#comment-13537237
 ] 

Jun Rao commented on KAFKA-657:
---

I was trying to apply patch v3 in trunk (on a revision before the 0.8 merge 
patch), but got the following error. Anyone know what the issue is?

git apply -p0 ~/Downloads/KAFKA-657v3.patch
fatal: git apply: bad git-diff - inconsistent new filename on line 5


 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2012-12-20 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537327#comment-13537327
 ] 

Jun Rao commented on KAFKA-657:
---

Thanks for patch v4. Some comments:

40. Could you add the Apache license header to all new files?

41. SimpleConsumer is a public API. So we need to add the new requests to the 
javaapi version of SimpleConsumer. We likely need a java version of the new 
requests/responses.

42. KafkaApis.handle(): Currently, for each type of requests, we catch all 
unexpected exceptions and send a corresponding response with an error code to 
the client. We need to do this for the 2 new types of requests too.

43. Do we plan to use the new API to commit offsets in the high level consumer?

 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, 
 KAFKA-657v4.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2012-12-20 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537368#comment-13537368
 ] 

Jay Kreps commented on KAFKA-657:
-

It probably makes sense to do this in two phases. Let's get the patch in that 
adds the api, then make the changes to the consumers to make use of it as phase 
2.

 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch, KAFKA-657v3.patch, 
 KAFKA-657v4.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2012-12-17 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534122#comment-13534122
 ] 

Jay Kreps commented on KAFKA-657:
-

This looks great!

Three minor things:
1. Can you change the logging for the common case to debug? Our logging policy 
is that you should be able to run in INFO and have all messages be things you 
need to know.
2. Can you handle any exceptions from ZK and send back an UnknownException
3. Can you remove the checks on topic/partition validity?

(3) is maybe more controversial. Here is my rationale. First ZK is a huge 
bottleneck so adding two more zk round-trips will be a problem. Second we 
actually have two use cases for allowing the user to store offsets for 
non-existant topics or partitions. The first use case is that if you are doing 
mirroring between two clusters in different data centers (a common case) it 
probably makes sense to store the offsets in whatever data center the mirroring 
process runs in. However there is no requirement that the two clusters have the 
same partitioning. The second use case is probably specific only to our usage, 
we have several systems that produce offset-like markers and being able to 
commit these all together to mark a single point in consumption time across 
all systems is nice. 

 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2012-12-17 Thread David Arthur (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534151#comment-13534151
 ] 

David Arthur commented on KAFKA-657:


Re 3: Maybe this is a case for the check-and-set functionality I originally 
proposed. The default case could update ZK with no checks (which would cover 
your two use cases), and a special case could do the check as well as check the 
last offset stored for a conditional update. Thoughts?

 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2012-12-17 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534162#comment-13534162
 ] 

Jay Kreps commented on KAFKA-657:
-

I think that actually covers an orthogonal problem right?
1. Checking topic/partition covers bugs in the client impl that set the wrong 
values.
2. Check and set catches a bug that might lead to you clobbering your offset 
due to a concurrency issue where there are two processes both trying to update 
the same offset.

Originally my concern with (2) was that I wasn't sure if we could implement it 
in a post-zk world. Now that we wrote up that proposal in a lot more detail I 
think we can.

We wouldn't want to make the last offset mandatory because in the case that you 
are manually resetting your offset to 0 (or some low number) you might not know 
the previous value. But I think what you are proposing is that we could have a 
current_offset field in the request, and if it is set we would only update the 
offset if the current offset equals the given offset. We could make it optional 
by having the value -1 indicate don't care, clobber whatever is there.

The question is, what is the use case for this? Our approach to the scala 
client has been to ensure mutual exclusion for the consumer processes, at which 
point this basically can't happen. I wonder how an alternative client 
implementation could make use of it? It would be good to work that out before 
including it.

 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Attachments: KAFKA-657v1.patch, KAFKA-657v2.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2012-12-12 Thread David Arthur (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530174#comment-13530174
 ] 

David Arthur commented on KAFKA-657:


Thanks [~jkreps], that clears things up quite a bit. Another question I have is 
around the request envelope (clientId, correlationId, etc).

I understand correlationId is used to allow multiplexing requests/response, but 
what about replicaId, clientId, etc. I mostly copied these from other Request 
classes - a bit of cargo-cult programming I guess :)

{code}
val versionId = buffer.getShort
val correlationId = buffer.getInt
val clientId = readShortString(buffer)
val replicaId = buffer.getInt
{code}

Are all of these necessary for the OffsetCommitRequest/Response? Specifically, 
is replicaId necessary?

 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Attachments: KAFKA-657v1.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2012-12-12 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530700#comment-13530700
 ] 

Jay Kreps commented on KAFKA-657:
-

I wonder if you could take a look at the updated docs and see if they seem 
clear. I tried to cover those, but, well, documentation is hard: 
https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol

Summary:
version id is the version of this api format. In the future if we decide we 
missed an important field (e.g. lastOffset) we will add it and bump the version 
number and handle both cases on the server side based on the version we see.
client id is a logical name for the client that could be used across many 
client servers. This is useful for logging and metrics (i.e. figuring out WHY 
you are suddenly getting 5x the qps, or whatever) if you have lots of clients.
replica id is just in the fetch request and shouldn't be in this request. A 
fetch request can be issued either by a normal consumer or by a replica and the 
broker has slightly different behavior in each case (e.g. whether uncommitted 
messages are visible...)



 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Attachments: KAFKA-657v1.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2012-12-11 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529583#comment-13529583
 ] 

Jay Kreps commented on KAFKA-657:
-

This looks great. To confirm, the final format for the commit response is 
 group [topic [partition offset]]

I think logically there are two phases of work around fixing offset management
1. Add the API and convert the consumer to use it
   a. CommitOffsetRequest/Response (to save your position)
   b. FetchOffsetRequest/Response (to read back a saved position)
   c. Integration into the consumer (using the new api in the scala client)
   d. Unit test coverage for these (say in kafka.integration.PrimitiveApiTest)
2. Move offsets out of zookeeper, since zookeeper doesn't scale well for writes

It would be nice to do (1) more or less together, and if we do it right (2) can 
be a follow-up item and need not be done by you unless you want to. We can 
definitely break (1) into successive patches if that is helpful to keep the 
individual changes small--I am happy to take what you have now if you are up 
for finishing the other items in (1). I would like to get people to brainstorm 
a little on (2) in parallel as it could potentially have some impact on (1). We 
have some time to fiddle with the API if we think of improvements before it 
would be released and we would have to start versioning changes, though, so we 
probably don't need to block on that.

So let me know if you are up for the rest of the items in (1)

 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Attachments: KAFKA-657v1.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2012-12-11 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529589#comment-13529589
 ] 

Jay Kreps commented on KAFKA-657:
-

Oh yes, two other things:
1. We don't have a response in this api yet. We should at least have a way to 
indicate if the request failed (i.e. we got an error writing to zk, etc).
2. Would be good to replace the println with a proper log statement.

 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Attachments: KAFKA-657v1.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2012-12-11 Thread David Arthur (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529613#comment-13529613
 ] 

David Arthur commented on KAFKA-657:


I'm fine working on the rest of 1. 1a is simple enough, 1c I might need some 
direction on. For 1b, how exactly is it different from the existing Offsets 
API? I have never really been clear what the purpose of the old API is.

 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Attachments: KAFKA-657v1.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-657) Add an API to commit offsets

2012-12-11 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529617#comment-13529617
 ] 

Jay Kreps commented on KAFKA-657:
-

The existing API describes the offset ranges contained in log segments on the 
server. It is poorly named and we should really rename it to something like 
LogMetadataRequest and we should really generalize it a bit to include things 
other than segment offset beginnings. The intended use case for the existing 
API is for new consumers when they consume for the first time in an existing 
stream. When they first start consuming they have no position in the log to 
read from (or to save out using your new api). They want to start consuming, 
but to start consuming they need a valid offset to start at. What offsets are 
valid depends on what is available on the server, so they need to be able to 
ask the server what offset ranges do you have and then they could chose to 
start consuming either at the beginning or end of that (or somewhere in the 
middle). Your api on the other hand answers the question what is the latest 
offset I have 'committed' (i.e. recorded as consumed). This would be used when 
a restart or rebalancing of the consumers occurs. Hope that makes sense? We 
could rename the existing API as part of this change to avoid the muddle.

 Add an API to commit offsets
 

 Key: KAFKA-657
 URL: https://issues.apache.org/jira/browse/KAFKA-657
 Project: Kafka
  Issue Type: New Feature
Reporter: Jay Kreps
  Labels: project
 Attachments: KAFKA-657v1.patch


 Currently the consumer directly writes their offsets to zookeeper. Two 
 problems with this: (1) This is a poor use of zookeeper, and we need to 
 replace it with a more scalable offset store, and (2) it makes it hard to 
 carry over to clients in other languages. A first step towards accomplishing 
 that is to add a proper Kafka API for committing offsets. The initial version 
 of this would just write to zookeeper as we do today, but in the future we 
 would then have the option of changing this.
 This api likely needs to take a sequence of 
 consumer-group/topic/partition/offset entries and commit them all.
 It would be good to do a wiki design on how this would work and consensus on 
 that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira