[jira] [Commented] (KAFKA-10009) Add method for getting last record offset in kafka partition
[ https://issues.apache.org/jira/browse/KAFKA-10009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17173671#comment-17173671 ] Werner Daehn commented on KAFKA-10009: -- What would be needed is either * endOffsets() returns the last existing offset. A bit dangerous as it might return offset 100, then a log compaction happens and then we start reading. * poll() telling that it retrieved the last record. That should be doable. Then we can call poll(100ms) in a loop until it tells us no-more-data via another getter. If I am interested in the current records only I stop polling now, all others will simply continue calling poll(). And the best of it, no side effects and backward compatibility. The one thing I don't know is if poll even has a chance to get that information yet or if the broker must be changed as well. I have seen many similar questions and no real solution, so this is a popular request. > Add method for getting last record offset in kafka partition > > > Key: KAFKA-10009 > URL: https://issues.apache.org/jira/browse/KAFKA-10009 > Project: Kafka > Issue Type: New Feature > Components: clients, consumer >Reporter: Yuriy Badalyantc >Priority: Major > > As far as I understand, at the current moment, there is no reliable way for > getting offset of the last record in the partition using java client. There > is {{endOffsets}} method in the consumer. And usually {{endOffsets - 1}} > works fine. But in the case of transactional producer, topic may contain > offsets without a record. And {{endOffsets - 1}} will point to the offset > without record. > This feature will help in situations when consumer application wants to > consume the whole topic. Checking of beginning and last record offset will > give lower and upper bounds for consuming. Of course, it is doable with the > current consumer implementation, but I need to check {{position}} after each > poll. > Also, I believe that this feature may help with monitoring and operations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-10009) Add method for getting last record offset in kafka partition
[ https://issues.apache.org/jira/browse/KAFKA-10009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17173661#comment-17173661 ] Werner Daehn commented on KAFKA-10009: -- It is actually worse than that. There could be log compaction or other reasons why the offset value is not a dense number set. Without such functionality, how do you reliably know that you have read all data from the topic? * Option 1: execute a poll(1second) and if it returns no data, that means there is no more data. But maybe the network was busy, so 1 second is not enough. 10 seconds? 1 minute, 10 minutes? I don't want to wait for ten minutes just to decrease the probability there is more data and I just have not received it yet. * Option 2: First call the endOffset(), then you know the high water mark is offset=100. So you poll until you received the record with offset=99 and then you know you have gotten the last record. But what if there is no record with offset=99? Again, you will wait forever. > Add method for getting last record offset in kafka partition > > > Key: KAFKA-10009 > URL: https://issues.apache.org/jira/browse/KAFKA-10009 > Project: Kafka > Issue Type: New Feature > Components: clients, consumer >Reporter: Yuriy Badalyantc >Priority: Minor > > As far as I understand, at the current moment, there is no reliable way for > getting offset of the last record in the partition using java client. There > is {{endOffsets}} method in the consumer. And usually {{endOffsets - 1}} > works fine. But in the case of transactional producer, topic may contain > offsets without a record. And {{endOffsets - 1}} will point to the offset > without record. > This feature will help in situations when consumer application wants to > consume the whole topic. Checking of beginning and last record offset will > give lower and upper bounds for consuming. Of course, it is doable with the > current consumer implementation, but I need to check {{position}} after each > poll. > Also, I believe that this feature may help with monitoring and operations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-10009) Add method for getting last record offset in kafka partition
[ https://issues.apache.org/jira/browse/KAFKA-10009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109054#comment-17109054 ] Yuriy Badalyantc commented on KAFKA-10009: -- Exactly. I want an offset of a last true record in a partition. > Add method for getting last record offset in kafka partition > > > Key: KAFKA-10009 > URL: https://issues.apache.org/jira/browse/KAFKA-10009 > Project: Kafka > Issue Type: New Feature > Components: clients, consumer >Reporter: Yuriy Badalyantc >Priority: Minor > > As far as I understand, at the current moment, there is no reliable way for > getting offset of the last record in the partition using java client. There > is {{endOffsets}} method in the consumer. And usually {{endOffsets - 1}} > works fine. But in the case of transactional producer, topic may contain > offsets without a record. And {{endOffsets - 1}} will point to the offset > without record. > This feature will help in situations when consumer application wants to > consume the whole topic. Checking of beginning and last record offset will > give lower and upper bounds for consuming. Of course, it is doable with the > current consumer implementation, but I need to check {{position}} after each > poll. > Also, I believe that this feature may help with monitoring and operations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-10009) Add method for getting last record offset in kafka partition
[ https://issues.apache.org/jira/browse/KAFKA-10009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109045#comment-17109045 ] Chia-Ping Tsai commented on KAFKA-10009: So you want to get end offset which is associated to a true record and the LSO, which maybe a smallest offset of open transaction, is not what you expect. > Add method for getting last record offset in kafka partition > > > Key: KAFKA-10009 > URL: https://issues.apache.org/jira/browse/KAFKA-10009 > Project: Kafka > Issue Type: New Feature > Components: clients, consumer >Reporter: Yuriy Badalyantc >Priority: Minor > > As far as I understand, at the current moment, there is no reliable way for > getting offset of the last record in the partition using java client. There > is {{endOffsets}} method in the consumer. And usually {{endOffsets - 1}} > works fine. But in the case of transactional producer, topic may contain > offsets without a record. And {{endOffsets - 1}} will point to the offset > without record. > This feature will help in situations when consumer application wants to > consume the whole topic. Checking of beginning and last record offset will > give lower and upper bounds for consuming. Of course, it is doable with the > current consumer implementation, but I need to check {{position}} after each > poll. > Also, I believe that this feature may help with monitoring and operations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-10009) Add method for getting last record offset in kafka partition
[ https://issues.apache.org/jira/browse/KAFKA-10009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109017#comment-17109017 ] Yuriy Badalyantc commented on KAFKA-10009: -- Even with read_committed isolation level {{endOffsets - 1}} will not give you last record offset if producer is transactional. It will point to an offset without a record. I specifically tested this behavior (on 2.4.0). Also, isolation level affects only pending transactions. > Add method for getting last record offset in kafka partition > > > Key: KAFKA-10009 > URL: https://issues.apache.org/jira/browse/KAFKA-10009 > Project: Kafka > Issue Type: New Feature > Components: clients, consumer >Reporter: Yuriy Badalyantc >Priority: Minor > > As far as I understand, at the current moment, there is no reliable way for > getting offset of the last record in the partition using java client. There > is {{endOffsets}} method in the consumer. And usually {{endOffsets - 1}} > works fine. But in the case of transactional producer, topic may contain > offsets without a record. And {{endOffsets - 1}} will point to the offset > without record. > This feature will help in situations when consumer application wants to > consume the whole topic. Checking of beginning and last record offset will > give lower and upper bounds for consuming. Of course, it is doable with the > current consumer implementation, but I need to check {{position}} after each > poll. > Also, I believe that this feature may help with monitoring and operations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-10009) Add method for getting last record offset in kafka partition
[ https://issues.apache.org/jira/browse/KAFKA-10009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109009#comment-17109009 ] Chia-Ping Tsai commented on KAFKA-10009: If you want to seek to the end offsets in case of transaction, you can set isolation.level to READ_COMMITTED and then call Consumer.seekToEnd or Consumer.endOffsets. Both of them will get last stable offset. > Add method for getting last record offset in kafka partition > > > Key: KAFKA-10009 > URL: https://issues.apache.org/jira/browse/KAFKA-10009 > Project: Kafka > Issue Type: New Feature > Components: clients, consumer >Reporter: Yuriy Badalyantc >Priority: Minor > > As far as I understand, at the current moment, there is no reliable way for > getting offset of the last record in the partition using java client. There > is {{endOffsets}} method in the consumer. And usually {{endOffsets - 1}} > works fine. But in the case of transactional producer, topic may contain > offsets without a record. And {{endOffsets - 1}} will point to the offset > without record. > This feature will help in situations when consumer application wants to > consume the whole topic. Checking of beginning and last record offset will > give lower and upper bounds for consuming. Of course, it is doable with the > current consumer implementation, but I need to check {{position}} after each > poll. > Also, I believe that this feature may help with monitoring and operations. -- This message was sent by Atlassian Jira (v8.3.4#803005)