Thanks much for reviewing the KIP! Dong
On Wed, Jan 24, 2018 at 7:10 AM, Guozhang Wang <wangg...@gmail.com> wrote: > Yeah that makes sense, again I'm just making sure we understand all the > scenarios and what to expect. > > I agree that if, more generally speaking, say users have only consumed to > offset 8, and then call seek(16) to "jump" to a further position, then she > needs to be aware that OORE maybe thrown and she needs to handle it or rely > on reset policy which should not surprise her. > > > I'm +1 on the KIP. > > Guozhang > > > On Wed, Jan 24, 2018 at 12:31 AM, Dong Lin <lindon...@gmail.com> wrote: > > > Yes, in general we can not prevent OffsetOutOfRangeException if user > seeks > > to a wrong offset. The main goal is to prevent OffsetOutOfRangeException > if > > user has done things in the right way, e.g. user should know that there > is > > message with this offset. > > > > For example, if user calls seek(..) right after construction, the only > > reason I can think of is that user stores offset externally. In this > case, > > user currently needs to use the offset which is obtained using > position(..) > > from the last run. With this KIP, user needs to get the offset and the > > offsetEpoch using positionAndOffsetEpoch(...) and stores these > information > > externally. The next time user starts consumer, he/she needs to call > > seek(..., offset, offsetEpoch) right after construction. Then KIP should > be > > able to ensure that we don't throw OffsetOutOfRangeException if there is > no > > unclean leader election. > > > > Does this sound OK? > > > > Regards, > > Dong > > > > > > On Tue, Jan 23, 2018 at 11:44 PM, Guozhang Wang <wangg...@gmail.com> > > wrote: > > > > > "If consumer wants to consume message with offset 16, then consumer > must > > > have > > > already fetched message with offset 15" > > > > > > --> this may not be always true right? What if consumer just call > > seek(16) > > > after construction and then poll without committed offset ever stored > > > before? Admittedly it is rare but we do not programmably disallow it. > > > > > > > > > Guozhang > > > > > > On Tue, Jan 23, 2018 at 10:42 PM, Dong Lin <lindon...@gmail.com> > wrote: > > > > > > > Hey Guozhang, > > > > > > > > Thanks much for reviewing the KIP! > > > > > > > > In the scenario you described, let's assume that broker A has > messages > > > with > > > > offset up to 10, and broker B has messages with offset up to 20. If > > > > consumer wants to consume message with offset 9, it will not receive > > > > OffsetOutOfRangeException > > > > from broker A. > > > > > > > > If consumer wants to consume message with offset 16, then consumer > must > > > > have already fetched message with offset 15, which can only come from > > > > broker B. Because consumer will fetch from broker B only if > leaderEpoch > > > >= > > > > 2, then the current consumer leaderEpoch can not be 1 since this KIP > > > > prevents leaderEpoch rewind. Thus we will not have > > > > OffsetOutOfRangeException > > > > in this case. > > > > > > > > Does this address your question, or maybe there is more advanced > > scenario > > > > that the KIP does not handle? > > > > > > > > Thanks, > > > > Dong > > > > > > > > On Tue, Jan 23, 2018 at 9:43 PM, Guozhang Wang <wangg...@gmail.com> > > > wrote: > > > > > > > > > Thanks Dong, I made a pass over the wiki and it lgtm. > > > > > > > > > > Just a quick question: can we completely eliminate the > > > > > OffsetOutOfRangeException with this approach? Say if there is > > > consecutive > > > > > leader changes such that the cached metadata's partition epoch is > 1, > > > and > > > > > the metadata fetch response returns with partition epoch 2 > pointing > > to > > > > > leader broker A, while the actual up-to-date metadata has partition > > > > epoch 3 > > > > > whose leader is now broker B, the metadata refresh will still > succeed > > > and > > > > > the follow-up fetch request may still see OORE? > > > > > > > > > > > > > > > Guozhang > > > > > > > > > > > > > > > On Tue, Jan 23, 2018 at 3:47 PM, Dong Lin <lindon...@gmail.com> > > wrote: > > > > > > > > > > > Hi all, > > > > > > > > > > > > I would like to start the voting process for KIP-232: > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > > > > > > 232%3A+Detect+outdated+metadata+using+leaderEpoch+ > > and+partitionEpoch > > > > > > > > > > > > The KIP will help fix a concurrency issue in Kafka which > currently > > > can > > > > > > cause message loss or message duplication in consumer. > > > > > > > > > > > > Regards, > > > > > > Dong > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > -- Guozhang > > > > > > > > > > > > > > > > > > > > > -- > > > -- Guozhang > > > > > > > > > -- > -- Guozhang >