Hey Jun, Thanks much for your comments. Yeah I have not considered the case where the offset is stored externally.
Based Jason's question, I think we probably have to use a global metadata_epoch. And since we have a global metadata_epoch, this KIP probably no longer needs the per-partition leader_epoch. Then we can use two newly-added API in consumer that allows user to get the metadata_epoch from consumer and wait for consumer to receive MetadataResponse whose metadata_epoch >= the given metadata_epoch. These two APIs should address the case where user stored offset externally. I have updated the KIP accordingly. Could you take another look? Thanks for all the comments. Dong On Tue, Dec 19, 2017 at 3:09 PM, Jun Rao <j...@confluent.io> wrote: > Hi, Dong, > > Thanks for the reply. > > 10. I was actually just thinking the case when the consumer consumes old > data. If the current leader epoch is 3 and the consumer is consuming > records generated in leader epoch 1, the epoch associated with the offset > should be 1. However, as you pointed out, the fetch response currently > includes the leader epoch for fetched data. So, this is already covered. > > 11. That's an interesting thought. What about the case when the offsets are > stored externally? When we restart a consumer and seek to an externally > stored offset, we won't know the leader epoch in the consumer. Do we need > another request to retrieve the leader epoch based on an offset and make > sure the info is up to date? Another related thing is that the leader epoch > that we want to associate the offset with ideally should be the epoch when > the data is fetched. For example, when all replicas lost data due to a > power failure or when there is an unclean leader election, the leader epoch > for a given offset may change over time on the broker. In those cases, a > consumer's offset may be in range, but is not in the same leader epoch for > the time when the data is fetched. We can potentially do a smarter offset > reset in those cases if we remember the epoch when the data is fetched. > > Jun > > > > On Mon, Dec 18, 2017 at 1:58 PM, Dong Lin <lindon...@gmail.com> wrote: > > > Hey Jun, > > > > Thanks much for your comments. These are very thoughtful ideas. Please > see > > my comments below. > > > > On Thu, Dec 14, 2017 at 6:38 PM, Jun Rao <j...@confluent.io> wrote: > > > > > Hi, Dong, > > > > > > Thanks for the update. A few more comments below. > > > > > > 10. It seems that we need to return the leader epoch in the fetch > > response > > > as well When fetching data, we could be fetching data from a leader > epoch > > > older than what's returned in the metadata response. So, we want to use > > the > > > leader epoch associated with the offset being fetched for committing > > > offsets. > > > > > > > It seems that we may have two separate issues here. The first issue is > that > > consumer uses metadata that is older than the one it uses before. The > > second issue is that consumer uses metadata which is newer than the > > corresponding leader epoch in the leader broker. We know that the > > OffsetOutOfRangeException described in this KIP can be prevented by > > avoiding the first issue. On the other hand, it seems that the > > OffsetOffsetOutOfRangeException can still happen even if we avoid the > > second issue -- if consumer uses an older version of metadata, the leader > > epoch in its metadata may equal the leader epoch in the broker even if > the > > leader epoch in the broker is oudated. > > > > Given this understanding, I am not sure why we need to return the leader > > epoch in the fetch response. As long as consumer's metadata is not going > > back in version, I think we are good. Did I miss something here? > > > > > > > > > > 11. Should we now extend OffsetAndMetadata used in the offset commit > api > > in > > > KafkaConsumer to include leader epoch? Similarly, should we return > leader > > > epoch in endOffsets(), beginningOffsets() and position()? We probably > > need > > > to think about how to make the api backward compatible. > > > > > > > After thinking through this carefully, I think we probably don't want to > > extend OffsetAndMetadata to include leader epoch because leader epoch is > > kind of implementation detail which ideally should be hidden from user. > The > > consumer can include leader epoch in the OffsetCommitRequest after taking > > offset from commitSync(final Map<TopicPartition, OffsetAndMetadata> > > offsets). Similarly consumer can store leader epoch from > > OffsetFetchResponse and only provide offset to user via > > consumer.committed(topicPartition). This solution seems to work well and > > we > > don't have to make changes to consumer's public API. Does this sound OK? > > > > > > > > > > 12. It seems that we now need to store leader epoch in the offset > topic. > > > Could you include the new schema for the value of the offset topic and > > add > > > upgrade notes? > > > > > > You are right. I have updated the KIP to specify the new schema for the > > value of the offset topic. Can you take another look? > > > > For existing messages in the offset topic, leader_epoch will be missing. > We > > will use leader_epoch = -1 to indicate the missing leader_epoch. Then the > > consumer behavior will be the same as it is now because any leader_epoch > in > > the MetadataResponse will be larger than the leader_epoch = -1 in the > > OffetFetchResponse. Thus we don't need specific procedure for upgrades > due > > to this change in the offset topic schema. By "upgrade nodes", do you > mean > > the sentences we need to include in the upgrade.html in the PR later? > > > > > > > > > > Jun > > > > > > > > > On Tue, Dec 12, 2017 at 5:19 PM, Dong Lin <lindon...@gmail.com> wrote: > > > > > > > Hey Jun, > > > > > > > > I see. Sounds good. Yeah it is probably simpler to leave this to > > another > > > > KIP in the future. > > > > > > > > Thanks for all the comments. Since there is no further comment in the > > > > community, I will open the voting thread. > > > > > > > > Thanks, > > > > Dong > > > > > > > > On Mon, Dec 11, 2017 at 5:37 PM, Jun Rao <j...@confluent.io> wrote: > > > > > > > > > Hi, Dong, > > > > > > > > > > The case that I am thinking is network partitioning. Suppose one > > > deploys > > > > a > > > > > stretched cluster across multiple AZs in the same region. If the > > > machines > > > > > in one AZ can't communicate to brokers in other AZs due to a > network > > > > issue, > > > > > the brokers in that AZ won't get any new metadata. > > > > > > > > > > We can potentially solve this problem by requiring some kind of > > regular > > > > > heartbeats between the controller and the broker. This may need > some > > > more > > > > > thoughts. So, it's probably fine to leave this to another KIP in > the > > > > > future. > > > > > > > > > > Thanks, > > > > > > > > > > Jun > > > > > > > > > > On Mon, Dec 11, 2017 at 2:55 PM, Dong Lin <lindon...@gmail.com> > > wrote: > > > > > > > > > > > Hey Jun, > > > > > > > > > > > > Thanks for the comment. I am open to improve this KIP to address > > more > > > > > > problems. I probably need more help in understanding what is the > > > > current > > > > > > problem with consumer using outdated metadata and whether it is > > > easier > > > > to > > > > > > address it together with this KIP. > > > > > > > > > > > > I agree that a consumer can potentially talk to old leader for a > > long > > > > > time > > > > > > even after this KIP. But after this KIP, the consumer probably > > should > > > > not > > > > > > get OffetOutofRangeException and therefore will not cause offset > > > rewind > > > > > > issue. So the only problem is that consumer will not be able to > > fetch > > > > > data > > > > > > until it has updated metadata. It seems that this situation can > > only > > > > > happen > > > > > > if the broker is too slow in processing LeaderAndIsrRequest since > > > > > otherwise > > > > > > the consumer will be forced to update metadata due to > > > > > > NotLeaderForPartitionException. So the problem we are having > here > > is > > > > > that > > > > > > consumer will not be able to fetch data if some broker is too > slow > > in > > > > > > processing LeaderAndIsrRequest. > > > > > > > > > > > > Because Kafka propagates LeaderAndIsrRequest asynchronously to > all > > > > > brokers > > > > > > in the cluster, there will always be a period of time when > consumer > > > can > > > > > not > > > > > > fetch data for the partition during the leadership change. Thus > it > > > > seems > > > > > > more like a broker-side performance issue instead of client-side > > > > > > correctness issue. My gut feel is that it is not causing a much a > > > > problem > > > > > > as the problem to be fixed in this KIP. And if we were to address > > it, > > > > we > > > > > > probably need to make change in the broker side, e.g. with > > > prioritized > > > > > > queue for controller-related requests, which may be kind of > > > orthogonal > > > > to > > > > > > this KIP. I am not very sure it will be easier to address it with > > the > > > > > > change in this KIP. Do you have any recommendation? > > > > > > > > > > > > Thanks, > > > > > > Dong > > > > > > > > > > > > > > > > > > On Mon, Dec 11, 2017 at 1:51 PM, Jun Rao <j...@confluent.io> > wrote: > > > > > > > > > > > > > Hi, Dong, > > > > > > > > > > > > > > Thanks for the reply. > > > > > > > > > > > > > > My suggestion of forcing the metadata refresh from the > controller > > > may > > > > > not > > > > > > > work in general since the cached controller could be outdated > > too. > > > > The > > > > > > > general problem is that if a consumer's metadata is outdated, > it > > > may > > > > > get > > > > > > > stuck with the old leader for a long time. We can address the > > issue > > > > of > > > > > > > detecting outdated metadata in a separate KIP in the future if > > you > > > > > didn't > > > > > > > intend to address it in this KIP. > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > Jun > > > > > > > > > > > > > > > > > > > > > On Sat, Dec 9, 2017 at 10:12 PM, Dong Lin <lindon...@gmail.com > > > > > > wrote: > > > > > > > > > > > > > > > Hey Jun, > > > > > > > > > > > > > > > > Thanks much for your comments. Given that client needs to > > > > > de-serialize > > > > > > > the > > > > > > > > metadata anyway, the extra overhead of checking the > > per-partition > > > > > > version > > > > > > > > for every partition should not be a big concern. Thus it > makes > > > > sense > > > > > to > > > > > > > use > > > > > > > > leader epoch as the per-partition version instead of > creating a > > > > > global > > > > > > > > metadata version. I will update the KIP to do that. > > > > > > > > > > > > > > > > Regarding the detection of outdated metadata, I think it is > > > > possible > > > > > to > > > > > > > > ensure that client gets latest metadata by fetching from > > > > controller. > > > > > > Note > > > > > > > > that this requires extra logic in the controller such that > > > > controller > > > > > > > > updates metadata directly in memory without requiring > > > > > > > > UpdateMetadataRequest. But I am not sure the main motivation > of > > > > this > > > > > at > > > > > > > > this moment. But this makes controller more like a bottleneck > > in > > > > the > > > > > > > > cluster which we probably want to avoid. > > > > > > > > > > > > > > > > I think we can probably keep the current way of ensuring > > metadata > > > > > > > > freshness. Currently client will be forced to refresh > metadata > > if > > > > > > broker > > > > > > > > returns error (e.g. NotLeaderForPartition) due to outdated > > > metadata > > > > > or > > > > > > if > > > > > > > > the metadata does not contain the partition that the client > > > needs. > > > > In > > > > > > the > > > > > > > > future, as you previously suggested, we can include > > per-partition > > > > > > > > leaderEpoch in the FetchRequest/ProduceRequest such that > broker > > > can > > > > > > > return > > > > > > > > error if the epoch is smaller than cached epoch in the > broker. > > > > Given > > > > > > that > > > > > > > > this adds more complexity to Kafka, I think we can probably > > think > > > > > about > > > > > > > > that leader when we have a specific use-case or problem to > > solve > > > > with > > > > > > > > up-to-date metadata. Does this sound OK? > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Dong > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Dec 8, 2017 at 3:53 PM, Jun Rao <j...@confluent.io> > > > wrote: > > > > > > > > > > > > > > > > > Hi, Dong, > > > > > > > > > > > > > > > > > > Thanks for the reply. A few more points below. > > > > > > > > > > > > > > > > > > For dealing with how to prevent a consumer switching from a > > new > > > > > > leader > > > > > > > to > > > > > > > > > an old leader, you suggestion that refreshes metadata on > > > consumer > > > > > > > restart > > > > > > > > > until it sees a metadata version >= the one associated with > > the > > > > > > offset > > > > > > > > > works too, as long as we guarantee that the cached metadata > > > > > versions > > > > > > on > > > > > > > > the > > > > > > > > > brokers only go up. > > > > > > > > > > > > > > > > > > The second discussion point is on whether the metadata > > > versioning > > > > > > > should > > > > > > > > be > > > > > > > > > per partition or global. For the partition level > versioning, > > > you > > > > > were > > > > > > > > > concerned about the performance. Given that metadata > updates > > > are > > > > > > rare, > > > > > > > I > > > > > > > > am > > > > > > > > > not sure if it's a big concern though. Doing a million if > > tests > > > > is > > > > > > > > probably > > > > > > > > > going to take less than 1ms. Another thing is that the > > metadata > > > > > > version > > > > > > > > > seems to need to survive controller failover. In your > current > > > > > > > approach, a > > > > > > > > > consumer may not be able to wait on the right version of > the > > > > > metadata > > > > > > > > after > > > > > > > > > the consumer restart since the metadata version may have > been > > > > > > recycled > > > > > > > on > > > > > > > > > the server side due to a controller failover while the > > consumer > > > > is > > > > > > > down. > > > > > > > > > The partition level leaderEpoch survives controller failure > > and > > > > > won't > > > > > > > > have > > > > > > > > > this issue. > > > > > > > > > > > > > > > > > > Lastly, neither your proposal nor mine addresses the issue > > how > > > to > > > > > > > > guarantee > > > > > > > > > a consumer to detect that is metadata is outdated. > Currently, > > > the > > > > > > > > consumer > > > > > > > > > is not guaranteed to fetch metadata from every broker > within > > > some > > > > > > > bounded > > > > > > > > > period of time. Maybe this is out of the scope of your KIP. > > But > > > > one > > > > > > > idea > > > > > > > > is > > > > > > > > > force the consumer to refresh metadata from the controller > > > > > > > periodically. > > > > > > > > > > > > > > > > > > Jun > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Dec 7, 2017 at 11:25 AM, Dong Lin < > > lindon...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hey Jun, > > > > > > > > > > > > > > > > > > > > Thanks much for the comments. Great point particularly > > > > regarding > > > > > > > (3). I > > > > > > > > > > haven't thought about this before. > > > > > > > > > > > > > > > > > > > > It seems that there are two possible ways where the > version > > > > > number > > > > > > > can > > > > > > > > be > > > > > > > > > > used. One solution is for client to check the version > > number > > > at > > > > > the > > > > > > > > time > > > > > > > > > it > > > > > > > > > > receives MetadataResponse. And if the version number in > the > > > > > > > > > > MetadataResponse is smaller than the version number in > the > > > > > client's > > > > > > > > > cache, > > > > > > > > > > the client will be forced to fetch metadata again. > Another > > > > > > solution, > > > > > > > > as > > > > > > > > > > you have suggested, is for broker to check the version > > number > > > > at > > > > > > the > > > > > > > > time > > > > > > > > > > it receives a request from client. The broker will reject > > the > > > > > > request > > > > > > > > if > > > > > > > > > > the version is smaller than the version in broker's > cache. > > > > > > > > > > > > > > > > > > > > I am not very sure that the second solution can address > the > > > > > problem > > > > > > > > here. > > > > > > > > > > In the scenario described in the JIRA ticket, broker's > > cache > > > > may > > > > > be > > > > > > > > > > outdated because it has not processed the > > LeaderAndIsrRequest > > > > > from > > > > > > > the > > > > > > > > > > controller. Thus it may still process client's request > even > > > if > > > > > the > > > > > > > > > version > > > > > > > > > > in client's request is actually outdated. Does this make > > > sense? > > > > > > > > > > > > > > > > > > > > IMO, it seems that we can address problem (3) by saving > the > > > > > > metadata > > > > > > > > > > version together with the offset. After consumer starts, > it > > > > will > > > > > > keep > > > > > > > > > > fetching metadata until the metadata version >= the > version > > > > saved > > > > > > > with > > > > > > > > > the > > > > > > > > > > offset of this partition. > > > > > > > > > > > > > > > > > > > > Regarding problems (1) and (2): Currently we use the > > version > > > > > number > > > > > > > in > > > > > > > > > the > > > > > > > > > > MetadataResponse to ensure that the metadata does not go > > back > > > > in > > > > > > > time. > > > > > > > > > > There are two alternative solutions to address problems > (1) > > > and > > > > > > (2). > > > > > > > > One > > > > > > > > > > solution is for client to enumerate all partitions in the > > > > > > > > > MetadataResponse, > > > > > > > > > > compare their epoch with those in the cached metadata, > and > > > > > rejects > > > > > > > the > > > > > > > > > > MetadataResponse iff any leader epoch is smaller. The > main > > > > > concern > > > > > > is > > > > > > > > > that > > > > > > > > > > MetadataResponse currently cached information of all > > > partitions > > > > > in > > > > > > > the > > > > > > > > > > entire cluster. It may slow down client's performance if > we > > > > were > > > > > to > > > > > > > do > > > > > > > > > it. > > > > > > > > > > The other solution is for client to enumerate partitions > > for > > > > only > > > > > > > > topics > > > > > > > > > > registered in the org.apache.kafka.clients.Metadata, > which > > > > will > > > > > be > > > > > > > an > > > > > > > > > > empty > > > > > > > > > > set for producer and the set of subscribed partitions for > > > > > consumer. > > > > > > > But > > > > > > > > > > this degrades to all topics if consumer subscribes to > > topics > > > in > > > > > the > > > > > > > > > cluster > > > > > > > > > > by pattern. > > > > > > > > > > > > > > > > > > > > Note that client will only be forced to update metadata > if > > > the > > > > > > > version > > > > > > > > in > > > > > > > > > > the MetadataResponse is smaller than the version in the > > > cached > > > > > > > > metadata. > > > > > > > > > In > > > > > > > > > > general it should not be a problem. It can be a problem > > only > > > if > > > > > > some > > > > > > > > > broker > > > > > > > > > > is particularly slower than other brokers in processing > > > > > > > > > > UpdateMetadataRequest. When this is the case, it means > that > > > the > > > > > > > broker > > > > > > > > is > > > > > > > > > > also particularly slower in processing > LeaderAndIsrRequest, > > > > which > > > > > > can > > > > > > > > > cause > > > > > > > > > > problem anyway because some partition will probably have > no > > > > > leader > > > > > > > > during > > > > > > > > > > this period. I am not sure problems (1) and (2) cause > more > > > > > problem > > > > > > > than > > > > > > > > > > what we already have. > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Dong > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Dec 6, 2017 at 6:42 PM, Jun Rao < > j...@confluent.io> > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi, Dong, > > > > > > > > > > > > > > > > > > > > > > Great finding on the issue. It's a real problem. A few > > > > comments > > > > > > > about > > > > > > > > > the > > > > > > > > > > > KIP. (1) I am not sure about updating > > > > controller_metadata_epoch > > > > > > on > > > > > > > > > every > > > > > > > > > > > UpdateMetadataRequest. Currently, the controller can > send > > > > > > > > > > > UpdateMetadataRequest when there is no actual metadata > > > > change. > > > > > > > Doing > > > > > > > > > this > > > > > > > > > > > may require unnecessary metadata refresh on the client. > > (2) > > > > > > > > > > > controller_metadata_epoch is global across all topics. > > This > > > > > means > > > > > > > > that > > > > > > > > > a > > > > > > > > > > > client may be forced to update its metadata even when > the > > > > > > metadata > > > > > > > > for > > > > > > > > > > the > > > > > > > > > > > topics that it cares haven't changed. (3) It doesn't > seem > > > > that > > > > > > the > > > > > > > > KIP > > > > > > > > > > > handles the corner case when a consumer is restarted. > > Say a > > > > > > > consumer > > > > > > > > > > reads > > > > > > > > > > > from the new leader, commits the offset and then is > > > > restarted. > > > > > On > > > > > > > > > > restart, > > > > > > > > > > > the consumer gets an outdated metadata and fetches from > > the > > > > old > > > > > > > > leader. > > > > > > > > > > > Then, the consumer will get into the offset out of > range > > > > issue. > > > > > > > > > > > > > > > > > > > > > > Given the above, I am thinking of the following > approach. > > > We > > > > > > > actually > > > > > > > > > > > already have metadata versioning at the partition > level. > > > Each > > > > > > > leader > > > > > > > > > has > > > > > > > > > > a > > > > > > > > > > > leader epoch which is monotonically increasing. We can > > > > > > potentially > > > > > > > > > > > propagate leader epoch back in the metadata response > and > > > the > > > > > > > clients > > > > > > > > > can > > > > > > > > > > > cache that. This solves the issue of (1) and (2). To > > solve > > > > (3), > > > > > > > when > > > > > > > > > > saving > > > > > > > > > > > an offset, we could save both an offset and the > > > corresponding > > > > > > > leader > > > > > > > > > > epoch. > > > > > > > > > > > When fetching the data, the consumer provides both the > > > offset > > > > > and > > > > > > > the > > > > > > > > > > > leader epoch. A leader will only serve the request if > its > > > > > leader > > > > > > > > epoch > > > > > > > > > is > > > > > > > > > > > equal to or greater than the leader epoch from the > > > consumer. > > > > To > > > > > > > > achieve > > > > > > > > > > > this, we need to change the fetch request protocol and > > the > > > > > offset > > > > > > > > > commit > > > > > > > > > > > api, which requires some more thoughts. > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > Jun > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Dec 6, 2017 at 10:57 AM, Dong Lin < > > > > lindon...@gmail.com > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Bump up the thread. > > > > > > > > > > > > > > > > > > > > > > > > It will be great to have more comments on whether we > > > should > > > > > do > > > > > > it > > > > > > > > or > > > > > > > > > > > > whether there is better way to address the motivation > > of > > > > this > > > > > > > KIP. > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Dec 4, 2017 at 3:09 PM, Dong Lin < > > > > > lindon...@gmail.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > I don't have an interesting rejected alternative > > > solution > > > > > to > > > > > > > put > > > > > > > > in > > > > > > > > > > the > > > > > > > > > > > > > KIP. If there is good alternative solution from > > anyone > > > in > > > > > > this > > > > > > > > > > thread, > > > > > > > > > > > I > > > > > > > > > > > > am > > > > > > > > > > > > > happy to discuss this and update the KIP > accordingly. > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > Dong > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Dec 4, 2017 at 1:12 PM, Ted Yu < > > > > > yuzhih...@gmail.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > >> It is clearer now. > > > > > > > > > > > > >> > > > > > > > > > > > > >> I noticed that Rejected Alternatives section is > > empty. > > > > > > > > > > > > >> Have you considered any alternative ? > > > > > > > > > > > > >> > > > > > > > > > > > > >> Cheers > > > > > > > > > > > > >> > > > > > > > > > > > > >> On Mon, Dec 4, 2017 at 1:07 PM, Dong Lin < > > > > > > lindon...@gmail.com > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > >> > > > > > > > > > > > > >> > Ted, thanks for catching this. I have updated > the > > > > > sentence > > > > > > > to > > > > > > > > > make > > > > > > > > > > > it > > > > > > > > > > > > >> > readable. > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > Thanks, > > > > > > > > > > > > >> > Dong > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > On Sat, Dec 2, 2017 at 3:05 PM, Ted Yu < > > > > > > yuzhih...@gmail.com > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > bq. It the controller_epoch of the incoming > > > > > > > > MetadataResponse, > > > > > > > > > or > > > > > > > > > > > if > > > > > > > > > > > > >> the > > > > > > > > > > > > >> > > controller_epoch is the same but the > > > > > > > > controller_metadata_epoch > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > Can you update the above sentence so that the > > > > > intention > > > > > > is > > > > > > > > > > > clearer ? > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > Thanks > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > On Fri, Dec 1, 2017 at 6:33 PM, Dong Lin < > > > > > > > > lindon...@gmail.com > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > Hi all, > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > I have created KIP-232: Detect outdated > > metadata > > > > by > > > > > > > adding > > > > > > > > > > > > >> > > > ControllerMetadataEpoch field: > > > > > > > > > > > > >> > > > https://cwiki.apache.org/ > > > > > > confluence/display/KAFKA/KIP- > > > > > > > > > > > > >> > > > 232%3A+Detect+outdated+metadata+by+adding+ > > > > > > > > > > > > >> > ControllerMetadataEpoch+field > > > > > > > > > > > > >> > > > . > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > The KIP proposes to add fields in > > > MetadataResponse > > > > > and > > > > > > > > > > > > >> > > > UpdateMetadataRequest so that client can > > reject > > > > > > outdated > > > > > > > > > > > metadata > > > > > > > > > > > > >> and > > > > > > > > > > > > >> > > avoid > > > > > > > > > > > > >> > > > unnecessary OffsetOutOfRangeException. > > Otherwise > > > > > there > > > > > > > is > > > > > > > > > > > > currently > > > > > > > > > > > > >> > race > > > > > > > > > > > > >> > > > condition that can cause consumer to reset > > > offset > > > > > > which > > > > > > > > > > > negatively > > > > > > > > > > > > >> > affect > > > > > > > > > > > > >> > > > the consumer's availability. > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > Feedback and suggestions are welcome! > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > Regards, > > > > > > > > > > > > >> > > > Dong > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >