I repost the newly changed KIP: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-842%3A+Add+richer+group+offset+reset+mechanisms

"hudeqi" <16120...@bjtu.edu.cn>写道:
> Hello, have any mates who have discussed it before seen it? Also welcome new 
> mates to discuss together.
> 
> "hudeqi" <16120...@bjtu.edu.cn>写道:
> > Long time no see, this issue has been discussed for a long time, now please 
> > allow me to summarize this issue, and then everyone can help to see which 
> > direction this issue should go in?
> > 
> > There are two problems to be solved by this kip:
> > 1. Solve the problem that when the client configures the 
> > "auto.offset.reset" to latest, the new partition data may be lost when the 
> > consumer resets the offset to the latest after expanding the topic 
> > partition.
> > 
> > 2. In addition to the "earliest", "latest", and "none" provided by the 
> > existing "auto.offset.reset", it also provides more abundant parameters, 
> > such as "latest_on_start" (application startup is reset to latest, and an 
> > exception is thrown if out of range occurs), "earliest_on_start" 
> > (application startup is reset to earliest, and an exception is thrown if 
> > out of range occurs), "nearest"(determined by "auto.offset.reset" when the 
> > program starts, and choose earliest or latest according to the distance 
> > between the current offset and log start offset and log end offset when out 
> > of range occurs).
> > 
> > According to the discussion results of the members above, it seems that 
> > there are concerns about adding these additional offset reset mechanisms: 
> > complexity and compatibility. In fact, these parameters do have 
> > corresponding benefits. Therefore, based on the above discussion results, I 
> > have sorted out two solution directions. You can help me to see which 
> > direction to follow:
> > 
> > 1. The first one is to follow Guozhang's suggestion: keep the three 
> > parameters of "auto.offset.reset" and their meanings unchanged, reduce the 
> > confusion for Kafka users, and solve the compatibility problem by the way. 
> > Add these two parameters:
> >     a. "auto.offset.reset.on.no.initial.offse": Indicates the strategy used 
> > to initialize the offset. The default value is the parameter configured by 
> > "auto.offset.reset". If so, the strategy for initializing the offset 
> > remains unchanged from the previous behavior, ensuring compatibility. If 
> > the parameter is configured with "latest_on_start" or "earliest_on_start", 
> > then the offset will be reset according to the configured semantics when 
> > initializing the offset. In this way, the problem of data loss during 
> > partition expansion can be solved: configure 
> > "auto.offset.reset.on.no.initial.offset" to "latest_on_start", and 
> > configure "auto.offset.reset" to earliest.
> >     b. "auto.offset.reset.on.invalid.offset": Indicates that the offset is 
> > illegal or out of range occurs. The default value is the parameter 
> > configured by "auto.offset.reset". If so, the processing of out of range is 
> > the same as before to ensure compatibility. If "nearest" is configured, 
> > then the semantic logic corresponding to "nearest" is used only for the 
> > case of out of range.
> > 
> > This solution ensures compatibility and ensures that the semantics of the 
> > original configuration remain unchanged. Only two incremental 
> > configurations are added to flexibly handle different situations.
> > 
> > 2. The second is to directly reduce the complexity of this problem, and 
> > directly add the logic of resetting the initial offset of the newly 
> > expanded partition to the earliest to "auto.offset.reset"="latest". In this 
> > way, Kafka users do not need to perceive this subtle but useful change, and 
> > the processing of other situations remains unchanged (without considering 
> > too many rich offset processing mechanisms).
> > 
> > I hope you can help me with the direction of the solution to this issue, 
> > thank you.
> > 
> > Best,
> > hudeqi

Reply via email to