Hello, have any mates who have discussed it before seen it? Also welcome new 
mates to discuss together.

"hudeqi" <16120...@bjtu.edu.cn>写道:
> Long time no see, this issue has been discussed for a long time, now please 
> allow me to summarize this issue, and then everyone can help to see which 
> direction this issue should go in?
> 
> There are two problems to be solved by this kip:
> 1. Solve the problem that when the client configures the "auto.offset.reset" 
> to latest, the new partition data may be lost when the consumer resets the 
> offset to the latest after expanding the topic partition.
> 
> 2. In addition to the "earliest", "latest", and "none" provided by the 
> existing "auto.offset.reset", it also provides more abundant parameters, such 
> as "latest_on_start" (application startup is reset to latest, and an 
> exception is thrown if out of range occurs), "earliest_on_start" (application 
> startup is reset to earliest, and an exception is thrown if out of range 
> occurs), "nearest"(determined by "auto.offset.reset" when the program starts, 
> and choose earliest or latest according to the distance between the current 
> offset and log start offset and log end offset when out of range occurs).
> 
> According to the discussion results of the members above, it seems that there 
> are concerns about adding these additional offset reset mechanisms: 
> complexity and compatibility. In fact, these parameters do have corresponding 
> benefits. Therefore, based on the above discussion results, I have sorted out 
> two solution directions. You can help me to see which direction to follow:
> 
> 1. The first one is to follow Guozhang's suggestion: keep the three 
> parameters of "auto.offset.reset" and their meanings unchanged, reduce the 
> confusion for Kafka users, and solve the compatibility problem by the way. 
> Add these two parameters:
>     a. "auto.offset.reset.on.no.initial.offse": Indicates the strategy used 
> to initialize the offset. The default value is the parameter configured by 
> "auto.offset.reset". If so, the strategy for initializing the offset remains 
> unchanged from the previous behavior, ensuring compatibility. If the 
> parameter is configured with "latest_on_start" or "earliest_on_start", then 
> the offset will be reset according to the configured semantics when 
> initializing the offset. In this way, the problem of data loss during 
> partition expansion can be solved: configure 
> "auto.offset.reset.on.no.initial.offset" to "latest_on_start", and configure 
> "auto.offset.reset" to earliest.
>     b. "auto.offset.reset.on.invalid.offset": Indicates that the offset is 
> illegal or out of range occurs. The default value is the parameter configured 
> by "auto.offset.reset". If so, the processing of out of range is the same as 
> before to ensure compatibility. If "nearest" is configured, then the semantic 
> logic corresponding to "nearest" is used only for the case of out of range.
> 
> This solution ensures compatibility and ensures that the semantics of the 
> original configuration remain unchanged. Only two incremental configurations 
> are added to flexibly handle different situations.
> 
> 2. The second is to directly reduce the complexity of this problem, and 
> directly add the logic of resetting the initial offset of the newly expanded 
> partition to the earliest to "auto.offset.reset"="latest". In this way, Kafka 
> users do not need to perceive this subtle but useful change, and the 
> processing of other situations remains unchanged (without considering too 
> many rich offset processing mechanisms).
> 
> I hope you can help me with the direction of the solution to this issue, 
> thank you.
> 
> Best,
> hudeqi

Reply via email to