tl;dr: 1) I tend to agree we should keep the existing behavior, but what this means is actually different and more complicated than just "if leaves group, then invokes leave callback" 2) Personally I think we should actually *always* invoke this callback, for every case
Longer version: First, to lay the groundwork for this discussion: 1. The callback in question is #onLeavePrepare -- which in turn invokes #onPartitionsRevoked (or #onPartitionsLost) on all the currently assigned partitions 2. The existing behavior is basically: -alway invoke #onLeavePrepare for consumer clients, whether dynamic or static -never invoked for Kafka Streams (ie when the internal leave.group.on.close config is set to false) 3. The new CloseOptions will allow a user to choose between REMAIN_IN_GROUP, LEAVE_GROUP, and DEFAULT: -KafkaStreams will explicitly select REMAIN_IN_GROUP as its default behavior but allow users to opt for LEAVE_GROUP instead -DEFAULT is essentially a consumer-only option, where regular (dynamic) consumers will leave the group but static consumers will remain in the group The two important takeaways here are: A. the current logic is actually not quite "if a consumer remains in the group, the callback should not be invoked" because a static member will invoke #onLeavePrepare but *not *leave the group B. maintaining the status quo would mean always invoking the callback for the DEFAULT and LEAVE_GROUP options, and never invoking it for the REMAIN_IN_GROUP option C. being able to select REMAIN_IN_GROUP for a plain consumer client is an entirely new possibility, which has no existing behavior, so this is up to us to decide Now, what to do? I'll break this up by the new options: LEAVE_GROUP: Always invoke #onLeavePrepare, this is the easy one. DEFAULT: Next up: should we continue to always invoke #onLeavePrepare for DEFAULT, regardless of whether the consumer is dynamic or static -- meaning we might invoke this even if the consumer actually does not leave the group (ie static membership)? In my opinion, yes. It may sound weird to invoke these callbacks without leaving the group, but remember that "leaving the group" actually means losing ownership of these partitions (which you do NOT want for a static member) whereas invoking #onLeavePrepare/#onPartitionsRevoked does not actually *revoke* the partitions from the consumer, and is generally just used to do things like flushing state, cleaning up, and committing offsets for those partitions. All of which you probably want to do even for a static member that's not actively leaving the group (for example since you probably store the processed offsets in memory and need to commit those before shutting down) So, I think we should indeed maintain the original logic and always invoke #onLeavePrepare for DEFAULT. Whether it's static/leaves the group or not. REMAIN_IN_GROUP: This is actually the hardest one to answer in my opinion. To reiterate, right now Kafka Streams *always* has REMAIN_IN_GROUP behavior and never invokes #onLeavePrepare, whereas plain consumer clients don't have this option at all and therefore don't have any existing behavior. The "easy" choice is to always skip #onLeavePrepare for this option, since that maintains the current behavior for Kafka Streams and gives us the "don't leave --> don't call leavePrepare" semantics that *feels* like it would make sense. But I worry that this will be detrimental to any consumer client apps that use this option. Pretty much the same reasoning behind calling it for the static membership/DEFAULT case applies here as well -- the client may want to stay in the group and retain ownership of its partitions, but if it shuts down it will likely lose info about its consumed/processed offsets, could skip flushing/persisting important data, fail to checkpoint, etc. In fact, the only reason it's "safe" to skip this callback for Kafka Streams in the existing logic is because Kafka Streams makes sure to manually commit everything prior to shutting down. But plain consumer apps don't have this guarantee. I also think it would be kind of weird to invoke this callback for static members who don't leave the group but use the DEFAULT option, but not invoke the callback for members who also don't leave the group but use the REMAIN_IN_GROUP option. So for consistency, imo it makes sense to just always invoke #onLeavePrepare for REMAIN_IN_GROUP as well. (Technically this is a change for Kafka Streams but I'm like 90% sure this will just work OOTB without needing any changes in Streams code, and the callback will just be a no-op. But if not, I'm happy to help with the Streams code fixes to make this work) Anyways -- thoughts? On Mon, Dec 16, 2024 at 7:23 AM TengYao Chi <kiting...@gmail.com> wrote: > Hi everyone, > > I'd like to discuss the leave callback implementation. > While this KIP focuses on sending leave requests, we should also consider > how to handle the leave callback. > IMHO, we should maintain the original logic: if a consumer remains in the > group, the callback should not be invoked. > > What do you think? > > Best regards, > TengYao > > TengYao Chi <kiting...@gmail.com> 於 2024年10月2日 週三 下午7:02寫道: > > > Hello everyone, > > > > It seems we have reached a consensus on how this KIP will proceed. > > I have updated the content accordingly. > > Please take a look, and if there are no further questions, I will > initiate > > a vote. > > > > Best regards, > > TengYao > > > > Chia-Ping Tsai <chia7...@gmail.com> 於 2024年10月2日 週三 上午7:06寫道: > > > >> hi Matthias > >> > >> Thanks for sharing your POV, and you do remind me that this KIP should > >> not be blocked due to this verbose discussion. > >> > >> Those protected methods are not Public APIs, so they can reverted > without > >> deprecation cycle. Those variables can be exposed publicly when users do > >> need them. > >> > >> +1 to clean user facing API > >> > >> Best, > >> Chia-Ping > >> > >> > Matthias J. Sax <mj...@apache.org> 於 2024年10月2日 凌晨2:38 寫道: > >> > Yes, it requires an internal class. But I would rather optimize for a > >> clean user facing API. Personally, I don't think it will make our code > base > >> significantly more complex, so I think it's worth the tradeoff. > >> > > >> > And yes, we would need to take user provided `CloseOption` object and > >> create `CloseOptionInternal` object, but as it's not on the hot code > path, > >> and both objects are short lived anyway, it seems > >> > > >