tl;dr:
1) I tend to agree we should keep the existing behavior, but what this
means is actually different and more complicated than just "if leaves
group, then invokes leave callback"
2) Personally I think we should actually *always* invoke this callback, for
every case

Longer version:

First, to lay the groundwork for this discussion:
1. The callback in question is #onLeavePrepare -- which in turn invokes
#onPartitionsRevoked (or #onPartitionsLost) on all the currently assigned
partitions
2. The existing behavior is basically:
   -alway invoke #onLeavePrepare for consumer clients, whether dynamic or
static
   -never invoked for Kafka Streams (ie when the internal
leave.group.on.close config is set to false)
3. The new CloseOptions will allow a user to choose between
REMAIN_IN_GROUP, LEAVE_GROUP, and DEFAULT:
   -KafkaStreams will explicitly select REMAIN_IN_GROUP as its default
behavior but allow users to opt for LEAVE_GROUP instead
   -DEFAULT is essentially a consumer-only option, where regular (dynamic)
consumers will leave the group but static consumers will remain in the group

The two important takeaways here are:
A. the current logic is actually not quite "if a consumer remains in the
group, the callback should not be invoked" because a static member will
invoke #onLeavePrepare but *not *leave the group
B. maintaining the status quo would mean always invoking the callback for
the DEFAULT and LEAVE_GROUP options, and never invoking it for the
REMAIN_IN_GROUP option
C. being able to select REMAIN_IN_GROUP for a plain consumer client is an
entirely new possibility, which has no existing behavior, so this is up to
us to decide

Now, what to do? I'll break this up by the new options:

LEAVE_GROUP:
Always invoke #onLeavePrepare, this is the easy one.

DEFAULT:
Next up: should we continue to always invoke #onLeavePrepare for DEFAULT,
regardless of whether the consumer is dynamic or static -- meaning we might
invoke this even if the consumer actually does not leave the group (ie
static membership)?

In my opinion, yes. It may sound weird to invoke these callbacks without
leaving the group, but remember that "leaving the group" actually means
losing ownership of these partitions (which you do NOT want for a static
member) whereas invoking #onLeavePrepare/#onPartitionsRevoked does not
actually *revoke* the partitions from the consumer, and is generally just
used to do things like flushing state, cleaning up, and committing offsets
for those partitions. All of which you probably want to do even for a
static member that's not actively leaving the group (for example since you
probably store the processed offsets in memory and need to commit those
before shutting down)

So, I think we should indeed maintain the original logic and always invoke
#onLeavePrepare for DEFAULT. Whether it's static/leaves the group or not.

REMAIN_IN_GROUP:
This is actually the hardest one to answer in my opinion. To reiterate,
right now Kafka Streams *always* has REMAIN_IN_GROUP behavior and never
invokes #onLeavePrepare, whereas plain consumer clients don't have this
option at all and therefore don't have any existing behavior.

The "easy" choice is to always skip #onLeavePrepare for this option, since
that maintains the current behavior for Kafka Streams and gives us the
"don't leave --> don't call leavePrepare" semantics that *feels* like it
would make sense. But I worry that this will be detrimental to any consumer
client apps that use this option. Pretty much the same reasoning behind
calling it for the  static membership/DEFAULT case applies here as well --
the client may want to stay in the group and retain ownership of its
partitions, but if it shuts down it will likely lose info about its
consumed/processed offsets, could skip flushing/persisting important data,
fail to checkpoint, etc.

In fact, the only reason it's "safe" to skip this callback for Kafka
Streams in the existing logic is because Kafka Streams makes sure to
manually commit everything prior to shutting down. But plain consumer apps
don't have this guarantee. I also think it would be kind of weird to invoke
this callback for static members who don't leave the group but use the
DEFAULT option, but not invoke the callback for members who also don't
leave the group but use the REMAIN_IN_GROUP option.

So for consistency, imo it makes sense to just always invoke
#onLeavePrepare for REMAIN_IN_GROUP as well. (Technically this is a change
for Kafka Streams but I'm like 90% sure this will just work OOTB without
needing any changes in Streams code, and the callback will just be a no-op.
But if not, I'm happy to help with the Streams code fixes to make this work)

Anyways -- thoughts?

On Mon, Dec 16, 2024 at 7:23 AM TengYao Chi <kiting...@gmail.com> wrote:

> Hi everyone,
>
> I'd like to discuss the leave callback implementation.
> While this KIP focuses on sending leave requests, we should also consider
> how to handle the leave callback.
> IMHO, we should maintain the original logic: if a consumer remains in the
> group, the callback should not be invoked.
>
> What do you think?
>
> Best regards,
> TengYao
>
> TengYao Chi <kiting...@gmail.com> 於 2024年10月2日 週三 下午7:02寫道:
>
> > Hello everyone,
> >
> > It seems we have reached a consensus on how this KIP will proceed.
> > I have updated the content accordingly.
> > Please take a look, and if there are no further questions, I will
> initiate
> > a vote.
> >
> > Best regards,
> > TengYao
> >
> > Chia-Ping Tsai <chia7...@gmail.com> 於 2024年10月2日 週三 上午7:06寫道:
> >
> >> hi Matthias
> >>
> >> Thanks for sharing your POV, and you do remind me that this KIP should
> >> not be blocked due to this verbose discussion.
> >>
> >> Those protected methods are not Public APIs, so they can reverted
> without
> >> deprecation cycle. Those variables can be exposed publicly when users do
> >> need them.
> >>
> >> +1 to clean user facing API
> >>
> >> Best,
> >> Chia-Ping
> >>
> >> > Matthias J. Sax <mj...@apache.org> 於 2024年10月2日 凌晨2:38 寫道:
> >> > Yes, it requires an internal class. But I would rather optimize for a
> >> clean user facing API. Personally, I don't think it will make our code
> base
> >> significantly more complex, so I think it's worth the tradeoff.
> >> >
> >> > And yes, we would need to take user provided `CloseOption` object and
> >> create `CloseOptionInternal` object, but as it's not on the hot code
> path,
> >> and both objects are short lived anyway, it seems
> >>
> >
>

Reply via email to