guozhangwang commented on pull request #11340: URL: https://github.com/apache/kafka/pull/11340#issuecomment-948033052
Hi @RivenSun2 just to clarify my original comments here :) * I totally agree with you that we should NOT trigger `maybeAutoCommitOffsetsSync` inside `onJoinPrepare` with `time.timer(rebalanceConfig.rebalanceTimeoutMs` in any of the proposed options we've discussed. On the other hand, we do need to commit sync the offsets before rejoining the group since the consumer may got those partitions reassigned to others in the group, and hence it's important that the offsets are committed already so that others getting the partitions can read the committed offsets and start at the right position. If we allow async commit during rebalance, then we still need to wait for that commit to complete before we can continue to send the join-group request. * What I was wondering is, what timeout value we should use when triggering `maybeAutoCommitOffsetsSync` inside `onJoinPrepare`. The key point here is that, that timeout value should no larger than the `poll()` call's own passed in timeout itself, so it should be in the form of `MIN(poll timer, the configured time)` --- and again, that's why I was thinking that we should keep passing the poll timer down into the callees, since later when we remove the deprecated call we would not need to pass in the boolean flag anyways. The `configured time` itself, could be either 1) we use a separate config as you suggested, or 2) we piggy-back on the existing configs still, e.g. on `DEFAULT_API_TIMEOUT_MS_CONFIG` (since semantically that defines how long a blocking call, like commitSync should be take at max). The semantics is basically that, the poll call should definitely return within the specified timeout parameter value, no matter if it is blocked on committing offsets, or other things li ke fetching. * Now for `commitSync` called by users, my rationale is the same as `poll`: say, if the user calls `consumer.commitSync` or `consumer.poll` with a very large timer anyways, then it is the intentional behavior that we should try to succeed that request for at least that amount of time. Here I think the reason with short backoff time it would mean a lot of requests sent to the brokers during that period of time hence wasted broker's CPUs, is actually due to the lack of https://issues.apache.org/jira/browse/KAFKA-9800. But even with that, I'd consider that users still have a lot of different ways to bombard a brokers other than `commitSync` with manually specified non-existent topic partitions, or with `allConsumed` partitions that would not be refreshed yet. So personally, I'd suggest we make the following changes as a fix: 1) pass in the timer to `onJoinPrepare`, and inside it, trigger `maybeAutoCommitOffsetsSync` with the time as `MIN(passed-in poll timer, configured timeout)`. 2) complete https://issues.apache.org/jira/browse/KAFKA-9800 (KIP-580) to reduce the bombarding effect of retriable requests. In the future as we have KRaft for metadata propagation, we can then reason about whether `UnknownTopicOrPartition` error is certain, or not, and then treat it as fatal rather than retriable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org