[ https://issues.apache.org/jira/browse/KAFKA-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
hudeqi updated KAFKA-15134: --------------------------- Description: Firstly, let me post the exception log of the client running: _"org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records."_ This is the client exception log provided to us by the online business. First, we confirmed that there is no issue on the server brokers. Then according to the exception prompt information, it may be judged that the settings of "max.poll.interval.ms" and "max.poll.records" may be unreasonable, but the actual processing time of business printing is much shorter than "max.poll.interval.ms", so We're sure that's not the reason. In the end, it was found that there is such a case: the business uses a "group id" to subscribe to some partitions of "topic1" through the simple consumer mode, and set "auto.offset.reset=false". When using the same "group id" name to start a high level consumer on "topic2", the original service using simple consumer throws CommitFailedException. And I have reproduced this process. In fact, this is not a bug, but a problem with the way the client is used, but I think the exception message of `CommitFailedException` may have imperfect and misleading guidance, so I have enriched the message that there may be special situations. > Enrich the prompt reason in CommitFailedException > ------------------------------------------------- > > Key: KAFKA-15134 > URL: https://issues.apache.org/jira/browse/KAFKA-15134 > Project: Kafka > Issue Type: Improvement > Components: clients > Affects Versions: 3.5.0 > Reporter: hudeqi > Assignee: hudeqi > Priority: Major > > Firstly, let me post the exception log of the client running: > _"org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be > completed since the group has already rebalanced and assigned the partitions > to another member. This means that the time between subsequent calls to > poll() was longer than the configured max.poll.interval.ms, which typically > implies that the poll loop is spending too much time message processing. You > can address this either by increasing max.poll.interval.ms or by reducing the > maximum size of batches returned in poll() with max.poll.records."_ > This is the client exception log provided to us by the online business. > First, we confirmed that there is no issue on the server brokers. Then > according to the exception prompt information, it may be judged that the > settings of "max.poll.interval.ms" and "max.poll.records" may be > unreasonable, but the actual processing time of business printing is much > shorter than "max.poll.interval.ms", so We're sure that's not the reason. In > the end, it was found that there is such a case: the business uses a "group > id" to subscribe to some partitions of "topic1" through the simple consumer > mode, and set "auto.offset.reset=false". When using the same "group id" name > to start a high level consumer on "topic2", the original service using simple > consumer throws CommitFailedException. And I have reproduced this process. > In fact, this is not a bug, but a problem with the way the client is used, > but I think the exception message of `CommitFailedException` may have > imperfect and misleading guidance, so I have enriched the message that there > may be special situations. -- This message was sent by Atlassian Jira (v8.20.10#820010)