David Jacot created KAFKA-9844:
----------------------------------
Summary: Maximum number of members within a group is not always
enforced due to a race condition in join group
Key: KAFKA-9844
URL: https://issues.apache.org/jira/browse/KAFKA-9844
Project: Kafka
Issue Type: Bug
Affects Versions: 2.5.0
Reporter: David Jacot
Assignee: David Jacot
While analysing https://issues.apache.org/jira/browse/KAFKA-7965, I found out
that the maximum number of members constraints is not always enforced due to a
race condition.
When an unknown member joins the group, the group is automatically created if
it does not exist. Then, it proceeds with a unknownJoinGroup. On that path, the
limit is not enforced because we assumes that the group is empty as this stage
because it did not exist. As the lookup and the creation are not protected by a
lock, multiple join requests could end up on that path and thus bypass the
enforcement.
Here is example of the logs captured while troubleshooting KAFKA-7965. The test
setups 3 consumers and use a limit of 2. The logs show that the three members
were able to join the group without being evicted.
{noformat}
[2020-04-05 13:29:03,145] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Discovered group coordinator localhost:36449 (id:
2147483645 rack: null)
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:794)
[2020-04-05 13:29:03,145] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Discovered group coordinator localhost:36449 (id:
2147483645 rack: null)
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:794)
[2020-04-05 13:29:03,151] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Discovered group coordinator localhost:36449 (id:
2147483645 rack: null)
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:794)
[2020-04-05 13:29:03,153] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Attempt to heartbeat failed since member id
ConsumerTestConsumer-764a71ea-f9b3-462c-9986-8e6b2530d6e3 is not valid.
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:1054)
[2020-04-05 13:29:03,155] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Giving away all assigned partitions as lost since
generation has been reset,indicating that consumer is no longer part of the
group (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:670)
[2020-04-05 13:29:03,155] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Lost previously assigned partitions
group-max-size-test-5, group-max-size-test-4
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:314)
[2020-04-05 13:29:03,156] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] (Re-)joining group
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:551)
[2020-04-05 13:29:03,154] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Attempt to heartbeat failed since member id
ConsumerTestConsumer-2d2886ad-1244-4ef7-9e07-62282c3547fd is not valid.
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:1054)
[2020-04-05 13:29:03,156] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Attempt to heartbeat failed since member id
ConsumerTestConsumer-42d0fa9d-cfbb-458f-afe9-99a75fef8e08 is not valid.
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:1054)
[2020-04-05 13:29:03,157] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Giving away all assigned partitions as lost since
generation has been reset,indicating that consumer is no longer part of the
group (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:670)
[2020-04-05 13:29:03,158] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Lost previously assigned partitions
group-max-size-test-2, group-max-size-test-3
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:314)
[2020-04-05 13:29:03,158] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] (Re-)joining group
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:551)
[2020-04-05 13:29:03,157] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Giving away all assigned partitions as lost since
generation has been reset,indicating that consumer is no longer part of the
group (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:670)
[2020-04-05 13:29:03,159] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Lost previously assigned partitions
group-max-size-test-1, group-max-size-test-0
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:314)
[2020-04-05 13:29:03,159] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] (Re-)joining group
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:551)
[2020-04-05 13:29:03,160] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] (Re-)joining group
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:551)
[2020-04-05 13:29:03,161] INFO [GroupCoordinator 2]: Preparing to rebalance
group group-max-size-test in state PreparingRebalance with old generation 0
(__consumer_offsets-0) (reason: Adding new member
ConsumerTestConsumer-84fd5153-c425-464d-a724-04022a0608f7 with group instanceid
None) (kafka.coordinator.group.GroupCoordinator:66)
[2020-04-05 13:29:03,158] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] (Re-)joining group
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:551)
[2020-04-05 13:29:03,160] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] (Re-)joining group
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:551)
[2020-04-05 13:29:03,171] INFO [GroupCoordinator 2]: Stabilized group
group-max-size-test generation 1 (__consumer_offsets-0)
(kafka.coordinator.group.GroupCoordinator:66)
[2020-04-05 13:29:03,605] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Finished assignment for group at generation 1:
{ConsumerTestConsumer-84fd5153-c425-464d-a724-04022a0608f7=Assignment(partitions=[group-max-size-test-0,
group-max-size-test-1]),
ConsumerTestConsumer-e25aedeb-73fd-4fae-b56c-fa929f11a9df=Assignment(partitions=[group-max-size-test-4,
group-max-size-test-5]),
ConsumerTestConsumer-8ca065a1-2ce4-44d5-881c-c6f01cb0d110=Assignment(partitions=[group-max-size-test-2,
group-max-size-test-3])}
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:605)
[2020-04-05 13:29:03,606] INFO [GroupCoordinator 2]: Assignment received from
leader for group group-max-size-test for generation 1
(kafka.coordinator.group.GroupCoordinator:66)
[2020-04-05 13:29:03,610] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Successfully joined group with generation 1
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:502)
[2020-04-05 13:29:03,611] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Adding newly assigned partitions:
group-max-size-test-1, group-max-size-test-0
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:276)
[2020-04-05 13:29:03,612] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Found no committed offset for partition
group-max-size-test-1
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1297)
[2020-04-05 13:29:03,612] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Found no committed offset for partition
group-max-size-test-0
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1297)
[2020-04-05 13:29:03,611] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Successfully joined group with generation 1
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:502)
[2020-04-05 13:29:03,611] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Successfully joined group with generation 1
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:502)
[2020-04-05 13:29:03,614] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Adding newly assigned partitions:
group-max-size-test-2, group-max-size-test-3
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:276)
[2020-04-05 13:29:03,614] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Adding newly assigned partitions:
group-max-size-test-5, group-max-size-test-4
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:276)
[2020-04-05 13:29:03,616] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Found no committed offset for partition
group-max-size-test-2
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1297)
[2020-04-05 13:29:03,617] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Found no committed offset for partition
group-max-size-test-3
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1297)
[2020-04-05 13:29:03,617] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Resetting offset for partition
group-max-size-test-1 to offset 0.
(org.apache.kafka.clients.consumer.internals.SubscriptionState:383)
[2020-04-05 13:29:03,617] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Found no committed offset for partition
group-max-size-test-5
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1297)
[2020-04-05 13:29:03,618] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Found no committed offset for partition
group-max-size-test-4
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1297)
[2020-04-05 13:29:03,619] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Resetting offset for partition
group-max-size-test-3 to offset 0.
(org.apache.kafka.clients.consumer.internals.SubscriptionState:383)
[2020-04-05 13:29:03,619] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Resetting offset for partition
group-max-size-test-4 to offset 0.
(org.apache.kafka.clients.consumer.internals.SubscriptionState:383)
[2020-04-05 13:29:03,645] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Resetting offset for partition
group-max-size-test-2 to offset 0.
(org.apache.kafka.clients.consumer.internals.SubscriptionState:383)
[2020-04-05 13:29:03,646] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Resetting offset for partition
group-max-size-test-0 to offset 0.
(org.apache.kafka.clients.consumer.internals.SubscriptionState:383)
[2020-04-05 13:29:03,651] INFO [Consumer clientId=ConsumerTestConsumer,
groupId=group-max-size-test] Resetting offset for partition
group-max-size-test-5 to offset 0.
(org.apache.kafka.clients.consumer.internals.SubscriptionState:383){noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)