[jira] [Commented] (KAFKA-16310) ListOffsets doesn't report the offset with maxTimestamp anymore
[ https://issues.apache.org/jira/browse/KAFKA-16310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823280#comment-17823280 ] Jason Gustafson commented on KAFKA-16310: - [~chia7712] Thanks for the analysis. Ironically, my original approach was to fix the compressed path to match the uncompressed path so that we passed along the actual offset of max timestamp. I ended up doing the opposite though. If either you or [~showuon] have time to submit a fix, I am happy to review. > ListOffsets doesn't report the offset with maxTimestamp anymore > --- > > Key: KAFKA-16310 > URL: https://issues.apache.org/jira/browse/KAFKA-16310 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.7.0 >Reporter: Emanuele Sabellico >Priority: Blocker > Fix For: 3.7.1 > > > Updated: This is confirmed a regression issue in v3.7.0. > The impact of this issue is that when there is a batch containing records > with timestamp not in order, the offset of the timestamp will be wrong.(ex: > the timestamp for t0 should be mapping to offset 10, but will get offset 12.. > etc). It'll cause the time index is putting the wrong offset, so the result > will be unexpected. > === > The last offset is reported instead. > A test in librdkafka (0081/do_test_ListOffsets) is failing an it's checking > that the offset with the max timestamp is the middle one and not the last > one. The tests is passing with 3.6.0 and previous versions > This is the test: > [https://github.com/confluentinc/librdkafka/blob/a6d85bdbc1023b1a5477b8befe516242c3e182f6/tests/0081-admin.c#L4989] > > there are three messages, with timestamps: > {noformat} > t0 + 100 > t0 + 400 > t0 + 250{noformat} > and indices 0,1,2. > then a ListOffsets with RD_KAFKA_OFFSET_SPEC_MAX_TIMESTAMP is done. > it should return offset 1 but in 3.7.0 and trunk is returning offset 2 > Even after 5 seconds from producing it's still returning 2 as the offset with > max timestamp. > ProduceRequest and ListOffsets were sent to the same broker (2), the leader > didn't change. > {code:java} > %7|1709134230.019|SEND|0081_admin#producer-3| > [thrd:localhost:39951/bootstrap]: localhost:39951/2: Sent ProduceRequest (v7, > 206 bytes @ 0, CorrId 2) %7|1709134230.020|RECV|0081_admin#producer-3| > [thrd:localhost:39951/bootstrap]: localhost:39951/2: Received ProduceResponse > (v7, 95 bytes, CorrId 2, rtt 1.18ms) > %7|1709134230.020|MSGSET|0081_admin#producer-3| > [thrd:localhost:39951/bootstrap]: localhost:39951/2: > rdkafkatest_rnd22e8d8ec45b53f98_do_test_ListOffsets [0]: MessageSet with 3 > message(s) (MsgId 0, BaseSeq -1) delivered {code} > {code:java} > %7|1709134235.021|SEND|0081_admin#producer-2| > [thrd:localhost:39951/bootstrap]: localhost:39951/2: Sent ListOffsetsRequest > (v7, 103 bytes @ 0, CorrId 7) %7|1709134235.022|RECV|0081_admin#producer-2| > [thrd:localhost:39951/bootstrap]: localhost:39951/2: Received > ListOffsetsResponse (v7, 88 bytes, CorrId 7, rtt 0.54ms){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16179) NPE handle ApiVersions during controller failover
[ https://issues.apache.org/jira/browse/KAFKA-16179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809591#comment-17809591 ] Jason Gustafson commented on KAFKA-16179: - It is probably not a blocker. It looks like it is a race condition on controller shutdown. The impact is probably a failed connection on a shutting down controller. > NPE handle ApiVersions during controller failover > - > > Key: KAFKA-16179 > URL: https://issues.apache.org/jira/browse/KAFKA-16179 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Major > > {code:java} > java.lang.NullPointerException: Cannot invoke > "org.apache.kafka.metadata.publisher.FeaturesPublisher.features()" because > the return value of "kafka.server.ControllerServer.featuresPublisher()" is > null > at > kafka.server.ControllerServer.$anonfun$startup$12(ControllerServer.scala:242) > at > kafka.server.SimpleApiVersionManager.features(ApiVersionManager.scala:122) > at > kafka.server.SimpleApiVersionManager.apiVersionResponse(ApiVersionManager.scala:111) > at > kafka.server.ControllerApis.$anonfun$handleApiVersionsRequest$3(ControllerApis.scala:566) > at > kafka.server.ControllerApis.$anonfun$handleApiVersionsRequest$3$adapted(ControllerApis.scala:566 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16179) NPE handle ApiVersions during controller failover
Jason Gustafson created KAFKA-16179: --- Summary: NPE handle ApiVersions during controller failover Key: KAFKA-16179 URL: https://issues.apache.org/jira/browse/KAFKA-16179 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson {code:java} java.lang.NullPointerException: Cannot invoke "org.apache.kafka.metadata.publisher.FeaturesPublisher.features()" because the return value of "kafka.server.ControllerServer.featuresPublisher()" is null at kafka.server.ControllerServer.$anonfun$startup$12(ControllerServer.scala:242) at kafka.server.SimpleApiVersionManager.features(ApiVersionManager.scala:122) at kafka.server.SimpleApiVersionManager.apiVersionResponse(ApiVersionManager.scala:111) at kafka.server.ControllerApis.$anonfun$handleApiVersionsRequest$3(ControllerApis.scala:566) at kafka.server.ControllerApis.$anonfun$handleApiVersionsRequest$3$adapted(ControllerApis.scala:566 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16012) Incomplete range assignment in consumer
[ https://issues.apache.org/jira/browse/KAFKA-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796889#comment-17796889 ] Jason Gustafson commented on KAFKA-16012: - This appears to be a regression introduced from KIP-951. I have a test case which reproduces the problem: [https://github.com/hachikuji/kafka/commit/489b81f81b401a28a604efcfce5059047558fe3e.] Basically we are losing some of the partition state after invoking `Metadata.updatePartitionLeadership`. And I do see this in the logs just before the assignment: {code:java} [2023-12-13 04:58:55,738] DEBUG [Consumer clientId=consumer-grouped-transactions-test-consumer-group-1, groupId=grouped-transactions-test-consumer-group] For input-topic-1, received error FENCED_LEADER_EPOCH, with leaderIdAndEpoch LeaderIdAndEpoch(leaderId=2, leaderEpoch=3) (org.apache.kafka.clients.consumer.internals.AbstractFetch) [2023-12-13 04:58:55,738] DEBUG [Consumer clientId=consumer-grouped-transactions-test-consumer-group-1, groupId=grouped-transactions-test-consumer-group] For input-topic-1, as the leader was updated, position will be validated. (org.apache.kafka.clients.consumer.internals.AbstractFetch) {code} > Incomplete range assignment in consumer > --- > > Key: KAFKA-16012 > URL: https://issues.apache.org/jira/browse/KAFKA-16012 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Blocker > Fix For: 3.7.0 > > > We were looking into test failures here: > https://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/system-test-kafka-branch-builder--1702475525--jolshan--kafka-15784--7cad567675/2023-12-13--001./2023-12-13–001./report.html. > > Here is the first failure in the report: > {code:java} > > test_id: > kafkatest.tests.core.group_mode_transactions_test.GroupModeTransactionsTest.test_transactions.failure_mode=clean_bounce.bounce_target=brokers > status: FAIL > run time: 3 minutes 4.950 seconds > TimeoutError('Consumer consumed only 88223 out of 10 messages in > 90s') {code} > > We traced the failure to an apparent bug during the last rebalance before the > group became empty. The last remaining instance seems to receive an > incomplete assignment which prevents it from completing expected consumption > on some partitions. Here is the rebalance from the coordinator's perspective: > {code:java} > server.log.2023-12-13-04:[2023-12-13 04:58:56,987] INFO [GroupCoordinator 3]: > Stabilized group grouped-transactions-test-consumer-group generation 5 > (__consumer_offsets-2) with 1 members > (kafka.coordinator.group.GroupCoordinator) > server.log.2023-12-13-04:[2023-12-13 04:58:56,990] INFO [GroupCoordinator 3]: > Assignment received from leader > consumer-grouped-transactions-test-consumer-group-1-2164f472-93f3-4176-af3f-23d4ed8b37fd > for group grouped-transactions-test-consumer-group for generation 5. The > group has 1 members, 0 of which are static. > (kafka.coordinator.group.GroupCoordinator) {code} > The group is down to one member in generation 5. In the previous generation, > the consumer in question reported this assignment: > {code:java} > // Gen 4: we've got partitions 0-4 > [2023-12-13 04:58:52,631] DEBUG [Consumer > clientId=consumer-grouped-transactions-test-consumer-group-1, > groupId=grouped-transactions-test-consumer-group] Executing onJoinComplete > with generation 4 and memberId > consumer-grouped-transactions-test-consumer-group-1-2164f472-93f3-4176-af3f-23d4ed8b37fd > (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) > [2023-12-13 04:58:52,631] INFO [Consumer > clientId=consumer-grouped-transactions-test-consumer-group-1, > groupId=grouped-transactions-test-consumer-group] Notifying assignor about > the new Assignment(partitions=[input-topic-0, input-topic-1, input-topic-2, > input-topic-3, input-topic-4]) > (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) {code} > However, in generation 5, we seem to be assigned only one partition: > {code:java} > // Gen 5: Now we have only partition 1? But aren't we the last member in the > group? > [2023-12-13 04:58:56,954] DEBUG [Consumer > clientId=consumer-grouped-transactions-test-consumer-group-1, > groupId=grouped-transactions-test-consumer-group] Executing onJoinComplete > with generation 5 and memberId > consumer-grouped-transactions-test-consumer-group-1-2164f472-93f3-4176-af3f-23d4ed8b37fd > (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) > [2023-12-13 04:58:56,955] INFO [Consumer > clientId=consumer-grouped-transactions-test-consumer-group-1, > groupId=grouped-transactions-test-consumer-group] Notifying assignor about > the new
[jira] [Created] (KAFKA-16012) Incomplete range assignment in consumer
Jason Gustafson created KAFKA-16012: --- Summary: Incomplete range assignment in consumer Key: KAFKA-16012 URL: https://issues.apache.org/jira/browse/KAFKA-16012 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson Fix For: 3.7.0 We were looking into test failures here: https://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/system-test-kafka-branch-builder--1702475525--jolshan--kafka-15784--7cad567675/2023-12-13--001./2023-12-13–001./report.html. Here is the first failure in the report: {code:java} test_id: kafkatest.tests.core.group_mode_transactions_test.GroupModeTransactionsTest.test_transactions.failure_mode=clean_bounce.bounce_target=brokers status: FAIL run time: 3 minutes 4.950 seconds TimeoutError('Consumer consumed only 88223 out of 10 messages in 90s') {code} We traced the failure to an apparent bug during the last rebalance before the group became empty. The last remaining instance seems to receive an incomplete assignment which prevents it from completing expected consumption on some partitions. Here is the rebalance from the coordinator's perspective: {code:java} server.log.2023-12-13-04:[2023-12-13 04:58:56,987] INFO [GroupCoordinator 3]: Stabilized group grouped-transactions-test-consumer-group generation 5 (__consumer_offsets-2) with 1 members (kafka.coordinator.group.GroupCoordinator) server.log.2023-12-13-04:[2023-12-13 04:58:56,990] INFO [GroupCoordinator 3]: Assignment received from leader consumer-grouped-transactions-test-consumer-group-1-2164f472-93f3-4176-af3f-23d4ed8b37fd for group grouped-transactions-test-consumer-group for generation 5. The group has 1 members, 0 of which are static. (kafka.coordinator.group.GroupCoordinator) {code} The group is down to one member in generation 5. In the previous generation, the consumer in question reported this assignment: {code:java} // Gen 4: we've got partitions 0-4 [2023-12-13 04:58:52,631] DEBUG [Consumer clientId=consumer-grouped-transactions-test-consumer-group-1, groupId=grouped-transactions-test-consumer-group] Executing onJoinComplete with generation 4 and memberId consumer-grouped-transactions-test-consumer-group-1-2164f472-93f3-4176-af3f-23d4ed8b37fd (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) [2023-12-13 04:58:52,631] INFO [Consumer clientId=consumer-grouped-transactions-test-consumer-group-1, groupId=grouped-transactions-test-consumer-group] Notifying assignor about the new Assignment(partitions=[input-topic-0, input-topic-1, input-topic-2, input-topic-3, input-topic-4]) (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) {code} However, in generation 5, we seem to be assigned only one partition: {code:java} // Gen 5: Now we have only partition 1? But aren't we the last member in the group? [2023-12-13 04:58:56,954] DEBUG [Consumer clientId=consumer-grouped-transactions-test-consumer-group-1, groupId=grouped-transactions-test-consumer-group] Executing onJoinComplete with generation 5 and memberId consumer-grouped-transactions-test-consumer-group-1-2164f472-93f3-4176-af3f-23d4ed8b37fd (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) [2023-12-13 04:58:56,955] INFO [Consumer clientId=consumer-grouped-transactions-test-consumer-group-1, groupId=grouped-transactions-test-consumer-group] Notifying assignor about the new Assignment(partitions=[input-topic-1]) (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) {code} The assignment type is range from the JoinGroup for generation 5. The decoded metadata sent by the consumer is this: {code:java} Subscription(topics=[input-topic], ownedPartitions=[], groupInstanceId=null, generationId=4, rackId=null) {code} Here is the decoded assignment from the SyncGroup: {code:java} Assignment(partitions=[input-topic-1]) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-15828) Protect clients from broker hostname reuse
Jason Gustafson created KAFKA-15828: --- Summary: Protect clients from broker hostname reuse Key: KAFKA-15828 URL: https://issues.apache.org/jira/browse/KAFKA-15828 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson In some environments such as k8s, brokers may be assigned to nodes dynamically from an available pool. When a cluster is rolling, it is possible for the client to see the same node advertised for different broker IDs in a short period of time. For example, kafka-1 might be initially assigned to node1. Before the client is able to establish a connection, it could be that kafka-3 is now on node1 instead. Currently there is no protection in the client or in the protocol for this scenario. If the connection succeeds, the client will assume it has a good connection to kafka-1. Until something disrupts the connection, it will continue under this assumption even if the hostname for kafka-1 changes. We have observed this scenario in practice. The client connected to the wrong broker through stale hostname information. It was unable to produce data because of persistent NOT_LEADER errors. The only way to recover in the end was by restarting the client to force a reconnection. We have discussed a couple potential solutions to this problem: # Let the client be smarter managing the connection/hostname mapping. When it detects that a hostname has changed, it should force a disconnect to ensure it connects to the right node. # We can modify the protocol to verify that the client has connected to the intended broker. For example, we can add a field to ApiVersions to indicate the intended broker ID. The broker receiving the request can return an error if its ID does not match that in the request. Are there alternatives? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-15221) Potential race condition between requests from rebooted followers
[ https://issues.apache.org/jira/browse/KAFKA-15221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17774161#comment-17774161 ] Jason Gustafson commented on KAFKA-15221: - We merged the fix to trunk. Given that it is a rare case which has existed for some time, we decided to target 3.7 forward and not cherry pick for older releases. > Potential race condition between requests from rebooted followers > - > > Key: KAFKA-15221 > URL: https://issues.apache.org/jira/browse/KAFKA-15221 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.5.0 >Reporter: Calvin Liu >Assignee: Calvin Liu >Priority: Blocker > Fix For: 3.7.0 > > > When the leader processes the fetch request, it does not acquire locks when > updating the replica fetch state. Then there can be a race between the fetch > requests from a rebooted follower. > T0, broker 1 sends a fetch to broker 0(leader). At the moment, broker 1 is > not in ISR. > T1, broker 1 crashes. > T2 broker 1 is back online and receives a new broker epoch. Also, it sends a > new Fetch request. > T3 broker 0 receives the old fetch requests and decides to expand the ISR. > T4 Right before broker 0 starts to fill the AlterPartitoin request, the new > fetch request comes in and overwrites the fetch state. Then broker 0 uses the > new broker epoch on the AlterPartition request. > In this way, the AlterPartition request can get around KIP-903 and wrongly > update the ISR. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-15221) Potential race condition between requests from rebooted followers
[ https://issues.apache.org/jira/browse/KAFKA-15221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-15221. - Fix Version/s: (was: 3.5.2) Resolution: Fixed > Potential race condition between requests from rebooted followers > - > > Key: KAFKA-15221 > URL: https://issues.apache.org/jira/browse/KAFKA-15221 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.5.0 >Reporter: Calvin Liu >Assignee: Calvin Liu >Priority: Blocker > Fix For: 3.7.0 > > > When the leader processes the fetch request, it does not acquire locks when > updating the replica fetch state. Then there can be a race between the fetch > requests from a rebooted follower. > T0, broker 1 sends a fetch to broker 0(leader). At the moment, broker 1 is > not in ISR. > T1, broker 1 crashes. > T2 broker 1 is back online and receives a new broker epoch. Also, it sends a > new Fetch request. > T3 broker 0 receives the old fetch requests and decides to expand the ISR. > T4 Right before broker 0 starts to fill the AlterPartitoin request, the new > fetch request comes in and overwrites the fetch state. Then broker 0 uses the > new broker epoch on the AlterPartition request. > In this way, the AlterPartition request can get around KIP-903 and wrongly > update the ISR. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14694) RPCProducerIdManager should not wait for a new block
[ https://issues.apache.org/jira/browse/KAFKA-14694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14694. - Fix Version/s: 3.6.0 Resolution: Fixed > RPCProducerIdManager should not wait for a new block > > > Key: KAFKA-14694 > URL: https://issues.apache.org/jira/browse/KAFKA-14694 > Project: Kafka > Issue Type: Bug >Reporter: Jeff Kim >Assignee: Jeff Kim >Priority: Major > Fix For: 3.6.0 > > > RPCProducerIdManager initiates an async request to the controller to grab a > block of producer IDs and then blocks waiting for a response from the > controller. > This is done in the request handler threads while holding a global lock. This > means that if many producers are requesting producer IDs and the controller > is slow to respond, many threads can get stuck waiting for the lock. > This may also be a deadlock concern under the following scenario: > if the controller has 1 request handler thread (1 chosen for simplicity) and > receives an InitProducerId request, it may deadlock. > basically any time the controller has N InitProducerId requests where N >= # > of request handler threads has the potential to deadlock. > consider this: > 1. the request handler thread tries to handle an InitProducerId request to > the controller by forwarding an AllocateProducerIds request. > 2. the request handler thread then waits on the controller response (timed > poll on nextProducerIdBlock) > 3. the controller's request handler threads need to pick this request up, and > handle it, but the controller's request handler threads are blocked waiting > for the forwarded AllocateProducerIds response. > > We should not block while waiting for a new block and instead return > immediately to free the request handler threads. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14831) Illegal state errors should be fatal in transactional producer
Jason Gustafson created KAFKA-14831: --- Summary: Illegal state errors should be fatal in transactional producer Key: KAFKA-14831 URL: https://issues.apache.org/jira/browse/KAFKA-14831 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson In KAFKA-14830, the producer hit an illegal state error. The error was propagated to the `Sender` thread and logged, but the producer otherwise continued on. It would be better to make illegal state errors fatal since continuing to write to transactions when the internal state is inconsistent may cause incorrect and unpredictable behavior. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14830) Illegal state error in transactional producer
Jason Gustafson created KAFKA-14830: --- Summary: Illegal state error in transactional producer Key: KAFKA-14830 URL: https://issues.apache.org/jira/browse/KAFKA-14830 Project: Kafka Issue Type: Bug Affects Versions: 3.1.2 Reporter: Jason Gustafson We have seen the following illegal state error in the producer: {code:java} [Producer clientId=client-id2, transactionalId=transactional-id] Transiting to abortable error state due to org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for topic-0:120027 ms has passed since batch creation [Producer clientId=client-id2, transactionalId=transactional-id] Transiting to abortable error state due to org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for topic-1:120026 ms has passed since batch creation [Producer clientId=client-id2, transactionalId=transactional-id] Aborting incomplete transaction [Producer clientId=client-id2, transactionalId=transactional-id] Invoking InitProducerId with current producer ID and epoch ProducerIdAndEpoch(producerId=191799, epoch=0) in order to bump the epoch [Producer clientId=client-id2, transactionalId=transactional-id] ProducerId set to 191799 with epoch 1 [Producer clientId=client-id2, transactionalId=transactional-id] Transiting to abortable error state due to org.apache.kafka.common.errors.NetworkException: Disconnected from node 4 [Producer clientId=client-id2, transactionalId=transactional-id] Transiting to abortable error state due to org.apache.kafka.common.errors.TimeoutException: The request timed out. [Producer clientId=client-id2, transactionalId=transactional-id] Uncaught error in request completion: java.lang.IllegalStateException: TransactionalId transactional-id: Invalid transition attempted from state READY to state ABORTABLE_ERROR at org.apache.kafka.clients.producer.internals.TransactionManager.transitionTo(TransactionManager.java:1089) at org.apache.kafka.clients.producer.internals.TransactionManager.transitionToAbortableError(TransactionManager.java:508) at org.apache.kafka.clients.producer.internals.TransactionManager.maybeTransitionToErrorState(TransactionManager.java:734) at org.apache.kafka.clients.producer.internals.TransactionManager.handleFailedBatch(TransactionManager.java:739) at org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:753) at org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:743) at org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:695) at org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:634) at org.apache.kafka.clients.producer.internals.Sender.lambda$null$1(Sender.java:575) at java.base/java.util.ArrayList.forEach(ArrayList.java:1541) at org.apache.kafka.clients.producer.internals.Sender.lambda$handleProduceResponse$2(Sender.java:562) at java.base/java.lang.Iterable.forEach(Iterable.java:75) at org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:562) at org.apache.kafka.clients.producer.internals.Sender.lambda$sendProduceRequest$5(Sender.java:836) at org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:109) at org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:583) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:575) at org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:328) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:243) at java.base/java.lang.Thread.run(Thread.java:829) {code} The producer hits timeouts which cause it to abort an active transaction. After aborting, the producer bumps its epoch, which transitions it back to the `READY` state. Following this, there are two errors for inflight requests, which cause an illegal state transition to `ABORTABLE_ERROR`. But how could the transaction ABORT complete if there were still inflight requests? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14664) Raft idle ratio is inaccurate
[ https://issues.apache.org/jira/browse/KAFKA-14664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14664: Fix Version/s: 3.5.0 > Raft idle ratio is inaccurate > - > > Key: KAFKA-14664 > URL: https://issues.apache.org/jira/browse/KAFKA-14664 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > Fix For: 3.5.0 > > > The `poll-idle-ratio-avg` metric is intended to track how idle the raft IO > thread is. When completely idle, it should measure 1. When saturated, it > should measure 0. The problem with the current measurements is that they are > treated equally with respect to time. For example, say we poll twice with the > following durations: > Poll 1: 2s > Poll 2: 0s > Assume that the busy time is negligible, so 2s passes overall. > In the first measurement, 2s is spent waiting, so we compute and record a > ratio of 1.0. In the second measurement, no time passes, and we record 0.0. > The idle ratio is then computed as the average of these two values (1.0 + 0.0 > / 2 = 0.5), which suggests that the process was busy for 1s, which > overestimates the true busy time. > Instead, we should sum up the time waiting over the full interval. 2s passes > total here and 2s is idle, so we should compute 1.0. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14664) Raft idle ratio is inaccurate
[ https://issues.apache.org/jira/browse/KAFKA-14664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14664. - Resolution: Fixed > Raft idle ratio is inaccurate > - > > Key: KAFKA-14664 > URL: https://issues.apache.org/jira/browse/KAFKA-14664 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > Fix For: 3.5.0 > > > The `poll-idle-ratio-avg` metric is intended to track how idle the raft IO > thread is. When completely idle, it should measure 1. When saturated, it > should measure 0. The problem with the current measurements is that they are > treated equally with respect to time. For example, say we poll twice with the > following durations: > Poll 1: 2s > Poll 2: 0s > Assume that the busy time is negligible, so 2s passes overall. > In the first measurement, 2s is spent waiting, so we compute and record a > ratio of 1.0. In the second measurement, no time passes, and we record 0.0. > The idle ratio is then computed as the average of these two values (1.0 + 0.0 > / 2 = 0.5), which suggests that the process was busy for 1s, which > overestimates the true busy time. > Instead, we should sum up the time waiting over the full interval. 2s passes > total here and 2s is idle, so we should compute 1.0. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14664) Raft idle ratio is inaccurate
[ https://issues.apache.org/jira/browse/KAFKA-14664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14664: Affects Version/s: 3.3.2 3.3.1 3.4.0 3.3.0 > Raft idle ratio is inaccurate > - > > Key: KAFKA-14664 > URL: https://issues.apache.org/jira/browse/KAFKA-14664 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.3.0, 3.4.0, 3.3.1, 3.3.2 >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > Fix For: 3.5.0 > > > The `poll-idle-ratio-avg` metric is intended to track how idle the raft IO > thread is. When completely idle, it should measure 1. When saturated, it > should measure 0. The problem with the current measurements is that they are > treated equally with respect to time. For example, say we poll twice with the > following durations: > Poll 1: 2s > Poll 2: 0s > Assume that the busy time is negligible, so 2s passes overall. > In the first measurement, 2s is spent waiting, so we compute and record a > ratio of 1.0. In the second measurement, no time passes, and we record 0.0. > The idle ratio is then computed as the average of these two values (1.0 + 0.0 > / 2 = 0.5), which suggests that the process was busy for 1s, which > overestimates the true busy time. > Instead, we should sum up the time waiting over the full interval. 2s passes > total here and 2s is idle, so we should compute 1.0. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-6793) Unnecessary warning log message
[ https://issues.apache.org/jira/browse/KAFKA-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-6793. Fix Version/s: 3.5.0 Resolution: Fixed We resolved this by changing the logging to info level. It may still be useful in some cases, but most of the time it the "warning" is spurious. > Unnecessary warning log message > > > Key: KAFKA-6793 > URL: https://issues.apache.org/jira/browse/KAFKA-6793 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 1.1.0 >Reporter: Anna O >Assignee: Philip Nee >Priority: Minor > Fix For: 3.5.0 > > > When upgraded KafkaStreams from 0.11.0.2 to 1.1.0 the following warning log > started to appear: > level: WARN > logger: org.apache.kafka.clients.consumer.ConsumerConfig > message: The configuration 'admin.retries' was supplied but isn't a known > config. > The config is not explicitly supplied to the streams. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-13972) Reassignment cancellation causes stray replicas
[ https://issues.apache.org/jira/browse/KAFKA-13972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-13972. - Resolution: Fixed > Reassignment cancellation causes stray replicas > --- > > Key: KAFKA-13972 > URL: https://issues.apache.org/jira/browse/KAFKA-13972 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > Fix For: 3.4.1 > > > A stray replica is one that is left behind on a broker after the partition > has been reassigned to other brokers or the partition has been deleted. We > found one case where this can happen is after a cancelled reassignment. When > a reassignment is cancelled, the controller sends `StopReplica` requests to > any of the adding replicas, but it does not necessarily bump the leader > epoch. Following > [KIP-570|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-570%3A+Add+leader+epoch+in+StopReplicaRequest],] > brokers will ignore `StopReplica` requests if the leader epoch matches the > current partition leader epoch. So we need to bump the epoch whenever we need > to ensure that `StopReplica` will be received. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14672) Producer queue time does not reflect batches expired in the accumulator
[ https://issues.apache.org/jira/browse/KAFKA-14672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17683183#comment-17683183 ] Jason Gustafson commented on KAFKA-14672: - [~kirktrue] Nope. Please do. > Producer queue time does not reflect batches expired in the accumulator > --- > > Key: KAFKA-14672 > URL: https://issues.apache.org/jira/browse/KAFKA-14672 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Kirk True >Priority: Major > > The producer exposes two metrics for the time a record has spent in the > accumulator waiting to be drained: > * `record-queue-time-avg` > * `record-queue-time-max` > The metric is only updated when a batch is ready to send to a broker. It is > also possible for a batch to be expired before it can be sent, but in this > case, the metric is not updated. This seems surprising and makes the queue > time misleading. The only metric I could find that does reflect batch > expirations in the accumulator is the generic `record-error-rate`. It would > make sense to let the queue-time metrics record the time spent in the queue > regardless of the outcome of the record send attempt. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14672) Producer queue time does not reflect batches expired in the accumulator
[ https://issues.apache.org/jira/browse/KAFKA-14672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14672: Description: The producer exposes two metrics for the time a record has spent in the accumulator waiting to be drained: * `record-queue-time-avg` * `record-queue-time-max` The metric is only updated when a batch is ready to send to a broker. It is also possible for a batch to be expired before it can be sent, but in this case, the metric is not updated. This seems surprising and makes the queue time misleading. The only metric I could find that does reflect batch expirations in the accumulator is the generic `record-error-rate`. It would make sense to let the queue-time metrics record the time spent in the queue regardless of the outcome of the record send attempt. was: The producer exposes two metrics for the time a record has spent in the accumulator waiting to be drained: * `record-queue-time-avg` * `record-queue-time-max` The metric is only updated when a batch is drained for sending to the broker. It is also possible for a batch to be expired before it can be drained, but in this case, the metric is not updated. This seems surprising and makes the queue time misleading. The only metric I could find that does reflect batch expirations in the accumulator is the generic `record-error-rate`. It would make sense to let the queue-time metrics record the time spent in the queue regardless of the outcome of the record send attempt. > Producer queue time does not reflect batches expired in the accumulator > --- > > Key: KAFKA-14672 > URL: https://issues.apache.org/jira/browse/KAFKA-14672 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Major > > The producer exposes two metrics for the time a record has spent in the > accumulator waiting to be drained: > * `record-queue-time-avg` > * `record-queue-time-max` > The metric is only updated when a batch is ready to send to a broker. It is > also possible for a batch to be expired before it can be sent, but in this > case, the metric is not updated. This seems surprising and makes the queue > time misleading. The only metric I could find that does reflect batch > expirations in the accumulator is the generic `record-error-rate`. It would > make sense to let the queue-time metrics record the time spent in the queue > regardless of the outcome of the record send attempt. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14672) Producer queue time does not reflect batches expired in the accumulator
Jason Gustafson created KAFKA-14672: --- Summary: Producer queue time does not reflect batches expired in the accumulator Key: KAFKA-14672 URL: https://issues.apache.org/jira/browse/KAFKA-14672 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson The producer exposes two metrics for the time a record has spent in the accumulator waiting to be drained: * `record-queue-time-avg` * `record-queue-time-max` The metric is only updated when a batch is drained for sending to the broker. It is also possible for a batch to be expired before it can be drained, but in this case, the metric is not updated. This seems surprising and makes the queue time misleading. The only metric I could find that does reflect batch expirations in the accumulator is the generic `record-error-rate`. It would make sense to let the queue-time metrics record the time spent in the queue regardless of the outcome of the record send attempt. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14664) Raft idle ratio is inaccurate
[ https://issues.apache.org/jira/browse/KAFKA-14664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14664: Description: The `poll-idle-ratio-avg` metric is intended to track how idle the raft IO thread is. When completely idle, it should measure 1. When saturated, it should measure 0. The problem with the current measurements is that they are treated equally with respect to time. For example, say we poll twice with the following durations: Poll 1: 2s Poll 2: 0s Assume that the busy time is negligible, so 2s passes overall. In the first measurement, 2s is spent waiting, so we compute and record a ratio of 1.0. In the second measurement, no time passes, and we record 0.0. The idle ratio is then computed as the average of these two values (1.0 + 0.0 / 2 = 0.5), which suggests that the process was busy for 1s, which overestimates the true busy time. Instead, we should sum up the time waiting over the full interval. 2s passes total here and 2s is idle, so we should compute 1.0. was: The `poll-idle-ratio-avg` metric is intended to track how idle the raft IO thread is. When completely idle, it should measure 1. When saturated, it should measure 0. The problem with the current measurements is that they are treated equally with respect to time. For example, say we poll twice with the following durations: Poll 1: 2s Poll 2: 0s Assume that the busy time is negligible, so 2s passes overall. In the first measurement, 2s is spent waiting, so we compute and record a ratio of 1.0. In the second measurement, no time passes, and we record 0.0. The idle ratio is then computed as the average of these two values (1.0 + 0.0 / 2 = 0.5), which suggests that the process was busy for 1s. Instead, we should sum up the time waiting over the full interval. 2s passes total here and 2s is idle, so we should compute 1.0. > Raft idle ratio is inaccurate > - > > Key: KAFKA-14664 > URL: https://issues.apache.org/jira/browse/KAFKA-14664 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > > The `poll-idle-ratio-avg` metric is intended to track how idle the raft IO > thread is. When completely idle, it should measure 1. When saturated, it > should measure 0. The problem with the current measurements is that they are > treated equally with respect to time. For example, say we poll twice with the > following durations: > Poll 1: 2s > Poll 2: 0s > Assume that the busy time is negligible, so 2s passes overall. > In the first measurement, 2s is spent waiting, so we compute and record a > ratio of 1.0. In the second measurement, no time passes, and we record 0.0. > The idle ratio is then computed as the average of these two values (1.0 + 0.0 > / 2 = 0.5), which suggests that the process was busy for 1s, which > overestimates the true busy time. > Instead, we should sum up the time waiting over the full interval. 2s passes > total here and 2s is idle, so we should compute 1.0. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14664) Raft idle ratio is inaccurate
Jason Gustafson created KAFKA-14664: --- Summary: Raft idle ratio is inaccurate Key: KAFKA-14664 URL: https://issues.apache.org/jira/browse/KAFKA-14664 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson Assignee: Jason Gustafson The `poll-idle-ratio-avg` metric is intended to track how idle the raft IO thread is. When completely idle, it should measure 1. When saturated, it should measure 0. The problem with the current measurements is that they are treated equally with respect to time. For example, say we poll twice with the following durations: Poll 1: 2s Poll 2: 0s Assume that the busy time is negligible, so 2s passes overall. In the first measurement, 2s is spent waiting, so we compute and record a ratio of 1.0. In the second measurement, no time passes, and we record 0.0. The idle ratio is then computed as the average of these two values (1.0 + 0.0 / 2 = 0.5), which suggests that the process was busy for 1s. Instead, we should sum up the time waiting over the full interval. 2s passes total here and 2s is idle, so we should compute 1.0. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14644) Process should stop after failure in raft IO thread
[ https://issues.apache.org/jira/browse/KAFKA-14644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14644. - Fix Version/s: 3.5.0 Resolution: Fixed > Process should stop after failure in raft IO thread > --- > > Key: KAFKA-14644 > URL: https://issues.apache.org/jira/browse/KAFKA-14644 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Major > Fix For: 3.5.0 > > > We have seen a few cases where an unexpected error in the Raft IO thread > causes the process to enter a zombie state where it is no longer > participating in the raft quorum. In this state, a controller can no longer > become leader or help in elections, and brokers can no longer update > metadata. It may be better to stop the process in this case since there is no > way to recover. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14648) Do not fail clients if bootstrap servers is not immediately resolvable
Jason Gustafson created KAFKA-14648: --- Summary: Do not fail clients if bootstrap servers is not immediately resolvable Key: KAFKA-14648 URL: https://issues.apache.org/jira/browse/KAFKA-14648 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson In dynamic environments, such as system tests, there is sometimes a delay between when a client is initialized and when the configured bootstrap servers become available in DNS. Currently clients will fail immediately if none of the bootstrap servers can resolve. It would be more convenient for these environments to provide a grace period to give more time for initialization. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14644) Process should stop after failure in raft IO thread
Jason Gustafson created KAFKA-14644: --- Summary: Process should stop after failure in raft IO thread Key: KAFKA-14644 URL: https://issues.apache.org/jira/browse/KAFKA-14644 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson We have seen a few cases where an unexpected error in the Raft IO thread causes the process to enter a zombie state where it is no longer participating in the raft quorum. In this state, a controller can no longer become leader or help in elections, and brokers can no longer update metadata. It may be better to stop the process in this case since there is no way to recover. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-13972) Reassignment cancellation causes stray replicas
[ https://issues.apache.org/jira/browse/KAFKA-13972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678390#comment-17678390 ] Jason Gustafson commented on KAFKA-13972: - The patch still needs to be picked into 3.4 after we have finished with 3.4.0. > Reassignment cancellation causes stray replicas > --- > > Key: KAFKA-13972 > URL: https://issues.apache.org/jira/browse/KAFKA-13972 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > Fix For: 3.4.1 > > > A stray replica is one that is left behind on a broker after the partition > has been reassigned to other brokers or the partition has been deleted. We > found one case where this can happen is after a cancelled reassignment. When > a reassignment is cancelled, the controller sends `StopReplica` requests to > any of the adding replicas, but it does not necessarily bump the leader > epoch. Following > [KIP-570|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-570%3A+Add+leader+epoch+in+StopReplicaRequest],] > brokers will ignore `StopReplica` requests if the leader epoch matches the > current partition leader epoch. So we need to bump the epoch whenever we need > to ensure that `StopReplica` will be received. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-13972) Reassignment cancellation causes stray replicas
[ https://issues.apache.org/jira/browse/KAFKA-13972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-13972: Fix Version/s: 3.4.1 > Reassignment cancellation causes stray replicas > --- > > Key: KAFKA-13972 > URL: https://issues.apache.org/jira/browse/KAFKA-13972 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > Fix For: 3.4.1 > > > A stray replica is one that is left behind on a broker after the partition > has been reassigned to other brokers or the partition has been deleted. We > found one case where this can happen is after a cancelled reassignment. When > a reassignment is cancelled, the controller sends `StopReplica` requests to > any of the adding replicas, but it does not necessarily bump the leader > epoch. Following > [KIP-570|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-570%3A+Add+leader+epoch+in+StopReplicaRequest],] > brokers will ignore `StopReplica` requests if the leader epoch matches the > current partition leader epoch. So we need to bump the epoch whenever we need > to ensure that `StopReplica` will be received. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14612) Topic config records written to log even when topic creation fails
[ https://issues.apache.org/jira/browse/KAFKA-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14612. - Fix Version/s: 3.4.0 Resolution: Fixed > Topic config records written to log even when topic creation fails > -- > > Key: KAFKA-14612 > URL: https://issues.apache.org/jira/browse/KAFKA-14612 > Project: Kafka > Issue Type: Bug > Components: kraft >Reporter: Jason Gustafson >Assignee: Andrew Grant >Priority: Major > Fix For: 3.4.0 > > > Config records are added when handling a `CreateTopics` request here: > [https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java#L549.] > If the subsequent validations fail and the topic is not created, these > records will still be written to the log. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14618) Off by one error in generated snapshot IDs causes misaligned fetching
Jason Gustafson created KAFKA-14618: --- Summary: Off by one error in generated snapshot IDs causes misaligned fetching Key: KAFKA-14618 URL: https://issues.apache.org/jira/browse/KAFKA-14618 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson Assignee: Jason Gustafson Fix For: 3.4.0 We implemented new snapshot generation logic here: [https://github.com/apache/kafka/pull/12983.] A few days prior to this patch getting merged, we had changed the `RaftClient` API to pass the _exclusive_ offset when generating snapshots instead of the inclusive offset: [https://github.com/apache/kafka/pull/12981.] Unfortunately, the new snapshot generation logic was not updated accordingly. The consequence of this is that the state on replicas can get out of sync. In the best case, the followers fail replication because the offset after loading a snapshot is no longer aligned on a batch boundary. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14612) Topic config records written to log even when topic creation fails
Jason Gustafson created KAFKA-14612: --- Summary: Topic config records written to log even when topic creation fails Key: KAFKA-14612 URL: https://issues.apache.org/jira/browse/KAFKA-14612 Project: Kafka Issue Type: Bug Components: kraft Reporter: Jason Gustafson Config records are added when handling a `CreateTopics` request here: [https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java#L549.] If the subsequent validations fail and the topic is not created, these records will still be written to the log. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14417) Producer doesn't handle REQUEST_TIMED_OUT for InitProducerIdRequest, treats as fatal error
[ https://issues.apache.org/jira/browse/KAFKA-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14417. - Fix Version/s: 4.0.0 3.3.2 Resolution: Fixed > Producer doesn't handle REQUEST_TIMED_OUT for InitProducerIdRequest, treats > as fatal error > -- > > Key: KAFKA-14417 > URL: https://issues.apache.org/jira/browse/KAFKA-14417 > Project: Kafka > Issue Type: Task >Affects Versions: 3.1.0, 3.0.0, 3.2.0, 3.3.0, 3.4.0 >Reporter: Justine Olshan >Assignee: Justine Olshan >Priority: Blocker > Fix For: 4.0.0, 3.3.2 > > > In TransactionManager we have a handler for InitProducerIdRequests > [https://github.com/apache/kafka/blob/19286449ee20f85cc81860e13df14467d4ce287c/clients/src/main/java/org/apache/kafka/clients/producer/internals/TransactionManager.java#LL1276C14-L1276C14] > However, we have the potential to return a REQUEST_TIMED_OUT error in > RPCProducerIdManager when the BrokerToControllerChannel manager times out: > [https://github.com/apache/kafka/blob/19286449ee20f85cc81860e13df14467d4ce287c/core/src/main/scala/kafka/coordinator/transaction/ProducerIdManager.scala#L236] > > or when the poll returns null: > [https://github.com/apache/kafka/blob/19286449ee20f85cc81860e13df14467d4ce287c/core/src/main/scala/kafka/coordinator/transaction/ProducerIdManager.scala#L170] > Since REQUEST_TIMED_OUT is not handled by the producer, we treat it as a > fatal error and the producer fails. With the default of idempotent producers, > this can cause more issues. > See this stack trace from 3.0: > {code:java} > ERROR [Producer clientId=console-producer] Aborting producer batches due to > fatal error (org.apache.kafka.clients.producer.internals.Sender) > org.apache.kafka.common.KafkaException: Unexpected error in > InitProducerIdResponse; The request timed out. > at > org.apache.kafka.clients.producer.internals.TransactionManager$InitProducerIdHandler.handleResponse(TransactionManager.java:1390) > at > org.apache.kafka.clients.producer.internals.TransactionManager$TxnRequestHandler.onComplete(TransactionManager.java:1294) > at > org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:109) > at > org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:658) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:650) > at > org.apache.kafka.clients.producer.internals.Sender.maybeSendAndPollTransactionalRequest(Sender.java:418) > at > org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:316) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:256) > at java.lang.Thread.run(Thread.java:748) > {code} > Seems like the commit that introduced the changes was this one: > [https://github.com/apache/kafka/commit/72d108274c98dca44514007254552481c731c958] > so we are vulnerable when the server code is ibp 3.0 and beyond. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-14439) Specify returned errors for various APIs and versions
[ https://issues.apache.org/jira/browse/KAFKA-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642726#comment-17642726 ] Jason Gustafson edited comment on KAFKA-14439 at 12/2/22 11:52 PM: --- Yeah, we've had so many compatibility breaks due to error code usage. Putting the errors into the spec would also enable better enforcement. One option could be something like this: {code:java} { "name": "ErrorCode", "type": "enum16", "versions": "0+", "about": "The error code,", "values": [ {"value": 0, "versions": "0+", "about": "The operation completed successfully"}, {"value": 1, "versions": "1+", "about": "The requested offset was out of range"}, {"value": 4, "versions": "1+", "about": "Invalid fetch size"}, ] } {code} Here "enum16" indicates a 2-byte enumeration where the values are provided in the `values` field. New values can only be added with a version bump. was (Author: hachikuji): Yeah, we've had so many compatibility breaks due to error code usage. Putting the errors into the spec would also enable better enforcement. One option could be something like this: {code:java} { "name": "ErrorCode", "type": "enum16", "versions": "0+", "about": "The error code,", "values": [ {"value": 0, "versions": "0+", "about": "The operation completed successfully"}, {"value": 1, "versions": "1+", "about": "The requested offset was out of range"}, ] } {code} Here "enum16" indicates a 2-byte enumeration where the values are provided in the `values` field. New values can only be added with a version bump. > Specify returned errors for various APIs and versions > - > > Key: KAFKA-14439 > URL: https://issues.apache.org/jira/browse/KAFKA-14439 > Project: Kafka > Issue Type: Task >Reporter: Justine Olshan >Priority: Major > > Kafka is known for supporting various clients and being compatible across > different versions. But one thing that is a bit unclear is what errors each > response can send. > Knowing what errors can come from each version helps those who implement > clients have a more defined spec for what errors they need to handle. When > new errors are added, it is clearer to the clients that changes need to be > made. > It also helps contributors get a better understanding about how clients are > expected to react and potentially find and prevent gaps like the one found in > https://issues.apache.org/jira/browse/KAFKA-14417 > I briefly synced offline with [~hachikuji] about this and he suggested maybe > adding values for the error codes in the schema definitions of APIs that > specify the error codes and what versions they are returned on. One idea was > creating some enum type to accomplish this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14438) Stop supporting empty consumer groupId
[ https://issues.apache.org/jira/browse/KAFKA-14438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14438: Priority: Blocker (was: Major) > Stop supporting empty consumer groupId > -- > > Key: KAFKA-14438 > URL: https://issues.apache.org/jira/browse/KAFKA-14438 > Project: Kafka > Issue Type: Task > Components: consumer >Reporter: Philip Nee >Priority: Blocker > Fix For: 4.0.0 > > > Currently, a warning message is logged upon using an empty consumer groupId. > In the next major release, we should drop the support of empty ("") consumer > groupId. > > cc [~hachikuji] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-14439) Specify returned errors for various APIs and versions
[ https://issues.apache.org/jira/browse/KAFKA-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642726#comment-17642726 ] Jason Gustafson edited comment on KAFKA-14439 at 12/2/22 11:44 PM: --- Yeah, we've had so many compatibility breaks due to error code usage. Putting the errors into the spec would also enable better enforcement. One option could be something like this: {code:java} { "name": "ErrorCode", "type": "enum16", "versions": "0+", "about": "The error code,", "values": [ {"value": 0, "versions": "0+", "about": "The operation completed successfully"}, {"value": 1, "versions": "1+", "about": "The requested offset was out of range"}, ] } {code} Here "enum16" indicates a 2-byte enumeration where the values are provided in the `values` field. New values can only be added with a version bump. was (Author: hachikuji): Yeah, we've had so many compatibility breaks due to error code usage. Putting the errors into the spec would also enable better enforcement. One option could be something like this: {code:java} { "name": "ErrorCode", "type": "enum16", "versions": "0+", "about": "The error code,", "values": [ {"value": 0, "versions": "0+", "about": "The operation completed successfully"}, {"value": 1, "versions": "1+", about": "The requested offset was out of range"}, ] } {code} Here "enum16" indicates a 2-byte enumeration where the values are provided in the `values` field. New values can only be added with a version bump. > Specify returned errors for various APIs and versions > - > > Key: KAFKA-14439 > URL: https://issues.apache.org/jira/browse/KAFKA-14439 > Project: Kafka > Issue Type: Task >Reporter: Justine Olshan >Priority: Major > > Kafka is known for supporting various clients and being compatible across > different versions. But one thing that is a bit unclear is what errors each > response can send. > Knowing what errors can come from each version helps those who implement > clients have a more defined spec for what errors they need to handle. When > new errors are added, it is clearer to the clients that changes need to be > made. > It also helps contributors get a better understanding about how clients are > expected to react and potentially find and prevent gaps like the one found in > https://issues.apache.org/jira/browse/KAFKA-14417 > I briefly synced offline with [~hachikuji] about this and he suggested maybe > adding values for the error codes in the schema definitions of APIs that > specify the error codes and what versions they are returned on. One idea was > creating some enum type to accomplish this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14439) Specify returned errors for various APIs and versions
[ https://issues.apache.org/jira/browse/KAFKA-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642726#comment-17642726 ] Jason Gustafson commented on KAFKA-14439: - Yeah, we've had so many compatibility breaks due to error code usage. Putting the errors into the spec would also enable better enforcement. One option could be something like this: {code:java} { "name": "ErrorCode", "type": "enum16", "versions": "0+", "about": "The error code,", "values": [ {"value": 0, "versions": "0+", "about": "The operation completed successfully"}, {"value": 1, "versions": "1+", about": "The requested offset was out of range"}, ] } {code} Here "enum16" indicates a 2-byte enumeration where the values are provided in the `values` field. New values can only be added with a version bump. > Specify returned errors for various APIs and versions > - > > Key: KAFKA-14439 > URL: https://issues.apache.org/jira/browse/KAFKA-14439 > Project: Kafka > Issue Type: Task >Reporter: Justine Olshan >Priority: Major > > Kafka is known for supporting various clients and being compatible across > different versions. But one thing that is a bit unclear is what errors each > response can send. > Knowing what errors can come from each version helps those who implement > clients have a more defined spec for what errors they need to handle. When > new errors are added, it is clearer to the clients that changes need to be > made. > It also helps contributors get a better understanding about how clients are > expected to react and potentially find and prevent gaps like the one found in > https://issues.apache.org/jira/browse/KAFKA-14417 > I briefly synced offline with [~hachikuji] about this and he suggested maybe > adding values for the error codes in the schema definitions of APIs that > specify the error codes and what versions they are returned on. One idea was > creating some enum type to accomplish this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14379) consumer should refresh preferred read replica on update metadata
[ https://issues.apache.org/jira/browse/KAFKA-14379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14379: Priority: Critical (was: Major) > consumer should refresh preferred read replica on update metadata > - > > Key: KAFKA-14379 > URL: https://issues.apache.org/jira/browse/KAFKA-14379 > Project: Kafka > Issue Type: Bug >Reporter: Jeff Kim >Assignee: Jeff Kim >Priority: Critical > Fix For: 3.4.0 > > > The consumer (fetcher) refreshes the preferred read replica only on three > conditions: > # the consumer receives an OFFSET_OUT_OF_RANGE error > # the follower does not exist in the client's metadata (i.e., offline) > # after metadata.max.age.ms (5 min default) > For other errors, it will continue to reach to the possibly unavailable > follower and only after 5 minutes will it refresh the preferred read replica > and go back to the leader. > A specific example is when a partition is reassigned. the consumer will get > NOT_LEADER_OR_FOLLOWER which triggers a metadata update but the preferred > read replica will not be refreshed as the follower is still online. it will > continue to reach out to the old follower until the preferred read replica > expires. > the consumer can instead refresh its preferred read replica whenever it makes > a metadata update request. so when the consumer receives i.e. > NOT_LEADER_OR_FOLLOWER it can find the new preferred read replica without > waiting for the expiration. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14379) consumer should refresh preferred read replica on update metadata
[ https://issues.apache.org/jira/browse/KAFKA-14379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14379: Fix Version/s: 3.4.0 > consumer should refresh preferred read replica on update metadata > - > > Key: KAFKA-14379 > URL: https://issues.apache.org/jira/browse/KAFKA-14379 > Project: Kafka > Issue Type: Bug >Reporter: Jeff Kim >Assignee: Jeff Kim >Priority: Major > Fix For: 3.4.0 > > > The consumer (fetcher) refreshes the preferred read replica only on three > conditions: > # the consumer receives an OFFSET_OUT_OF_RANGE error > # the follower does not exist in the client's metadata (i.e., offline) > # after metadata.max.age.ms (5 min default) > For other errors, it will continue to reach to the possibly unavailable > follower and only after 5 minutes will it refresh the preferred read replica > and go back to the leader. > A specific example is when a partition is reassigned. the consumer will get > NOT_LEADER_OR_FOLLOWER which triggers a metadata update but the preferred > read replica will not be refreshed as the follower is still online. it will > continue to reach out to the old follower until the preferred read replica > expires. > the consumer can instead refresh its preferred read replica whenever it makes > a metadata update request. so when the consumer receives i.e. > NOT_LEADER_OR_FOLLOWER it can find the new preferred read replica without > waiting for the expiration. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14397) Idempotent producer may bump epoch and reset sequence numbers prematurely
[ https://issues.apache.org/jira/browse/KAFKA-14397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14397: Description: Suppose that idempotence is enabled in the producer and we send the following single-record batches to a partition leader: * A: epoch=0, seq=0 * B: epoch=0, seq=1 * C: epoch=0, seq=2 The partition leader receives all 3 of these batches and commits them to the log. However, the connection is lost before the `Produce` responses are received by the client. Subsequent retries by the producer all fail to be delivered. It is possible in this scenario for the first batch `A` to reach the delivery timeout before the subsequence batches. This triggers the following check: [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java#L642.] Depending whether retries are exhausted, we may adjust sequence numbers. The intuition behind this check is that if retries have not been exhausted, then we saw a fatal error and the batch could not have been written to the log. Hence we should bump the epoch and adjust the sequence numbers of the pending batches since they are presumed to be doomed to failure. So in this case, batches B and C might get reset with the bumped epoch: * B: epoch=1, seq=0 * C: epoch=1, seq=1 If the producer is able to reach the partition leader before these batches are expired locally, then they may get written and committed to the log. This can result in duplicates. The root of the issue is that this logic does not account for expiration of the delivery timeout. When the delivery timeout is reached, the number of retries is still likely much lower than the max allowed number of retries (which is `Int.MaxValue` by default). was: Suppose that idempotence is enabled in the producer and we send the following single-record batches to a partition leader: * A: epoch=0, seq=0 * B: epoch=0, seq=1 * C: epoch=0, seq=2 The partition leader receives all 3 of these batches and commits them to the log. However, the connection is lost before the `Produce` responses are received by the client. Subsequent retries by the producer all fail to be delivered. It is possible in this scenario for the first batch `A` to reach the delivery timeout before the subsequence batches. This triggers the following check: [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java#L642.] Depending whether retries are exhausted, we may adjust sequence numbers. The intuition behind this check is that if retries have not been exhausted, then we saw a fatal error and the batch could not have been written to the log. Hence we should bump the epoch and adjust the sequence numbers of the pending batches since they are presumed to be doomed to failure. So in this case, batches B and C might get reset with the bumped epoch: * B: epoch=1, seq=0 * C: epoch=1, seq=1 This can result in duplicate records in the log. The root of the issue is that this logic does not account for expiration of the delivery timeout. When the delivery timeout is reached, the number of retries is still likely much lower than the max allowed number of retries (which is `Int.MaxValue` by default). > Idempotent producer may bump epoch and reset sequence numbers prematurely > - > > Key: KAFKA-14397 > URL: https://issues.apache.org/jira/browse/KAFKA-14397 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > > Suppose that idempotence is enabled in the producer and we send the following > single-record batches to a partition leader: > * A: epoch=0, seq=0 > * B: epoch=0, seq=1 > * C: epoch=0, seq=2 > The partition leader receives all 3 of these batches and commits them to the > log. However, the connection is lost before the `Produce` responses are > received by the client. Subsequent retries by the producer all fail to be > delivered. > It is possible in this scenario for the first batch `A` to reach the delivery > timeout before the subsequence batches. This triggers the following check: > [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java#L642.] > Depending whether retries are exhausted, we may adjust sequence numbers. > The intuition behind this check is that if retries have not been exhausted, > then we saw a fatal error and the batch could not have been written to the > log. Hence we should bump the epoch and adjust the sequence numbers of the > pending batches since they are presumed to be doomed to failure. So in this > case, batches B and C might get reset with the bumped epoch: >
[jira] [Created] (KAFKA-14397) Idempotent producer may bump epoch and reset sequence numbers prematurely
Jason Gustafson created KAFKA-14397: --- Summary: Idempotent producer may bump epoch and reset sequence numbers prematurely Key: KAFKA-14397 URL: https://issues.apache.org/jira/browse/KAFKA-14397 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson Assignee: Jason Gustafson Suppose that idempotence is enabled in the producer and we send the following single-record batches to a partition leader: * A: epoch=0, seq=0 * B: epoch=0, seq=1 * C: epoch=0, seq=2 The partition leader receives all 3 of these batches and commits them to the log. However, the connection is lost before the `Produce` responses are received by the client. Subsequent retries by the producer all fail to be delivered. It is possible in this scenario for the first batch `A` to reach the delivery timeout before the subsequence batches. This triggers the following check: [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java#L642.] Depending whether retries are exhausted, we may adjust sequence numbers. The intuition behind this check is that if retries have not been exhausted, then we saw a fatal error and the batch could not have been written to the log. Hence we should bump the epoch and adjust the sequence numbers of the pending batches since they are presumed to be doomed to failure. So in this case, batches B and C might get reset with the bumped epoch: * B: epoch=1, seq=0 * C: epoch=1, seq=1 This can result in duplicate records in the log. The root of the issue is that this logic does not account for expiration of the delivery timeout. When the delivery timeout is reached, the number of retries is still likely much lower than the max allowed number of retries (which is `Int.MaxValue` by default). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-13964) kafka-configs.sh end with UnsupportedVersionException when describing TLS user with quotas
[ https://issues.apache.org/jira/browse/KAFKA-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-13964. - Resolution: Duplicate Thanks for reporting the issue. This will be resolved by https://issues.apache.org/jira/browse/KAFKA-14084. > kafka-configs.sh end with UnsupportedVersionException when describing TLS > user with quotas > --- > > Key: KAFKA-13964 > URL: https://issues.apache.org/jira/browse/KAFKA-13964 > Project: Kafka > Issue Type: Bug > Components: admin, kraft >Affects Versions: 3.2.0 > Environment: Kafka 3.2.0 running on OpenShift 4.10 in KRaft mode > managed by Strimzi >Reporter: Jakub Stejskal >Priority: Minor > > {color:#424242}Usage of {color:#424242}kafka-configs.sh end with > {color:#424242}org.apache.kafka.common.errors.UnsupportedVersionException: > The broker does not support DESCRIBE_USER_SCRAM_CREDENTIALS when describing > TLS user with quotas enabled. {color}{color}{color} > > {code:java} > bin/kafka-configs.sh --bootstrap-server localhost:9092 --describe --user > CN=encrypted-arnost` got status code 1 and stderr: -- Error while > executing config command with args '--bootstrap-server localhost:9092 > --describe --user CN=encrypted-arnost' > java.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.UnsupportedVersionException: The broker does > not support DESCRIBE_USER_SCRAM_CREDENTIALS{code} > STDOUT contains all necessary data, but the script itself ends with return > code 1 and the error above. Scram-sha has not been configured anywhere in > that case (not supported by KRaft). This might be fixed by adding support for > scram-sha in the next version (not reproducible without KRaft enabled). > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14316) NoSuchElementException in feature control iterator
[ https://issues.apache.org/jira/browse/KAFKA-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14316. - Fix Version/s: 3.4.0 3.3.2 Resolution: Fixed > NoSuchElementException in feature control iterator > -- > > Key: KAFKA-14316 > URL: https://issues.apache.org/jira/browse/KAFKA-14316 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > Fix For: 3.4.0, 3.3.2 > > > We noticed this exception during testing: > {code:java} > java.util.NoSuchElementException > 2 at > org.apache.kafka.timeline.SnapshottableHashTable$HistoricalIterator.next(SnapshottableHashTable.java:276) > 3 at > org.apache.kafka.timeline.SnapshottableHashTable$HistoricalIterator.next(SnapshottableHashTable.java:189) > 4 at > org.apache.kafka.timeline.TimelineHashMap$EntryIterator.next(TimelineHashMap.java:360) > 5 at > org.apache.kafka.timeline.TimelineHashMap$EntryIterator.next(TimelineHashMap.java:346) > 6 at > org.apache.kafka.controller.FeatureControlManager$FeatureControlIterator.next(FeatureControlManager.java:375) > 7 at > org.apache.kafka.controller.FeatureControlManager$FeatureControlIterator.next(FeatureControlManager.java:347) > 8 at > org.apache.kafka.controller.SnapshotGenerator.generateBatch(SnapshotGenerator.java:109) > 9 at > org.apache.kafka.controller.SnapshotGenerator.generateBatches(SnapshotGenerator.java:126) > 10at > org.apache.kafka.controller.QuorumController$SnapshotGeneratorManager.run(QuorumController.java:637) > 11at > org.apache.kafka.controller.QuorumController$ControlEvent.run(QuorumController.java:539) > 12at > org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121) > 13at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200) > 14at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173) > 15at java.base/java.lang.Thread.run(Thread.java:833) > 16at org.apache.kafka.common.utils.KafkaThread.run(KafkaThread.java:64) > {code} > The iterator `FeatureControlIterator.hasNext()` checks two conditions: 1) > whether we have already written the metadata version, and 2) whether the > underlying iterator has additional records. However, in `next()`, we also > check that the metadata version is at least high enough to include it in the > log. When this fails, then we can see an unexpected `NoSuchElementException` > if the underlying iterator is empty. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14319) Storage tool format command does not work with old metadata versions
Jason Gustafson created KAFKA-14319: --- Summary: Storage tool format command does not work with old metadata versions Key: KAFKA-14319 URL: https://issues.apache.org/jira/browse/KAFKA-14319 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson When using the format tool with older metadata versions, we see the following error: {code:java} $ bin/kafka-storage.sh format --cluster-id vVlhxM7VT3C-3nz7yEkiCQ --config config/kraft/server.properties --release-version "3.1-IV0" Exception in thread "main" java.lang.RuntimeException: Bootstrap metadata versions before 3.3-IV0 are not supported. Can't load metadata from format command at org.apache.kafka.metadata.bootstrap.BootstrapMetadata.(BootstrapMetadata.java:83) at org.apache.kafka.metadata.bootstrap.BootstrapMetadata.fromVersion(BootstrapMetadata.java:48) at kafka.tools.StorageTool$.$anonfun$formatCommand$2(StorageTool.scala:265) at kafka.tools.StorageTool$.$anonfun$formatCommand$2$adapted(StorageTool.scala:254) at scala.collection.immutable.List.foreach(List.scala:333) at kafka.tools.StorageTool$.formatCommand(StorageTool.scala:254) at kafka.tools.StorageTool$.main(StorageTool.scala:61) at kafka.tools.StorageTool.main(StorageTool.scala) {code} For versions prior to `3.3-IV0`, we should skip creation of the `bootstrap.checkpoint` file instead of failing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14316) NoSuchElementException in feature control iterator
Jason Gustafson created KAFKA-14316: --- Summary: NoSuchElementException in feature control iterator Key: KAFKA-14316 URL: https://issues.apache.org/jira/browse/KAFKA-14316 Project: Kafka Issue Type: Bug Affects Versions: 3.3.0 Reporter: Jason Gustafson Assignee: Jason Gustafson We noticed this exception during testing: {code:java} java.util.NoSuchElementException 2 at org.apache.kafka.timeline.SnapshottableHashTable$HistoricalIterator.next(SnapshottableHashTable.java:276) 3 at org.apache.kafka.timeline.SnapshottableHashTable$HistoricalIterator.next(SnapshottableHashTable.java:189) 4 at org.apache.kafka.timeline.TimelineHashMap$EntryIterator.next(TimelineHashMap.java:360) 5 at org.apache.kafka.timeline.TimelineHashMap$EntryIterator.next(TimelineHashMap.java:346) 6 at org.apache.kafka.controller.FeatureControlManager$FeatureControlIterator.next(FeatureControlManager.java:375) 7 at org.apache.kafka.controller.FeatureControlManager$FeatureControlIterator.next(FeatureControlManager.java:347) 8 at org.apache.kafka.controller.SnapshotGenerator.generateBatch(SnapshotGenerator.java:109) 9 at org.apache.kafka.controller.SnapshotGenerator.generateBatches(SnapshotGenerator.java:126) 10 at org.apache.kafka.controller.QuorumController$SnapshotGeneratorManager.run(QuorumController.java:637) 11 at org.apache.kafka.controller.QuorumController$ControlEvent.run(QuorumController.java:539) 12 at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121) 13 at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200) 14 at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173) 15 at java.base/java.lang.Thread.run(Thread.java:833) 16 at org.apache.kafka.common.utils.KafkaThread.run(KafkaThread.java:64) {code} The iterator `FeatureControlIterator.hasNext()` checks two conditions: 1) whether we have already written the metadata version, and 2) whether the underlying iterator has additional records. However, in `next()`, we also check that the metadata version is at least high enough to include it in the log. When this fails, then we can see an unexpected `NoSuchElementException` if the underlying iterator is empty. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14296) Partition leaders are not demoted during kraft controlled shutdown
[ https://issues.apache.org/jira/browse/KAFKA-14296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14296. - Fix Version/s: 3.4.0 3.3.2 Resolution: Fixed > Partition leaders are not demoted during kraft controlled shutdown > -- > > Key: KAFKA-14296 > URL: https://issues.apache.org/jira/browse/KAFKA-14296 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.3.1 >Reporter: David Jacot >Assignee: David Jacot >Priority: Major > Fix For: 3.4.0, 3.3.2 > > > When the BrokerServer starts its shutting down process, it transitions to > SHUTTING_DOWN and sets isShuttingDown to true. With this state change, the > follower state changes are short-cutted. This means that a broker which was > serving as leader would remain acting as a leader until controlled shutdown > completes. Instead, we want the leader and ISR state to be updated so that > requests will return NOT_LEADER and the client can find the new leader. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14292) KRaft broker controlled shutdown can be delayed indefinitely
[ https://issues.apache.org/jira/browse/KAFKA-14292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14292. - Fix Version/s: 3.4.0 3.3.2 Resolution: Fixed > KRaft broker controlled shutdown can be delayed indefinitely > > > Key: KAFKA-14292 > URL: https://issues.apache.org/jira/browse/KAFKA-14292 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Alyssa Huang >Priority: Major > Fix For: 3.4.0, 3.3.2 > > > We noticed when rolling a kraft cluster that it took an unexpectedly long > time for one of the brokers to shutdown. In the logs, we saw the following: > {code:java} > Oct 11, 2022 @ 17:53:38.277 [Controller 1] The request from broker 8 to > shut down can not yet be granted because the lowest active offset 2283357 is > not greater than the broker's shutdown offset 2283358. > org.apache.kafka.controller.BrokerHeartbeatManager DEBUG > 2Oct 11, 2022 @ 17:53:38.277 [Controller 1] Updated the controlled shutdown > offset for broker 8 to 2283362. > org.apache.kafka.controller.BrokerHeartbeatManager DEBUG > 3Oct 11, 2022 @ 17:53:40.278 [Controller 1] Updated the controlled shutdown > offset for broker 8 to 2283366. > org.apache.kafka.controller.BrokerHeartbeatManager DEBUG > 4Oct 11, 2022 @ 17:53:40.278 [Controller 1] The request from broker 8 to > shut down can not yet be granted because the lowest active offset 2283361 is > not greater than the broker's shutdown offset 2283362. > org.apache.kafka.controller.BrokerHeartbeatManager DEBUG > 5Oct 11, 2022 @ 17:53:42.279 [Controller 1] The request from broker 8 to > shut down can not yet be granted because the lowest active offset 2283365 is > not greater than the broker's shutdown offset 2283366. > org.apache.kafka.controller.BrokerHeartbeatManager DEBUG > 6Oct 11, 2022 @ 17:53:42.279 [Controller 1] Updated the controlled shutdown > offset for broker 8 to 2283370. > org.apache.kafka.controller.BrokerHeartbeatManager DEBUG > 7Oct 11, 2022 @ 17:53:44.280 [Controller 1] The request from broker 8 to > shut down can not yet be granted because the lowest active offset 2283369 is > not greater than the broker's shutdown offset 2283370. > org.apache.kafka.controller.BrokerHeartbeatManager DEBUG > 8Oct 11, 2022 @ 17:53:44.281 [Controller 1] Updated the controlled shutdown > offset for broker 8 to 2283374. > org.apache.kafka.controller.BrokerHeartbeatManager DEBUG{code} > From what I can tell, it looks like the controller waits until all brokers > have caught up to the {{controlledShutdownOffset}} of the broker that is > shutting down before allowing it to proceed. Probably the intent is to make > sure they have all the leader and ISR state. > The problem is that the {{controlledShutdownOffset}} seems to be updated > after every heartbeat that the controller receives: > https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/QuorumController.java#L1996. > Unless all other brokers can catch up to that offset before the next > heartbeat from the shutting down broker is received, then the broker remains > in the shutting down state indefinitely. > In this case, it took more than 40 minutes before the broker completed > shutdown: > {code:java} > 1Oct 11, 2022 @ 18:36:36.105 [Controller 1] The request from broker 8 to > shut down has been granted since the lowest active offset 2288510 is now > greater than the broker's controlled shutdown offset 2288510. > org.apache.kafka.controller.BrokerHeartbeatManager INFO > 2Oct 11, 2022 @ 18:40:35.197 [Controller 1] The request from broker 8 to > unfence has been granted because it has caught up with the offset of it's > register broker record 2288906. > org.apache.kafka.controller.BrokerHeartbeatManager INFO{code} > It seems like the bug here is that we should not keep updating > {{controlledShutdownOffset}} if it has already been set. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14292) KRaft broker controlled shutdown can be delayed indefinitely
Jason Gustafson created KAFKA-14292: --- Summary: KRaft broker controlled shutdown can be delayed indefinitely Key: KAFKA-14292 URL: https://issues.apache.org/jira/browse/KAFKA-14292 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson Assignee: Alyssa Huang We noticed when rolling a kraft cluster that it took an unexpectedly long time for one of the brokers to shutdown. In the logs, we saw the following: {code:java} Oct 11, 2022 @ 17:53:38.277 [Controller 1] The request from broker 8 to shut down can not yet be granted because the lowest active offset 2283357 is not greater than the broker's shutdown offset 2283358. org.apache.kafka.controller.BrokerHeartbeatManager DEBUG 2Oct 11, 2022 @ 17:53:38.277[Controller 1] Updated the controlled shutdown offset for broker 8 to 2283362. org.apache.kafka.controller.BrokerHeartbeatManager DEBUG 3Oct 11, 2022 @ 17:53:40.278[Controller 1] Updated the controlled shutdown offset for broker 8 to 2283366. org.apache.kafka.controller.BrokerHeartbeatManager DEBUG 4Oct 11, 2022 @ 17:53:40.278[Controller 1] The request from broker 8 to shut down can not yet be granted because the lowest active offset 2283361 is not greater than the broker's shutdown offset 2283362. org.apache.kafka.controller.BrokerHeartbeatManager DEBUG 5Oct 11, 2022 @ 17:53:42.279[Controller 1] The request from broker 8 to shut down can not yet be granted because the lowest active offset 2283365 is not greater than the broker's shutdown offset 2283366. org.apache.kafka.controller.BrokerHeartbeatManager DEBUG 6Oct 11, 2022 @ 17:53:42.279[Controller 1] Updated the controlled shutdown offset for broker 8 to 2283370. org.apache.kafka.controller.BrokerHeartbeatManager DEBUG 7Oct 11, 2022 @ 17:53:44.280[Controller 1] The request from broker 8 to shut down can not yet be granted because the lowest active offset 2283369 is not greater than the broker's shutdown offset 2283370. org.apache.kafka.controller.BrokerHeartbeatManager DEBUG 8Oct 11, 2022 @ 17:53:44.281[Controller 1] Updated the controlled shutdown offset for broker 8 to 2283374. org.apache.kafka.controller.BrokerHeartbeatManager DEBUG{code} >From what I can tell, it looks like the controller waits until all brokers >have caught up to the {{controlledShutdownOffset}} of the broker that is >shutting down before allowing it to proceed. Probably the intent is to make >sure they have all the leader and ISR state. The problem is that the {{controlledShutdownOffset}} seems to be updated after every heartbeat that the controller receives: https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/QuorumController.java#L1996. Unless all other brokers can catch up to that offset before the next heartbeat from the shutting down broker is received, then the broker remains in the shutting down state indefinitely. In this case, it took more than 40 minutes before the broker completed shutdown: {code:java} 1Oct 11, 2022 @ 18:36:36.105[Controller 1] The request from broker 8 to shut down has been granted since the lowest active offset 2288510 is now greater than the broker's controlled shutdown offset 2288510. org.apache.kafka.controller.BrokerHeartbeatManager INFO 2Oct 11, 2022 @ 18:40:35.197[Controller 1] The request from broker 8 to unfence has been granted because it has caught up with the offset of it's register broker record 2288906. org.apache.kafka.controller.BrokerHeartbeatManager INFO{code} It seems like the bug here is that we should not keep updating {{controlledShutdownOffset}} if it has already been set. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14247) Implement EventHandler interface and DefaultEventHandler
[ https://issues.apache.org/jira/browse/KAFKA-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14247. - Resolution: Fixed > Implement EventHandler interface and DefaultEventHandler > > > Key: KAFKA-14247 > URL: https://issues.apache.org/jira/browse/KAFKA-14247 > Project: Kafka > Issue Type: Sub-task > Components: consumer >Reporter: Philip Nee >Assignee: Philip Nee >Priority: Major > > The polling thread uses events to communicate with the background thread. > The events send to the background thread are the {_}Requests{_}, and the > events send from the background thread to the polling thread are the > {_}Responses{_}. > > Here we have an EventHandler interface and DefaultEventHandler > implementation. The implementation uses two blocking queues to send events > both ways. The two methods, add and poll allows the client, i.e., the > polling thread, to retrieve and add events to the handler. > > PR: https://github.com/apache/kafka/pull/12663 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-10140) Incremental config api excludes plugin config changes
[ https://issues.apache.org/jira/browse/KAFKA-10140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-10140: Description: I was trying to alter the jmx metric filters using the incremental alter config api and hit this error: {code:java} java.util.NoSuchElementException: key not found: metrics.jmx.blacklist at scala.collection.MapLike.default(MapLike.scala:235) at scala.collection.MapLike.default$(MapLike.scala:234) at scala.collection.AbstractMap.default(Map.scala:65) at scala.collection.MapLike.apply(MapLike.scala:144) at scala.collection.MapLike.apply$(MapLike.scala:143) at scala.collection.AbstractMap.apply(Map.scala:65) at kafka.server.AdminManager.listType$1(AdminManager.scala:681) at kafka.server.AdminManager.$anonfun$prepareIncrementalConfigs$1(AdminManager.scala:693) at kafka.server.AdminManager.prepareIncrementalConfigs(AdminManager.scala:687) at kafka.server.AdminManager.$anonfun$incrementalAlterConfigs$1(AdminManager.scala:618) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:273) at scala.collection.immutable.Map$Map1.foreach(Map.scala:154) at scala.collection.TraversableLike.map(TraversableLike.scala:273) at scala.collection.TraversableLike.map$(TraversableLike.scala:266) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at kafka.server.AdminManager.incrementalAlterConfigs(AdminManager.scala:589) at kafka.server.KafkaApis.handleIncrementalAlterConfigsRequest(KafkaApis.scala:2698) at kafka.server.KafkaApis.handle(KafkaApis.scala:188) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:78) at java.base/java.lang.Thread.run(Thread.java:834) {code} It looks like we are only allowing changes to the keys defined in `KafkaConfig` through this API. This excludes config changes to any plugin components such as `JmxReporter`. Note that I was able to use the regular `alterConfig` API to change this config. was: I was trying to alter the jmx metric filters using the incremental alter config api and hit this error: ``` java.util.NoSuchElementException: key not found: metrics.jmx.blacklist at scala.collection.MapLike.default(MapLike.scala:235) at scala.collection.MapLike.default$(MapLike.scala:234) at scala.collection.AbstractMap.default(Map.scala:65) at scala.collection.MapLike.apply(MapLike.scala:144) at scala.collection.MapLike.apply$(MapLike.scala:143) at scala.collection.AbstractMap.apply(Map.scala:65) at kafka.server.AdminManager.listType$1(AdminManager.scala:681) at kafka.server.AdminManager.$anonfun$prepareIncrementalConfigs$1(AdminManager.scala:693) at kafka.server.AdminManager.prepareIncrementalConfigs(AdminManager.scala:687) at kafka.server.AdminManager.$anonfun$incrementalAlterConfigs$1(AdminManager.scala:618) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:273) at scala.collection.immutable.Map$Map1.foreach(Map.scala:154) at scala.collection.TraversableLike.map(TraversableLike.scala:273) at scala.collection.TraversableLike.map$(TraversableLike.scala:266) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at kafka.server.AdminManager.incrementalAlterConfigs(AdminManager.scala:589) at kafka.server.KafkaApis.handleIncrementalAlterConfigsRequest(KafkaApis.scala:2698) at kafka.server.KafkaApis.handle(KafkaApis.scala:188) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:78) at java.base/java.lang.Thread.run(Thread.java:834) ``` It looks like we are only allowing changes to the keys defined in `KafkaConfig` through this API. This excludes config changes to any plugin components such as `JmxReporter`. Note that I was able to use the regular `alterConfig` API to change this config. > Incremental config api excludes plugin config changes > - > > Key: KAFKA-10140 > URL: https://issues.apache.org/jira/browse/KAFKA-10140 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Critical > > I was trying to alter the jmx metric filters using the incremental alter > config api and hit this error: > {code:java} > java.util.NoSuchElementException: key not found: metrics.jmx.blacklist > at scala.collection.MapLike.default(MapLike.scala:235) > at scala.collection.MapLike.default$(MapLike.scala:234) > at scala.collection.AbstractMap.default(Map.scala:65) > at scala.collection.MapLike.apply(MapLike.scala:144) > at scala.collection.MapLike.apply$(MapLike.scala:143) > at scala.collection.AbstractMap.apply(Map.scala:65) > at kafka.server.AdminManager.listType$1(AdminManager.scala:681) > at > kafka.server.AdminManager.$anonfun$prepareIncrementalConfigs$1(AdminManager.scala:693) >
[jira] [Resolved] (KAFKA-14236) ListGroups request produces too much Denied logs in authorizer
[ https://issues.apache.org/jira/browse/KAFKA-14236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14236. - Fix Version/s: 3.4.0 Resolution: Fixed > ListGroups request produces too much Denied logs in authorizer > -- > > Key: KAFKA-14236 > URL: https://issues.apache.org/jira/browse/KAFKA-14236 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 2.0.0 >Reporter: Alexandre GRIFFAUT >Priority: Major > Labels: patch-available > Fix For: 3.4.0 > > > Context > On a multi-tenant secured cluster, with many consumers, a call to ListGroups > api will log an authorization error for each consumer group of other tenant. > Reason > The handleListGroupsRequest function first tries to authorize a DESCRIBE > CLUSTER, and if it fails it will then try to authorize a DESCRIBE GROUP on > each consumer group. > Fix > In that case neither the DESCRIBE CLUSTER, nor the DESCRIBE GROUP of other > tenant were intended, and should be specified in the Action using > logIfDenied: false -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14240) Ensure kraft metadata log dir is initialized with valid snapshot state
[ https://issues.apache.org/jira/browse/KAFKA-14240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14240. - Resolution: Fixed > Ensure kraft metadata log dir is initialized with valid snapshot state > -- > > Key: KAFKA-14240 > URL: https://issues.apache.org/jira/browse/KAFKA-14240 > Project: Kafka > Issue Type: Improvement >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > Fix For: 3.3.0 > > > If the first segment under __cluster_metadata has a base offset greater than > 0, then there must exist at least one snapshot which has a larger offset than > whatever the first segment starts at. We should check for this at startup to > prevent the controller from initialization with invalid state. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14238) KRaft replicas can delete segments not included in a snapshot
[ https://issues.apache.org/jira/browse/KAFKA-14238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14238. - Resolution: Fixed > KRaft replicas can delete segments not included in a snapshot > - > > Key: KAFKA-14238 > URL: https://issues.apache.org/jira/browse/KAFKA-14238 > Project: Kafka > Issue Type: Bug > Components: core, kraft >Reporter: Jose Armando Garcia Sancio >Assignee: Jose Armando Garcia Sancio >Priority: Blocker > Fix For: 3.3.0 > > > We see this in the log > {code:java} > Deleting segment LogSegment(baseOffset=243864, size=9269150, > lastModifiedTime=1662486784182, largestRecordTimestamp=Some(1662486784160)) > due to retention time 60480ms breach based on the largest record > timestamp in the segment {code} > This then cause {{KafkaRaftClient}} to throw an exception when sending > batches to the listener: > {code:java} > java.lang.IllegalStateException: Snapshot expected since next offset of > org.apache.kafka.controller.QuorumController$QuorumMetaLogListener@195461949 > is 0, log start offset is 369668 and high-watermark is 547379 > at > org.apache.kafka.raft.KafkaRaftClient.lambda$updateListenersProgress$4(KafkaRaftClient.java:312) > at java.base/java.util.Optional.orElseThrow(Optional.java:403) > at > org.apache.kafka.raft.KafkaRaftClient.lambda$updateListenersProgress$5(KafkaRaftClient.java:311) > at java.base/java.util.OptionalLong.ifPresent(OptionalLong.java:165) > at > org.apache.kafka.raft.KafkaRaftClient.updateListenersProgress(KafkaRaftClient.java:309){code} > The on disk state for the cluster metadata partition confirms this: > {code:java} > ls __cluster_metadata-0/ > 00369668.index > 00369668.log > 00369668.timeindex > 00503411.index > 00503411.log > 00503411.snapshot > 00503411.timeindex > 00548746.snapshot > leader-epoch-checkpoint > partition.metadata > quorum-state{code} > Noticed that there are no {{checkpoint}} files and the log doesn't have a > segment at base offset 0. > This is happening because the {{LogConfig}} used for KRaft sets the retention > policy to {{delete}} which causes the method {{deleteOldSegments}} to delete > old segments even if there are no snaspshot for it. For KRaft, Kafka should > only delete segment that breach the log start offset. > Log configuration for KRaft: > {code:java} > val props = new Properties() > props.put(LogConfig.MaxMessageBytesProp, > config.maxBatchSizeInBytes.toString) > props.put(LogConfig.SegmentBytesProp, Int.box(config.logSegmentBytes)) > props.put(LogConfig.SegmentMsProp, Long.box(config.logSegmentMillis)) > props.put(LogConfig.FileDeleteDelayMsProp, > Int.box(Defaults.FileDeleteDelayMs)) > LogConfig.validateValues(props) > val defaultLogConfig = LogConfig(props){code} > Segment deletion code: > {code:java} > def deleteOldSegments(): Int = { > if (config.delete) { > deleteLogStartOffsetBreachedSegments() + > deleteRetentionSizeBreachedSegments() + > deleteRetentionMsBreachedSegments() > } else { > deleteLogStartOffsetBreachedSegments() > } > }{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14240) Ensure kraft metadata log dir is initialized with valid snapshot state
Jason Gustafson created KAFKA-14240: --- Summary: Ensure kraft metadata log dir is initialized with valid snapshot state Key: KAFKA-14240 URL: https://issues.apache.org/jira/browse/KAFKA-14240 Project: Kafka Issue Type: Improvement Reporter: Jason Gustafson Assignee: Jason Gustafson If the first segment under __cluster_metadata has a base offset greater than 0, then there must exist at least one snapshot which has a larger offset than whatever the first segment starts at. We should check for this at startup to prevent the controller from initialization with invalid state. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14224) Consumer should auto-commit prior to cooperative partition revocation
[ https://issues.apache.org/jira/browse/KAFKA-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14224: Description: With the old "eager" reassignment logic, we always revoked all partitions prior to each rebalance. When auto-commit is enabled, a part of this process is committing current positions. Under the new "cooperative" logic, we defer revocation until after the rebalance, which means we can continue fetching while the rebalance is in progress. However, when reviewing KAFKA-14196, we noticed that there is no similar logic to commit offsets prior to this deferred revocation. This means that cooperative consumption is more likely to lead to have duplicate consumption even when there is no failure involved. (was: With the old "eager" reassignment logic, we always revoked all partitions prior to each rebalance. When auto-commit is enabled, a part of this process is committing current position. Under the new "cooperative" logic, we defer revocation until after the rebalance, which means we can continue fetching while the rebalance is in progress. However, when reviewing KAFKA-14196, we noticed that there is no similar logic to commit offsets prior to this deferred revocation. This means that cooperative consumption is more likely to lead to have duplicate consumption even when there is no failure involved.) > Consumer should auto-commit prior to cooperative partition revocation > - > > Key: KAFKA-14224 > URL: https://issues.apache.org/jira/browse/KAFKA-14224 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Major > Labels: new-consumer-threading-should-fix > > With the old "eager" reassignment logic, we always revoked all partitions > prior to each rebalance. When auto-commit is enabled, a part of this process > is committing current positions. Under the new "cooperative" logic, we defer > revocation until after the rebalance, which means we can continue fetching > while the rebalance is in progress. However, when reviewing KAFKA-14196, we > noticed that there is no similar logic to commit offsets prior to this > deferred revocation. This means that cooperative consumption is more likely > to lead to have duplicate consumption even when there is no failure involved. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14196) Duplicated consumption during rebalance, causing OffsetValidationTest to act flaky
[ https://issues.apache.org/jira/browse/KAFKA-14196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17603328#comment-17603328 ] Jason Gustafson commented on KAFKA-14196: - > Also, this is currently marked as a blocker. Is there a crisp description of >the regression? Prior to revocation, eager rebalance strategies will attempt to auto-commit offsets before revoking partitions and joining the rebalance. Originally this logic was synchronous, which meant there was no opportunity for additional data to be returned before the revocation completed. This changed when we introduced asynchronous offset commit logic. Any progress made between the time the asynchronous offset commit was sent and the revocation completed would be lost. This results in duplicate consumption. > Duplicated consumption during rebalance, causing OffsetValidationTest to act > flaky > -- > > Key: KAFKA-14196 > URL: https://issues.apache.org/jira/browse/KAFKA-14196 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Affects Versions: 3.2.1 >Reporter: Philip Nee >Assignee: Philip Nee >Priority: Blocker > Labels: new-consumer-threading-should-fix > Fix For: 3.3.0, 3.2.2 > > > Several flaky tests under OffsetValidationTest are indicating potential > consumer duplication issue, when autocommit is enabled. I believe this is > affecting *3.2* and onward. Below shows the failure message: > > {code:java} > Total consumed records 3366 did not match consumed position 3331 {code} > > After investigating the log, I discovered that the data consumed between the > start of a rebalance event and the async commit was lost for those failing > tests. In the example below, the rebalance event kicks in at around > 1662054846995 (first record), and the async commit of the offset 3739 is > completed at around 1662054847015 (right before partitions_revoked). > > {code:java} > {"timestamp":1662054846995,"name":"records_consumed","count":3,"partitions":[{"topic":"test_topic","partition":0,"count":3,"minOffset":3739,"maxOffset":3741}]} > {"timestamp":1662054846998,"name":"records_consumed","count":2,"partitions":[{"topic":"test_topic","partition":0,"count":2,"minOffset":3742,"maxOffset":3743}]} > {"timestamp":1662054847008,"name":"records_consumed","count":2,"partitions":[{"topic":"test_topic","partition":0,"count":2,"minOffset":3744,"maxOffset":3745}]} > {"timestamp":1662054847016,"name":"partitions_revoked","partitions":[{"topic":"test_topic","partition":0}]} > {"timestamp":1662054847031,"name":"partitions_assigned","partitions":[{"topic":"test_topic","partition":0}]} > {"timestamp":1662054847038,"name":"records_consumed","count":23,"partitions":[{"topic":"test_topic","partition":0,"count":23,"minOffset":3739,"maxOffset":3761}]} > {code} > A few things to note here: > # Manually calling commitSync in the onPartitionsRevoke cb seems to > alleviate the issue > # Setting includeMetadataInTimeout to false also seems to alleviate the > issue. > The above tries seems to suggest that contract between poll() and > asyncCommit() is broken. AFAIK, we implicitly uses poll() to ack the > previously fetched data, and the consumer would (try to) commit these offsets > in the current poll() loop. However, it seems like as the poll continues to > loop, the "acked" data isn't being committed. > > I believe this could be introduced in KAFKA-14024, which originated from > KAFKA-13310. > More specifically, (see the comments below), the ConsumerCoordinator will > alway return before async commit, due to the previous incomplete commit. > However, this is a bit contradictory here because: > # I think we want to commit asynchronously while the poll continues, and if > we do that, we are back to KAFKA-14024, that the consumer will get rebalance > timeout and get kicked out of the group. > # But we also need to commit all the "acked" offsets before revoking the > partition, and this has to be blocked. > *Steps to Reproduce the Issue:* > # Check out AK 3.2 > # Run this several times: (Recommend to only run runs with autocommit > enabled in consumer_test.py to save time) > {code:java} > _DUCKTAPE_OPTIONS="--debug" > TC_PATHS="tests/kafkatest/tests/client/consumer_test.py::OffsetValidationTest.test_consumer_failure" > bash tests/docker/run_tests.sh {code} > > *Steps to Diagnose the Issue:* > # Open the test results in *results/* > # Go to the consumer log. It might look like this > > {code:java} > results/2022-09-03--005/OffsetValidationTest/test_consumer_failure/clean_shutdown=True.enable_autocommit=True.metadata_quorum=ZK/2/VerifiableConsumer-0-xx/dockerYY > {code} > 3. Find the docker instance that has partition getting revoked
[jira] [Resolved] (KAFKA-14215) KRaft forwarded requests have no quota enforcement
[ https://issues.apache.org/jira/browse/KAFKA-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14215. - Resolution: Fixed > KRaft forwarded requests have no quota enforcement > -- > > Key: KAFKA-14215 > URL: https://issues.apache.org/jira/browse/KAFKA-14215 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.3.0, 3.3 >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Blocker > Fix For: 3.3.0, 3.3 > > > On the broker, the `BrokerMetadataPublisher` is responsible for propagating > quota changes from `ClientQuota` records to `ClientQuotaManager`. On the > controller, there is no similar logic, so no client quotas are enforced on > the controller. > On the broker side, there is no enforcement as well since the broker assumes > that the controller will be the one to do it. Basically it looks at the > throttle time returned in the response from the controller. If it is 0, then > the response is sent immediately without any throttling. > So the consequence of both of these issues is that controller-bound requests > have no throttling today. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14224) Consumer should auto-commit prior to cooperative partition revocation
[ https://issues.apache.org/jira/browse/KAFKA-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14224: Description: With the old "eager" reassignment logic, we always revoked all partitions prior to each rebalance. When auto-commit is enabled, a part of this process is committing current position. Under the new "cooperative" logic, we defer revocation until after the rebalance, which means we can continue fetching while the rebalance is in progress. However, when reviewing KAFKA-14196, we noticed that there is no similar logic to commit offsets prior to this deferred revocation. This means that cooperative consumption is more likely to lead to have duplicate consumption even when there is no failure involved. (was: With the old "eager" reassignment logic, we always revoked all partitions prior to each rebalance. When auto-commit is enabled, a part of this process is committing current position. Under the cooperative logic, we defer revocation until after the rebalance, which means we can continue fetching while the rebalance is in progress. However, when reviewing KAFKA-14196, we noticed that there is no similar logic to commit offsets prior to this deferred revocation. This means that cooperative consumption is more likely to lead to have duplicate consumption even when there is no failure involved.) > Consumer should auto-commit prior to cooperative partition revocation > - > > Key: KAFKA-14224 > URL: https://issues.apache.org/jira/browse/KAFKA-14224 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Major > > With the old "eager" reassignment logic, we always revoked all partitions > prior to each rebalance. When auto-commit is enabled, a part of this process > is committing current position. Under the new "cooperative" logic, we defer > revocation until after the rebalance, which means we can continue fetching > while the rebalance is in progress. However, when reviewing KAFKA-14196, we > noticed that there is no similar logic to commit offsets prior to this > deferred revocation. This means that cooperative consumption is more likely > to lead to have duplicate consumption even when there is no failure involved. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14224) Consumer should auto-commit prior to cooperative partition revocation
Jason Gustafson created KAFKA-14224: --- Summary: Consumer should auto-commit prior to cooperative partition revocation Key: KAFKA-14224 URL: https://issues.apache.org/jira/browse/KAFKA-14224 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson With the old "eager" reassignment logic, we always revoked all partitions prior to each rebalance. When auto-commit is enabled, a part of this process is committing current position. Under the cooperative logic, we defer revocation until after the rebalance, which means we can continue fetching while the rebalance is in progress. However, when reviewing KAFKA-14196, we noticed that there is no similar logic to commit offsets prior to this deferred revocation. This means that cooperative consumption is more likely to lead to have duplicate consumption even when there is no failure involved. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14215) KRaft forwarded requests have no quota enforcement
Jason Gustafson created KAFKA-14215: --- Summary: KRaft forwarded requests have no quota enforcement Key: KAFKA-14215 URL: https://issues.apache.org/jira/browse/KAFKA-14215 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson On the broker, the `BrokerMetadataPublisher` is responsible for propagating quota changes from `ClientQuota` records to `ClientQuotaManager`. On the controller, there is no similar logic, so no client quotas are enforced on the controller. On the broker side, there is no enforcement as well since the broker assumes that the controller will be the one to do it. Basically it looks at the throttle time returned in the response from the controller. If it is 0, then the response is sent immediately without any throttling. So the consequence of both of these issues is that controller-bound requests have no throttling today. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14196) Duplicated consumption during rebalance, causing OffsetValidationTest to act flaky
[ https://issues.apache.org/jira/browse/KAFKA-14196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14196: Affects Version/s: 3.2.1 (was: 3.3.0) > Duplicated consumption during rebalance, causing OffsetValidationTest to act > flaky > -- > > Key: KAFKA-14196 > URL: https://issues.apache.org/jira/browse/KAFKA-14196 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Affects Versions: 3.2.1 >Reporter: Philip Nee >Assignee: Philip Nee >Priority: Major > Labels: new-consumer-threading-should-fix > > Several flaky tests under OffsetValidationTest are indicating potential > consumer duplication issue, when autocommit is enabled. I believe this is > affecting *3.2* and onward. Below shows the failure message: > > {code:java} > Total consumed records 3366 did not match consumed position 3331 {code} > > After investigating the log, I discovered that the data consumed between the > start of a rebalance event and the async commit was lost for those failing > tests. In the example below, the rebalance event kicks in at around > 1662054846995 (first record), and the async commit of the offset 3739 is > completed at around 1662054847015 (right before partitions_revoked). > > {code:java} > {"timestamp":1662054846995,"name":"records_consumed","count":3,"partitions":[{"topic":"test_topic","partition":0,"count":3,"minOffset":3739,"maxOffset":3741}]} > {"timestamp":1662054846998,"name":"records_consumed","count":2,"partitions":[{"topic":"test_topic","partition":0,"count":2,"minOffset":3742,"maxOffset":3743}]} > {"timestamp":1662054847008,"name":"records_consumed","count":2,"partitions":[{"topic":"test_topic","partition":0,"count":2,"minOffset":3744,"maxOffset":3745}]} > {"timestamp":1662054847016,"name":"partitions_revoked","partitions":[{"topic":"test_topic","partition":0}]} > {"timestamp":1662054847031,"name":"partitions_assigned","partitions":[{"topic":"test_topic","partition":0}]} > {"timestamp":1662054847038,"name":"records_consumed","count":23,"partitions":[{"topic":"test_topic","partition":0,"count":23,"minOffset":3739,"maxOffset":3761}]} > {code} > A few things to note here: > # Manually calling commitSync in the onPartitionsRevoke cb seems to > alleviate the issue > # Setting includeMetadataInTimeout to false also seems to alleviate the > issue. > The above tries seems to suggest that contract between poll() and > asyncCommit() is broken. AFAIK, we implicitly uses poll() to ack the > previously fetched data, and the consumer would (try to) commit these offsets > in the current poll() loop. However, it seems like as the poll continues to > loop, the "acked" data isn't being committed. > > I believe this could be introduced in KAFKA-14024, which originated from > KAFKA-13310. > More specifically, (see the comments below), the ConsumerCoordinator will > alway return before async commit, due to the previous incomplete commit. > However, this is a bit contradictory here because: > # I think we want to commit asynchronously while the poll continues, and if > we do that, we are back to KAFKA-14024, that the consumer will get rebalance > timeout and get kicked out of the group. > # But we also need to commit all the "acked" offsets before revoking the > partition, and this has to be blocked. > *Steps to Reproduce the Issue:* > # Check out AK 3.2 > # Run this several times: (Recommend to only run runs with autocommit > enabled in consumer_test.py to save time) > {code:java} > _DUCKTAPE_OPTIONS="--debug" > TC_PATHS="tests/kafkatest/tests/client/consumer_test.py::OffsetValidationTest.test_consumer_failure" > bash tests/docker/run_tests.sh {code} > > *Steps to Diagnose the Issue:* > # Open the test results in *results/* > # Go to the consumer log. It might look like this > > {code:java} > results/2022-09-03--005/OffsetValidationTest/test_consumer_failure/clean_shutdown=True.enable_autocommit=True.metadata_quorum=ZK/2/VerifiableConsumer-0-xx/dockerYY > {code} > 3. Find the docker instance that has partition getting revoked and rejoined. > Observed the offset before and after. > *Propose Fixes:* > TBD -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14196) Duplicated consumption during rebalance, causing OffsetValidationTest to act flaky
[ https://issues.apache.org/jira/browse/KAFKA-14196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14196: Priority: Blocker (was: Major) > Duplicated consumption during rebalance, causing OffsetValidationTest to act > flaky > -- > > Key: KAFKA-14196 > URL: https://issues.apache.org/jira/browse/KAFKA-14196 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Affects Versions: 3.2.1 >Reporter: Philip Nee >Assignee: Philip Nee >Priority: Blocker > Labels: new-consumer-threading-should-fix > Fix For: 3.3.0, 3.2.2 > > > Several flaky tests under OffsetValidationTest are indicating potential > consumer duplication issue, when autocommit is enabled. I believe this is > affecting *3.2* and onward. Below shows the failure message: > > {code:java} > Total consumed records 3366 did not match consumed position 3331 {code} > > After investigating the log, I discovered that the data consumed between the > start of a rebalance event and the async commit was lost for those failing > tests. In the example below, the rebalance event kicks in at around > 1662054846995 (first record), and the async commit of the offset 3739 is > completed at around 1662054847015 (right before partitions_revoked). > > {code:java} > {"timestamp":1662054846995,"name":"records_consumed","count":3,"partitions":[{"topic":"test_topic","partition":0,"count":3,"minOffset":3739,"maxOffset":3741}]} > {"timestamp":1662054846998,"name":"records_consumed","count":2,"partitions":[{"topic":"test_topic","partition":0,"count":2,"minOffset":3742,"maxOffset":3743}]} > {"timestamp":1662054847008,"name":"records_consumed","count":2,"partitions":[{"topic":"test_topic","partition":0,"count":2,"minOffset":3744,"maxOffset":3745}]} > {"timestamp":1662054847016,"name":"partitions_revoked","partitions":[{"topic":"test_topic","partition":0}]} > {"timestamp":1662054847031,"name":"partitions_assigned","partitions":[{"topic":"test_topic","partition":0}]} > {"timestamp":1662054847038,"name":"records_consumed","count":23,"partitions":[{"topic":"test_topic","partition":0,"count":23,"minOffset":3739,"maxOffset":3761}]} > {code} > A few things to note here: > # Manually calling commitSync in the onPartitionsRevoke cb seems to > alleviate the issue > # Setting includeMetadataInTimeout to false also seems to alleviate the > issue. > The above tries seems to suggest that contract between poll() and > asyncCommit() is broken. AFAIK, we implicitly uses poll() to ack the > previously fetched data, and the consumer would (try to) commit these offsets > in the current poll() loop. However, it seems like as the poll continues to > loop, the "acked" data isn't being committed. > > I believe this could be introduced in KAFKA-14024, which originated from > KAFKA-13310. > More specifically, (see the comments below), the ConsumerCoordinator will > alway return before async commit, due to the previous incomplete commit. > However, this is a bit contradictory here because: > # I think we want to commit asynchronously while the poll continues, and if > we do that, we are back to KAFKA-14024, that the consumer will get rebalance > timeout and get kicked out of the group. > # But we also need to commit all the "acked" offsets before revoking the > partition, and this has to be blocked. > *Steps to Reproduce the Issue:* > # Check out AK 3.2 > # Run this several times: (Recommend to only run runs with autocommit > enabled in consumer_test.py to save time) > {code:java} > _DUCKTAPE_OPTIONS="--debug" > TC_PATHS="tests/kafkatest/tests/client/consumer_test.py::OffsetValidationTest.test_consumer_failure" > bash tests/docker/run_tests.sh {code} > > *Steps to Diagnose the Issue:* > # Open the test results in *results/* > # Go to the consumer log. It might look like this > > {code:java} > results/2022-09-03--005/OffsetValidationTest/test_consumer_failure/clean_shutdown=True.enable_autocommit=True.metadata_quorum=ZK/2/VerifiableConsumer-0-xx/dockerYY > {code} > 3. Find the docker instance that has partition getting revoked and rejoined. > Observed the offset before and after. > *Propose Fixes:* > TBD -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14196) Duplicated consumption during rebalance, causing OffsetValidationTest to act flaky
[ https://issues.apache.org/jira/browse/KAFKA-14196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14196: Fix Version/s: 3.3.0 3.2.2 > Duplicated consumption during rebalance, causing OffsetValidationTest to act > flaky > -- > > Key: KAFKA-14196 > URL: https://issues.apache.org/jira/browse/KAFKA-14196 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Affects Versions: 3.2.1 >Reporter: Philip Nee >Assignee: Philip Nee >Priority: Major > Labels: new-consumer-threading-should-fix > Fix For: 3.3.0, 3.2.2 > > > Several flaky tests under OffsetValidationTest are indicating potential > consumer duplication issue, when autocommit is enabled. I believe this is > affecting *3.2* and onward. Below shows the failure message: > > {code:java} > Total consumed records 3366 did not match consumed position 3331 {code} > > After investigating the log, I discovered that the data consumed between the > start of a rebalance event and the async commit was lost for those failing > tests. In the example below, the rebalance event kicks in at around > 1662054846995 (first record), and the async commit of the offset 3739 is > completed at around 1662054847015 (right before partitions_revoked). > > {code:java} > {"timestamp":1662054846995,"name":"records_consumed","count":3,"partitions":[{"topic":"test_topic","partition":0,"count":3,"minOffset":3739,"maxOffset":3741}]} > {"timestamp":1662054846998,"name":"records_consumed","count":2,"partitions":[{"topic":"test_topic","partition":0,"count":2,"minOffset":3742,"maxOffset":3743}]} > {"timestamp":1662054847008,"name":"records_consumed","count":2,"partitions":[{"topic":"test_topic","partition":0,"count":2,"minOffset":3744,"maxOffset":3745}]} > {"timestamp":1662054847016,"name":"partitions_revoked","partitions":[{"topic":"test_topic","partition":0}]} > {"timestamp":1662054847031,"name":"partitions_assigned","partitions":[{"topic":"test_topic","partition":0}]} > {"timestamp":1662054847038,"name":"records_consumed","count":23,"partitions":[{"topic":"test_topic","partition":0,"count":23,"minOffset":3739,"maxOffset":3761}]} > {code} > A few things to note here: > # Manually calling commitSync in the onPartitionsRevoke cb seems to > alleviate the issue > # Setting includeMetadataInTimeout to false also seems to alleviate the > issue. > The above tries seems to suggest that contract between poll() and > asyncCommit() is broken. AFAIK, we implicitly uses poll() to ack the > previously fetched data, and the consumer would (try to) commit these offsets > in the current poll() loop. However, it seems like as the poll continues to > loop, the "acked" data isn't being committed. > > I believe this could be introduced in KAFKA-14024, which originated from > KAFKA-13310. > More specifically, (see the comments below), the ConsumerCoordinator will > alway return before async commit, due to the previous incomplete commit. > However, this is a bit contradictory here because: > # I think we want to commit asynchronously while the poll continues, and if > we do that, we are back to KAFKA-14024, that the consumer will get rebalance > timeout and get kicked out of the group. > # But we also need to commit all the "acked" offsets before revoking the > partition, and this has to be blocked. > *Steps to Reproduce the Issue:* > # Check out AK 3.2 > # Run this several times: (Recommend to only run runs with autocommit > enabled in consumer_test.py to save time) > {code:java} > _DUCKTAPE_OPTIONS="--debug" > TC_PATHS="tests/kafkatest/tests/client/consumer_test.py::OffsetValidationTest.test_consumer_failure" > bash tests/docker/run_tests.sh {code} > > *Steps to Diagnose the Issue:* > # Open the test results in *results/* > # Go to the consumer log. It might look like this > > {code:java} > results/2022-09-03--005/OffsetValidationTest/test_consumer_failure/clean_shutdown=True.enable_autocommit=True.metadata_quorum=ZK/2/VerifiableConsumer-0-xx/dockerYY > {code} > 3. Find the docker instance that has partition getting revoked and rejoined. > Observed the offset before and after. > *Propose Fixes:* > TBD -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14196) Duplicated consumption during rebalance, causing OffsetValidationTest to act flaky
[ https://issues.apache.org/jira/browse/KAFKA-14196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14196: Affects Version/s: (was: 3.2.1) > Duplicated consumption during rebalance, causing OffsetValidationTest to act > flaky > -- > > Key: KAFKA-14196 > URL: https://issues.apache.org/jira/browse/KAFKA-14196 > Project: Kafka > Issue Type: Bug > Components: clients, consumer >Affects Versions: 3.3.0 >Reporter: Philip Nee >Assignee: Philip Nee >Priority: Major > Labels: new-consumer-threading-should-fix > > Several flaky tests under OffsetValidationTest are indicating potential > consumer duplication issue, when autocommit is enabled. I believe this is > affecting *3.2* and onward. Below shows the failure message: > > {code:java} > Total consumed records 3366 did not match consumed position 3331 {code} > > After investigating the log, I discovered that the data consumed between the > start of a rebalance event and the async commit was lost for those failing > tests. In the example below, the rebalance event kicks in at around > 1662054846995 (first record), and the async commit of the offset 3739 is > completed at around 1662054847015 (right before partitions_revoked). > > {code:java} > {"timestamp":1662054846995,"name":"records_consumed","count":3,"partitions":[{"topic":"test_topic","partition":0,"count":3,"minOffset":3739,"maxOffset":3741}]} > {"timestamp":1662054846998,"name":"records_consumed","count":2,"partitions":[{"topic":"test_topic","partition":0,"count":2,"minOffset":3742,"maxOffset":3743}]} > {"timestamp":1662054847008,"name":"records_consumed","count":2,"partitions":[{"topic":"test_topic","partition":0,"count":2,"minOffset":3744,"maxOffset":3745}]} > {"timestamp":1662054847016,"name":"partitions_revoked","partitions":[{"topic":"test_topic","partition":0}]} > {"timestamp":1662054847031,"name":"partitions_assigned","partitions":[{"topic":"test_topic","partition":0}]} > {"timestamp":1662054847038,"name":"records_consumed","count":23,"partitions":[{"topic":"test_topic","partition":0,"count":23,"minOffset":3739,"maxOffset":3761}]} > {code} > A few things to note here: > # Manually calling commitSync in the onPartitionsRevoke cb seems to > alleviate the issue > # Setting includeMetadataInTimeout to false also seems to alleviate the > issue. > The above tries seems to suggest that contract between poll() and > asyncCommit() is broken. AFAIK, we implicitly uses poll() to ack the > previously fetched data, and the consumer would (try to) commit these offsets > in the current poll() loop. However, it seems like as the poll continues to > loop, the "acked" data isn't being committed. > > I believe this could be introduced in KAFKA-14024, which originated from > KAFKA-13310. > More specifically, (see the comments below), the ConsumerCoordinator will > alway return before async commit, due to the previous incomplete commit. > However, this is a bit contradictory here because: > # I think we want to commit asynchronously while the poll continues, and if > we do that, we are back to KAFKA-14024, that the consumer will get rebalance > timeout and get kicked out of the group. > # But we also need to commit all the "acked" offsets before revoking the > partition, and this has to be blocked. > *Steps to Reproduce the Issue:* > # Check out AK 3.2 > # Run this several times: (Recommend to only run runs with autocommit > enabled in consumer_test.py to save time) > {code:java} > _DUCKTAPE_OPTIONS="--debug" > TC_PATHS="tests/kafkatest/tests/client/consumer_test.py::OffsetValidationTest.test_consumer_failure" > bash tests/docker/run_tests.sh {code} > > *Steps to Diagnose the Issue:* > # Open the test results in *results/* > # Go to the consumer log. It might look like this > > {code:java} > results/2022-09-03--005/OffsetValidationTest/test_consumer_failure/clean_shutdown=True.enable_autocommit=True.metadata_quorum=ZK/2/VerifiableConsumer-0-xx/dockerYY > {code} > 3. Find the docker instance that has partition getting revoked and rejoined. > Observed the offset before and after. > *Propose Fixes:* > TBD -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14201) Consumer should not send group instance ID if committing with empty member ID
[ https://issues.apache.org/jira/browse/KAFKA-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17599744#comment-17599744 ] Jason Gustafson commented on KAFKA-14201: - Worth mentioning that the alternative is to make the server more permissive and just ignore instance ID if it is set in OffsetCommit while memberId is empty. > Consumer should not send group instance ID if committing with empty member ID > - > > Key: KAFKA-14201 > URL: https://issues.apache.org/jira/browse/KAFKA-14201 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Major > > The consumer group instance ID is used to support a notion of "static" > consumer groups. The idea is to be able to identify the same group instance > across restarts so that a rebalance is not needed. However, if the user sets > `group.instance.id` in the consumer configuration, but uses "simple" > assignment with `assign()`, then the instance ID nevertheless is sent in the > OffsetCommit request to the coordinator. This may result in a surprising > UNKNOWN_MEMBER_ID error. The consumer should probably be smart enough to only > send the instance ID when committing as part of a consumer group. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14201) Consumer should not send group instance ID if committing with empty member ID
Jason Gustafson created KAFKA-14201: --- Summary: Consumer should not send group instance ID if committing with empty member ID Key: KAFKA-14201 URL: https://issues.apache.org/jira/browse/KAFKA-14201 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson The consumer group instance ID is used to support a notion of "static" consumer groups. The idea is to be able to identify the same group instance across restarts so that a rebalance is not needed. However, if the user sets `group.instance.id` in the consumer configuration, but uses "simple" assignment with `assign()`, then the instance ID nevertheless is sent in the OffsetCommit request to the coordinator. This may result in a surprising UNKNOWN_MEMBER_ID error. The consumer should probably be smart enough to only send the instance ID when committing as part of a consumer group. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14177) Correctly support older kraft versions without FeatureLevelRecord
[ https://issues.apache.org/jira/browse/KAFKA-14177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14177. - Resolution: Fixed > Correctly support older kraft versions without FeatureLevelRecord > - > > Key: KAFKA-14177 > URL: https://issues.apache.org/jira/browse/KAFKA-14177 > Project: Kafka > Issue Type: Bug >Reporter: Colin McCabe >Assignee: Colin McCabe >Priority: Blocker > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-14183) Kraft bootstrap metadata file should use snapshot header/footer
[ https://issues.apache.org/jira/browse/KAFKA-14183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson reassigned KAFKA-14183: --- Assignee: Jason Gustafson > Kraft bootstrap metadata file should use snapshot header/footer > --- > > Key: KAFKA-14183 > URL: https://issues.apache.org/jira/browse/KAFKA-14183 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > Fix For: 3.3.0 > > > The bootstrap checkpoint file that we use in kraft is intended to follow the > usual snapshot format, but currently it does not include the header/footer > control records. The main purpose of these at the moment is to set a version > for the checkpoint file itself. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14183) Kraft bootstrap metadata file should use snapshot header/footer
[ https://issues.apache.org/jira/browse/KAFKA-14183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14183: Description: The bootstrap checkpoint file that we use in kraft is intended to follow the usual snapshot format, but currently it does not include the header/footer control records. The main purpose of these at the moment is to set a version for the checkpoint file itself. > Kraft bootstrap metadata file should use snapshot header/footer > --- > > Key: KAFKA-14183 > URL: https://issues.apache.org/jira/browse/KAFKA-14183 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Major > Fix For: 3.3.0 > > > The bootstrap checkpoint file that we use in kraft is intended to follow the > usual snapshot format, but currently it does not include the header/footer > control records. The main purpose of these at the moment is to set a version > for the checkpoint file itself. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14183) Kraft bootstrap metadata file should use snapshot header/footer
Jason Gustafson created KAFKA-14183: --- Summary: Kraft bootstrap metadata file should use snapshot header/footer Key: KAFKA-14183 URL: https://issues.apache.org/jira/browse/KAFKA-14183 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson Fix For: 3.3.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-13166) EOFException when Controller handles unknown API
[ https://issues.apache.org/jira/browse/KAFKA-13166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17583117#comment-17583117 ] Jason Gustafson edited comment on KAFKA-13166 at 8/22/22 5:42 PM: -- I am going to resolve this jira. There were two issues that we have fixed: * [https://github.com/apache/kafka/pull/12403]: This patch fixes error handling in `ControllerApis` so that we catch errors which occur during serialization and sending of the response. Prior to this, any error occurred during this phase would be silently ignored and the connection would be left hanging. * [https://github.com/apache/kafka/pull/12538:] This patch fixes a bug during kraft shutdown which can cause a response to try to be sent after the controller's socket server has been shutdown. When this occurs, we get an exception trace which matches that in the description above. was (Author: hachikuji): I am going to resolve this jira. There were two issues that we have fixed: * [https://github.com/apache/kafka/pull/12403]: This patch fixes error handling in `ControllerApis` so that we catch errors which occur during serialization and sending of the response. Prior to this, any error occurred during this phase would be silently ignored and the connection would be left hanging. * [https://github.com/apache/kafka/pull/12538:] This patch fixes a bug during kraft shutdown which can cause a response to be sent after the controller's socket server has been shutdown. When this occurs, we get an exception trace which matches that in the description above. > EOFException when Controller handles unknown API > > > Key: KAFKA-13166 > URL: https://issues.apache.org/jira/browse/KAFKA-13166 > Project: Kafka > Issue Type: Bug >Reporter: David Arthur >Assignee: David Arthur >Priority: Minor > Fix For: 3.3.0 > > > When ControllerApis handles an unsupported RPC, it silently drops the request > due to an unhandled exception. > The following stack trace was manually printed since this exception was > suppressed on the controller. > {code} > java.util.NoSuchElementException: key not found: UpdateFeatures > at scala.collection.MapOps.default(Map.scala:274) > at scala.collection.MapOps.default$(Map.scala:273) > at scala.collection.AbstractMap.default(Map.scala:405) > at scala.collection.mutable.HashMap.apply(HashMap.scala:425) > at kafka.network.RequestChannel$Metrics.apply(RequestChannel.scala:74) > at > kafka.network.RequestChannel.$anonfun$updateErrorMetrics$1(RequestChannel.scala:458) > at > kafka.network.RequestChannel.$anonfun$updateErrorMetrics$1$adapted(RequestChannel.scala:457) > at > kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62) > at > scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359) > at > scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355) > at > scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309) > at > kafka.network.RequestChannel.updateErrorMetrics(RequestChannel.scala:457) > at kafka.network.RequestChannel.sendResponse(RequestChannel.scala:388) > at > kafka.server.RequestHandlerHelper.sendErrorOrCloseConnection(RequestHandlerHelper.scala:93) > at > kafka.server.RequestHandlerHelper.sendErrorResponseMaybeThrottle(RequestHandlerHelper.scala:121) > at > kafka.server.RequestHandlerHelper.handleError(RequestHandlerHelper.scala:78) > at kafka.server.ControllerApis.handle(ControllerApis.scala:116) > at > kafka.server.ControllerApis.$anonfun$handleEnvelopeRequest$1(ControllerApis.scala:125) > at > kafka.server.ControllerApis.$anonfun$handleEnvelopeRequest$1$adapted(ControllerApis.scala:125) > at > kafka.server.EnvelopeUtils$.handleEnvelopeRequest(EnvelopeUtils.scala:65) > at > kafka.server.ControllerApis.handleEnvelopeRequest(ControllerApis.scala:125) > at kafka.server.ControllerApis.handle(ControllerApis.scala:103) > at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:75) > at java.lang.Thread.run(Thread.java:748) > {code} > This is due to a bug in the metrics code in RequestChannel. > The result is that the request fails, but no indication is given that it was > due to an unsupported API on either the broker, controller, or client. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-13166) EOFException when Controller handles unknown API
[ https://issues.apache.org/jira/browse/KAFKA-13166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17583117#comment-17583117 ] Jason Gustafson commented on KAFKA-13166: - I am going to resolve this jira. There were two issues that we have fixed: * [https://github.com/apache/kafka/pull/12403]: This patch fixes error handling in `ControllerApis` so that we catch errors which occur during serialization and sending of the response. Prior to this, any error occurred during this phase would be silently ignored and the connection would be left hanging. * [https://github.com/apache/kafka/pull/12538:] This patch fixes a bug during kraft shutdown which can cause a response to be sent after the controller's socket server has been shutdown. When this occurs, we get an exception trace which matches that in the description above. > EOFException when Controller handles unknown API > > > Key: KAFKA-13166 > URL: https://issues.apache.org/jira/browse/KAFKA-13166 > Project: Kafka > Issue Type: Bug >Reporter: David Arthur >Assignee: David Arthur >Priority: Minor > Fix For: 3.3.0, 3.4.0 > > > When ControllerApis handles an unsupported RPC, it silently drops the request > due to an unhandled exception. > The following stack trace was manually printed since this exception was > suppressed on the controller. > {code} > java.util.NoSuchElementException: key not found: UpdateFeatures > at scala.collection.MapOps.default(Map.scala:274) > at scala.collection.MapOps.default$(Map.scala:273) > at scala.collection.AbstractMap.default(Map.scala:405) > at scala.collection.mutable.HashMap.apply(HashMap.scala:425) > at kafka.network.RequestChannel$Metrics.apply(RequestChannel.scala:74) > at > kafka.network.RequestChannel.$anonfun$updateErrorMetrics$1(RequestChannel.scala:458) > at > kafka.network.RequestChannel.$anonfun$updateErrorMetrics$1$adapted(RequestChannel.scala:457) > at > kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62) > at > scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359) > at > scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355) > at > scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309) > at > kafka.network.RequestChannel.updateErrorMetrics(RequestChannel.scala:457) > at kafka.network.RequestChannel.sendResponse(RequestChannel.scala:388) > at > kafka.server.RequestHandlerHelper.sendErrorOrCloseConnection(RequestHandlerHelper.scala:93) > at > kafka.server.RequestHandlerHelper.sendErrorResponseMaybeThrottle(RequestHandlerHelper.scala:121) > at > kafka.server.RequestHandlerHelper.handleError(RequestHandlerHelper.scala:78) > at kafka.server.ControllerApis.handle(ControllerApis.scala:116) > at > kafka.server.ControllerApis.$anonfun$handleEnvelopeRequest$1(ControllerApis.scala:125) > at > kafka.server.ControllerApis.$anonfun$handleEnvelopeRequest$1$adapted(ControllerApis.scala:125) > at > kafka.server.EnvelopeUtils$.handleEnvelopeRequest(EnvelopeUtils.scala:65) > at > kafka.server.ControllerApis.handleEnvelopeRequest(ControllerApis.scala:125) > at kafka.server.ControllerApis.handle(ControllerApis.scala:103) > at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:75) > at java.lang.Thread.run(Thread.java:748) > {code} > This is due to a bug in the metrics code in RequestChannel. > The result is that the request fails, but no indication is given that it was > due to an unsupported API on either the broker, controller, or client. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-13166) EOFException when Controller handles unknown API
[ https://issues.apache.org/jira/browse/KAFKA-13166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-13166: Fix Version/s: (was: 3.4.0) > EOFException when Controller handles unknown API > > > Key: KAFKA-13166 > URL: https://issues.apache.org/jira/browse/KAFKA-13166 > Project: Kafka > Issue Type: Bug >Reporter: David Arthur >Assignee: David Arthur >Priority: Minor > Fix For: 3.3.0 > > > When ControllerApis handles an unsupported RPC, it silently drops the request > due to an unhandled exception. > The following stack trace was manually printed since this exception was > suppressed on the controller. > {code} > java.util.NoSuchElementException: key not found: UpdateFeatures > at scala.collection.MapOps.default(Map.scala:274) > at scala.collection.MapOps.default$(Map.scala:273) > at scala.collection.AbstractMap.default(Map.scala:405) > at scala.collection.mutable.HashMap.apply(HashMap.scala:425) > at kafka.network.RequestChannel$Metrics.apply(RequestChannel.scala:74) > at > kafka.network.RequestChannel.$anonfun$updateErrorMetrics$1(RequestChannel.scala:458) > at > kafka.network.RequestChannel.$anonfun$updateErrorMetrics$1$adapted(RequestChannel.scala:457) > at > kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62) > at > scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359) > at > scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355) > at > scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309) > at > kafka.network.RequestChannel.updateErrorMetrics(RequestChannel.scala:457) > at kafka.network.RequestChannel.sendResponse(RequestChannel.scala:388) > at > kafka.server.RequestHandlerHelper.sendErrorOrCloseConnection(RequestHandlerHelper.scala:93) > at > kafka.server.RequestHandlerHelper.sendErrorResponseMaybeThrottle(RequestHandlerHelper.scala:121) > at > kafka.server.RequestHandlerHelper.handleError(RequestHandlerHelper.scala:78) > at kafka.server.ControllerApis.handle(ControllerApis.scala:116) > at > kafka.server.ControllerApis.$anonfun$handleEnvelopeRequest$1(ControllerApis.scala:125) > at > kafka.server.ControllerApis.$anonfun$handleEnvelopeRequest$1$adapted(ControllerApis.scala:125) > at > kafka.server.EnvelopeUtils$.handleEnvelopeRequest(EnvelopeUtils.scala:65) > at > kafka.server.ControllerApis.handleEnvelopeRequest(ControllerApis.scala:125) > at kafka.server.ControllerApis.handle(ControllerApis.scala:103) > at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:75) > at java.lang.Thread.run(Thread.java:748) > {code} > This is due to a bug in the metrics code in RequestChannel. > The result is that the request fails, but no indication is given that it was > due to an unsupported API on either the broker, controller, or client. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-13166) EOFException when Controller handles unknown API
[ https://issues.apache.org/jira/browse/KAFKA-13166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-13166. - Resolution: Fixed > EOFException when Controller handles unknown API > > > Key: KAFKA-13166 > URL: https://issues.apache.org/jira/browse/KAFKA-13166 > Project: Kafka > Issue Type: Bug >Reporter: David Arthur >Assignee: David Arthur >Priority: Minor > Fix For: 3.3.0, 3.4.0 > > > When ControllerApis handles an unsupported RPC, it silently drops the request > due to an unhandled exception. > The following stack trace was manually printed since this exception was > suppressed on the controller. > {code} > java.util.NoSuchElementException: key not found: UpdateFeatures > at scala.collection.MapOps.default(Map.scala:274) > at scala.collection.MapOps.default$(Map.scala:273) > at scala.collection.AbstractMap.default(Map.scala:405) > at scala.collection.mutable.HashMap.apply(HashMap.scala:425) > at kafka.network.RequestChannel$Metrics.apply(RequestChannel.scala:74) > at > kafka.network.RequestChannel.$anonfun$updateErrorMetrics$1(RequestChannel.scala:458) > at > kafka.network.RequestChannel.$anonfun$updateErrorMetrics$1$adapted(RequestChannel.scala:457) > at > kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62) > at > scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359) > at > scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355) > at > scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309) > at > kafka.network.RequestChannel.updateErrorMetrics(RequestChannel.scala:457) > at kafka.network.RequestChannel.sendResponse(RequestChannel.scala:388) > at > kafka.server.RequestHandlerHelper.sendErrorOrCloseConnection(RequestHandlerHelper.scala:93) > at > kafka.server.RequestHandlerHelper.sendErrorResponseMaybeThrottle(RequestHandlerHelper.scala:121) > at > kafka.server.RequestHandlerHelper.handleError(RequestHandlerHelper.scala:78) > at kafka.server.ControllerApis.handle(ControllerApis.scala:116) > at > kafka.server.ControllerApis.$anonfun$handleEnvelopeRequest$1(ControllerApis.scala:125) > at > kafka.server.ControllerApis.$anonfun$handleEnvelopeRequest$1$adapted(ControllerApis.scala:125) > at > kafka.server.EnvelopeUtils$.handleEnvelopeRequest(EnvelopeUtils.scala:65) > at > kafka.server.ControllerApis.handleEnvelopeRequest(ControllerApis.scala:125) > at kafka.server.ControllerApis.handle(ControllerApis.scala:103) > at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:75) > at java.lang.Thread.run(Thread.java:748) > {code} > This is due to a bug in the metrics code in RequestChannel. > The result is that the request fails, but no indication is given that it was > due to an unsupported API on either the broker, controller, or client. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-13914) Implement kafka-metadata-quorum.sh
[ https://issues.apache.org/jira/browse/KAFKA-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-13914. - Fix Version/s: 3.3.0 Resolution: Fixed > Implement kafka-metadata-quorum.sh > -- > > Key: KAFKA-13914 > URL: https://issues.apache.org/jira/browse/KAFKA-13914 > Project: Kafka > Issue Type: Improvement >Reporter: Jason Gustafson >Assignee: dengziming >Priority: Major > Fix For: 3.3.0 > > > KIP-595 documents a tool for describing quorum status > `kafka-metadata-quorum.sh`: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-595%3A+A+Raft+Protocol+for+the+Metadata+Quorum#KIP595:ARaftProtocolfortheMetadataQuorum-ToolingSupport.] > We need to implement this. > Note that this depends on the Admin API for `DescribeQuorum`, which is > proposed in KIP-836: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-836%3A+Addition+of+Information+in+DescribeQuorumResponse+about+Voter+Lag.] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14167) Unexpected UNKNOWN_SERVER_ERROR raised from kraft controller
[ https://issues.apache.org/jira/browse/KAFKA-14167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14167. - Resolution: Fixed > Unexpected UNKNOWN_SERVER_ERROR raised from kraft controller > > > Key: KAFKA-14167 > URL: https://issues.apache.org/jira/browse/KAFKA-14167 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Blocker > Fix For: 3.3.0 > > > In `ControllerApis`, we have callbacks such as the following after completion: > {code:java} > controller.allocateProducerIds(context, allocatedProducerIdsRequest.data) > .handle[Unit] { (results, exception) => > if (exception != null) { > requestHelper.handleError(request, exception) > } else { > requestHelper.sendResponseMaybeThrottle(request, requestThrottleMs > => { > results.setThrottleTimeMs(requestThrottleMs) > new AllocateProducerIdsResponse(results) > }) > } > } {code} > What I see locally is that the underlying exception that gets passed to > `handle` always gets wrapped in a `CompletionException`. When passed to > `getErrorResponse`, this error will get converted to `UNKNOWN_SERVER_ERROR`. > For example, in this case, a `NOT_CONTROLLER` error returned from the > controller would be returned as `UNKNOWN_SERVER_ERROR`. It looks like there > are a few APIs that are potentially affected by this bug, such as > `DeleteTopics` and `UpdateFeatures`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-13940) DescribeQuorum returns INVALID_REQUEST if not handled by leader
[ https://issues.apache.org/jira/browse/KAFKA-13940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-13940. - Resolution: Fixed > DescribeQuorum returns INVALID_REQUEST if not handled by leader > --- > > Key: KAFKA-13940 > URL: https://issues.apache.org/jira/browse/KAFKA-13940 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > Fix For: 3.3.0 > > > In `KafkaRaftClient.handleDescribeQuorum`, we currently return > INVALID_REQUEST if the node is not the current raft leader. This is > surprising and doesn't work with our general approach for retrying forwarded > APIs. In `BrokerToControllerChannelManager`, we only retry after > `NOT_CONTROLLER` errors. It would be more consistent with the other Raft APIs > if we returned NOT_LEADER_OR_FOLLOWER, but that also means we need additional > logic in `BrokerToControllerChannelManager` to handle that error and retry > correctly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14167) Unexpected UNKNOWN_SERVER_ERROR raised from kraft controller
[ https://issues.apache.org/jira/browse/KAFKA-14167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14167: Priority: Blocker (was: Major) > Unexpected UNKNOWN_SERVER_ERROR raised from kraft controller > > > Key: KAFKA-14167 > URL: https://issues.apache.org/jira/browse/KAFKA-14167 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Blocker > Fix For: 3.3.0 > > > In `ControllerApis`, we have callbacks such as the following after completion: > {code:java} > controller.allocateProducerIds(context, allocatedProducerIdsRequest.data) > .handle[Unit] { (results, exception) => > if (exception != null) { > requestHelper.handleError(request, exception) > } else { > requestHelper.sendResponseMaybeThrottle(request, requestThrottleMs > => { > results.setThrottleTimeMs(requestThrottleMs) > new AllocateProducerIdsResponse(results) > }) > } > } {code} > What I see locally is that the underlying exception that gets passed to > `handle` always gets wrapped in a `CompletionException`. When passed to > `getErrorResponse`, this error will get converted to `UNKNOWN_SERVER_ERROR`. > For example, in this case, a `NOT_CONTROLLER` error returned from the > controller would be returned as `UNKNOWN_SERVER_ERROR`. It looks like there > are a few APIs that are potentially affected by this bug, such as > `DeleteTopics` and `UpdateFeatures`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14167) Unexpected UNKNOWN_SERVER_ERROR raised from kraft controller
Jason Gustafson created KAFKA-14167: --- Summary: Unexpected UNKNOWN_SERVER_ERROR raised from kraft controller Key: KAFKA-14167 URL: https://issues.apache.org/jira/browse/KAFKA-14167 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson In `ControllerApis`, we have callbacks such as the following after completion: {code:java} controller.allocateProducerIds(context, allocatedProducerIdsRequest.data) .handle[Unit] { (results, exception) => if (exception != null) { requestHelper.handleError(request, exception) } else { requestHelper.sendResponseMaybeThrottle(request, requestThrottleMs => { results.setThrottleTimeMs(requestThrottleMs) new AllocateProducerIdsResponse(results) }) } } {code} What I see locally is that the underlying exception that gets passed to `handle` always gets wrapped in a `CompletionException`. When passed to `getErrorResponse`, this error will get converted to `UNKNOWN_SERVER_ERROR`. For example, in this case, a `NOT_CONTROLLER` error returned from the controller would be returned as `UNKNOWN_SERVER_ERROR`. It looks like there are a few APIs that are potentially affected by this bug, such as `DeleteTopics` and `UpdateFeatures`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14167) Unexpected UNKNOWN_SERVER_ERROR raised from kraft controller
[ https://issues.apache.org/jira/browse/KAFKA-14167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14167: Fix Version/s: 3.3.0 > Unexpected UNKNOWN_SERVER_ERROR raised from kraft controller > > > Key: KAFKA-14167 > URL: https://issues.apache.org/jira/browse/KAFKA-14167 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Major > Fix For: 3.3.0 > > > In `ControllerApis`, we have callbacks such as the following after completion: > {code:java} > controller.allocateProducerIds(context, allocatedProducerIdsRequest.data) > .handle[Unit] { (results, exception) => > if (exception != null) { > requestHelper.handleError(request, exception) > } else { > requestHelper.sendResponseMaybeThrottle(request, requestThrottleMs > => { > results.setThrottleTimeMs(requestThrottleMs) > new AllocateProducerIdsResponse(results) > }) > } > } {code} > What I see locally is that the underlying exception that gets passed to > `handle` always gets wrapped in a `CompletionException`. When passed to > `getErrorResponse`, this error will get converted to `UNKNOWN_SERVER_ERROR`. > For example, in this case, a `NOT_CONTROLLER` error returned from the > controller would be returned as `UNKNOWN_SERVER_ERROR`. It looks like there > are a few APIs that are potentially affected by this bug, such as > `DeleteTopics` and `UpdateFeatures`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-13940) DescribeQuorum returns INVALID_REQUEST if not handled by leader
[ https://issues.apache.org/jira/browse/KAFKA-13940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson reassigned KAFKA-13940: --- Assignee: Jason Gustafson > DescribeQuorum returns INVALID_REQUEST if not handled by leader > --- > > Key: KAFKA-13940 > URL: https://issues.apache.org/jira/browse/KAFKA-13940 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > > In `KafkaRaftClient.handleDescribeQuorum`, we currently return > INVALID_REQUEST if the node is not the current raft leader. This is > surprising and doesn't work with our general approach for retrying forwarded > APIs. In `BrokerToControllerChannelManager`, we only retry after > `NOT_CONTROLLER` errors. It would be more consistent with the other Raft APIs > if we returned NOT_LEADER_OR_FOLLOWER, but that also means we need additional > logic in `BrokerToControllerChannelManager` to handle that error and retry > correctly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14154) Persistent URP after controller soft failure
[ https://issues.apache.org/jira/browse/KAFKA-14154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14154. - Resolution: Fixed > Persistent URP after controller soft failure > > > Key: KAFKA-14154 > URL: https://issues.apache.org/jira/browse/KAFKA-14154 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Blocker > Fix For: 3.3.0 > > > We ran into a scenario where a partition leader was unable to expand the ISR > after a soft controller failover. Here is what happened: > Initial state: leader=1, isr=[1,2], leader epoch=10. Broker 1 is acting as > the current controller. > 1. Broker 1 loses its session in Zookeeper. > 2. Broker 2 becomes the new controller. > 3. During initialization, controller 2 removes 1 from the ISR. So state is > updated: leader=2, isr=[2], leader epoch=11. > 4. Broker 2 receives `LeaderAndIsr` from the new controller with leader > epoch=11. > 5. Broker 2 immediately tries to add replica 1 back to the ISR since it is > still fetching and is caught up. However, the > `BrokerToControllerChannelManager` is still pointed at controller 1, so that > is where the `AlterPartition` is sent. > 6. Controller 1 does not yet realize that it is not the controller, so it > processes the `AlterPartition` request. It sees the leader epoch of 11, which > is higher than what it has in its own context. Following changes to the > `AlterPartition` validation in > [https://github.com/apache/kafka/pull/12032/files,] the controller returns > FENCED_LEADER_EPOCH. > 7. After receiving the FENCED_LEADER_EPOCH from the old controller, the > leader is stuck because it assumes that the error implies that another > LeaderAndIsr request should be sent. > Prior to > [https://github.com/apache/kafka/pull/12032/files|https://github.com/apache/kafka/pull/12032/files,], > the way we handled this case was a little different. We only verified that > the leader epoch in the request was at least as large as the current epoch in > the controller context. Anything higher was accepted. The controller would > have attempted to write the updated state to Zookeeper. This update would > have failed because of the controller epoch check, however, we would have > returned NOT_CONTROLLER in this case, which is handled in > `AlterPartitionManager`. > It is tempting to revert the logic, but the risk is in the idempotency check: > [https://github.com/apache/kafka/pull/12032/files#diff-3e042c962e80577a4cc9bbcccf0950651c6b312097a86164af50003c00c50d37L2369.] > If the AlterPartition request happened to match the state inside the old > controller, the controller would consider the update successful and return no > error. But if its state was already stale at that point, then that might > cause the leader to incorrectly assume that the state had been updated. > One way to fix this problem without weakening the validation is to rely on > the controller epoch in `AlterPartitionManager`. When we discover a new > controller, we also discover its epoch, so we can pass that through. The > `LeaderAndIsr` request already includes the controller epoch of the > controller that sent it and we already propagate this through to > `AlterPartition.submit`. Hence all we need to do is verify that the epoch of > the current controller target is at least as large as the one discovered > through the `LeaderAndIsr`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14166) Consistent toString implementations for byte arrays in generated messages
Jason Gustafson created KAFKA-14166: --- Summary: Consistent toString implementations for byte arrays in generated messages Key: KAFKA-14166 URL: https://issues.apache.org/jira/browse/KAFKA-14166 Project: Kafka Issue Type: Improvement Reporter: Jason Gustafson In the generated `toString()` implementations for message objects (such as protocol RPCs), we are a little inconsistent in how we display types with raw bytes. If the type is `Array[Byte]`, then we use Arrays.toString. If the type is `ByteBuffer` (i.e. when `zeroCopy` is set), then we use the corresponding `ByteBuffer.toString`, which is not often useful. We should try to be consistent. By default, it is probably not useful to print the full array contents, but we might print summary information (e.g. size, checksum). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-13986) DescribeQuorum does not return the observers (brokers) for the Metadata log
[ https://issues.apache.org/jira/browse/KAFKA-13986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-13986. - Fix Version/s: 3.3.0 Resolution: Fixed > DescribeQuorum does not return the observers (brokers) for the Metadata log > --- > > Key: KAFKA-13986 > URL: https://issues.apache.org/jira/browse/KAFKA-13986 > Project: Kafka > Issue Type: Improvement > Components: kraft >Affects Versions: 3.0.0, 3.0.1 >Reporter: Niket Goel >Assignee: Niket Goel >Priority: Major > Fix For: 3.3.0 > > > h2. Background > While working on the [PR|https://github.com/apache/kafka/pull/12206] for > KIP-836, we realized that the `DescribeQuorum` API does not return the > brokers as observers for the metadata log. > As noted by [~dengziming] : > _We set nodeId=-1 if it's a broker so observers.size==0_ > The related code is: > [https://github.com/apache/kafka/blob/4c9eeef5b2dff9a4f0977fbc5ac7eaaf930d0d0e/core/src/main/scala/kafka/raft/RaftManager.scala#L185-L189] > {code:java} > val nodeId = if (config.processRoles.contains(ControllerRole)) > { OptionalInt.of(config.nodeId) } > else > { OptionalInt.empty() } > {code} > h2. ToDo > We should fix this and have the DescribeMetadata API return the brokers as > observers for the metadata log. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14163) Build failing in streams-scala:compileScala due to zinc compiler cache
[ https://issues.apache.org/jira/browse/KAFKA-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14163. - Resolution: Workaround > Build failing in streams-scala:compileScala due to zinc compiler cache > -- > > Key: KAFKA-14163 > URL: https://issues.apache.org/jira/browse/KAFKA-14163 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Major > > I have been seeing builds failing recently with the following error: > {code:java} > [2022-08-11T17:08:22.279Z] * What went wrong: > [2022-08-11T17:08:22.279Z] Execution failed for task > ':streams:streams-scala:compileScala'. > [2022-08-11T17:08:22.279Z] > Timeout waiting to lock zinc-1.6.1_2.13.8_17 > compiler cache (/home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17). It > is currently in use by another Gradle instance. > [2022-08-11T17:08:22.279Z] Owner PID: 25008 > [2022-08-11T17:08:22.279Z] Our PID: 25524 > [2022-08-11T17:08:22.279Z] Owner Operation: > [2022-08-11T17:08:22.279Z] Our operation: > [2022-08-11T17:08:22.279Z] Lock file: > /home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17/zinc-1.6.1_2.13.8_17.lock > {code} > And another: > {code:java} > [2022-08-10T21:30:41.779Z] * What went wrong: > [2022-08-10T21:30:41.779Z] Execution failed for task > ':streams:streams-scala:compileScala'. > [2022-08-10T21:30:41.779Z] > Timeout waiting to lock zinc-1.6.1_2.13.8_17 > compiler cache (/home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17). It > is currently in use by another Gradle instance. > [2022-08-10T21:30:41.779Z] Owner PID: 11022 > [2022-08-10T21:30:41.779Z] Our PID: 11766 > [2022-08-10T21:30:41.779Z] Owner Operation: > [2022-08-10T21:30:41.779Z] Our operation: > [2022-08-10T21:30:41.779Z] Lock file: > /home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17/zinc-1.6.1_2.13.8_17.lock > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14163) Build failing in streams-scala:compileScala due to zinc compiler cache
[ https://issues.apache.org/jira/browse/KAFKA-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14163: Summary: Build failing in streams-scala:compileScala due to zinc compiler cache (was: Build failing in scala:compileScala due to zinc compiler cache) > Build failing in streams-scala:compileScala due to zinc compiler cache > -- > > Key: KAFKA-14163 > URL: https://issues.apache.org/jira/browse/KAFKA-14163 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Major > > I have been seeing builds failing recently with the following error: > {code:java} > [2022-08-11T17:08:22.279Z] * What went wrong: > [2022-08-11T17:08:22.279Z] Execution failed for task > ':streams:streams-scala:compileScala'. > [2022-08-11T17:08:22.279Z] > Timeout waiting to lock zinc-1.6.1_2.13.8_17 > compiler cache (/home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17). It > is currently in use by another Gradle instance. > [2022-08-11T17:08:22.279Z] Owner PID: 25008 > [2022-08-11T17:08:22.279Z] Our PID: 25524 > [2022-08-11T17:08:22.279Z] Owner Operation: > [2022-08-11T17:08:22.279Z] Our operation: > [2022-08-11T17:08:22.279Z] Lock file: > /home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17/zinc-1.6.1_2.13.8_17.lock > {code} > And another: > {code:java} > [2022-08-10T21:30:41.779Z] * What went wrong: > [2022-08-10T21:30:41.779Z] Execution failed for task > ':streams:streams-scala:compileScala'. > [2022-08-10T21:30:41.779Z] > Timeout waiting to lock zinc-1.6.1_2.13.8_17 > compiler cache (/home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17). It > is currently in use by another Gradle instance. > [2022-08-10T21:30:41.779Z] Owner PID: 11022 > [2022-08-10T21:30:41.779Z] Our PID: 11766 > [2022-08-10T21:30:41.779Z] Owner Operation: > [2022-08-10T21:30:41.779Z] Our operation: > [2022-08-10T21:30:41.779Z] Lock file: > /home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17/zinc-1.6.1_2.13.8_17.lock > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14163) Build failing in scala:compileScala due to zinc compiler cache
[ https://issues.apache.org/jira/browse/KAFKA-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14163: Description: I have been seeing builds failing recently with the following error: {code:java} [2022-08-11T17:08:22.279Z] * What went wrong: [2022-08-11T17:08:22.279Z] Execution failed for task ':streams:streams-scala:compileScala'. [2022-08-11T17:08:22.279Z] > Timeout waiting to lock zinc-1.6.1_2.13.8_17 compiler cache (/home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17). It is currently in use by another Gradle instance. [2022-08-11T17:08:22.279Z] Owner PID: 25008 [2022-08-11T17:08:22.279Z] Our PID: 25524 [2022-08-11T17:08:22.279Z] Owner Operation: [2022-08-11T17:08:22.279Z] Our operation: [2022-08-11T17:08:22.279Z] Lock file: /home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17/zinc-1.6.1_2.13.8_17.lock {code} And another: {code:java} [2022-08-10T21:30:41.779Z] * What went wrong: [2022-08-10T21:30:41.779Z] Execution failed for task ':streams:streams-scala:compileScala'. [2022-08-10T21:30:41.779Z] > Timeout waiting to lock zinc-1.6.1_2.13.8_17 compiler cache (/home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17). It is currently in use by another Gradle instance. [2022-08-10T21:30:41.779Z] Owner PID: 11022 [2022-08-10T21:30:41.779Z] Our PID: 11766 [2022-08-10T21:30:41.779Z] Owner Operation: [2022-08-10T21:30:41.779Z] Our operation: [2022-08-10T21:30:41.779Z] Lock file: /home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17/zinc-1.6.1_2.13.8_17.lock {code} was: I have been seeing builds failing recently with the following error: {code:java} [2022-08-10T21:30:41.779Z] * What went wrong: [2022-08-10T21:30:41.779Z] Execution failed for task ':streams:streams-scala:compileScala'. [2022-08-10T21:30:41.779Z] > Timeout waiting to lock zinc-1.6.1_2.13.8_17 compiler cache (/home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17). It is currently in use by another Gradle instance. {code} > Build failing in scala:compileScala due to zinc compiler cache > -- > > Key: KAFKA-14163 > URL: https://issues.apache.org/jira/browse/KAFKA-14163 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Major > > I have been seeing builds failing recently with the following error: > > {code:java} > [2022-08-11T17:08:22.279Z] * What went wrong: > [2022-08-11T17:08:22.279Z] Execution failed for task > ':streams:streams-scala:compileScala'. > [2022-08-11T17:08:22.279Z] > Timeout waiting to lock zinc-1.6.1_2.13.8_17 > compiler cache (/home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17). It > is currently in use by another Gradle instance. > [2022-08-11T17:08:22.279Z] Owner PID: 25008 > [2022-08-11T17:08:22.279Z] Our PID: 25524 > [2022-08-11T17:08:22.279Z] Owner Operation: > [2022-08-11T17:08:22.279Z] Our operation: > [2022-08-11T17:08:22.279Z] Lock file: > /home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17/zinc-1.6.1_2.13.8_17.lock > {code} > And another: > {code:java} > [2022-08-10T21:30:41.779Z] * What went wrong: > [2022-08-10T21:30:41.779Z] Execution failed for task > ':streams:streams-scala:compileScala'. > [2022-08-10T21:30:41.779Z] > Timeout waiting to lock zinc-1.6.1_2.13.8_17 > compiler cache (/home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17). It > is currently in use by another Gradle instance. > [2022-08-10T21:30:41.779Z] Owner PID: 11022 > [2022-08-10T21:30:41.779Z] Our PID: 11766 > [2022-08-10T21:30:41.779Z] Owner Operation: > [2022-08-10T21:30:41.779Z] Our operation: > [2022-08-10T21:30:41.779Z] Lock file: > /home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17/zinc-1.6.1_2.13.8_17.lock > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14163) Build failing in scala:compileScala due to zinc compiler cache
[ https://issues.apache.org/jira/browse/KAFKA-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14163: Description: I have been seeing builds failing recently with the following error: {code:java} [2022-08-11T17:08:22.279Z] * What went wrong: [2022-08-11T17:08:22.279Z] Execution failed for task ':streams:streams-scala:compileScala'. [2022-08-11T17:08:22.279Z] > Timeout waiting to lock zinc-1.6.1_2.13.8_17 compiler cache (/home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17). It is currently in use by another Gradle instance. [2022-08-11T17:08:22.279Z] Owner PID: 25008 [2022-08-11T17:08:22.279Z] Our PID: 25524 [2022-08-11T17:08:22.279Z] Owner Operation: [2022-08-11T17:08:22.279Z] Our operation: [2022-08-11T17:08:22.279Z] Lock file: /home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17/zinc-1.6.1_2.13.8_17.lock {code} And another: {code:java} [2022-08-10T21:30:41.779Z] * What went wrong: [2022-08-10T21:30:41.779Z] Execution failed for task ':streams:streams-scala:compileScala'. [2022-08-10T21:30:41.779Z] > Timeout waiting to lock zinc-1.6.1_2.13.8_17 compiler cache (/home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17). It is currently in use by another Gradle instance. [2022-08-10T21:30:41.779Z] Owner PID: 11022 [2022-08-10T21:30:41.779Z] Our PID: 11766 [2022-08-10T21:30:41.779Z] Owner Operation: [2022-08-10T21:30:41.779Z] Our operation: [2022-08-10T21:30:41.779Z] Lock file: /home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17/zinc-1.6.1_2.13.8_17.lock {code} was: I have been seeing builds failing recently with the following error: {code:java} [2022-08-11T17:08:22.279Z] * What went wrong: [2022-08-11T17:08:22.279Z] Execution failed for task ':streams:streams-scala:compileScala'. [2022-08-11T17:08:22.279Z] > Timeout waiting to lock zinc-1.6.1_2.13.8_17 compiler cache (/home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17). It is currently in use by another Gradle instance. [2022-08-11T17:08:22.279Z] Owner PID: 25008 [2022-08-11T17:08:22.279Z] Our PID: 25524 [2022-08-11T17:08:22.279Z] Owner Operation: [2022-08-11T17:08:22.279Z] Our operation: [2022-08-11T17:08:22.279Z] Lock file: /home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17/zinc-1.6.1_2.13.8_17.lock {code} And another: {code:java} [2022-08-10T21:30:41.779Z] * What went wrong: [2022-08-10T21:30:41.779Z] Execution failed for task ':streams:streams-scala:compileScala'. [2022-08-10T21:30:41.779Z] > Timeout waiting to lock zinc-1.6.1_2.13.8_17 compiler cache (/home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17). It is currently in use by another Gradle instance. [2022-08-10T21:30:41.779Z] Owner PID: 11022 [2022-08-10T21:30:41.779Z] Our PID: 11766 [2022-08-10T21:30:41.779Z] Owner Operation: [2022-08-10T21:30:41.779Z] Our operation: [2022-08-10T21:30:41.779Z] Lock file: /home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17/zinc-1.6.1_2.13.8_17.lock {code} > Build failing in scala:compileScala due to zinc compiler cache > -- > > Key: KAFKA-14163 > URL: https://issues.apache.org/jira/browse/KAFKA-14163 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Priority: Major > > I have been seeing builds failing recently with the following error: > {code:java} > [2022-08-11T17:08:22.279Z] * What went wrong: > [2022-08-11T17:08:22.279Z] Execution failed for task > ':streams:streams-scala:compileScala'. > [2022-08-11T17:08:22.279Z] > Timeout waiting to lock zinc-1.6.1_2.13.8_17 > compiler cache (/home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17). It > is currently in use by another Gradle instance. > [2022-08-11T17:08:22.279Z] Owner PID: 25008 > [2022-08-11T17:08:22.279Z] Our PID: 25524 > [2022-08-11T17:08:22.279Z] Owner Operation: > [2022-08-11T17:08:22.279Z] Our operation: > [2022-08-11T17:08:22.279Z] Lock file: > /home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17/zinc-1.6.1_2.13.8_17.lock > {code} > And another: > {code:java} > [2022-08-10T21:30:41.779Z] * What went wrong: > [2022-08-10T21:30:41.779Z] Execution failed for task > ':streams:streams-scala:compileScala'. > [2022-08-10T21:30:41.779Z] > Timeout waiting to lock zinc-1.6.1_2.13.8_17 > compiler cache (/home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17). It > is currently in use by another Gradle instance. > [2022-08-10T21:30:41.779Z] Owner PID: 11022 > [2022-08-10T21:30:41.779Z] Our PID: 11766 > [2022-08-10T21:30:41.779Z] Owner Operation: > [2022-08-10T21:30:41.779Z] Our operation: > [2022-08-10T21:30:41.779Z] Lock file: > /home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17/zinc-1.6.1_2.13.8_17.lock > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14163) Build failing in scala:compileScala due to zinc compiler cache
Jason Gustafson created KAFKA-14163: --- Summary: Build failing in scala:compileScala due to zinc compiler cache Key: KAFKA-14163 URL: https://issues.apache.org/jira/browse/KAFKA-14163 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson I have been seeing builds failing recently with the following error: {code:java} [2022-08-10T21:30:41.779Z] * What went wrong: [2022-08-10T21:30:41.779Z] Execution failed for task ':streams:streams-scala:compileScala'. [2022-08-10T21:30:41.779Z] > Timeout waiting to lock zinc-1.6.1_2.13.8_17 compiler cache (/home/jenkins/.gradle/caches/7.5.1/zinc-1.6.1_2.13.8_17). It is currently in use by another Gradle instance. {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14154) Persistent URP after controller soft failure
[ https://issues.apache.org/jira/browse/KAFKA-14154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14154: Description: We ran into a scenario where a partition leader was unable to expand the ISR after a soft controller failover. Here is what happened: Initial state: leader=1, isr=[1,2], leader epoch=10. Broker 1 is acting as the current controller. 1. Broker 1 loses its session in Zookeeper. 2. Broker 2 becomes the new controller. 3. During initialization, controller 2 removes 1 from the ISR. So state is updated: leader=2, isr=[2], leader epoch=11. 4. Broker 2 receives `LeaderAndIsr` from the new controller with leader epoch=11. 5. Broker 2 immediately tries to add replica 1 back to the ISR since it is still fetching and is caught up. However, the `BrokerToControllerChannelManager` is still pointed at controller 1, so that is where the `AlterPartition` is sent. 6. Controller 1 does not yet realize that it is not the controller, so it processes the `AlterPartition` request. It sees the leader epoch of 11, which is higher than what it has in its own context. Following changes to the `AlterPartition` validation in [https://github.com/apache/kafka/pull/12032/files,] the controller returns FENCED_LEADER_EPOCH. 7. After receiving the FENCED_LEADER_EPOCH from the old controller, the leader is stuck because it assumes that the error implies that another LeaderAndIsr request should be sent. Prior to [https://github.com/apache/kafka/pull/12032/files|https://github.com/apache/kafka/pull/12032/files,], the way we handled this case was a little different. We only verified that the leader epoch in the request was at least as large as the current epoch in the controller context. Anything higher was accepted. The controller would have attempted to write the updated state to Zookeeper. This update would have failed because of the controller epoch check, however, we would have returned NOT_CONTROLLER in this case, which is handled in `AlterPartitionManager`. It is tempting to revert the logic, but the risk is in the idempotency check: [https://github.com/apache/kafka/pull/12032/files#diff-3e042c962e80577a4cc9bbcccf0950651c6b312097a86164af50003c00c50d37L2369.] If the AlterPartition request happened to match the state inside the old controller, the controller would consider the update successful and return no error. But if its state was already stale at that point, then that might cause the leader to incorrectly assume that the state had been updated. One way to fix this problem without weakening the validation is to rely on the controller epoch in `AlterPartitionManager`. When we discover a new controller, we also discover its epoch, so we can pass that through. The `LeaderAndIsr` request already includes the controller epoch of the controller that sent it and we already propagate this through to `AlterPartition.submit`. Hence all we need to do is verify that the epoch of the current controller target is at least as large as the one discovered through the `LeaderAndIsr`. was: We ran into a scenario where a partition leader was unable to expand the ISR after a soft controller failover. Here is what happened: Initial state: leader=1, isr=[1,2], leader epoch=10. Broker 1 is acting as the current controller. 1. Broker 1 loses its session in Zookeeper. 2. Broker 2 becomes the new controller. 3. During initialization, controller 2 removes 1 from the ISR. So state is updated: leader=2, isr=[1, 2], leader epoch=11. 4. Broker 2 receives `LeaderAndIsr` from the new controller with leader epoch=11. 5. Broker 2 immediately tries to add replica 1 back to the ISR since it is still fetching and is caught up. However, the `BrokerToControllerChannelManager` is still pointed at controller 1, so that is where the `AlterPartition` is sent. 6. Controller 1 does not yet realize that it is not the controller, so it processes the `AlterPartition` request. It sees the leader epoch of 11, which is higher than what it has in its own context. Following changes to the `AlterPartition` validation in [https://github.com/apache/kafka/pull/12032/files,] the controller returns FENCED_LEADER_EPOCH. 7. After receiving the FENCED_LEADER_EPOCH from the old controller, the leader is stuck because it assumes that the error implies that another LeaderAndIsr request should be sent. Prior to [https://github.com/apache/kafka/pull/12032/files|https://github.com/apache/kafka/pull/12032/files,], the way we handled this case was a little different. We only verified that the leader epoch in the request was at least as large as the current epoch in the controller context. Anything higher was accepted. The controller would have attempted to write the updated state to Zookeeper. This update would have failed because of the controller epoch check, however, we would have returned NOT_CONTROLLER
[jira] [Updated] (KAFKA-14154) Persistent URP after controller soft failure
[ https://issues.apache.org/jira/browse/KAFKA-14154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14154: Fix Version/s: 3.3.0 > Persistent URP after controller soft failure > > > Key: KAFKA-14154 > URL: https://issues.apache.org/jira/browse/KAFKA-14154 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Blocker > Fix For: 3.3.0 > > > We ran into a scenario where a partition leader was unable to expand the ISR > after a soft controller failover. Here is what happened: > Initial state: leader=1, isr=[1,2], leader epoch=10. Broker 1 is acting as > the current controller. > 1. Broker 1 loses its session in Zookeeper. > 2. Broker 2 becomes the new controller. > 3. During initialization, controller 2 removes 1 from the ISR. So state is > updated: leader=2, isr=[1, 2], leader epoch=11. > 4. Broker 2 receives `LeaderAndIsr` from the new controller with leader > epoch=11. > 5. Broker 2 immediately tries to add replica 1 back to the ISR since it is > still fetching and is caught up. However, the > `BrokerToControllerChannelManager` is still pointed at controller 1, so that > is where the `AlterPartition` is sent. > 6. Controller 1 does not yet realize that it is not the controller, so it > processes the `AlterPartition` request. It sees the leader epoch of 11, which > is higher than what it has in its own context. Following changes to the > `AlterPartition` validation in > [https://github.com/apache/kafka/pull/12032/files,] the controller returns > FENCED_LEADER_EPOCH. > 7. After receiving the FENCED_LEADER_EPOCH from the old controller, the > leader is stuck because it assumes that the error implies that another > LeaderAndIsr request should be sent. > Prior to > [https://github.com/apache/kafka/pull/12032/files|https://github.com/apache/kafka/pull/12032/files,], > the way we handled this case was a little different. We only verified that > the leader epoch in the request was at least as large as the current epoch in > the controller context. Anything higher was accepted. The controller would > have attempted to write the updated state to Zookeeper. This update would > have failed because of the controller epoch check, however, we would have > returned NOT_CONTROLLER in this case, which is handled in > `AlterPartitionManager`. > It is tempting to revert the logic, but the risk is in the idempotency check: > [https://github.com/apache/kafka/pull/12032/files#diff-3e042c962e80577a4cc9bbcccf0950651c6b312097a86164af50003c00c50d37L2369.] > If the AlterPartition request happened to match the state inside the old > controller, the controller would consider the update successful and return no > error. But if its state was already stale at that point, then that might > cause the leader to incorrectly assume that the state had been updated. > One way to fix this problem without weakening the validation is to rely on > the controller epoch in `AlterPartitionManager`. When we discover a new > controller, we also discover its epoch, so we can pass that through. The > `LeaderAndIsr` request already includes the controller epoch of the > controller that sent it and we already propagate this through to > `AlterPartition.submit`. Hence all we need to do is verify that the epoch of > the current controller target is at least as large as the one discovered > through the `LeaderAndIsr`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14154) Persistent URP after controller soft failure
Jason Gustafson created KAFKA-14154: --- Summary: Persistent URP after controller soft failure Key: KAFKA-14154 URL: https://issues.apache.org/jira/browse/KAFKA-14154 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson Assignee: Jason Gustafson We ran into a scenario where a partition leader was unable to expand the ISR after a soft controller failover. Here is what happened: Initial state: leader=1, isr=[1,2], leader epoch=10. Broker 1 is acting as the current controller. 1. Broker 1 loses its session in Zookeeper. 2. Broker 2 becomes the new controller. 3. During initialization, controller 2 removes 1 from the ISR. So state is updated: leader=2, isr=[1, 2], leader epoch=11. 4. Broker 2 receives `LeaderAndIsr` from the new controller with leader epoch=11. 5. Broker 2 immediately tries to add replica 1 back to the ISR since it is still fetching and is caught up. However, the `BrokerToControllerChannelManager` is still pointed at controller 1, so that is where the `AlterPartition` is sent. 6. Controller 1 does not yet realize that it is not the controller, so it processes the `AlterPartition` request. It sees the leader epoch of 11, which is higher than what it has in its own context. Following changes to the `AlterPartition` validation in [https://github.com/apache/kafka/pull/12032/files,] the controller returns FENCED_LEADER_EPOCH. 7. After receiving the FENCED_LEADER_EPOCH from the old controller, the leader is stuck because it assumes that the error implies that another LeaderAndIsr request should be sent. Prior to [https://github.com/apache/kafka/pull/12032/files|https://github.com/apache/kafka/pull/12032/files,], the way we handled this case was a little different. We only verified that the leader epoch in the request was at least as large as the current epoch in the controller context. Anything higher was accepted. The controller would have attempted to write the updated state to Zookeeper. This update would have failed because of the controller epoch check, however, we would have returned NOT_CONTROLLER in this case, which is handled in `AlterPartitionManager`. It is tempting to revert the logic, but the risk is in the idempotency check: [https://github.com/apache/kafka/pull/12032/files#diff-3e042c962e80577a4cc9bbcccf0950651c6b312097a86164af50003c00c50d37L2369.] If the AlterPartition request happened to match the state inside the old controller, the controller would consider the update successful and return no error. But if its state was already stale at that point, then that might cause the leader to incorrectly assume that the state had been updated. One way to fix this problem without weakening the validation is to rely on the controller epoch in `AlterPartitionManager`. When we discover a new controller, we also discover its epoch, so we can pass that through. The `LeaderAndIsr` request already includes the controller epoch of the controller that sent it and we already propagate this through to `AlterPartition.submit`. Hence all we need to do is verify that the epoch of the current controller target is at least as large as the one discovered through the `LeaderAndIsr`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14152) Add logic to fence kraft brokers which have fallen behind in replication
Jason Gustafson created KAFKA-14152: --- Summary: Add logic to fence kraft brokers which have fallen behind in replication Key: KAFKA-14152 URL: https://issues.apache.org/jira/browse/KAFKA-14152 Project: Kafka Issue Type: Improvement Reporter: Jason Gustafson When a kraft broker registers with the controller, it must catch up to the current metadata before it is unfenced. However, once it has been unfenced, it only needs to continue sending heartbeats to remain unfenced. It can fall arbitrarily behind in the replication of the metadata log and remain unfenced. We should consider whether there is an inverse condition that we can use to fence a broker that has fallen behind. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14144) AlterPartition is not idempotent when requests time out
[ https://issues.apache.org/jira/browse/KAFKA-14144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14144. - Resolution: Fixed > AlterPartition is not idempotent when requests time out > --- > > Key: KAFKA-14144 > URL: https://issues.apache.org/jira/browse/KAFKA-14144 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.3.0 >Reporter: David Mao >Assignee: David Mao >Priority: Blocker > Fix For: 3.3.0 > > > [https://github.com/apache/kafka/pull/12032] changed the validation order of > AlterPartition requests to fence requests with a stale partition epoch before > we compare the leader and ISR contents. > This results in a loss of idempotency if a leader does not receive an > AlterPartition response because retries will receive an > INVALID_UPDATE_VERSION error. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14104) Perform CRC validation on KRaft Batch Records and Snapshots
[ https://issues.apache.org/jira/browse/KAFKA-14104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14104. - Resolution: Fixed > Perform CRC validation on KRaft Batch Records and Snapshots > --- > > Key: KAFKA-14104 > URL: https://issues.apache.org/jira/browse/KAFKA-14104 > Project: Kafka > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Niket Goel >Assignee: Niket Goel >Priority: Blocker > Fix For: 3.3.0 > > > Today we stamp the BatchRecord header with a CRC [1] and verify that CRC > before the log is written to disk [2]. The CRC checks should also be verified > when the records are read back from disk. The same procedure should be > followed for KRaft snapshots as well. > [1] > [https://github.com/apache/kafka/blob/6b76c01cf895db0651e2cdcc07c2c392f00a8ceb/clients/src/main/java/org/apache/kafka/common/record/DefaultRecordBatch.java#L501=] > > [2] > [https://github.com/apache/kafka/blob/679e9e0cee67e7d3d2ece204a421ea7da31d73e9/core/src/main/scala/kafka/log/UnifiedLog.scala#L1143] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14139) Replaced disk can lead to loss of committed data even with non-empty ISR
[ https://issues.apache.org/jira/browse/KAFKA-14139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14139: Description: We have been thinking about disk failure cases recently. Suppose that a disk has failed and the user needs to restart the disk from an empty state. The concern is whether this can lead to the unnecessary loss of committed data. For normal topic partitions, removal from the ISR during controlled shutdown buys us some protection. After the replica is restarted, it must prove its state to the leader before it can be added back to the ISR. And it cannot become a leader until it does so. An obvious exception to this is when the replica is the last member in the ISR. In this case, the disk failure itself has compromised the committed data, so some amount of loss must be expected. We have been considering other scenarios in which the loss of one disk can lead to data loss even when there are replicas remaining which have all of the committed entries. One such scenario is this: Suppose we have a partition with two replicas: A and B. Initially A is the leader and it is the only member of the ISR. # Broker B catches up to A, so A attempts to send an AlterPartition request to the controller to add B into the ISR. # Before the AlterPartition request is received, replica B has a hard failure. # The current controller successfully fences broker B. It takes no action on this partition since B is already out of the ISR. # Before the controller receives the AlterPartition request to add B, it also fails. # While the new controller is initializing, suppose that replica B finishes startup, but the disk has been replaced (all of the previous state has been lost). # The new controller sees the registration from broker B first. # Finally, the AlterPartition from A arrives which adds B back into the ISR even though it has an empty log. (Credit for coming up with this scenario goes to [~junrao] .) I tested this in KRaft and confirmed that this sequence is possible (even if perhaps unlikely). There are a few ways we could have potentially detected the issue. First, perhaps the leader should have bumped the leader epoch on all partitions when B was fenced. Then the inflight AlterPartition would be doomed no matter when it arrived. Alternatively, we could have relied on the broker epoch to distinguish the dead broker's state from that of the restarted broker. This could be done by including the broker epoch in both the `Fetch` request and in `AlterPartition`. Finally, perhaps even normal kafka replication should be using a unique identifier for each disk so that we can reliably detect when it has changed. For example, something like what was proposed for the metadata quorum here: [https://cwiki.apache.org/confluence/display/KAFKA/KIP-853%3A+KRaft+Voter+Changes.] was: We have been thinking about disk failure cases recently. Suppose that a disk has failed and the user needs to restart the disk from an empty state. The concern is whether this can lead to the unnecessary loss of committed data. For normal topic partitions, removal from the ISR during controlled shutdown buys us some protection. After the replica is restarted, it must prove its state to the leader before it can be added back to the ISR. And it cannot become a leader until it does so. An obvious exception to this is when the replica is the last member in the ISR. In this case, the disk failure itself has compromised the committed data, so some amount of loss must be expected. We have been considering other scenarios in which the loss of one disk can lead to data loss even when there are replicas remaining which have all of the committed entries. One such scenario is this: Suppose we have a partition with two replicas: A and B. Initially A is the leader and it is the only member of the ISR. # Broker B catches up to A, so A attempts to send an AlterPartition request to the controller to add B into the ISR. # Before the AlterPartition request is received, replica B has a hard failure. # The current controller successfully fences broker B. It takes no action on this partition since B is already out of the ISR. # Before the controller receives the AlterPartition request to add B, it also fails. # While the new controller is initializing, suppose that replica B finishes startup, but the disk has been replaced (all of the previous state has been lost). # The new controller sees the registration from broker B first. # Finally, the AlterPartition from A arrives which adds B back into the ISR. (Credit for coming up with this scenario goes to [~junrao] .) I tested this in KRaft and confirmed that this sequence is possible (even if perhaps unlikely). There are a few ways we could have potentially detected the issue. First, perhaps the leader should have bumped the leader epoch on all
[jira] [Created] (KAFKA-14139) Replaced disk can lead to loss of committed data even with non-empty ISR
Jason Gustafson created KAFKA-14139: --- Summary: Replaced disk can lead to loss of committed data even with non-empty ISR Key: KAFKA-14139 URL: https://issues.apache.org/jira/browse/KAFKA-14139 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson We have been thinking about disk failure cases recently. Suppose that a disk has failed and the user needs to restart the disk from an empty state. The concern is whether this can lead to the unnecessary loss of committed data. For normal topic partitions, removal from the ISR during controlled shutdown buys us some protection. After the replica is restarted, it must prove its state to the leader before it can be added back to the ISR. And it cannot become a leader until it does so. An obvious exception to this is when the replica is the last member in the ISR. In this case, the disk failure itself has compromised the committed data, so some amount of loss must be expected. We have been considering other scenarios in which the loss of one disk can lead to data loss even when there are replicas remaining which have all of the committed entries. One such scenario is this: Suppose we have a partition with two replicas: A and B. Initially A is the leader and it is the only member of the ISR. # Broker B catches up to A, so A attempts to send an AlterPartition request to the controller to add B into the ISR. # Before the AlterPartition request is received, replica B has a hard failure. # The current controller successfully fences broker B. It takes no action on this partition since B is already out of the ISR. # Before the controller receives the AlterPartition request to add B, it also fails. # While the new controller is initializing, suppose that replica B finishes startup, but the disk has been replaced (all of the previous state has been lost). # The new controller sees the registration from broker B first. # Finally, the AlterPartition from A arrives which adds B back into the ISR. (Credit for coming up with this scenario goes to [~junrao] .) I tested this in KRaft and confirmed that this sequence is possible (even if perhaps unlikely). There are a few ways we could have potentially detected the issue. First, perhaps the leader should have bumped the leader epoch on all partitions when B was fenced. Then the inflight AlterPartition would be doomed no matter when it arrived. Alternatively, we could have relied on the broker epoch to distinguish the dead broker's state from that of the restarted broker. This could be done by including the broker epoch in both the `Fetch` request and in `AlterPartition`. Finally, perhaps even normal kafka replication should be using a unique identifier for each disk so that we can reliably detect when it has changed. For example, something like what was proposed for the metadata quorum here: [https://cwiki.apache.org/confluence/display/KAFKA/KIP-853%3A+KRaft+Voter+Changes.] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14078) Replica fetches to follower should return NOT_LEADER error
[ https://issues.apache.org/jira/browse/KAFKA-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14078. - Resolution: Fixed > Replica fetches to follower should return NOT_LEADER error > -- > > Key: KAFKA-14078 > URL: https://issues.apache.org/jira/browse/KAFKA-14078 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > Fix For: 3.3.0 > > > After the fix for KAFKA-13837, if a follower receives a request from another > replica, it will return UNKNOWN_LEADER_EPOCH even if the leader epoch > matches. We need to do epoch leader/epoch validation first before we check > whether we have a valid replica. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14078) Replica fetches to follower should return NOT_LEADER error
Jason Gustafson created KAFKA-14078: --- Summary: Replica fetches to follower should return NOT_LEADER error Key: KAFKA-14078 URL: https://issues.apache.org/jira/browse/KAFKA-14078 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson Assignee: Jason Gustafson Fix For: 3.3.0 After the fix for KAFKA-13837, if a follower receives a request from another replica, it will return UNKNOWN_LEADER_EPOCH even if the leader epoch matches. We need to do epoch leader/epoch validation first before we check whether we have a valid replica. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-13888) KIP-836: Addition of Information in DescribeQuorumResponse about Voter Lag
[ https://issues.apache.org/jira/browse/KAFKA-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-13888: Priority: Blocker (was: Major) > KIP-836: Addition of Information in DescribeQuorumResponse about Voter Lag > -- > > Key: KAFKA-13888 > URL: https://issues.apache.org/jira/browse/KAFKA-13888 > Project: Kafka > Issue Type: Improvement > Components: kraft >Reporter: Niket Goel >Assignee: lqjacklee >Priority: Blocker > Fix For: 3.3.0 > > > Tracking issue for the implementation of KIP:836 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-14077) KRaft should support recovery from failed disk
Jason Gustafson created KAFKA-14077: --- Summary: KRaft should support recovery from failed disk Key: KAFKA-14077 URL: https://issues.apache.org/jira/browse/KAFKA-14077 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson Fix For: 3.3.0 If one of the nodes in the metadata quorum has a disk failure, there is no way currently to safely bring the node back into the quorum. When we lose disk state, we are at risk of losing committed data even if the failure only affects a minority of the cluster. Here is an example. Suppose that a metadata quorum has 3 members: v1, v2, and v3. Initially, v1 is the leader and writes a record at offset 1. After v2 acknowledges replication of the record, it becomes committed. Suppose that v1 fails before v3 has a chance to replicate this record. As long as v1 remains down, the raft protocol guarantees that only v2 can become leader, so the record cannot be lost. The raft protocol expects that when v1 returns, it will still have that record, but what if there is a disk failure, the state cannot be recovered and v1 participates in leader election? Then we would have committed data on a minority of the voters. The main problem here concerns how we recover from this impaired state without risking the loss of this data. Consider a naive solution which brings v1 back with an empty disk. Since the node has lost is prior knowledge of the state of the quorum, it will vote for any candidate that comes along. If v3 becomes a candidate, then it will vote for itself and it just needs the vote from v1 to become leader. If that happens, then the committed data on v2 will become lost. This is just one scenario. In general, the invariants that the raft protocol is designed to preserve go out the window when disk state is lost. For example, it is also possible to contrive a scenario where the loss of disk state leads to multiple leaders. There is a good reason why raft requires that any vote cast by a voter is written to disk since otherwise the voter may vote for different candidates in the same epoch. Many systems solve this problem with a unique identifier which is generated automatically and stored on disk. This identifier is then committed to the raft log. If a disk changes, we would see a new identifier and we can prevent the node from breaking raft invariants. Then recovery from a failed disk requires a quorum reconfiguration. We need something like this in KRaft to make disk recovery possible. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-14055) Transaction markers may be lost during cleaning if data keys conflict with marker keys
[ https://issues.apache.org/jira/browse/KAFKA-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-14055: Affects Version/s: 2.8.1 2.7.2 2.6.3 2.5.1 2.4.1 > Transaction markers may be lost during cleaning if data keys conflict with > marker keys > -- > > Key: KAFKA-14055 > URL: https://issues.apache.org/jira/browse/KAFKA-14055 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.4.1, 2.5.1, 2.6.3, 2.7.2, 2.8.1, 3.0.1, 3.2.0, 3.1.1 >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Critical > Fix For: 3.3.0, 3.0.2, 3.1.2, 3.2.1 > > > We have been seeing recently hanging transactions occur on streams changelog > topics quite frequently. After investigation, we found that the keys used in > the changelog topic conflict with the keys used in the transaction markers > (the schema used in control records is 4 bytes, which happens to be the same > for the changelog topics that we investigated). When we build the offset map > prior to cleaning, we do properly exclude the transaction marker keys, but > the bug is the fact that we do not exclude them during the cleaning phase. > This can result in the marker being removed from the cleaned log before the > corresponding data is removed when there is a user record with a conflicting > key at a higher offset. A side effect of this is a so-called "hanging" > transaction, but the bigger problem is that we lose the atomicity of the > transaction. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14055) Transaction markers may be lost during cleaning if data keys conflict with marker keys
[ https://issues.apache.org/jira/browse/KAFKA-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-14055. - Resolution: Fixed > Transaction markers may be lost during cleaning if data keys conflict with > marker keys > -- > > Key: KAFKA-14055 > URL: https://issues.apache.org/jira/browse/KAFKA-14055 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Critical > Fix For: 3.3.0, 3.0.2, 3.1.2, 3.2.1 > > > We have been seeing recently hanging transactions occur on streams changelog > topics quite frequently. After investigation, we found that the keys used in > the changelog topic conflict with the keys used in the transaction markers > (the schema used in control records is 4 bytes, which happens to be the same > for the changelog topics that we investigated). When we build the offset map > prior to cleaning, we do properly exclude the transaction marker keys, but > the bug is the fact that we do not exclude them during the cleaning phase. > This can result in the marker being removed from the cleaned log before the > corresponding data is removed when there is a user record with a conflicting > key at a higher offset. A side effect of this is a so-called "hanging" > transaction, but the bigger problem is that we lose the atomicity of the > transaction. -- This message was sent by Atlassian Jira (v8.20.10#820010)