Kamal Chandraprakash created KAFKA-19599: --------------------------------------------
Summary: Reduce the frequency of ReplicaNotAvailableException thrown to clients when RLMM is not ready Key: KAFKA-19599 URL: https://issues.apache.org/jira/browse/KAFKA-19599 Project: Kafka Issue Type: Task Reporter: Kamal Chandraprakash Assignee: Kamal Chandraprakash During broker restarts, the topic-based RemoteLogMetadataManager constructs the state by reading the internal {{__remote_log_metadata}} topic. When the partition is not ready to perform remote storage operations, then ReplicaNotAvailableException thrown back to the consumer. The clients retries the request immediately. This can result to lot of FetchConsumer requests on the broker and can utilize the request handler threads. Using CountdownLatch the frequency of ReplicaNotAvailableException thrown back to the clients can be reduced. This will improve the request handler thread usage on the broker. Reproducer: 1. Standalone one node cluster with LocalTieredStorage setup. 2. Create a topic with remote storage enabled. RF = 1 and partitionCount = 2 3. Produce few message and ensure that the segments are uploaded to remote storage. 4. Use console-consumer to read the produced messages from the beginning of the topic. 5. Update [RemoteLogMetadataPartitionStore|https://sourcegraph.com/github.com/apache/kafka/-/blob/storage/src/main/java/org/apache/kafka/server/log/remote/metadata/storage/RemotePartitionMetadataStore.java?L166] to micmic that the partition is not ready. 6. Replace the jar and restart the broker. 7. Start the console-consumer to read from the beginning of the topic. ~18K FetchConsumer requests per second are received on the broker for one consumer: {code:java} % sh kafka-topics.sh --bootstrap-server localhost:9092 --topic apple --replication-factor 1 --partitions 2 --create --config segment.bytes=1048576 --config local.retention.ms=60000 --config remote.storage.enable=true % sh kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic apple --from-beginning --property print.key=false --property print.value=false # broker logs % less nohup.out | grep "Error occurred while reading the remote data for 4ChgxqKOTPakBikyo0Thjw" | grep -c "2025-08-12 21:18" 1107088 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)