Kamal Chandraprakash created KAFKA-19599:
--------------------------------------------

             Summary: Reduce the frequency of ReplicaNotAvailableException 
thrown to clients when RLMM is not ready
                 Key: KAFKA-19599
                 URL: https://issues.apache.org/jira/browse/KAFKA-19599
             Project: Kafka
          Issue Type: Task
            Reporter: Kamal Chandraprakash
            Assignee: Kamal Chandraprakash


During broker restarts, the topic-based RemoteLogMetadataManager constructs the 
state by reading the internal {{__remote_log_metadata}} topic. When the 
partition is not ready to perform remote storage operations, then 
ReplicaNotAvailableException thrown back to the consumer. The clients retries 
the request immediately. 

This can result to lot of FetchConsumer requests on the broker and can utilize 
the request handler threads. Using CountdownLatch the frequency of 
ReplicaNotAvailableException thrown back to the clients can be reduced. This 
will improve the request handler thread usage on the broker.

Reproducer: 
1. Standalone one node cluster with LocalTieredStorage setup. 
2. Create a topic with remote storage enabled. RF = 1 and partitionCount = 2
3. Produce few message and ensure that the segments are uploaded to remote 
storage. 
4. Use console-consumer to read the produced messages from the beginning of the 
topic.
5. Update 
[RemoteLogMetadataPartitionStore|https://sourcegraph.com/github.com/apache/kafka/-/blob/storage/src/main/java/org/apache/kafka/server/log/remote/metadata/storage/RemotePartitionMetadataStore.java?L166]
 to micmic that the partition is not ready.
6. Replace the jar and restart the broker. 
7. Start the console-consumer to read from the beginning of the topic.  

~18K FetchConsumer requests per second are received on the broker for one 
consumer:
{code:java}
% sh kafka-topics.sh --bootstrap-server localhost:9092  --topic apple 
--replication-factor 1 --partitions 2 --create  --config segment.bytes=1048576 
--config local.retention.ms=60000 --config remote.storage.enable=true
% sh kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic apple 
--from-beginning --property print.key=false --property print.value=false
# broker logs
 % less nohup.out | grep  "Error occurred while reading the remote data for 
4ChgxqKOTPakBikyo0Thjw"  | grep -c "2025-08-12 21:18" 
1107088
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to