Divij Vaidya created KAFKA-14038:
------------------------------------

             Summary: Optimize calculation of size for log in remote tier
                 Key: KAFKA-14038
                 URL: https://issues.apache.org/jira/browse/KAFKA-14038
             Project: Kafka
          Issue Type: Improvement
          Components: core
            Reporter: Divij Vaidya
            Assignee: Divij Vaidya
             Fix For: 3.3.0


{color:#24292f}As per the Tiered Storage feature introduced in 
[KIP-405|https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage],
 users can configure the retention of remote tier based on time, by size, or 
both. The work of computing the log segments to be deleted based on the 
retention config is [owned by 
RemoteLogManager|https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage#KIP405:KafkaTieredStorage-1.RemoteLogManager(RLM)ThreadPool]
 (RLM).{color}

{color:#24292f}To compute remote segments eligible for deletion based on 
retention by size config, {color}RLM needs to compute the 
{{total_remote_log_size}} i.e. the total size of logs available in the remote 
tier for that topic-partition. RLM could use the 
{{RemoteLogMetadataManager.listRemoteLogSegments()}} to fetch metadata for all 
the remote segments and then aggregate the segment sizes by using 
{{{}RemoteLogSegmentMetadata.segmentSizeInBytes(){}}}to find the total log size 
stored in the remote tier.

The above method involves iterating through all metadata of all the segments 
i.e. O({color:#24292f}num_remote_segments{color}) on each execution of RLM 
thread. {color:#24292f}Since the main feature of tiered storage is storing a 
large amount of data, we expect num_remote_segments to be large and a frequent 
linear scan could be expensive (depending on the underlying storage used by 
RemoteLogMetadataManager).

Segment offloads and segment deletions are run together in the same task and a 
fixed size thread pool is shared among all topic-partitions. A slow logic for 
calculation of total_log_size could result in the loss of availability as 
demonstrated in the following scenario:{color}
 # {color:#24292f}Calculation of total_size is slow and the threads in the 
thread pool are busy with segment deletions{color}
 # Segment offloads are delayed (since they run together with deletions)
 # Local disk fills up, since local deletion requires the segment to be 
offloaded
 # If local disk is completely full, Kafka fails

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to