[ 
https://issues.apache.org/jira/browse/KAFKA-19967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandini Singhal reassigned KAFKA-19967:
---------------------------------------

    Assignee: Nandini Singhal

> Reduce GC pressure in tiered storage read path by using direct memory buffers
> -----------------------------------------------------------------------------
>
>                 Key: KAFKA-19967
>                 URL: https://issues.apache.org/jira/browse/KAFKA-19967
>             Project: Kafka
>          Issue Type: Improvement
>          Components: Tiered-Storage
>            Reporter: Nandini Singhal
>            Assignee: Nandini Singhal
>            Priority: Major
>
> Large heap buffer allocations in the tiered storage read path cause 
> significant GC pressure by creating "humongous" objects that bypass young 
> generation and go directly to old generation. Using direct buffers for these 
> I/O-centric allocations would eliminate this GC overhead.
> (RemoteLogManager.java:1718):
> {code:java}
>   int updatedFetchSize = remoteStorageFetchInfo.minOneMessage() && 
> firstBatchSize > maxBytes
>       ? firstBatchSize
>       : maxBytes;
>   ByteBuffer buffer = ByteBuffer.allocate(updatedFetchSize);{code}
>   Where maxBytes = Math.min(fetchMaxBytes, fetchInfo.maxBytes), which can be 
> configured up to 55MB or
>   more depending on:
>   - replica.fetch.max.bytes (default: 1MB)
>   - replica.fetch.response.max.bytes (default: 10MB)
>   - Client-side max.partition.fetch.bytes
>  
> In the G1GC collector (Kafka's default), objects larger than half a region 
> size (~32MB with 64MB regions) are considered "humongous" and:
>   1. Skip eden and young generation entirely
>   2. Allocated directly in old generation
>   3. Can only be reclaimed during expensive full/mixed GCs
>   4. Trigger old GCs more frequently
>   Example: With a 4GB heap and InitiatingHeapOccupancyPercent=35, 
> approximately 25 concurrent tiered storage fetch requests (25 × 55MB = 
> 1.375GB) would trigger an old GC. Under high read load from tiered storage, 
> this creates continuous GC pressure.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to