[
https://issues.apache.org/jira/browse/KAFKA-19995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nandini Singhal updated KAFKA-19995:
------------------------------------
Description: When tiered storage segment copies are failing, the
RemoteCopyLagBytes and RemoteCopyLagSegments metrics are not emitted or remain
at 0/stale values. This makes it impossible to detect growing lag during copy
failures. (was: Large heap buffer allocations in the tiered storage read path
cause significant GC pressure by creating "humongous" objects that bypass young
generation and go directly to old generation. Using direct buffers for these
I/O-centric allocations would eliminate this GC overhead.
(RemoteLogManager.java:1718):
{code:java}
int updatedFetchSize = remoteStorageFetchInfo.minOneMessage() &&
firstBatchSize > maxBytes
? firstBatchSize
: maxBytes;
ByteBuffer buffer = ByteBuffer.allocate(updatedFetchSize);{code}
Where maxBytes = Math.min(fetchMaxBytes, fetchInfo.maxBytes), which can be
configured up to 55MB or
more depending on:
- replica.fetch.max.bytes (default: 1MB)
- replica.fetch.response.max.bytes (default: 10MB)
- Client-side max.partition.fetch.bytes
In the G1GC collector (Kafka's default), objects larger than half a region size
(~32MB with 64MB regions) are considered "humongous" and:
1. Skip eden and young generation entirely
2. Allocated directly in old generation
3. Can only be reclaimed during expensive full/mixed GCs
4. Trigger old GCs more frequently
Example: With a 4GB heap and InitiatingHeapOccupancyPercent=35, approximately
25 concurrent tiered storage fetch requests (25 × 55MB = 1.375GB) would trigger
an old GC. Under high read load from tiered storage, this creates continuous GC
pressure.
Solution: Use ByteBuffer.allocateDirect() for large fetch buffers in the tiered
storage read path. Buffers are used only for a single fetch request.)
> Record copy lag metrics during failures
> ---------------------------------------
>
> Key: KAFKA-19995
> URL: https://issues.apache.org/jira/browse/KAFKA-19995
> Project: Kafka
> Issue Type: Improvement
> Components: Tiered-Storage
> Reporter: Nandini Singhal
> Assignee: Nandini Singhal
> Priority: Major
>
> When tiered storage segment copies are failing, the RemoteCopyLagBytes and
> RemoteCopyLagSegments metrics are not emitted or remain at 0/stale values.
> This makes it impossible to detect growing lag during copy failures.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)