GuoHao created HDDS-12087:
-----------------------------
Summary: TransactionToDNeCommitMap too large causes GC to pause
for a long time
Key: HDDS-12087
URL: https://issues.apache.org/jira/browse/HDDS-12087
Project: Apache Ozone
Issue Type: Improvement
Reporter: GuoHao
Assignee: GuoHao
Description: When deleting the block status from the SCM record, there will be
a `transactionToDNsCommitMap` structure record. We encountered this structure
accumulating too much content and occupying too much memory, causing the SCM to
have a long GC time of 327 seconds.
GC log:
{code:java}
2025-01-15 08:25:14,789 [JvmPauseMonitor0] ERROR
org.apache.ratis.server.RaftServer: 127e9d82-790c-40c5-af90-050564a06a45: JVM
pause detected 372.305s longer than the close-threshold 120s, shutting down ...
{code}
Solution: Before iterating the deleteBlocks table and sending a delete request
to dn, if the content in the `transactionToDNsCommitMap` is found to be too
large and exceeds a certain threshold, pause the iteration of the deleteBlocks
table for a while.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]