[ 
https://issues.apache.org/jira/browse/HDDS-12087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

GuoHao updated HDDS-12087:
--------------------------
    Description: 
Description: When deleting the block status from the SCM record, there will be 
a `transactionToDNsCommitMap` structure record. We encountered this structure 
accumulating too much content and occupying too much memory, causing the SCM to 
have a long GC time of 327 seconds.

 

GC log:
{code:java}
2025-01-15 08:25:14,789 [JvmPauseMonitor0] ERROR 
org.apache.ratis.server.RaftServer: 127e9d82-790c-40c5-af90-050564a06a45: JVM 
pause detected 372.305s longer than the close-threshold 120s, shutting down ... 
{code}
 

Solution: Before iterating the deleteBlocks table and sending a delete request 
to dn, if the content in the `transactionToDNsCommitMap` is found to be too 
large and exceeds a certain threshold, pause the iteration of the deleteBlocks 
table for a while.

  was:
Description: When deleting the block status from the SCM record, there will be 
a `transactionToDNsCommitMap` structure record. We encountered this structure 
accumulating too much content and occupying too much memory, causing the SCM to 
have a long GC time of 327 seconds.

 

GC log:
{code:java}
2025-01-15 08:25:14,789 [JvmPauseMonitor0] ERROR 
org.apache.ratis.server.RaftServer: 127e9d82-790c-40c5-af90-050564a06a45: JVM 
pause detected 372.305s longer than the close-threshold 120s, shutting down ... 
{code}

Solution: Before iterating the deleteBlocks table and sending a delete request 
to dn, if the content in the `transactionToDNsCommitMap` is found to be too 
large and exceeds a certain threshold, pause the iteration of the deleteBlocks 
table for a while.


> TransactionToDNeCommitMap too large causes GC to pause for a long time
> ----------------------------------------------------------------------
>
>                 Key: HDDS-12087
>                 URL: https://issues.apache.org/jira/browse/HDDS-12087
>             Project: Apache Ozone
>          Issue Type: Sub-task
>          Components: SCM
>            Reporter: GuoHao
>            Assignee: GuoHao
>            Priority: Critical
>
> Description: When deleting the block status from the SCM record, there will 
> be a `transactionToDNsCommitMap` structure record. We encountered this 
> structure accumulating too much content and occupying too much memory, 
> causing the SCM to have a long GC time of 327 seconds.
>  
> GC log:
> {code:java}
> 2025-01-15 08:25:14,789 [JvmPauseMonitor0] ERROR 
> org.apache.ratis.server.RaftServer: 127e9d82-790c-40c5-af90-050564a06a45: JVM 
> pause detected 372.305s longer than the close-threshold 120s, shutting down 
> ... {code}
>  
> Solution: Before iterating the deleteBlocks table and sending a delete 
> request to dn, if the content in the `transactionToDNsCommitMap` is found to 
> be too large and exceeds a certain threshold, pause the iteration of the 
> deleteBlocks table for a while.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to