[
https://issues.apache.org/jira/browse/HDDS-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sammi Chen resolved HDDS-7244.
------------------------------
Resolution: Fixed
> Fix multiple reports queued up from same DN and using up heap
> -------------------------------------------------------------
>
> Key: HDDS-7244
> URL: https://issues.apache.org/jira/browse/HDDS-7244
> Project: Apache Ozone
> Issue Type: Improvement
> Environment: Environment Observation:
> # Disk is not SSD
> # DNs- 96, Containers for each DNs on average approx 12K - 15K
> # Hearbeat initially is 60sec, but same observed with 20 minutes also.
> # Observed multiple threads waiting on write lock in thread dump, but not a
> deadlock.
> Reporter: Sumit Agrawal
> Assignee: Sumit Agrawal
> Priority: Minor
> Labels: pull-request-available
> Attachments: Handling multiple FCR for the datanode in event queue
> .docx
>
>
> When processing of FCR/ICR reports from datanode is slow, queue keeps getting
> filled with these reports. Slowness can be due to:
> * Disk being slow (recommended SSD is not configured)
> * Number of datanodes and container is huge
> * Heartbeat from DNs configured with high frequency
> * Thread contention / starvation where global locks are present
> The issue is observed in combination of multiple condition as given above and
> reports keeps on appending to EventQueue for processing. This further results
> in heapdump over a period of time.
> This is observed for both SCM and Recon server
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]