Sumit Agrawal created HDDS-7244:
-----------------------------------

             Summary: Fix multiple reports queued up from same DN and using up 
heap
                 Key: HDDS-7244
                 URL: https://issues.apache.org/jira/browse/HDDS-7244
             Project: Apache Ozone
          Issue Type: Improvement
         Environment: Environment Observation:
 # Disk is not SSD
 # DNs- 96, Containers for each DNs on average approx 12K - 15K
 # Hearbeat initially is 60sec, but same observed with 20 minutes also.
 # Observed multiple threads waiting on write lock in thread dump, but not a 
deadlock.
            Reporter: Sumit Agrawal
            Assignee: Sumit Agrawal


When processing of FCR/ICR reports from datanode is slow, queue keeps getting 
filled with these reports. Slowness can be due to:
 * Disk being slow (recommended SSD is not configured)
 * Number of datanodes and container is huge
 * Heartbeat from DNs configured with high frequency
 * Thread contention / starvation where global locks are present

The issue is observed in combination of multiple condition as given above and 
reports keeps on appending to EventQueue for processing. This further results 
in heapdump over a period of time.

This is observed for both SCM and Recon server



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to