[jira] [Created] (HDDS-4970) Significant overhead when DataNode is over-scribed

Wei-Chiu Chuang (Jira) Thu, 11 Mar 2021 17:30:06 -0800

Wei-Chiu Chuang created HDDS-4970:
-------------------------------------

             Summary: Significant overhead when DataNode is over-scribed
                 Key: HDDS-4970
                 URL: https://issues.apache.org/jira/browse/HDDS-4970
             Project: Apache Ozone
          Issue Type: Bug
          Components: Ozone Datanode
    Affects Versions: 1.0.0
            Reporter: Wei-Chiu Chuang
         Attachments: Screen Shot 2021-03-11 at 11.58.23 PM.png


Ran a microbenchmark to have concurrent clients reading chunks from a DataNode.

When the number of clients grows, there is a significant amount of overhead in 
accessing a concurrent hash map. The overhead grows exponentially with respect 
to the number of clients.
{code:java|title=ChunkUtils#processFileExclusively}
  @VisibleForTesting
  static <T> T processFileExclusively(Path path, Supplier<T> op) {
    for (;;) {
      if (LOCKS.add(path)) {
        break;
      }
    }

    try {
      return op.get();
    } finally {
      LOCKS.remove(path);
    }
  }
{code}
In my test, having 64 concurrent clients reading chunks from a 1-disk DataNode 
caused the DN to spend nearly half of the time adding into the LOCKS object (a 
concurrent hash map).

 

!Screen Shot 2021-03-11 at 11.58.23 PM.png|width=640!

 

Given that it is not uncommon to find HDFS DataNodes with tens of thousands of 
incoming client connections, I expect to see similar traffic to an Ozone 
DataNode at scale.

We should fix this code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (HDDS-4970) Significant overhead when DataNode is over-scribed

Reply via email to