Wei-Chiu Chuang created HDDS-10908:
--------------------------------------
Summary: Increase DataNode XceiverServerGrpc event loop group size
Key: HDDS-10908
URL: https://issues.apache.org/jira/browse/HDDS-10908
Project: Apache Ozone
Issue Type: Improvement
Components: Ozone Datanode
Reporter: Wei-Chiu Chuang
The current configuration has the XceiverServerGrpc boss and worker event loop
group share the same thread pool whose size is number of volumes *
hdds.datanode.read.chunk.threads.per.volume / 10, and executor thread pool size
number of volumes * hdds.datanode.read.chunk.threads.per.volume.
The event loop group thread pool size is too small. Assuming single volume that
implies just one thread shared between boss/worker.
Using freon DN Echo tool I found increasing the pool size slightly
significantly increases throughput:
{noformat}
sudo -u hdfs ozone freon dne --clients=32 --container-id=1001 -t 32 -n 10000000
--sleep-time-ms=0 --read-only
hdds.datanode.read.chunk.threads.per.volume = 10 (default):
mean rate = 44125.45 calls/second
hdds.datanode.read.chunk.threads.per.volume = 20:
mean rate = 61322.60 calls/second
hdds.datanode.read.chunk.threads.per.volume = 40:
mean rate = 77951.91 calls/second
hdds.datanode.read.chunk.threads.per.volume = 100:
mean rate = 65573.07 calls/second
hdds.datanode.read.chunk.threads.per.volume = 1000:
mean rate = 25079.32 calls/second
{noformat}
So it appears that increasing the default value to 40 has positive impact. Or
we should consider don't associate the thread pool size with number of volumes.
Note:
DN echo in Ratis read only mode is about 83k requests per second on the same
host.
OM echo in read only mode is about 38k requests per second.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]