Marton Elek created HDDS-4747:
---------------------------------
Summary: Use GRPC for datanode->scm call instead of Hadoop RPC
Key: HDDS-4747
URL: https://issues.apache.org/jira/browse/HDDS-4747
Project: Hadoop Distributed Data Store
Issue Type: Improvement
Components: SCM
Reporter: Marton Elek
As we discussed during the recent community calls it might be useful to switch
from Hadoop RPC to GRPC for the datanode heartbeat protocol (between the
datanodes and SCM).
1. It simplifies the architecture as we started to use mTLS guarded GRPC for
server2server communication
2. It can make the cluster more responsive: today we cache the outgoing
messages on both side (scm and datanode).
At datanode the reports are cached in StateContext.*Reports fields until the
next heartbeats. On SCM side the commands are cached in
SCMNodeManager.commandQueue until the next heartbeats.
SCM couldn't send commands to datanode immediately, but it should wait until
the next HB from datanode, and commands (like closeContainer or createPipeline)
are added to the response from the commandQueue.
Two-way async GRPC can simplify the code: at any time both side can send
one-way messages. The full cluster can be more responsive as any SCM command
will be immediately delivered to the datanode.
As we already have API on both side to queue the messages it doesn't require
significant changes, but the implementation should be modified to send out a
message immediately instead of adding it to an internal queue.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]