Marton Elek created HDDS-4747:
---------------------------------

             Summary: Use GRPC for datanode->scm call instead of Hadoop RPC
                 Key: HDDS-4747
                 URL: https://issues.apache.org/jira/browse/HDDS-4747
             Project: Hadoop Distributed Data Store
          Issue Type: Improvement
          Components: SCM
            Reporter: Marton Elek


As we discussed during the recent community calls it might be useful to switch 
from Hadoop RPC to GRPC for the datanode heartbeat protocol (between the 
datanodes and SCM).

 1. It simplifies the architecture as we started to use mTLS guarded GRPC for 
server2server communication
 2. It can make the cluster more responsive: today we cache the outgoing 
messages on both side (scm and datanode). 

At datanode the reports are cached in StateContext.*Reports fields until the 
next heartbeats. On SCM side the commands are cached in 
SCMNodeManager.commandQueue until the next heartbeats.

SCM couldn't send commands to datanode immediately, but it should wait until 
the next HB from datanode, and commands (like closeContainer or createPipeline) 
are added to the response from the commandQueue.

Two-way async GRPC can simplify the code: at any time both side can send 
one-way messages. The full cluster can be more responsive as any SCM command 
will be immediately delivered to the datanode.

As we already have API on both side to queue the messages it doesn't require 
significant changes, but the implementation should be modified to send out a 
message immediately instead of adding it to an internal queue.

  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to