[ 
https://issues.apache.org/jira/browse/RATIS-666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917992#comment-16917992
 ] 

Tsz Wo Nicholas Sze commented on RATIS-666:
-------------------------------------------

> ... which makes A very hard to tell which one of the 5 groups is so unhealthy 
> as to drag the communication.  ...

Such thing should not happen -- heartbeat is to tell if the connection between 
two machines is good.  When one group is unhealthy and the connection is still 
good, the appendEntries to that group will fail but the heartbeats should 
succeed.

> Coalesced heartbeat in multiraft
> --------------------------------
>
>                 Key: RATIS-666
>                 URL: https://issues.apache.org/jira/browse/RATIS-666
>             Project: Ratis
>          Issue Type: Improvement
>          Components: raft-group
>            Reporter: Li Cheng
>            Priority: Major
>
> I'm using this issue to discuss the coalesced heartbeat plan in multi-raft. 
> We are looking at incorporating multi-raft feature in ratis into Hadoop 
> Ozone. So in ozone, every datanode would be in multiple raft groups or say 
> pipelines with multi-raft, which brings:
>  # Is there any plan for coalesced heartbeat on single node? 
>  # Are we going to use gRPC to achieve coalesced heartbeat like what 
> cockroach does? Shall we assume only Java APIs are required?
>  # Either we have coalesced heartbeat, every node would have chances to be 
> selected as leader in each raft group. So to the extreme extend, one node, 
> say node A, would be the leader to all raft groups. If we implement coalesced 
> heartbeat, there would more easily push node A to be the bottleneck for 
> future stumbling in performance. Any idea on how to avoid this extremity? 
> Maybe do a candidate scrub?
>  # How do we plan to test the 'single node, multi raft groups' scenario? 
> Furthermore, if we allow coalesced heartbeat configurable, how to determine 
> when and whether to use it?
>  
> [~szetszwo] [~Sammi] [~xyao] [~waterlx]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to