[jira] [Commented] (RATIS-556) Detect node failures and close the log to prevent additional writes

Ankit Singhal (JIRA) Sun, 04 Aug 2019 12:32:26 -0700


    [ 
https://issues.apache.org/jira/browse/RATIS-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899675#comment-16899675
 ]


Ankit Singhal commented on RATIS-556:
-------------------------------------

bq.  Currently, when a node goes down we cannot know which log to close because 
there is no indication the logs serve by the peer. So in the patch passing the 
peer(optional) while creating the log so that meta servers maintain the logs 
serve by the peer.
what is and who will pass this origin peer? Why can't we create an inverted 
index on below the existing map to get logs hosted on a peer?
{code}
private Map<LogName, RaftGroup> map = new ConcurrentHashMap<>();
{code}

And , Is notifySlowness() the right API to declare node as dead? can't we use 
the heartbeat mechanism like every peer will be sending heartbeat request 
regularly to meta quorum using a separate thread? (or it will create a storm of 
heart beat request and congestion at meta quorum?)





> Detect node failures and close the log to prevent additional writes
> -------------------------------------------------------------------
>
>                 Key: RATIS-556
>                 URL: https://issues.apache.org/jira/browse/RATIS-556
>             Project: Ratis
>          Issue Type: Improvement
>            Reporter: Rajeshbabu Chintaguntla
>            Assignee: Rajeshbabu Chintaguntla
>            Priority: Major
>         Attachments: RATIS-556-wip.patch
>
>
> Currently there is no way to detect the node failures at master log servers 
> and add new nodes to the group serving the log. We need to analyze how Ozone 
> is working in this case.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (RATIS-556) Detect node failures and close the log to prevent additional writes

Reply via email to