[jira] [Commented] (ASTERIXDB-2284) Ensure Node Failure on Heartbeat Misses

ASF subversion and git services (JIRA) Tue, 13 Feb 2018 21:45:44 -0800

    [ 
https://issues.apache.org/jira/browse/ASTERIXDB-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363524#comment-16363524
 ]


ASF subversion and git services commented on ASTERIXDB-2284:
------------------------------------------------------------

Commit bf74a319dbdfa3fea3007d3286f14a77fecac178 in asterixdb's branch 
refs/heads/master from [~mhubail]
[ https://git-wip-us.apache.org/repos/asf?p=asterixdb.git;h=bf74a31 ]

[ASTERIXDB-2284][CLUS] Ensure Node Failure on Heartbeat Miss

- user model changes: no
- storage format changes: no
- interface changes: no

Details:
- Request the node which exceeded its heartbeat misses
  to shutdown to ensure its failures.
- Ensure thread safety of lastHeartbeatNanoTime in
  NodeControllerState.

Change-Id: I121f85fd858484377a9d888d18c3069c239f00fc
Reviewed-on: https://asterix-gerrit.ics.uci.edu/2390
Sonar-Qube: Jenkins <[email protected]>
Tested-by: Jenkins <[email protected]>
Contrib: Jenkins <[email protected]>
Integration-Tests: Jenkins <[email protected]>
Reviewed-by: Michael Blow <[email protected]>


> Ensure Node Failure on Heartbeat Misses
> ---------------------------------------
>
>                 Key: ASTERIXDB-2284
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2284
>             Project: Apache AsterixDB
>          Issue Type: Improvement
>            Reporter: Murtadha Hubail
>            Assignee: Murtadha Hubail
>            Priority: Major
>
> Currently, there is a possibility that an NC exceeds the allowed period to 
> send its heartbeat (i.e. due to garbage collection pause), and continue to 
> stay up which will result in the cluster state being unusable forever. The 
> proposal is to ensure the failed node has really failed by asking it to 
> shutdown. By doing this, if the shutdown succeeds, the NC will be restarted 
> and the cluster state will be active again when the NC joins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ASTERIXDB-2284) Ensure Node Failure on Heartbeat Misses

Reply via email to