[jira] [Commented] (RATIS-556) Detect node failures and close the log to prevent additional writes

Josh Elser (Jira) Wed, 04 Sep 2019 14:13:12 -0700


    [ 
https://issues.apache.org/jira/browse/RATIS-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922868#comment-16922868
 ]


Josh Elser commented on RATIS-556:
----------------------------------

Success! This worked for me. The steps I did:
* {{docker-compose up -d}} (after making a total of 5 workers)
* {{./client-env.sh}} and {{./bin/shell}}, creating a log
* Found the metadata quorum leader
* {{docker logs -f metadata_leader}} in one terminal
* {{docker kill log_worker}}
* Validated that the metadata quorum leader closed the log
* Validated that I could no long interact with the log via the shell

{noformat}
2019-09-04 21:05:10,902 INFO  server.MetaStateMachine 
(MetaStateMachine.java:applyTransactionSerial(153)) - Log LogName['josh'] 
registered at master1.logservice.ratis.org_9999 with group 
group-FE315AA09C9B:[worker1.logservice.ratis.org_9999:worker1.logservice.ratis.org:9999,
 worker3.logservice.ratis.org_9999:worker3.logservice.ratis.org:9999, 
worker5.logservice.ratis.org_9999:worker5.logservice.ratis.org:9999]
2019-09-04 21:08:07,963 WARN  server.MetaStateMachine 
(MetaStateMachine.java:lambda$run$2(445)) - Peer 
worker5.logservice.ratis.org_9999:worker5.logservice.ratis.org:9999 in the 
group 
group-FE315AA09C9B:[worker1.logservice.ratis.org_9999:worker1.logservice.ratis.org:9999,
 worker3.logservice.ratis.org_9999:worker3.logservice.ratis.org:9999, 
worker5.logservice.ratis.org_9999:worker5.logservice.ratis.org:9999] went down. 
Hence closing the log LogName['josh'] serve by the group.
{noformat}

> Detect node failures and close the log to prevent additional writes
> -------------------------------------------------------------------
>
>                 Key: RATIS-556
>                 URL: https://issues.apache.org/jira/browse/RATIS-556
>             Project: Ratis
>          Issue Type: Improvement
>            Reporter: Rajeshbabu Chintaguntla
>            Assignee: Rajeshbabu Chintaguntla
>            Priority: Major
>         Attachments: RATIS-556-wip.patch, RATIS-556_v1.patch, 
> RATIS-556_v2.patch, RATIS-556_v3.patch, RATIS-556_v4.patch
>
>
> Currently there is no way to detect the node failures at master log servers 
> and add new nodes to the group serving the log. We need to analyze how Ozone 
> is working in this case.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (RATIS-556) Detect node failures and close the log to prevent additional writes

Reply via email to