[jira] [Work logged] (HDDS-1284) Adjust default values of pipline recovery for more resilient service restart

ASF GitHub Bot (JIRA) Wed, 17 Apr 2019 04:28:26 -0700


     [ 
https://issues.apache.org/jira/browse/HDDS-1284?focusedWorklogId=229026&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-229026
 ]


ASF GitHub Bot logged work on HDDS-1284:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 17/Apr/19 11:27
            Start Date: 17/Apr/19 11:27
    Worklog Time Spent: 10m 
      Work Description: hadoop-yetus commented on issue #733: HDDS-1284. Adjust 
default values of pipline recovery for more resilient service restart
URL: https://github.com/apache/hadoop/pull/733#issuecomment-484044451
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | 0 | reexec | 57 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 38 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1186 | trunk passed |
   | +1 | compile | 90 | trunk passed |
   | +1 | checkstyle | 35 | trunk passed |
   | +1 | mvnsite | 152 | trunk passed |
   | +1 | shadedclient | 941 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | findbugs | 199 | trunk passed |
   | +1 | javadoc | 108 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 11 | Maven dependency ordering for patch |
   | +1 | mvninstall | 133 | the patch passed |
   | +1 | compile | 67 | the patch passed |
   | +1 | javac | 67 | the patch passed |
   | -0 | checkstyle | 25 | hadoop-hdds: The patch generated 3 new + 0 
unchanged - 0 fixed = 3 total (was 0) |
   | +1 | mvnsite | 110 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | xml | 1 | The patch has no ill-formed XML file. |
   | +1 | shadedclient | 816 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | findbugs | 215 | the patch passed |
   | +1 | javadoc | 97 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 74 | common in the patch passed. |
   | +1 | unit | 31 | framework in the patch passed. |
   | +1 | unit | 55 | container-service in the patch passed. |
   | -1 | unit | 111 | server-scm in the patch failed. |
   | +1 | asflicense | 27 | The patch does not generate ASF License warnings. |
   | | | 4564 | |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hdds.scm.node.TestSCMNodeManager |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-733/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/733 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
   | uname | Linux 270564ca6d65 4.4.0-141-generic #167~14.04.1-Ubuntu SMP Mon 
Dec 10 13:20:24 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / d608be6 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | findbugs | v3.1.0-RC1 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-733/3/artifact/out/diff-checkstyle-hadoop-hdds.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-733/3/artifact/out/patch-unit-hadoop-hdds_server-scm.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-733/3/testReport/ |
   | Max. process+thread count | 349 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-hdds/framework 
hadoop-hdds/container-service hadoop-hdds/server-scm U: hadoop-hdds |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-733/3/console |
   | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 229026)
    Time Spent: 1h 20m  (was: 1h 10m)

> Adjust default values of pipline recovery for more resilient service restart
> ----------------------------------------------------------------------------
>
>                 Key: HDDS-1284
>                 URL: https://issues.apache.org/jira/browse/HDDS-1284
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Elek, Marton
>            Assignee: Elek, Marton
>            Priority: Critical
>              Labels: pull-request-available
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> As of now we have a following algorithm to handle node failures:
> 1. In case of a missing node the leader of the pipline or the scm can 
> detected the missing heartbeats.
> 2. SCM will start to close the pipeline (CLOSING state) and try to close the 
> containers with the remaining nodes in the pipeline
> 3. After 5 minutes the pipeline will be destroyed (CLOSED) and a new pipeline 
> can be created from the healthy nodes (one node can be part only one pipwline 
> in the same time).
> While this algorithm can work well with a big cluster it doesn't provide very 
> good usability on small clusters:
> Use case1:
> Given 3 nodes, in case of a service restart, if the restart takes more than 
> 90s, the pipline will be moved to the CLOSING state. For the next 5 minutes 
> (ozone.scm.pipeline.destroy.timeout) the container will remain in the CLOSING 
> state. As there are no more nodes and we can't assign the same node to two 
> different pipeline, the cluster will be unavailable for 5 minutes.
> Use case2:
> Given 90 nodes and 30 pipelines where all the pipelines are spread across 3 
> racks. Let's stop one rack. As all the pipelines are affected, all the 
> pipelines will be moved to the CLOSING state. We have no free nodes, 
> therefore we need to wait for 5 minutes to write any data to the cluster.
> These problems can be solved in multiple ways:
> 1.) Instead of waiting 5 minutes, destroy the pipeline when all the 
> containers are reported to be closed. (Most of the time it's enough, but some 
> container report can be missing)
> 2.) Support multi-raft and open a pipeline as soon as we have enough nodes 
> (even if the nodes already have a CLOSING pipelines).
> Both the options require more work on the pipeline management side. For 0.4.0 
> we can adjust the following parameters to get better user experience:
> {code}
>   <property>
>     <name>ozone.scm.pipeline.destroy.timeout</name>
>     <value>60s</value>
>     <tag>OZONE, SCM, PIPELINE</tag>
>     <description>
>       Once a pipeline is closed, SCM should wait for the above configured time
>       before destroying a pipeline.
>     </description>
>   <property>
>     <name>ozone.scm.stale.node.interval</name>
>     <value>90s</value>
>     <tag>OZONE, MANAGEMENT</tag>
>     <description>
>       The interval for stale node flagging. Please
>       see ozone.scm.heartbeat.thread.interval before changing this value.
>     </description>
>   </property>
>  {code}
> First of all, we can be more optimistic and mark node to stale only after 5 
> mins instead of 90s. 5 mins should be enough most of the time to recover the 
> nodes.
> Second: we can decrease the time of ozone.scm.pipeline.destroy.timeout. 
> Ideally the close command is sent by the scm to the datanode with a HB. 
> Between two HB we have enough time to close all the containers via ratis. 
> With the next HB, datanode can report the successful datanode. (If the 
> containers can be closed the scm can manage the QUASI_CLOSED containers)
> We need to wait 29 seconds (worst case) for the next HB, and 29+30 seconds 
> for the confirmation. --> 66 seconds seems to be a safe choice (assuming that 
> 6 seconds is enough to process the report about the successful closing)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Work logged] (HDDS-1284) Adjust default values of pipline recovery for more resilient service restart

Reply via email to