[jira] [Comment Edited] (YARN-5601) Make the RM epoch base value configurable

Subru Krishnan (JIRA) Wed, 31 Aug 2016 18:43:34 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15453991#comment-15453991
 ]


Subru Krishnan edited comment on YARN-5601 at 9/1/16 1:43 AM:
--------------------------------------------------------------

[~jianhe], to answer your question let me start with why we need epoch in a 
federated cluster:
currently only a single RM generates containerIDs (applicationID + a sequence 
number) but in a federated cluster, there are multiple RMs that are 
concurrently generating them. So there will be conflicts if an application 
spans across multiple sub-clusters. To avoid this conflict, we use epoch in a 
federated cluster similar to how it's used in the context of work preserving 
restarts to prevent conflicts.

The idea is we will set epoch number to be 0 for first sub-cluster RM, 10000 
for second sub-cluster RM, 20000 for third sub-cluster RM, etc. This should be 
sufficient as we have 1M epochs as they are represented as a 20bit integer. 
With this, there will be a conflict of containerIDs only if *all* of the below 
conditions are satisfied: 
  # The RM of sub-cluster 1 is rebooted over 10000 times 
  # There is a running App the is still running (during over 10k reboots of one 
of the RMs)
  # The app is run across sub-cluster 1 and sub-cluster 2
  # The app is still holding onto containers from sub-cluster 2  issued from 
the first reboot of that sub-cluster
  # The containers have Ids low enough that the newly issued containers from 
RM1 clash
 
Makes sense?


was (Author: subru):
[~jianhe], to answer your question let me start with why we need epoch in a 
federated cluster:
currently only a single RM generates containerIDs (applicationID + a sequence 
number) but in a federated cluster, there are multiple RMs that are 
concurrently generating them. So there will be conflicts if an application 
spans across multiple sub-clusters. To avoid this conflict, we use epoch in a 
federated cluster similar to how it's used in the context of work preserving 
restarts to prevent conflicts.

The idea is we will set epoch number to be 0 for first sub-cluster RM, 10000 
for second sub-cluster RM, 20000 for third sub-cluster RM, etc. This should be 
sufficient as we have 1M epochs as they are represented as a 20bit integer. 
With this, there will be a conflict of containerIDs only if *all* of the below 
conditions are satisfied: 
  1) The RM of sub-cluster 1 is rebooted over 10000 times 
  2) There is a running App the is still running (during over 10k reboots of 
one of the RMs)
  3) The app is run across sub-cluster 1 and sub-cluster 2
  4) The app is still holding onto containers from sub-cluster 2  issued from 
the first reboot of that sub-cluster
  5) The containers have Ids low enough that the newly issued containers from 
RM1 clash
 
Makes sense?

> Make the RM epoch base value configurable
> -----------------------------------------
>
>                 Key: YARN-5601
>                 URL: https://issues.apache.org/jira/browse/YARN-5601
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Subru Krishnan
>            Assignee: Subru Krishnan
>         Attachments: YARN-5601-YARN-2915-v1.patch
>
>
> Currently the epoch always starts from zero. This can cause container ids to 
> conflict for an application under Federation that spans multiple RMs 
> concurrently. This JIRA proposes to make the RM epoch base value configurable 
> which will allow us to avoid conflicts by setting different values for each 
> RM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-5601) Make the RM epoch base value configurable

Reply via email to