[jira] [Updated] (YARN-3671) Integrate Federation services with ResourceManager

2016-08-29 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-3671:
-
Attachment: YARN-3671-YARN-2915-v5.patch

Fixing the checkstyle warning that can be addressed (v5).

> Integrate Federation services with ResourceManager
> --
>
> Key: YARN-3671
> URL: https://issues.apache.org/jira/browse/YARN-3671
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: YARN-3671-YARN-2915-v1.patch, 
> YARN-3671-YARN-2915-v2.patch, YARN-3671-YARN-2915-v3.patch, 
> YARN-3671-YARN-2915-v4.patch, YARN-3671-YARN-2915-v5.patch
>
>
> This JIRA proposes adding the ability to turn on Federation services like 
> StateStore, cluster membership heartbeat etc in the RM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3671) Integrate Federation services with ResourceManager

2016-08-29 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-3671:
-
Attachment: YARN-3671-YARN-2915-v4.patch

bq. Here, the scheduler is passed in via reference. If the RM switches to 
standby and then switches back to active, the scheduler object will be 
re-created and the reference will be updated in RMContext, but the reference 
here won't be updated, which cause the scheduler object here outdated? 

Good question [~jianhe]. The scheduler object will *not* be outdated as 
{{FederationStateStoreService}} is an active service and the 
{{FederationStateStoreHeartbeat}} is initialized on it's _serviceStart_:
{code}
stateStoreHeartbeat = new FederationStateStoreHeartbeat(subClusterId,
stateStoreClient, rmContext.getScheduler());
{code}

To double-check, I have updated the {{TestFederationRMStateStoreService}} with 
an explicit transtion to standby and then back to active.

I have also renamed "state-store.heartbeat-interval" --> 
"state-store.heartbeat-interval-secs" in v4 of the patch as you suggested.

> Integrate Federation services with ResourceManager
> --
>
> Key: YARN-3671
> URL: https://issues.apache.org/jira/browse/YARN-3671
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: YARN-3671-YARN-2915-v1.patch, 
> YARN-3671-YARN-2915-v2.patch, YARN-3671-YARN-2915-v3.patch, 
> YARN-3671-YARN-2915-v4.patch
>
>
> This JIRA proposes adding the ability to turn on Federation services like 
> StateStore, cluster membership heartbeat etc in the RM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3671) Integrate Federation services with ResourceManager

2016-08-26 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-3671:
-
Attachment: YARN-3671-YARN-2915-v3.patch

[~jianhe], I looked into it & seems like we should be able to reuse 
RM_CLUSTER_ID safely as you suggested. So I have replaced 
FEDERATION_SUBCLUSTER_ID with RM_CLUSTER_ID in v3 of the patch.

> Integrate Federation services with ResourceManager
> --
>
> Key: YARN-3671
> URL: https://issues.apache.org/jira/browse/YARN-3671
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: YARN-3671-YARN-2915-v1.patch, 
> YARN-3671-YARN-2915-v2.patch, YARN-3671-YARN-2915-v3.patch
>
>
> This JIRA proposes adding the ability to turn on Federation services like 
> StateStore, cluster membership heartbeat etc in the RM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3671) Integrate Federation services with ResourceManager

2016-08-25 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-3671:
-
Attachment: YARN-3671-YARN-2915-v2.patch

Thanks [~jianhe] for the feedback. Updated patch (v2) to remove redundant null 
check and refactor _setStateStoreClient_ as suggested by you.

As to your other questions:

bq. we already have RM_CLUSTER_ID, any chance that this can be used for 
FEDERATION_SUBCLUSTER_ID ?

That's a possibility. The reason I didn't combine both is RM_CLUSTER_ID is 
currently used for HA but Federation can work both with and without HA (and RM 
HA can work both with and without Federation). So felt it would be better to 
keep them separate. Thoughts?

bq. I feel the SubClusterState is a bit redundant in the request object, 
because the API itself already indicates the state such as register / 
deregister.

You are right. We don't want state to be null in the store so either the store 
impl can implicitly add SC_NEW/SC_UNREGISTERED on register / deregister or the 
invoker (which is always RM) can. I decided to do it in the RM for 2 reasons:
  1. It is trivial (one line) & needs to be done in a single place (RM) instead 
of in each store impl we add.
   2. This allows for flexibility future as RM could potentially register / 
deregister with different states (say SC_DRAINING).

Makes sense?

> Integrate Federation services with ResourceManager
> --
>
> Key: YARN-3671
> URL: https://issues.apache.org/jira/browse/YARN-3671
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: YARN-3671-YARN-2915-v1.patch, 
> YARN-3671-YARN-2915-v2.patch
>
>
> This JIRA proposes adding the ability to turn on Federation services like 
> StateStore, cluster membership heartbeat etc in the RM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3671) Integrate Federation services with ResourceManager

2016-08-24 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-3671:
-
Attachment: YARN-3671-YARN-2915-v1.patch

Attaching a patch that adds a new _RMActiveService_ - 
{{FederationStateStoreService}} that's optionally turned on by a boolean 
*FEDERATION_ENABLED* flag. When enabled it starts a periodic heartbeat; 
{{FederationStateStoreHeartbeat}} to the _FederationStateStore_.

> Integrate Federation services with ResourceManager
> --
>
> Key: YARN-3671
> URL: https://issues.apache.org/jira/browse/YARN-3671
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: YARN-3671-YARN-2915-v1.patch
>
>
> This JIRA proposes adding the ability to turn on Federation services like 
> StateStore, cluster membership heartbeat etc in the RM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org