[jira] [Commented] (YARN-7592) yarn.federation.failover.enabled missing in yarn-default.xml

2024-01-02 Thread Shilun Fan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17801963#comment-17801963
 ] 

Shilun Fan commented on YARN-7592:
--

I will continue to follow up on this issue in the next 1-2 days.

> yarn.federation.failover.enabled missing in yarn-default.xml
> 
>
> Key: YARN-7592
> URL: https://issues.apache.org/jira/browse/YARN-7592
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.0.0-beta1
>Reporter: Gera Shegalov
>Priority: Major
> Attachments: IssueReproduce.patch
>
>
> yarn.federation.failover.enabled should be documented in yarn-default.xml. I 
> am also not sure why it should be true by default and force the HA retry 
> policy in {{RMProxy#createRMProxy}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7592) yarn.federation.failover.enabled missing in yarn-default.xml

2023-12-22 Thread yanbin.zhang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17799746#comment-17799746
 ] 

yanbin.zhang commented on YARN-7592:


[~slfan1989] Thank you for your prompt reply.

> yarn.federation.failover.enabled missing in yarn-default.xml
> 
>
> Key: YARN-7592
> URL: https://issues.apache.org/jira/browse/YARN-7592
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.0.0-beta1
>Reporter: Gera Shegalov
>Priority: Major
> Attachments: IssueReproduce.patch
>
>
> yarn.federation.failover.enabled should be documented in yarn-default.xml. I 
> am also not sure why it should be true by default and force the HA retry 
> policy in {{RMProxy#createRMProxy}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7592) yarn.federation.failover.enabled missing in yarn-default.xml

2023-12-22 Thread Shilun Fan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17799732#comment-17799732
 ] 

Shilun Fan commented on YARN-7592:
--

[~it_singer] Thank you for reporting this issue! I will reply as soon as 
possible on how to handle this issue.

> yarn.federation.failover.enabled missing in yarn-default.xml
> 
>
> Key: YARN-7592
> URL: https://issues.apache.org/jira/browse/YARN-7592
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.0.0-beta1
>Reporter: Gera Shegalov
>Priority: Major
> Attachments: IssueReproduce.patch
>
>
> yarn.federation.failover.enabled should be documented in yarn-default.xml. I 
> am also not sure why it should be true by default and force the HA retry 
> policy in {{RMProxy#createRMProxy}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7592) yarn.federation.failover.enabled missing in yarn-default.xml

2023-12-22 Thread yanbin.zhang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17799725#comment-17799725
 ] 

yanbin.zhang commented on YARN-7592:


[~slfan1989] Do you have any thoughts on this? This bug seems to have not been 
resolved yet.

> yarn.federation.failover.enabled missing in yarn-default.xml
> 
>
> Key: YARN-7592
> URL: https://issues.apache.org/jira/browse/YARN-7592
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.0.0-beta1
>Reporter: Gera Shegalov
>Priority: Major
> Attachments: IssueReproduce.patch
>
>
> yarn.federation.failover.enabled should be documented in yarn-default.xml. I 
> am also not sure why it should be true by default and force the HA retry 
> policy in {{RMProxy#createRMProxy}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7592) yarn.federation.failover.enabled missing in yarn-default.xml

2018-11-02 Thread Subru Krishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673832#comment-16673832
 ] 

Subru Krishnan commented on YARN-7592:
--

Thanks [~rahulanand90] for the clarification. Can you update the patch after 
removing the flag (which I should mention is great) and quickly revalidate that 
there's no regression?

+1 from my side pending that.

 

> yarn.federation.failover.enabled missing in yarn-default.xml
> 
>
> Key: YARN-7592
> URL: https://issues.apache.org/jira/browse/YARN-7592
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.0.0-beta1
>Reporter: Gera Shegalov
>Priority: Major
> Attachments: IssueReproduce.patch
>
>
> yarn.federation.failover.enabled should be documented in yarn-default.xml. I 
> am also not sure why it should be true by default and force the HA retry 
> policy in {{RMProxy#createRMProxy}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7592) yarn.federation.failover.enabled missing in yarn-default.xml

2018-11-01 Thread Rahul Anand (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672567#comment-16672567
 ] 

Rahul Anand commented on YARN-7592:
---

Thanks [~bibinchundatt] and [~subru] for the comment.

Yes his works well with both HA and non HA scenario and for the 
*yarn.federation.failover.enabled* flag, IIUC, we can remove that flag too.

> yarn.federation.failover.enabled missing in yarn-default.xml
> 
>
> Key: YARN-7592
> URL: https://issues.apache.org/jira/browse/YARN-7592
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.0.0-beta1
>Reporter: Gera Shegalov
>Priority: Major
> Attachments: IssueReproduce.patch
>
>
> yarn.federation.failover.enabled should be documented in yarn-default.xml. I 
> am also not sure why it should be true by default and force the HA retry 
> policy in {{RMProxy#createRMProxy}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7592) yarn.federation.failover.enabled missing in yarn-default.xml

2018-10-09 Thread Subru Krishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16643949#comment-16643949
 ] 

Subru Krishnan commented on YARN-7592:
--

I want to make sure I fully understand the proposal - we will revert the 
changes in RMProxy and create the FederationClientRMProxy}} (I feel 
we can skip custom) directly if *yarn.federation.enabled* is set? }}

I like the idea, can you ensure couple of things:
 * This works with both HA enabled or not (for NM, router and AMRMProxy).
 * Assuming above is true, can we remove *yarn.federation.failover.enabled* 
flag completely?

 

Thanks for working on this!

 

> yarn.federation.failover.enabled missing in yarn-default.xml
> 
>
> Key: YARN-7592
> URL: https://issues.apache.org/jira/browse/YARN-7592
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.0.0-beta1
>Reporter: Gera Shegalov
>Priority: Major
> Attachments: IssueReproduce.patch
>
>
> yarn.federation.failover.enabled should be documented in yarn-default.xml. I 
> am also not sure why it should be true by default and force the HA retry 
> policy in {{RMProxy#createRMProxy}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7592) yarn.federation.failover.enabled missing in yarn-default.xml

2018-10-08 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16641793#comment-16641793
 ] 

Bibin A Chundatt commented on YARN-7592:


+1 for this change.

[~subru] thoughts??

> yarn.federation.failover.enabled missing in yarn-default.xml
> 
>
> Key: YARN-7592
> URL: https://issues.apache.org/jira/browse/YARN-7592
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.0.0-beta1
>Reporter: Gera Shegalov
>Priority: Major
> Attachments: IssueReproduce.patch
>
>
> yarn.federation.failover.enabled should be documented in yarn-default.xml. I 
> am also not sure why it should be true by default and force the HA retry 
> policy in {{RMProxy#createRMProxy}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7592) yarn.federation.failover.enabled missing in yarn-default.xml

2018-09-24 Thread Rahul Anand (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625870#comment-16625870
 ] 

Rahul Anand commented on YARN-7592:
---

Thanks [~bibinchundatt] and [~subru]. 

Removing *yarn.federation.enabled* from yarn-site.xml can solve this issue but 
would definitely create a confusion. So, instead of changing/removing a 
meaningful federation flag or updating doc, an alternative solution can be 
creation of a {{FederationCustomClientRMProxy}} which can override the 
{{ClientRMProxy#createRMProxy}} in {{AMRMClientUtils}} to always select *proxy 
provider* as {{FederationRMFailoverProxyProvider}} for federation.
{code:java}
public static  T createRMProxy(final Configuration configuration,
  final Class protocol, UserGroupInformation user,
  final Token token) throws IOException {
 ...
  return FederationCustomClientRMProxy.createRMProxy(configuration, 
protocol);
}
 ...
}
  }
{code}
After this, we can remove the {{isFederationEnabled}} check from the 
{{RMProxy.java}} as before. 
{code:java}
protected static  T createRMProxy(final Configuration configuration,
  final Class protocol, RMProxy instance) throws IOException {
...
RetryPolicy retryPolicy = createRetryPolicy(conf,
(HAUtil.isHAEnabled(conf)));
...
  }
{code}
{code:java}
  protected static  T createRMProxy(final Configuration configuration,
  final Class protocol, RMProxy instance, final long retryTime,
  final long retryInterval) throws IOException {
...
RetryPolicy retryPolicy = createRetryPolicy(conf, retryTime, retryInterval,
HAUtil.isHAEnabled(conf));
...
  }
{code}
With this change, we don't need to seperately  specify the *proxy provider* for 
HA and non-HA scenarios.

> yarn.federation.failover.enabled missing in yarn-default.xml
> 
>
> Key: YARN-7592
> URL: https://issues.apache.org/jira/browse/YARN-7592
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.0.0-beta1
>Reporter: Gera Shegalov
>Priority: Major
> Attachments: IssueReproduce.patch
>
>
> yarn.federation.failover.enabled should be documented in yarn-default.xml. I 
> am also not sure why it should be true by default and force the HA retry 
> policy in {{RMProxy#createRMProxy}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7592) yarn.federation.failover.enabled missing in yarn-default.xml

2018-09-13 Thread Subru Krishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614143#comment-16614143
 ] 

Subru Krishnan commented on YARN-7592:
--

Thanks [~jira.shegalov] for raising this and [~bibinchundatt] and 
[~rahulanand90] for the detailed analysis.

 

[~bibinchundatt], I agree that this is related to YARN-8434. Looks like in our 
test setup, we specify {{FederationRMFailoverProxyProvider}}  for non-HA setup 
and ConfiguredRMFailoverProxyProvider for HA setup in yarn-site.

Before we change Server/Client proxies, is it possible to remove  
*yarn.federation.enabled* flag from yarn-site and check as after (re)looking at 
the code, that may not  be necessary in NMs (only in RMs)?

> yarn.federation.failover.enabled missing in yarn-default.xml
> 
>
> Key: YARN-7592
> URL: https://issues.apache.org/jira/browse/YARN-7592
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.0.0-beta1
>Reporter: Gera Shegalov
>Priority: Major
> Attachments: IssueReproduce.patch
>
>
> yarn.federation.failover.enabled should be documented in yarn-default.xml. I 
> am also not sure why it should be true by default and force the HA retry 
> policy in {{RMProxy#createRMProxy}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7592) yarn.federation.failover.enabled missing in yarn-default.xml

2018-09-13 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613828#comment-16613828
 ] 

Bibin A Chundatt commented on YARN-7592:


Thank you [~rahulanand90] for detail analysis

[~subru] seems in single RM case {{FederationRMFailoverProxyProvider}} 
configuration works for {{ResourceTracker}} and fails in case of *RM HA* 
cluster.

As discussed in YARN-8434  for ServerProxy and ClientProxy separate conf are 
required or for federationUtils  should use extended ClientRMProxy.



> yarn.federation.failover.enabled missing in yarn-default.xml
> 
>
> Key: YARN-7592
> URL: https://issues.apache.org/jira/browse/YARN-7592
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.0.0-beta1
>Reporter: Gera Shegalov
>Priority: Major
> Attachments: IssueReproduce.patch
>
>
> yarn.federation.failover.enabled should be documented in yarn-default.xml. I 
> am also not sure why it should be true by default and force the HA retry 
> policy in {{RMProxy#createRMProxy}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7592) yarn.federation.failover.enabled missing in yarn-default.xml

2018-09-07 Thread Rahul Anand (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606769#comment-16606769
 ] 

Rahul Anand commented on YARN-7592:
---

As per my understanding, for a Non-HA setup, with the default configuration, 
this will always create a problem. I have listed down my analysis.

NodeManager registration starts from {{NodeManager#main}} and evetually invokes 
{{NodeStatusUpdaterImpl#serviceStart}} 
{code:java}
protected void serviceStart() throws Exception \{
...
this.resourceTracker = getRMClient();
..
  } catch (Exception e) \{
  String errorMessage = "Unexpected error starting NodeStatusUpdater";
  LOG.error(errorMessage, e);
  throw new YarnRuntimeException(e);
 }
}
 {code}
Then, NodeStatusUpdaterImpl#getRMClient tries to create RM proxy for resource 
tracker protocol. Now, the Federation enabled check in RMProxy#newProxyInstance 
{code:java}
if (HAUtil.isHAEnabled(conf) || HAUtil.isFederationEnabled(conf)) {
   RMFailoverProxyProvider provider =
   instance.createRMFailoverProxyProvider(conf, protocol);{code}
is failing the registration of the nodemanager. By default, 
RMProxy#createRMFailoverProxyProvider will always select 
ConfiguredRMFailoverProxyProvider 
{code:java}
RMFailoverProxyProvider provider = ReflectionUtils.newInstance(
  conf.getClass(YarnConfiguration.CLIENT_FAILOVER_PROXY_PROVIDER,
 defaultProviderClass, RMFailoverProxyProvider.class), conf);
provider.init(conf, (RMProxy) this, protocol);{code}
and eventually, it will try to get RM's id from 
ConfiguredRMFailoverProxyProvider#init
{code:java}
Collection rmIds = HAUtil.getRMHAIds(conf);
 which would have been set only in case of HA setup according to 
ResourceManager#serviceInit.
this.rmContext.setHAEnabled(HAUtil.isHAEnabled(this.conf));
if (this.rmContext.isHAEnabled()) \{
HAUtil.verifyAndSetConfiguration(this.conf);
}
  {code}
 

When I tried to run with the proxy provider as 
FederationRMFailoverProxyProvider, it started the nodemanager but this would be 
idealistic to work with only in case of 1 RM. 
{code:java}

yarn.client.failover-proxy-provider

org.apache.hadoop.yarn.server.federation.failover.FederationRMFailoverProxyProvider
{code}
Please correct if I am wrong at any point. 

 

> yarn.federation.failover.enabled missing in yarn-default.xml
> 
>
> Key: YARN-7592
> URL: https://issues.apache.org/jira/browse/YARN-7592
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.0.0-beta1
>Reporter: Gera Shegalov
>Priority: Major
> Attachments: IssueReproduce.patch
>
>
> yarn.federation.failover.enabled should be documented in yarn-default.xml. I 
> am also not sure why it should be true by default and force the HA retry 
> policy in {{RMProxy#createRMProxy}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7592) yarn.federation.failover.enabled missing in yarn-default.xml

2018-09-05 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605292#comment-16605292
 ] 

Bibin A Chundatt commented on YARN-7592:


Thank you [~subru] for comment

Issue is in registration of nodemanager. Nodemanager is not able to start.
{code}
2018-09-06 11:09:16,276 INFO  [main] service.AbstractService 
(AbstractService.java:noteFailure(267)) - Service 
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl failed in state 
STARTED
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
java.lang.ArrayIndexOutOfBoundsException: 0
at 
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:263)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.yarn.server.TestFederationCluster.testNonHANodeManagerRegistration(TestFederationCluster.java:53)
...
Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
at 
org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.init(ConfiguredRMFailoverProxyProvider.java:62)
at 
org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:174)
at 
org.apache.hadoop.yarn.client.RMProxy.newProxyInstance(RMProxy.java:129)
at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:121)
at 
org.apache.hadoop.yarn.server.api.ServerRMProxy.createRMProxy(ServerRMProxy.java:74)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.getRMClient(NodeStatusUpdaterImpl.java:346)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:256)
... 26 more
{code}
Attaching patch to reproduce the issue.

Currently i havent added FederationInterceptor for Nodemanager configuration. 

> yarn.federation.failover.enabled missing in yarn-default.xml
> 
>
> Key: YARN-7592
> URL: https://issues.apache.org/jira/browse/YARN-7592
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.0.0-beta1
>Reporter: Gera Shegalov
>Priority: Major
>
> yarn.federation.failover.enabled should be documented in yarn-default.xml. I 
> am also not sure why it should be true by default and force the HA retry 
> policy in {{RMProxy#createRMProxy}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7592) yarn.federation.failover.enabled missing in yarn-default.xml

2018-09-05 Thread Subru Krishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605092#comment-16605092
 ] 

Subru Krishnan commented on YARN-7592:
--

[~bibinchundatt]/[~jira.shegalov], I have tested multiple times with a similar 
setup (for 2.9 release) and never faced any issues.

 

FYI the FEDERATION_FAILOVER_ENABLED is automatically set by 
{{FederationProxyProviderUtil}} if HA is enabled as you can see 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/failover/FederationProxyProviderUtil.java#L128].

> yarn.federation.failover.enabled missing in yarn-default.xml
> 
>
> Key: YARN-7592
> URL: https://issues.apache.org/jira/browse/YARN-7592
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.0.0-beta1
>Reporter: Gera Shegalov
>Priority: Major
>
> yarn.federation.failover.enabled should be documented in yarn-default.xml. I 
> am also not sure why it should be true by default and force the HA retry 
> policy in {{RMProxy#createRMProxy}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7592) yarn.federation.failover.enabled missing in yarn-default.xml

2018-09-05 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604773#comment-16604773
 ] 

Bibin A Chundatt commented on YARN-7592:


[~jira.shegalov]

Following are my understanding based on discussion in YARN-8434

As per 
[comment|https://issues.apache.org/jira/browse/YARN-8434?focusedCommentId=16539415=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16539415]
 from [~subru]  FederationRMFailoverProxyProvider is intenally set for 
connection retry handling.


IIUC {{RMProxy#createRMProxy}} , federation check is not required. Also 
following code seems to have issue {{RMProxy#newProxyInstance}} 

{code}
  private static  T newProxyInstance(final YarnConfiguration conf,
  final Class protocol, RMProxy instance, RetryPolicy retryPolicy)
  throws IOException{
if (HAUtil.isHAEnabled(conf) || HAUtil.isFederationEnabled(conf)) {
  RMFailoverProxyProvider provider =
  instance.createRMFailoverProxyProvider(conf, protocol);
  return (T) RetryProxy.create(protocol, provider, retryPolicy);
} else {
  InetSocketAddress rmAddress = instance.getRMAddress(conf, protocol);
  LOG.info("Connecting to ResourceManager at " + rmAddress);
  T proxy = instance.getProxy(conf, protocol, rmAddress);
  return (T) RetryProxy.create(protocol, proxy, retryPolicy);
}
  }
{code}

Router + 1 RM (non HA) - 2 NM and Federation enabled topology.
{{ConfiguredRMFailoverProxyProvider}}  get intialized as Failover Provider  
ServerProxy and fails to connect to RM. Exception @
{code}
 this.rmServiceIds = rmIds.toArray(new String[rmIds.size()]);
conf.set(YarnConfiguration.RM_HA_ID, rmServiceIds[currentProxyIndex]);
{code}

cc:/ [~subru]





> yarn.federation.failover.enabled missing in yarn-default.xml
> 
>
> Key: YARN-7592
> URL: https://issues.apache.org/jira/browse/YARN-7592
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.0.0-beta1
>Reporter: Gera Shegalov
>Priority: Major
>
> yarn.federation.failover.enabled should be documented in yarn-default.xml. I 
> am also not sure why it should be true by default and force the HA retry 
> policy in {{RMProxy#createRMProxy}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org