[jira] [Commented] (YARN-5718) TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings which could cause unexpected behavior

2019-10-14 Thread Wang, Xinglong (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16951623#comment-16951623
 ] 

Wang, Xinglong commented on YARN-5718:
--

I went through hdfs code, and also found the issue is only with No-HA hdfs 
setup. The original description is not correct.

As the following code, only in Non_HA case, retry config will be used. In HA 
case, RetryPolicies.failoverOnNetworkException will be used.


{code:java}
public class NameNodeProxies {
public static  ProxyAndInfo createProxy(Configuration conf,
  URI nameNodeUri, Class xface, AtomicBoolean fallbackToSimpleAuth)
  throws IOException {
AbstractNNFailoverProxyProvider failoverProxyProvider =
createFailoverProxyProvider(conf, nameNodeUri, xface, true,
  fallbackToSimpleAuth);
  
if (failoverProxyProvider == null) {
  // Non-HA case
  return createNonHAProxy(conf, NameNode.getAddress(conf, nameNodeUri),
  xface, UserGroupInformation.getCurrentUser(), true,
  fallbackToSimpleAuth);
} else {
  // HA case
  Conf config = new Conf(conf);
  T proxy = (T) RetryProxy.create(xface, failoverProxyProvider,
  RetryPolicies.failoverOnNetworkException(
  RetryPolicies.TRY_ONCE_THEN_FAIL, config.maxFailoverAttempts,
  config.maxRetryAttempts, config.failoverSleepBaseMillis,
  config.failoverSleepMaxMillis));
{code}


> TimelineClient (and other places in YARN) shouldn't over-write HDFS client 
> retry settings which could cause unexpected behavior
> ---
>
> Key: YARN-5718
> URL: https://issues.apache.org/jira/browse/YARN-5718
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, timelineclient
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Major
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-5718-v2.1.patch, YARN-5718-v2.patch, YARN-5718.patch
>
>
> In one HA cluster, after NN failed over, we noticed that job is getting 
> failed as TimelineClient failed to retry connection to proper NN. This is 
> because we are overwrite hdfs client settings that hard code retry policy to 
> be enabled that conflict NN failed-over case - hdfs client should fail fast 
> so can retry on another NN.
> We shouldn't assume any retry policy for hdfs client at all places in YARN. 
> This should keep consistent with HDFS settings that has different retry 
> polices in different deployment case. Thus, we should clean up these hard 
> code settings in YARN, include: FileSystemTimelineWriter, 
> FileSystemRMStateStore and FileSystemNodeLabelsStore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5718) TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings which could cause unexpected behavior

2016-10-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586222#comment-15586222
 ] 

Hudson commented on YARN-5718:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10631 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10631/])
YARN-5718. TimelineClient (and other places in YARN) shouldn't (xgong: rev 
b733a6f86262522e535cebc972baecbe6a6eab50)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/FileSystemNodeLabelsStore.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/FileSystemTimelineWriter.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java


> TimelineClient (and other places in YARN) shouldn't over-write HDFS client 
> retry settings which could cause unexpected behavior
> ---
>
> Key: YARN-5718
> URL: https://issues.apache.org/jira/browse/YARN-5718
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, timelineclient
>Reporter: Junping Du
>Assignee: Junping Du
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-5718-v2.1.patch, YARN-5718-v2.patch, YARN-5718.patch
>
>
> In one HA cluster, after NN failed over, we noticed that job is getting 
> failed as TimelineClient failed to retry connection to proper NN. This is 
> because we are overwrite hdfs client settings that hard code retry policy to 
> be enabled that conflict NN failed-over case - hdfs client should fail fast 
> so can retry on another NN.
> We shouldn't assume any retry policy for hdfs client at all places in YARN. 
> This should keep consistent with HDFS settings that has different retry 
> polices in different deployment case. Thus, we should clean up these hard 
> code settings in YARN, include: FileSystemTimelineWriter, 
> FileSystemRMStateStore and FileSystemNodeLabelsStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5718) TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings which could cause unexpected behavior

2016-10-18 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586188#comment-15586188
 ] 

Xuan Gong commented on YARN-5718:
-

Committed into trunk. Let us continue our discuss on YARN-5748

> TimelineClient (and other places in YARN) shouldn't over-write HDFS client 
> retry settings which could cause unexpected behavior
> ---
>
> Key: YARN-5718
> URL: https://issues.apache.org/jira/browse/YARN-5718
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, timelineclient
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-5718-v2.1.patch, YARN-5718-v2.patch, YARN-5718.patch
>
>
> In one HA cluster, after NN failed over, we noticed that job is getting 
> failed as TimelineClient failed to retry connection to proper NN. This is 
> because we are overwrite hdfs client settings that hard code retry policy to 
> be enabled that conflict NN failed-over case - hdfs client should fail fast 
> so can retry on another NN.
> We shouldn't assume any retry policy for hdfs client at all places in YARN. 
> This should keep consistent with HDFS settings that has different retry 
> polices in different deployment case. Thus, we should clean up these hard 
> code settings in YARN, include: FileSystemTimelineWriter, 
> FileSystemRMStateStore and FileSystemNodeLabelsStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5718) TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings which could cause unexpected behavior

2016-10-17 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584116#comment-15584116
 ] 

Junping Du commented on YARN-5718:
--

Just filed YARN-5748. We have have more discussion there.

> TimelineClient (and other places in YARN) shouldn't over-write HDFS client 
> retry settings which could cause unexpected behavior
> ---
>
> Key: YARN-5718
> URL: https://issues.apache.org/jira/browse/YARN-5718
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, timelineclient
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-5718-v2.1.patch, YARN-5718-v2.patch, YARN-5718.patch
>
>
> In one HA cluster, after NN failed over, we noticed that job is getting 
> failed as TimelineClient failed to retry connection to proper NN. This is 
> because we are overwrite hdfs client settings that hard code retry policy to 
> be enabled that conflict NN failed-over case - hdfs client should fail fast 
> so can retry on another NN.
> We shouldn't assume any retry policy for hdfs client at all places in YARN. 
> This should keep consistent with HDFS settings that has different retry 
> polices in different deployment case. Thus, we should clean up these hard 
> code settings in YARN, include: FileSystemTimelineWriter, 
> FileSystemRMStateStore and FileSystemNodeLabelsStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5718) TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings which could cause unexpected behavior

2016-10-17 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584084#comment-15584084
 ] 

Junping Du commented on YARN-5718:
--

Thanks [~xgong] for review and comments. File a separate one for 
branch-2/branch-2.8 sounds good to me. Will do it soon.

> TimelineClient (and other places in YARN) shouldn't over-write HDFS client 
> retry settings which could cause unexpected behavior
> ---
>
> Key: YARN-5718
> URL: https://issues.apache.org/jira/browse/YARN-5718
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, timelineclient
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-5718-v2.1.patch, YARN-5718-v2.patch, YARN-5718.patch
>
>
> In one HA cluster, after NN failed over, we noticed that job is getting 
> failed as TimelineClient failed to retry connection to proper NN. This is 
> because we are overwrite hdfs client settings that hard code retry policy to 
> be enabled that conflict NN failed-over case - hdfs client should fail fast 
> so can retry on another NN.
> We shouldn't assume any retry policy for hdfs client at all places in YARN. 
> This should keep consistent with HDFS settings that has different retry 
> polices in different deployment case. Thus, we should clean up these hard 
> code settings in YARN, include: FileSystemTimelineWriter, 
> FileSystemRMStateStore and FileSystemNodeLabelsStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5718) TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings which could cause unexpected behavior

2016-10-13 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15573792#comment-15573792
 ] 

Xuan Gong commented on YARN-5718:
-

+1 LGTM.

[~djp]
But I am worried about the back-incompatibility issue. Could you file a 
separate ticket to have a discussion on branch-2 and branch-2.8 for this 
ticket, please ?

At the mean time, I will commit the patch into trunk.

> TimelineClient (and other places in YARN) shouldn't over-write HDFS client 
> retry settings which could cause unexpected behavior
> ---
>
> Key: YARN-5718
> URL: https://issues.apache.org/jira/browse/YARN-5718
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, timelineclient
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-5718-v2.1.patch, YARN-5718-v2.patch, YARN-5718.patch
>
>
> In one HA cluster, after NN failed over, we noticed that job is getting 
> failed as TimelineClient failed to retry connection to proper NN. This is 
> because we are overwrite hdfs client settings that hard code retry policy to 
> be enabled that conflict NN failed-over case - hdfs client should fail fast 
> so can retry on another NN.
> We shouldn't assume any retry policy for hdfs client at all places in YARN. 
> This should keep consistent with HDFS settings that has different retry 
> polices in different deployment case. Thus, we should clean up these hard 
> code settings in YARN, include: FileSystemTimelineWriter, 
> FileSystemRMStateStore and FileSystemNodeLabelsStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5718) TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings which could cause unexpected behavior

2016-10-12 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569902#comment-15569902
 ] 

Vrushali C commented on YARN-5718:
--

Thanks Junping, latest patch v2.1 looks good to me.

> TimelineClient (and other places in YARN) shouldn't over-write HDFS client 
> retry settings which could cause unexpected behavior
> ---
>
> Key: YARN-5718
> URL: https://issues.apache.org/jira/browse/YARN-5718
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, timelineclient
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-5718-v2.1.patch, YARN-5718-v2.patch, YARN-5718.patch
>
>
> In one HA cluster, after NN failed over, we noticed that job is getting 
> failed as TimelineClient failed to retry connection to proper NN. This is 
> because we are overwrite hdfs client settings that hard code retry policy to 
> be enabled that conflict NN failed-over case - hdfs client should fail fast 
> so can retry on another NN.
> We shouldn't assume any retry policy for hdfs client at all places in YARN. 
> This should keep consistent with HDFS settings that has different retry 
> polices in different deployment case. Thus, we should clean up these hard 
> code settings in YARN, include: FileSystemTimelineWriter, 
> FileSystemRMStateStore and FileSystemNodeLabelsStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5718) TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings which could cause unexpected behavior

2016-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569737#comment-15569737
 ] 

Hadoop QA commented on YARN-5718:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
35s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 21s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 34s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
42s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
58s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
27s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 31s 
{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 28s 
{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 35m 18s 
{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m 46s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832957/YARN-5718-v2.1.patch |
| JIRA Issue | YARN-5718 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 1e0d14075cd7 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 6476934 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13364/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 

[jira] [Commented] (YARN-5718) TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings which could cause unexpected behavior

2016-10-12 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569568#comment-15569568
 ] 

Junping Du commented on YARN-5718:
--

Thanks Vrushali for quick comments. I think compile error is a bit misleading 
but indeed an issue need to fix in TestFSRMStateStore (due to a stupid mistake 
in generating v2 patch). v2.1 should fix the issue.

> TimelineClient (and other places in YARN) shouldn't over-write HDFS client 
> retry settings which could cause unexpected behavior
> ---
>
> Key: YARN-5718
> URL: https://issues.apache.org/jira/browse/YARN-5718
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, timelineclient
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-5718-v2.1.patch, YARN-5718-v2.patch, YARN-5718.patch
>
>
> In one HA cluster, after NN failed over, we noticed that job is getting 
> failed as TimelineClient failed to retry connection to proper NN. This is 
> because we are overwrite hdfs client settings that hard code retry policy to 
> be enabled that conflict NN failed-over case - hdfs client should fail fast 
> so can retry on another NN.
> We shouldn't assume any retry policy for hdfs client at all places in YARN. 
> This should keep consistent with HDFS settings that has different retry 
> polices in different deployment case. Thus, we should clean up these hard 
> code settings in YARN, include: FileSystemTimelineWriter, 
> FileSystemRMStateStore and FileSystemNodeLabelsStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5718) TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings which could cause unexpected behavior

2016-10-12 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569537#comment-15569537
 ] 

Vrushali C commented on YARN-5718:
--

Thanks Junping, the updated patch looks good. 

Not sure what the compilation error is:
{code}
[ERROR] COMPILATION ERROR : 
[INFO] -
[ERROR]   where E is a type-variable:
E extends Object declared in method toSet(E...)
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java:[326,36]
 error: cannot find symbol
[INFO] 1 erro
{code}

Perhaps unrelated. 

> TimelineClient (and other places in YARN) shouldn't over-write HDFS client 
> retry settings which could cause unexpected behavior
> ---
>
> Key: YARN-5718
> URL: https://issues.apache.org/jira/browse/YARN-5718
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, timelineclient
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-5718-v2.patch, YARN-5718.patch
>
>
> In one HA cluster, after NN failed over, we noticed that job is getting 
> failed as TimelineClient failed to retry connection to proper NN. This is 
> because we are overwrite hdfs client settings that hard code retry policy to 
> be enabled that conflict NN failed-over case - hdfs client should fail fast 
> so can retry on another NN.
> We shouldn't assume any retry policy for hdfs client at all places in YARN. 
> This should keep consistent with HDFS settings that has different retry 
> polices in different deployment case. Thus, we should clean up these hard 
> code settings in YARN, include: FileSystemTimelineWriter, 
> FileSystemRMStateStore and FileSystemNodeLabelsStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5718) TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings which could cause unexpected behavior

2016-10-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15568506#comment-15568506
 ] 

Hadoop QA commented on YARN-5718:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
9s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 23s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
42s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 39s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
43s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
55s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 29s 
{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 1m 46s 
{color} | {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 46s {color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
38s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 31s 
{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 23s 
{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s 
{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 17s 
{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 29s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 31m 0s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832870/YARN-5718-v2.patch |
| JIRA Issue | YARN-5718 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux e4236da56dea 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 6476934 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| mvninstall | 

[jira] [Commented] (YARN-5718) TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings which could cause unexpected behavior

2016-10-10 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563137#comment-15563137
 ] 

Vrushali C commented on YARN-5718:
--

Thanks Junping, I looked at the patch and I think I agree that external YARN 
clients should not set/override HDFS retry settings. 

Now, with this patch, the yarn config variables 
TIMELINE_SERVICE_ENTITYGROUP_FS_STORE_RETRY_POLICY_SPEC, 
FS_NODE_LABELS_STORE_RETRY_POLICY_SPEC and their defaults are no longer used 
anywhere in the code. Should they be removed?
Also, YarnConfiguration.FS_RM_STATE_STORE_RETRY_POLICY_SPEC is used in a test 
case in TestFSRMStateStore.java, so should that be changed too? 

> TimelineClient (and other places in YARN) shouldn't over-write HDFS client 
> retry settings which could cause unexpected behavior
> ---
>
> Key: YARN-5718
> URL: https://issues.apache.org/jira/browse/YARN-5718
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, timelineclient
>Reporter: Junping Du
>Assignee: Junping Du
> Attachments: YARN-5718.patch
>
>
> In one HA cluster, after NN failed over, we noticed that job is getting 
> failed as TimelineClient failed to retry connection to proper NN. This is 
> because we are overwrite hdfs client settings that hard code retry policy to 
> be enabled that conflict NN failed-over case - hdfs client should fail fast 
> so can retry on another NN.
> We shouldn't assume any retry policy for hdfs client at all places in YARN. 
> This should keep consistent with HDFS settings that has different retry 
> polices in different deployment case. Thus, we should clean up these hard 
> code settings in YARN, include: FileSystemTimelineWriter, 
> FileSystemRMStateStore and FileSystemNodeLabelsStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5718) TimelineClient (and other places in YARN) shouldn't over-write HDFS client retry settings which could cause unexpected behavior

2016-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562726#comment-15562726
 ] 

Hadoop QA commented on YARN-5718:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 39s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 17s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
37s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
51s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
36s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 16s 
{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 38m 29s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 64m 49s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832492/YARN-5718.patch |
| JIRA Issue | YARN-5718 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 0ece86db98f6 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / cef61d5 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/13335/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit test logs |