[jira] [Commented] (YARN-4087) Set YARN_FAIL_FAST to be false by default

2015-09-03 Thread Anubhav Dhoot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14729422#comment-14729422
 ] 

Anubhav Dhoot commented on YARN-4087:
-

In general if we are not failing the daemon if fail fast flag is false, we 
still need to ensure we are not leaving inconsistent state in RM. For eg in 
YARN-4032. YARN-2019 is the other case where we did not need to do anything. 
This would mean every patch from now on that uses fail fast to not crash the 
daemon should consider taking corrective action to ensure correctness. Does 
that make sense?

> Set YARN_FAIL_FAST to be false by default
> -
>
> Key: YARN-4087
> URL: https://issues.apache.org/jira/browse/YARN-4087
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-4087.1.patch, YARN-4087.2.patch
>
>
> Increasingly, I feel setting this property to be false makes more sense 
> especially in production environment, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4087) Set YARN_FAIL_FAST to be false by default

2015-09-01 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725761#comment-14725761
 ] 

Vinod Kumar Vavilapalli commented on YARN-4087:
---

Yes, I just checked that YARN-2019 added the config only in 2.8 which is 
unreleased now. So, we can safely change the default.

bq. Also, may be we should mark this JIRA as incompatible (for behavior)?
The previous behavior was undesired, and nobody in practice should depend on it.

I think there was a bigger thing that got missed at YARN-2019. If we ignore the 
failure when the config is off, the higher order operations are stuck in a 
weird state as there are no retries or explicit app-failures, [~jianhe]?

> Set YARN_FAIL_FAST to be false by default
> -
>
> Key: YARN-4087
> URL: https://issues.apache.org/jira/browse/YARN-4087
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-4087.1.patch, YARN-4087.2.patch
>
>
> Increasingly, I feel setting this property to be false makes more sense 
> especially in production environment, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4087) Set YARN_FAIL_FAST to be false by default

2015-09-01 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14726097#comment-14726097
 ] 

Jian He commented on YARN-4087:
---

bq. as there are no retries or explicit app-failures
Retry already happened internally before the final Exception is thrown. 
Right, app will be stuck at certain state, since no notification is sent back. 
But,  explicitly failing the app may be too harsh, since the app itself can 
actually proceed without any impact.  I think we can still notify back that the 
store operation is done and let the app continue. Also, print warning message 
on application page something like "Application is not persisted in state-store 
due to state-store error. Application will be lost if RM restarted."

> Set YARN_FAIL_FAST to be false by default
> -
>
> Key: YARN-4087
> URL: https://issues.apache.org/jira/browse/YARN-4087
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-4087.1.patch, YARN-4087.2.patch
>
>
> Increasingly, I feel setting this property to be false makes more sense 
> especially in production environment, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4087) Set YARN_FAIL_FAST to be false by default

2015-08-31 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14724003#comment-14724003
 ] 

Jian He commented on YARN-4087:
---

bq. In yarn-default.xml the default value for RM_FAIL_FAST is true.
DIdn't get you. Isn't the default value set to YARN_FAIL_FAST too?

> Set YARN_FAIL_FAST to be false by default
> -
>
> Key: YARN-4087
> URL: https://issues.apache.org/jira/browse/YARN-4087
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-4087.1.patch, YARN-4087.2.patch
>
>
> Increasingly, I feel setting this property to be false makes more sense 
> especially in production environment, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4087) Set YARN_FAIL_FAST to be false by default

2015-08-31 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723242#comment-14723242
 ] 

Bibin A Chundatt commented on YARN-4087:


In yarn-default.xml the default value for RM_FAIL_FAST is true.
In code the default value for RM_FAIL_FAST is taken from YARN_FAIL_FAST whose 
value is false.

> Set YARN_FAIL_FAST to be false by default
> -
>
> Key: YARN-4087
> URL: https://issues.apache.org/jira/browse/YARN-4087
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-4087.1.patch, YARN-4087.2.patch
>
>
> Increasingly, I feel setting this property to be false makes more sense 
> especially in production environment, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4087) Set YARN_FAIL_FAST to be false by default

2015-08-28 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718641#comment-14718641
 ] 

Junping Du commented on YARN-4087:
--

Patch LGTM.
bq. +1, if fail-fast hasn't been in any prior release and we are not 
drastically altering the behavior.
I believe fail-fast just involve recently. However, the default behavior when 
RM/NM state store get failed could be different from previous releases: it 
failed NM/RM daemons previously, now we could tolerant it keep running with log 
some error messages. We should definitely note this in our release notes. Also, 
may be we should mark this JIRA as incompatible (for behavior)?


 Set YARN_FAIL_FAST to be false by default
 -

 Key: YARN-4087
 URL: https://issues.apache.org/jira/browse/YARN-4087
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-4087.1.patch, YARN-4087.2.patch


 Increasingly, I feel setting this property to be false makes more sense 
 especially in production environment, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4087) Set YARN_FAIL_FAST to be false by default

2015-08-27 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717131#comment-14717131
 ] 

Jian He commented on YARN-4087:
---

[~bibinchundatt],   the logic is that default value for RM_FAIL_FAST is  
YARN_FAIL_FAST

 Set YARN_FAIL_FAST to be false by default
 -

 Key: YARN-4087
 URL: https://issues.apache.org/jira/browse/YARN-4087
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-4087.1.patch


 Increasingly, I feel setting this property to be false makes more sense 
 especially in production environment, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4087) Set YARN_FAIL_FAST to be false by default

2015-08-27 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717526#comment-14717526
 ] 

Hitesh Shah commented on YARN-4087:
---

It would be good to rename the config property to something that provides a bit 
more clarity on what the config knob is meant to control. 

 Set YARN_FAIL_FAST to be false by default
 -

 Key: YARN-4087
 URL: https://issues.apache.org/jira/browse/YARN-4087
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-4087.1.patch


 Increasingly, I feel setting this property to be false makes more sense 
 especially in production environment, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4087) Set YARN_FAIL_FAST to be false by default

2015-08-27 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717568#comment-14717568
 ] 

Jian He commented on YARN-4087:
---

The YARN_FAIL_FAST is a global knob to control all components, e.g. RM, NM; The 
config description does the clarification. Just can't think of a concise and 
meaningful name. Any naming suggestion is welcome.

Update the patch to carify the config description more.

 Set YARN_FAIL_FAST to be false by default
 -

 Key: YARN-4087
 URL: https://issues.apache.org/jira/browse/YARN-4087
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-4087.1.patch


 Increasingly, I feel setting this property to be false makes more sense 
 especially in production environment, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4087) Set YARN_FAIL_FAST to be false by default

2015-08-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717666#comment-14717666
 ] 

Hadoop QA commented on YARN-4087:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 23s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 51s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  3s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m  0s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 12s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 23s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 59s | Tests passed in 
hadoop-yarn-common. |
| | |  46m 22s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12752862/YARN-4087.2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / a9c8ea7 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8932/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8932/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8932/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8932/console |


This message was automatically generated.

 Set YARN_FAIL_FAST to be false by default
 -

 Key: YARN-4087
 URL: https://issues.apache.org/jira/browse/YARN-4087
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-4087.1.patch, YARN-4087.2.patch


 Increasingly, I feel setting this property to be false makes more sense 
 especially in production environment, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4087) Set YARN_FAIL_FAST to be false by default

2015-08-26 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715829#comment-14715829
 ] 

Karthik Kambatla commented on YARN-4087:


+1, if fail-fast hasn't been in any prior release and we are not drastically 
altering the behavior.

In any case, it would be nice to release note this new behavior for 2.8.0. 

 Set YARN_FAIL_FAST to be false by default
 -

 Key: YARN-4087
 URL: https://issues.apache.org/jira/browse/YARN-4087
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-4087.1.patch


 Increasingly, I feel setting this property to be false makes more sense 
 especially in production environment, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4087) Set YARN_FAIL_FAST to be false by default

2015-08-26 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715872#comment-14715872
 ] 

Bibin A Chundatt commented on YARN-4087:


So by  default in yarn-default.xml 

yarn.resourcemanager.fail-fast=true
yarn.fail-fast=false

In YarnConfiguration

{code}
  public static boolean shouldRMFailFast(Configuration conf) {
return conf.getBoolean(YarnConfiguration.RM_FAIL_FAST,
conf.getBoolean(YarnConfiguration.YARN_FAIL_FAST,
YarnConfiguration.DEFAULT_YARN_FAIL_FAST));
  }
{code}

some mismatch rt?

No plans to change YarnConfiguration.RM_FAIL_FAST.



 Set YARN_FAIL_FAST to be false by default
 -

 Key: YARN-4087
 URL: https://issues.apache.org/jira/browse/YARN-4087
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-4087.1.patch


 Increasingly, I feel setting this property to be false makes more sense 
 especially in production environment, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4087) Set YARN_FAIL_FAST to be false by default

2015-08-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715892#comment-14715892
 ] 

Hadoop QA commented on YARN-4087:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 27s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m  2s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  9s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 58s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 10s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 23s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 59s | Tests passed in 
hadoop-yarn-common. |
| | |  46m 40s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12752615/YARN-4087.1.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f44b599 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8922/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8922/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8922/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8922/console |


This message was automatically generated.

 Set YARN_FAIL_FAST to be false by default
 -

 Key: YARN-4087
 URL: https://issues.apache.org/jira/browse/YARN-4087
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-4087.1.patch


 Increasingly, I feel setting this property to be false makes more sense 
 especially in production environment, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)