date:20150526


 [ 
https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-3721:

Attachment: YARN-3721-YARN-2928.001.patch

Add an exclusion to resolve the cyclic dependency in timelineserver's pom file. 

 build is broken on YARN-2928 branch due to possible dependency cycle
 

 Key: YARN-3721
 URL: https://issues.apache.org/jira/browse/YARN-3721
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Li Lu
Priority: Blocker
 Attachments: YARN-3721-YARN-2928.001.patch


 The build is broken on the YARN-2928 branch at the 
 hadoop-yarn-server-timelineservice module. It's been broken for a while, but 
 we didn't notice it because the build happens to work despite this if the 
 maven local cache is not cleared.
 To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven 
 local cache and build it.
 Almost certainly it was introduced by YARN-3529.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle


[ 
https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560305#comment-14560305
 ] 

Hadoop QA commented on YARN-3721:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 22s | Pre-patch YARN-2928 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 42s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 43s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 38s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | yarn tests |   1m 11s | Tests failed in 
hadoop-yarn-server-timelineservice. |
| | |  36m 42s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineWriterImpl |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12735496/YARN-3721-YARN-2928.001.patch
 |
| Optional Tests | javadoc javac unit |
| git revision | YARN-2928 / e19566a |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8094/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8094/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8094/console |


This message was automatically generated.

 build is broken on YARN-2928 branch due to possible dependency cycle
 

 Key: YARN-3721
 URL: https://issues.apache.org/jira/browse/YARN-3721
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Li Lu
Priority: Blocker
 Attachments: YARN-3721-YARN-2928.001.patch


 The build is broken on the YARN-2928 branch at the 
 hadoop-yarn-server-timelineservice module. It's been broken for a while, but 
 we didn't notice it because the build happens to work despite this if the 
 maven local cache is not cleared.
 To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven 
 local cache and build it.
 Almost certainly it was introduced by YARN-3529.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3722) Merge multiple TestWebAppUtils

Masatake Iwasaki created YARN-3722:
--

 Summary: Merge multiple TestWebAppUtils
 Key: YARN-3722
 URL: https://issues.apache.org/jira/browse/YARN-3722
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: test
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor


The tests in {{o.a.h.yarn.util.TestWebAppUtils}} could be moved to 
{{o.a.h.yarn.webapp.util.TestWebAppUtils}}. WebAppUtils belongs to 
{{o.a.h.yarn.webapp.util}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3581) Deprecate -directlyAccessNodeLabelStore in RMAdminCLI

2015-05-26 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560179#comment-14560179
 ] 

Wangda Tan commented on YARN-3581:
--

[~Naganarasimha]. Thanks for working on this, some comments:
- {{(Deprecated! Support will be removed in future) Directly access node label 
store, }}, is it better to make it: {{(This is DEPRECATED, will be removed in 
future releases)...}}?
- RMAdminCLI put the deprecated.. message in args option instead of help.
- printHelp in RMAdminCLI should be consistency with usage? For changes of 
{{...}} 

 Deprecate -directlyAccessNodeLabelStore in RMAdminCLI
 -

 Key: YARN-3581
 URL: https://issues.apache.org/jira/browse/YARN-3581
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-3581.20150525-1.patch


 In 2.6.0, we added an option called -directlyAccessNodeLabelStore to make 
 RM can start with label-configured queue settings. After YARN-2918, we don't 
 need this option any more, admin can configure queue setting, start RM and 
 configure node label via RMAdminCLI without any error.
 In addition, this option is very restrictive, first it needs to run on the 
 same node where RM is running if admin configured to store labels in local 
 disk.
 Second, when admin run the option when RM is running, multiple process write 
 to a same file can happen, this could make node label store becomes invalid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-41) The RM should handle the graceful shutdown of the NM.

2015-05-26 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560178#comment-14560178
 ] 

Jian He commented on YARN-41:
-

I only briefly scan the patch and found the 
UnRegisterNodeManagerRequest/Response better be abstract class to be consistent 
with the rest records

 The RM should handle the graceful shutdown of the NM.
 -

 Key: YARN-41
 URL: https://issues.apache.org/jira/browse/YARN-41
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Ravi Teja Ch N V
Assignee: Devaraj K
 Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
 MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
 YARN-41-4.patch, YARN-41-5.patch, YARN-41-6.patch, YARN-41-7.patch, 
 YARN-41.patch


 Instead of waiting for the NM expiry, RM should remove and handle the NM, 
 which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3716) Output node label expression in ResourceRequestPBImpl.toString

2015-05-26 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560182#comment-14560182
 ] 

Wangda Tan commented on YARN-3716:
--

Patch LGTM, will commit once Jenkins get back.

 Output node label expression in ResourceRequestPBImpl.toString
 --

 Key: YARN-3716
 URL: https://issues.apache.org/jira/browse/YARN-3716
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api
Reporter: Xianyin Xin
Assignee: Xianyin Xin
Priority: Minor
 Attachments: YARN-3716.001.patch


 It's convenient for debug and log trace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3700) ATS Web Performance issue at load time when large number of jobs


[ 
https://issues.apache.org/jira/browse/YARN-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560222#comment-14560222
 ] 

Hadoop QA commented on YARN-3700:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 52s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 49s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 55s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   4m  9s | Site still builds. |
| {color:red}-1{color} | checkstyle |   2m  9s | The applied patch generated  1 
new checkstyle issues (total was 215, now 215). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 28s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 57s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   3m  6s | Tests passed in 
hadoop-yarn-server-applicationhistoryservice. |
| {color:green}+1{color} | yarn tests |   0m 23s | Tests passed in 
hadoop-yarn-server-common. |
| | |  55m 32s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12735457/YARN-3700.3.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle site |
| git revision | trunk / cdbd66b |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8092/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8092/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8092/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-applicationhistoryservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8092/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8092/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8092/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8092/console |


This message was automatically generated.

 ATS Web Performance issue at load time when large number of jobs
 

 Key: YARN-3700
 URL: https://issues.apache.org/jira/browse/YARN-3700
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager, webapp, yarn
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-3700.1.patch, YARN-3700.2.1.patch, 
 YARN-3700.2.2.patch, YARN-3700.2.patch, YARN-3700.3.patch


 Currently, we will load all the apps when we try to load the yarn 
 timelineservice web page. If we have large number of jobs, it will be very 
 slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3467) Expose allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI

2015-05-26 Thread Anubhav Dhoot (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3467:

Attachment: Screen Shot 2015-05-26 at 5.46.54 PM.png

Shows the 2 new sortable columns for Allocated memory and cpu

 Expose allocatedMB, allocatedVCores, and runningContainers metrics on running 
 Applications in RM Web UI
 ---

 Key: YARN-3467
 URL: https://issues.apache.org/jira/browse/YARN-3467
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: webapp, yarn
Affects Versions: 2.5.0
Reporter: Anthony Rojas
Assignee: Anubhav Dhoot
Priority: Minor
 Attachments: ApplicationAttemptPage.png, Screen Shot 2015-05-26 at 
 5.46.54 PM.png, YARN-3467.001.patch


 The YARN REST API can report on the following properties:
 *allocatedMB*: The sum of memory in MB allocated to the application's running 
 containers
 *allocatedVCores*: The sum of virtual cores allocated to the application's 
 running containers
 *runningContainers*: The number of containers currently running for the 
 application
 Currently, the RM Web UI does not report on these items (at least I couldn't 
 find any entries within the Web UI).
 It would be useful for YARN Application and Resource troubleshooting to have 
 these properties and their corresponding values exposed on the RM WebUI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle


 [ 
https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu reassigned YARN-3721:
---

Assignee: Li Lu

 build is broken on YARN-2928 branch due to possible dependency cycle
 

 Key: YARN-3721
 URL: https://issues.apache.org/jira/browse/YARN-3721
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Li Lu
Priority: Blocker

 The build is broken on the YARN-2928 branch at the 
 hadoop-yarn-server-timelineservice module. It's been broken for a while, but 
 we didn't notice it because the build happens to work despite this if the 
 maven local cache is not cleared.
 To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven 
 local cache and build it.
 Almost certainly it was introduced by YARN-3529.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle


[ 
https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560247#comment-14560247
 ] 

Li Lu commented on YARN-3721:
-

Hi [~sjlee0], thanks for catching this! Wow, this is a real problem. I can take 
a look at it. 

 build is broken on YARN-2928 branch due to possible dependency cycle
 

 Key: YARN-3721
 URL: https://issues.apache.org/jira/browse/YARN-3721
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Li Lu
Priority: Blocker

 The build is broken on the YARN-2928 branch at the 
 hadoop-yarn-server-timelineservice module. It's been broken for a while, but 
 we didn't notice it because the build happens to work despite this if the 
 maven local cache is not cleared.
 To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven 
 local cache and build it.
 Almost certainly it was introduced by YARN-3529.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3547) FairScheduler: Apps that have no resource demand should not participate scheduling


[ 
https://issues.apache.org/jira/browse/YARN-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560255#comment-14560255
 ] 

Xianyin Xin commented on YARN-3547:
---

Hi [~kasha], [~leftnoteasy], can we reach a consensus as the patch is just a 
simple fix?

 FairScheduler: Apps that have no resource demand should not participate 
 scheduling
 --

 Key: YARN-3547
 URL: https://issues.apache.org/jira/browse/YARN-3547
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Reporter: Xianyin Xin
Assignee: Xianyin Xin
 Attachments: YARN-3547.001.patch, YARN-3547.002.patch, 
 YARN-3547.003.patch, YARN-3547.004.patch, YARN-3547.005.patch


 At present, all of the 'running' apps participate the scheduling process, 
 however, most of them may have no resource demand on a production cluster, as 
 the app's status is running other than waiting for resource at the most of 
 the app's lifetime. It's not a wise way we sort all the 'running' apps and 
 try to fulfill them, especially on a large-scale cluster which has heavy 
 scheduling load. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3722) Merge multiple TestWebAppUtils


 [ 
https://issues.apache.org/jira/browse/YARN-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated YARN-3722:
---
Attachment: YARN-3722.001.patch

 Merge multiple TestWebAppUtils
 --

 Key: YARN-3722
 URL: https://issues.apache.org/jira/browse/YARN-3722
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: test
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: YARN-3722.001.patch


 The tests in {{o.a.h.yarn.util.TestWebAppUtils}} could be moved to 
 {{o.a.h.yarn.webapp.util.TestWebAppUtils}}. WebAppUtils belongs to 
 {{o.a.h.yarn.webapp.util}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3722) Merge multiple TestWebAppUtils


[ 
https://issues.apache.org/jira/browse/YARN-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560321#comment-14560321
 ] 

Hadoop QA commented on YARN-3722:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   5m 13s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 28s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 20s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 29s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 23s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   1m 55s | Tests passed in 
hadoop-yarn-common. |
| | |  18m 56s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12735502/YARN-3722.001.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / cdbd66b |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8095/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8095/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8095/console |


This message was automatically generated.

 Merge multiple TestWebAppUtils
 --

 Key: YARN-3722
 URL: https://issues.apache.org/jira/browse/YARN-3722
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: test
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: YARN-3722.001.patch


 The tests in {{o.a.h.yarn.util.TestWebAppUtils}} could be moved to 
 {{o.a.h.yarn.webapp.util.TestWebAppUtils}}. WebAppUtils belongs to 
 {{o.a.h.yarn.webapp.util}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3682) Decouple PID-file management from ContainerExecutor


 [ 
https://issues.apache.org/jira/browse/YARN-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-3682:
--
Attachment: YARN-3682-20150526.1.txt

Updated patch. Apparently, I already forgot how to write code that compiles.

 Decouple PID-file management from ContainerExecutor
 ---

 Key: YARN-3682
 URL: https://issues.apache.org/jira/browse/YARN-3682
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-3682-20150526.1.txt, YARN-3682-20150526.txt


 The PID-files management currently present in ContainerExecutor really 
 doesn't belong there. I know the original history of why we added it, that 
 was about the only right place to put it in at that point of time.
 Given the evolution of executors for Windows etc, the ContainerExecutor is 
 getting more complicated than is necessary.
 We should pull the PID-file management into its own entity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-160) nodemanagers should obtain cpu/memory values from underlying OS


[ 
https://issues.apache.org/jira/browse/YARN-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560188#comment-14560188
 ] 

Hudson commented on YARN-160:
-

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #209 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/209/])
YARN-160. Enhanced NodeManager to automatically obtain cpu/memory values from 
underlying OS when configured to do so. Contributed by Varun Vasudev. (vinodkv: 
rev 500a1d9c76ec612b4e737888f4be79951c11591d)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/LinuxResourceCalculatorPlugin.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestCgroupsLCEResourcesHandler.java
* 
hadoop-tools/hadoop-gridmix/src/test/java/org/apache/hadoop/mapred/gridmix/DummyResourceCalculatorPlugin.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerExecutor.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/WindowsResourceCalculatorPlugin.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/NodeManagerHardwareUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestContainerExecutor.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestLinuxResourceCalculatorPlugin.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/CgroupsLCEResourcesHandler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestNodeManagerHardwareUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ResourceCalculatorPlugin.java


 nodemanagers should obtain cpu/memory values from underlying OS
 ---

 Key: YARN-160
 URL: https://issues.apache.org/jira/browse/YARN-160
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.0.3-alpha
Reporter: Alejandro Abdelnur
Assignee: Varun Vasudev
  Labels: BB2015-05-TBR
 Fix For: 2.8.0

 Attachments: YARN-160.005.patch, YARN-160.006.patch, 
 YARN-160.007.patch, YARN-160.008.patch, apache-yarn-160.0.patch, 
 apache-yarn-160.1.patch, apache-yarn-160.2.patch, apache-yarn-160.3.patch


 As mentioned in YARN-2
 *NM memory and CPU configs*
 Currently these values are coming from the config of the NM, we should be 
 able to obtain those values from the OS (ie, in the case of Linux from 
 /proc/meminfo  /proc/cpuinfo). As this is highly OS dependent we should have 
 an interface that obtains this information. In addition implementations of 
 this interface should be able to specify a mem/cpu offset (amount of mem/cpu 
 not to be avail as YARN resource), this would allow to reserve mem/cpu for 
 the OS and other services outside of YARN containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3632) Ordering policy should be allowed to reorder an application when demand changes


[ 
https://issues.apache.org/jira/browse/YARN-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560190#comment-14560190
 ] 

Hudson commented on YARN-3632:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #209 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/209/])
YARN-3632. Ordering policy should be allowed to reorder an application when 
demand changes. Contributed by Craig Welch (jianhe: rev 
10732d515f62258309f98e4d7d23249f80b1847d)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/FifoOrderingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/AbstractComparatorOrderingPolicy.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/OrderingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/FairOrderingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java


 Ordering policy should be allowed to reorder an application when demand 
 changes
 ---

 Key: YARN-3632
 URL: https://issues.apache.org/jira/browse/YARN-3632
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Craig Welch
Assignee: Craig Welch
 Fix For: 2.8.0

 Attachments: YARN-3632.0.patch, YARN-3632.1.patch, YARN-3632.3.patch, 
 YARN-3632.4.patch, YARN-3632.5.patch, YARN-3632.6.patch, YARN-3632.7.patch


 At present, ordering policies have the option to have an application 
 re-ordered (for allocation and preemption) when it is allocated to or a 
 container is recovered from the application.  Some ordering policies may also 
 need to reorder when demand changes if that is part of the ordering 
 comparison, this needs to be made available (and used by the 
 fairorderingpolicy when sizebasedweight is true)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3716) Output node label expression in ResourceRequestPBImpl.toString


[ 
https://issues.apache.org/jira/browse/YARN-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560224#comment-14560224
 ] 

Xianyin Xin commented on YARN-3716:
---

Thanks, [~leftnoteasy].

 Output node label expression in ResourceRequestPBImpl.toString
 --

 Key: YARN-3716
 URL: https://issues.apache.org/jira/browse/YARN-3716
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api
Reporter: Xianyin Xin
Assignee: Xianyin Xin
Priority: Minor
 Attachments: YARN-3716.001.patch


 It's convenient for debug and log trace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle


[ 
https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560234#comment-14560234
 ] 

Sangjin Lee commented on YARN-3721:
---

The error message:

{panel}
Failed to execute goal on project hadoop-yarn-server-timelineservice: Could not 
resolve dependencies for project 
org.apache.hadoop:hadoop-yarn-server-timelineservice:jar:3.0.0-SNAPSHOT: 
Failure to find 
org.apache.hadoop:hadoop-yarn-server-timelineservice:jar:3.0.0-SNAPSHOT in 
https://repository.apache.org/content/repositories/snapshots was cached in the 
local repository, resolution will not be reattempted until the update interval 
of apache.snapshots.https has elapsed or updates are forced
{panel}

The dependency cycle is introduced by hbase testing util. It has a transitive 
dependency on timelineservice (test) itself!

{noformat}
org.apache.hbase:hbase-testing-util:jar:1.0.1:test
   org.apache.hbase:hbase-common:jar:tests:1.0.1:runtime
   org.apache.hbase:hbase-annotations:jar:tests:1.0.1:test
   org.apache.hbase:hbase-hadoop-compat:jar:tests:1.0.1:test
   org.apache.hbase:hbase-hadoop2-compat:jar:tests:1.0.1:test
   org.apache.hadoop:hadoop-client:jar:3.0.0-SNAPSHOT:compile (version 
managed from 2.5.1 by org.apache.hadoop:hadoop-project:3.0.0-SNAPSHOT)
  
org.apache.hadoop:hadoop-mapreduce-client-app:jar:3.0.0-SNAPSHOT:compile
   
org.apache.hadoop:hadoop-mapreduce-client-jobclient:jar:3.0.0-SNAPSHOT:compile 
(version managed from 2.5.1 by org.apache.hadoop:hadoop-project:3.0.0-SNAPSHOT)
  
org.apache.hadoop:hadoop-mapreduce-client-common:jar:3.0.0-SNAPSHOT:compile
 org.apache.hadoop:hadoop-yarn-client:jar:3.0.0-SNAPSHOT:compile
  
org.apache.hadoop:hadoop-mapreduce-client-shuffle:jar:3.0.0-SNAPSHOT:compile
 
org.apache.hadoop:hadoop-yarn-server-nodemanager:jar:3.0.0-SNAPSHOT:compile
   org.apache.hadoop:hadoop-minicluster:jar:3.0.0-SNAPSHOT:test (version 
managed from 2.5.1 by org.apache.hadoop:hadoop-project:3.0.0-SNAPSHOT)
  
org.apache.hadoop:hadoop-yarn-server-tests:jar:tests:3.0.0-SNAPSHOT:test
 
org.apache.hadoop:hadoop-yarn-server-resourcemanager:jar:3.0.0-SNAPSHOT:test

org.apache.hadoop:hadoop-yarn-server-web-proxy:jar:3.0.0-SNAPSHOT:test
org.apache.zookeeper:zookeeper:jar:tests:3.4.6:test
 
org.apache.hadoop:hadoop-yarn-server-timelineservice:jar:3.0.0-SNAPSHOT:test
  
org.apache.hadoop:hadoop-mapreduce-client-jobclient:jar:tests:3.0.0-SNAPSHOT:test
  org.apache.hadoop:hadoop-mapreduce-client-hs:jar:3.0.0-SNAPSHOT:test
{noformat}


 build is broken on YARN-2928 branch due to possible dependency cycle
 

 Key: YARN-3721
 URL: https://issues.apache.org/jira/browse/YARN-3721
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Priority: Blocker

 The build is broken on the YARN-2928 branch at the 
 hadoop-yarn-server-timelineservice module. It's been broken for a while, but 
 we didn't notice it because the build happens to work despite this if the 
 maven local cache is not cleared.
 To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven 
 local cache and build it.
 Almost certainly it was introduced by YARN-3529.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3652) A SchedulerMetrics may be need for evaluating the scheduler's performance


[ 
https://issues.apache.org/jira/browse/YARN-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560252#comment-14560252
 ] 

Xianyin Xin commented on YARN-3652:
---

Thanks [~vinodkv]. 
When i said YARN-3293 and {{SchedulerMetrics}} are similar, i mean the two are 
similar on function design, and it is not implemented yet at that time. A 
simple {{SchedulerMetrics}} was introduced in YARN-3630, where a 
{{#ofWaitingSchedulerEvent}} metric was used to evaluate the load of the 
scheduler.
[~vvasudev], hope for your idea. :)

 A SchedulerMetrics may be need for evaluating the scheduler's performance
 -

 Key: YARN-3652
 URL: https://issues.apache.org/jira/browse/YARN-3652
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager, scheduler
Reporter: Xianyin Xin

 As discussed in YARN-3630, a {{SchedulerMetrics}} may be need for evaluating 
 the scheduler's performance. The performance indexes includes #events waiting 
 for being handled by scheduler, the throughput, the scheduling delay and/or 
 other indicators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3685) NodeManager unnecessarily knows about classpath-jars due to Windows limitations

[
https://issues.apache.org/jira/browse/YARN-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560262#comment-14560262
]

Vinod Kumar Vavilapalli commented on YARN-3685:
---

bq. This was true of Linux even before YARN-316, so in that sense, YARN did
already have some classpath logic indirectly.
bq. I was thinking of stuff like yarn.application.classpath, where values are
defined in terms of things like the HADOOP_YARN_HOME and HADOOP_COMMON_HOME
environment variables, and those values might not match the file system layout
at the client side.
Hm.. YARN_APPLICATION_CLASSPATH is a simple convenience configuration property
that the server *does not* load, but used by the applications like
distributed-shell. And yeah, this convenience property was never assumed to
work with variable installation layouts. Increasingly our apps are being
migrated to a distributed-cache based deployment so as to avoid the layout
issue, so in sum YARN_APPLICATION_CLASSPATH is essentially unused.

NodeManager unnecessarily knows about classpath-jars due to Windows
limitations
---

Key: YARN-3685
URL: https://issues.apache.org/jira/browse/YARN-3685
Project: Hadoop YARN
Issue Type: Sub-task
Components: nodemanager
Reporter: Vinod Kumar Vavilapalli

Found this while looking at cleaning up ContainerExecutor via YARN-3648,
making it a sub-task.
YARN *should not* know about classpaths. Our original design modeled around
this. But when we added windows suppport, due to classpath issues, we ended
up breaking this abstraction via YARN-316. We should clean this up.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3581) Deprecate -directlyAccessNodeLabelStore in RMAdminCLI

2015-05-26 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560216#comment-14560216
 ] 

Naganarasimha G R commented on YARN-3581:
-

hi [~wangda],

Regarding deprecated message as argument, {{-removeFromClusterNodeLabels}} had 
a comment that {{(label splitted by ,)}}, so thought imp info can be shown in 
this way. will move it to description . but one more thing, shall i totally 
remove description and have only this Deprecated message, so that no one will 
use it ?
others will get it corrected.

 Deprecate -directlyAccessNodeLabelStore in RMAdminCLI
 -

 Key: YARN-3581
 URL: https://issues.apache.org/jira/browse/YARN-3581
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-3581.20150525-1.patch


 In 2.6.0, we added an option called -directlyAccessNodeLabelStore to make 
 RM can start with label-configured queue settings. After YARN-2918, we don't 
 need this option any more, admin can configure queue setting, start RM and 
 configure node label via RMAdminCLI without any error.
 In addition, this option is very restrictive, first it needs to run on the 
 same node where RM is running if admin configured to store labels in local 
 disk.
 Second, when admin run the option when RM is running, multiple process write 
 to a same file can happen, this could make node label store becomes invalid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3467) Expose allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI

2015-05-26 Thread Anubhav Dhoot (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3467:

Attachment: YARN-3467.001.patch

Changes show allocated CPU and memory on the Applications page

 Expose allocatedMB, allocatedVCores, and runningContainers metrics on running 
 Applications in RM Web UI
 ---

 Key: YARN-3467
 URL: https://issues.apache.org/jira/browse/YARN-3467
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: webapp, yarn
Affects Versions: 2.5.0
Reporter: Anthony Rojas
Assignee: Anubhav Dhoot
Priority: Minor
 Attachments: ApplicationAttemptPage.png, YARN-3467.001.patch


 The YARN REST API can report on the following properties:
 *allocatedMB*: The sum of memory in MB allocated to the application's running 
 containers
 *allocatedVCores*: The sum of virtual cores allocated to the application's 
 running containers
 *runningContainers*: The number of containers currently running for the 
 application
 Currently, the RM Web UI does not report on these items (at least I couldn't 
 find any entries within the Web UI).
 It would be useful for YARN Application and Resource troubleshooting to have 
 these properties and their corresponding values exposed on the RM WebUI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle

Sangjin Lee created YARN-3721:
-

 Summary: build is broken on YARN-2928 branch due to possible 
dependency cycle
 Key: YARN-3721
 URL: https://issues.apache.org/jira/browse/YARN-3721
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Priority: Blocker


The build is broken on the YARN-2928 branch at the 
hadoop-yarn-server-timelineservice module. It's been broken for a while, but we 
didn't notice it because the build happens to work despite this if the maven 
local cache is not cleared.

To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven 
local cache and build it.

Almost certainly it was introduced by YARN-3529.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle


[ 
https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560352#comment-14560352
 ] 

Li Lu commented on YARN-3721:
-

Seems like the HBase UT is also failing on YARN-2928 branch. [~vrushalic] would 
you please take a look at it? The UT failure appears to be irrelevant to the 
changes in this patch (the maven failure is gone and the mini-hbase cluster has 
been successfully launched). 

 build is broken on YARN-2928 branch due to possible dependency cycle
 

 Key: YARN-3721
 URL: https://issues.apache.org/jira/browse/YARN-3721
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Li Lu
Priority: Blocker
 Attachments: YARN-3721-YARN-2928.001.patch


 The build is broken on the YARN-2928 branch at the 
 hadoop-yarn-server-timelineservice module. It's been broken for a while, but 
 we didn't notice it because the build happens to work despite this if the 
 maven local cache is not cleared.
 To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven 
 local cache and build it.
 Almost certainly it was introduced by YARN-3529.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle


[ 
https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560373#comment-14560373
 ] 

Sangjin Lee commented on YARN-3721:
---

Thanks for the quick patch [~gtCarrera9]!

So we don't need the hadoop mini-cluster part of the dependency from 
hbase-testing-util at all? Could you elaborate how that still works with the 
mini-HBase cluster? That might help us make the dependency clearer (or more 
explicit).

 build is broken on YARN-2928 branch due to possible dependency cycle
 

 Key: YARN-3721
 URL: https://issues.apache.org/jira/browse/YARN-3721
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Li Lu
Priority: Blocker
 Attachments: YARN-3721-YARN-2928.001.patch


 The build is broken on the YARN-2928 branch at the 
 hadoop-yarn-server-timelineservice module. It's been broken for a while, but 
 we didn't notice it because the build happens to work despite this if the 
 maven local cache is not cleared.
 To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven 
 local cache and build it.
 Almost certainly it was introduced by YARN-3529.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3685) NodeManager unnecessarily knows about classpath-jars due to Windows limitations

2015-05-26 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560391#comment-14560391
 ] 

Chris Nauroth commented on YARN-3685:
-

bq. YARN_APPLICATION_CLASSPATH is essentially unused.

In that case, this is definitely worth revisiting as part of this issue.  
Perhaps it's not a problem anymore.  This had been used in the past, as seen in 
bug reports like YARN-1138.

 NodeManager unnecessarily knows about classpath-jars due to Windows 
 limitations
 ---

 Key: YARN-3685
 URL: https://issues.apache.org/jira/browse/YARN-3685
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Vinod Kumar Vavilapalli

 Found this while looking at cleaning up ContainerExecutor via YARN-3648, 
 making it a sub-task.
 YARN *should not* know about classpaths. Our original design modeled around 
 this. But when we added windows suppport, due to classpath issues, we ended 
 up breaking this abstraction via YARN-316. We should clean this up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS

2015-05-26 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560416#comment-14560416
 ] 

Zhijie Shen commented on YARN-3044:
---

Naga, sorry for late reply. The new patch looks much better to me, but I still 
concern about the following change:

{code}
287   @Override
288   public Dispatcher getDispatcher() {
289 Dispatcher dispatcher = null;
290 
291 if (publishContainerMetrics) {
292   dispatcher = super.getDispatcher();
293 } else {
294   // Normal dispatcher is sufficient if container metrics are not 
required
295   // to be published
296   dispatcher = new AsyncDispatcher();
297 }
298 return dispatcher;
299   }
{code}

I think it's better to retain the multiple-dispatchers, which is more flexible 
to fit for different scales. We can config to change how many threads we need. 
Routing an event to one dispatcher takes constant time according to the current 
 multiple-dispatchers implementation. Thoughts?

 [Event producers] Implement RM writing app lifecycle events to ATS
 --

 Key: YARN-3044
 URL: https://issues.apache.org/jira/browse/YARN-3044
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Naganarasimha G R
 Attachments: YARN-3044-YARN-2928.004.patch, 
 YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, 
 YARN-3044-YARN-2928.007.patch, YARN-3044-YARN-2928.008.patch, 
 YARN-3044.20150325-1.patch, YARN-3044.20150406-1.patch, 
 YARN-3044.20150416-1.patch


 Per design in YARN-2928, implement RM writing app lifecycle events to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-3720) Need comprehensive documentation for configuration CPU/memory resources on NodeManager

2015-05-26 Thread Varun Vasudev (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev reassigned YARN-3720:
---

Assignee: Varun Vasudev

 Need comprehensive documentation for configuration CPU/memory resources on 
 NodeManager
 --

 Key: YARN-3720
 URL: https://issues.apache.org/jira/browse/YARN-3720
 Project: Hadoop YARN
  Issue Type: Task
  Components: documentation, nodemanager
Reporter: Vinod Kumar Vavilapalli
Assignee: Varun Vasudev

 Things are getting more and more complex after the likes of YARN-160. We need 
 a document explaining how to configure cpu/memory values on a NodeManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3703) Container Launch fails with exitcode 2 with DefaultContainerExecutor

2015-05-26 Thread Devaraj K (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560443#comment-14560443
 ] 

Devaraj K commented on YARN-3703:
-

I lost the app logs for this issue when it occurred, trying to reproduce this. 
I am closing it now, will reopen this issue once I get the logs and still feel 
it an issue. Thanks.

 Container Launch fails with exitcode 2 with DefaultContainerExecutor
 

 Key: YARN-3703
 URL: https://issues.apache.org/jira/browse/YARN-3703
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.7.0
Reporter: Devaraj K
Priority: Minor

 Please find the below NM log when the issue occurs.
 {code:xml}
 2015-05-21 20:14:53,907 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code 
 from container container_1432208816246_0225_01_34 is : 2
 2015-05-21 20:14:53,908 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception 
 from container-launch with container ID: 
 container_1432208816246_0225_01_34 and exit code: 2
 ExitCodeException exitCode=2:
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
 at org.apache.hadoop.util.Shell.run(Shell.java:456)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception from 
 container-launch.
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Container id: 
 container_1432208816246_0225_01_34
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exit code: 2
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Stack trace: 
 ExitCodeException exitCode=2:
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 org.apache.hadoop.util.Shell.run(Shell.java:456)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 java.util.concurrent.FutureTask.run(FutureTask.java:262)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 java.lang.Thread.run(Thread.java:745)
 2015-05-21 20:14:53,910 WARN 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
  Container exited with a non-zero exit code 2
 2015-05-21 20:14:53,911 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl:
  Container container_1432208816246_0225_01_34 transitioned from RUNNING 
 to EXITED_WITH_FAILURE
 2015-05-21 20:14:53,911

[jira] [Resolved] (YARN-3703) Container Launch fails with exitcode 2 with DefaultContainerExecutor

2015-05-26 Thread Devaraj K (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K resolved YARN-3703.
-
Resolution: Not A Problem

 Container Launch fails with exitcode 2 with DefaultContainerExecutor
 

 Key: YARN-3703
 URL: https://issues.apache.org/jira/browse/YARN-3703
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.7.0
Reporter: Devaraj K
Priority: Minor

 Please find the below NM log when the issue occurs.
 {code:xml}
 2015-05-21 20:14:53,907 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code 
 from container container_1432208816246_0225_01_34 is : 2
 2015-05-21 20:14:53,908 WARN 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception 
 from container-launch with container ID: 
 container_1432208816246_0225_01_34 and exit code: 2
 ExitCodeException exitCode=2:
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
 at org.apache.hadoop.util.Shell.run(Shell.java:456)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
 at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception from 
 container-launch.
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Container id: 
 container_1432208816246_0225_01_34
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exit code: 2
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Stack trace: 
 ExitCodeException exitCode=2:
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 org.apache.hadoop.util.Shell.run(Shell.java:456)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 java.util.concurrent.FutureTask.run(FutureTask.java:262)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 2015-05-21 20:14:53,910 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:   at 
 java.lang.Thread.run(Thread.java:745)
 2015-05-21 20:14:53,910 WARN 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
  Container exited with a non-zero exit code 2
 2015-05-21 20:14:53,911 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl:
  Container container_1432208816246_0225_01_34 transitioned from RUNNING 
 to EXITED_WITH_FAILURE
 2015-05-21 20:14:53,911 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
  Cleaning up container container_1432208816246_0225_01_34
 {code}



--
This message was sent by Atlassian

[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle


[ 
https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560415#comment-14560415
 ] 

Li Lu commented on YARN-3721:
-

Also, [~sjlee0], I thought I've resolved the problem, but would you please help 
me to verify if my patch actually resolves exactly the same problem as you 
raised? Thanks! 

 build is broken on YARN-2928 branch due to possible dependency cycle
 

 Key: YARN-3721
 URL: https://issues.apache.org/jira/browse/YARN-3721
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Li Lu
Priority: Blocker
 Attachments: YARN-3721-YARN-2928.001.patch


 The build is broken on the YARN-2928 branch at the 
 hadoop-yarn-server-timelineservice module. It's been broken for a while, but 
 we didn't notice it because the build happens to work despite this if the 
 maven local cache is not cleared.
 To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven 
 local cache and build it.
 Almost certainly it was introduced by YARN-3529.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3682) Decouple PID-file management from ContainerExecutor


[ 
https://issues.apache.org/jira/browse/YARN-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560369#comment-14560369
 ] 

Hadoop QA commented on YARN-3682:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | patch |   0m  1s | The patch file was not named 
according to hadoop's naming conventions. Please see 
https://wiki.apache.org/hadoop/HowToContribute for instructions. |
| {color:blue}0{color} | pre-patch |  14m 51s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 6 new or modified test files. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 33s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 19s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 46s | The applied patch generated  7 
new checkstyle issues (total was 295, now 298). |
| {color:green}+1{color} | whitespace |   0m  2s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m  2s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |   6m  2s | Tests failed in 
hadoop-yarn-server-nodemanager. |
| | |  42m 19s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.nodemanager.TestContainerManagerWithLCE |
|   | hadoop.yarn.server.nodemanager.containermanager.container.TestContainer |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12735508/YARN-3682-20150526.1.txt
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / cdbd66b |
| Release Audit | 
https://builds.apache.org/job/PreCommit-YARN-Build/8096/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8096/artifact/patchprocess/diffcheckstylehadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8096/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8096/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8096/console |


This message was automatically generated.

 Decouple PID-file management from ContainerExecutor
 ---

 Key: YARN-3682
 URL: https://issues.apache.org/jira/browse/YARN-3682
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-3682-20150526.1.txt, YARN-3682-20150526.txt


 The PID-files management currently present in ContainerExecutor really 
 doesn't belong there. I know the original history of why we added it, that 
 was about the only right place to put it in at that point of time.
 Given the evolution of executors for Windows etc, the ContainerExecutor is 
 getting more complicated than is necessary.
 We should pull the PID-file management into its own entity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle

[
https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560406#comment-14560406
]

Li Lu commented on YARN-3721:
-

Hi [~sjlee0], I'm not 100% sure, but at least on my local machine maven is not
only complaining about a cyclic dependency. The direct cause of the failure is
Failure to find
org.apache.hadoop:hadoop-yarn-server-timelineservice:jar:3.0.0-SNAPSHOT. I
suspect this is because we have not published anything in YARN-2928 branch to
Apache's snapshot server. Previously, if local builds are cached, there are
hadoop-yarn-server-timelineservice available for future builds. However, if
timelineservice is not in the cache, Maven cannot find it from the snapshot
server.

Of course, the root cause of this problem is the cyclic dependence from
timeline-service to hbase-test-util to mini hadoop cluster to timeline-service
itself. We can exempt the dependence at compile time from hbase-test-util to
mini hadoop cluster because for tests, mini hadoop cluster is available. So I
don't think we need to enforce that statically.

This is only my hunch. I'm not a maven expert so I truly appreciate more
analysis. Thanks!

build is broken on YARN-2928 branch due to possible dependency cycle

Key: YARN-3721
URL: https://issues.apache.org/jira/browse/YARN-3721
Project: Hadoop YARN
Issue Type: Sub-task
Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Li Lu
Priority: Blocker
Attachments: YARN-3721-YARN-2928.001.patch

The build is broken on the YARN-2928 branch at the
hadoop-yarn-server-timelineservice module. It's been broken for a while, but
we didn't notice it because the build happens to work despite this if the
maven local cache is not cleared.
To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven
local cache and build it.
Almost certainly it was introduced by YARN-3529.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3585) NodeManager cannot exit on SHUTDOWN event triggered and NM recovery is enabled

2015-05-26 Thread Rohith (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560410#comment-14560410
 ] 

Rohith commented on YARN-3585:
--

I will test YARN-3641 fix for this JIRA scenario. About the patch, I think 
calling System.exit() explicitely after shutdown thead exit is one option.

 NodeManager cannot exit on SHUTDOWN event triggered and NM recovery is enabled
 --

 Key: YARN-3585
 URL: https://issues.apache.org/jira/browse/YARN-3585
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Peng Zhang
Priority: Critical

 With NM recovery enabled, after decommission, nodemanager log show stop but 
 process cannot end. 
 non daemon thread:
 {noformat}
 DestroyJavaVM prio=10 tid=0x7f3460011800 nid=0x29ec waiting on 
 condition [0x]
 leveldb prio=10 tid=0x7f3354001800 nid=0x2a97 runnable 
 [0x]
 VM Thread prio=10 tid=0x7f3460167000 nid=0x29f8 runnable 
 Gang worker#0 (Parallel GC Threads) prio=10 tid=0x7f346002 
 nid=0x29ed runnable 
 Gang worker#1 (Parallel GC Threads) prio=10 tid=0x7f3460022000 
 nid=0x29ee runnable 
 Gang worker#2 (Parallel GC Threads) prio=10 tid=0x7f3460024000 
 nid=0x29ef runnable 
 Gang worker#3 (Parallel GC Threads) prio=10 tid=0x7f3460025800 
 nid=0x29f0 runnable 
 Gang worker#4 (Parallel GC Threads) prio=10 tid=0x7f3460027800 
 nid=0x29f1 runnable 
 Gang worker#5 (Parallel GC Threads) prio=10 tid=0x7f3460029000 
 nid=0x29f2 runnable 
 Gang worker#6 (Parallel GC Threads) prio=10 tid=0x7f346002b000 
 nid=0x29f3 runnable 
 Gang worker#7 (Parallel GC Threads) prio=10 tid=0x7f346002d000 
 nid=0x29f4 runnable 
 Concurrent Mark-Sweep GC Thread prio=10 tid=0x7f3460120800 nid=0x29f7 
 runnable 
 Gang worker#0 (Parallel CMS Threads) prio=10 tid=0x7f346011c800 
 nid=0x29f5 runnable 
 Gang worker#1 (Parallel CMS Threads) prio=10 tid=0x7f346011e800 
 nid=0x29f6 runnable 
 VM Periodic Task Thread prio=10 tid=0x7f346019f800 nid=0x2a01 waiting 
 on condition 
 {noformat}
 and jni leveldb thread stack
 {noformat}
 Thread 12 (Thread 0x7f33dd842700 (LWP 10903)):
 #0  0x003d8340b43c in pthread_cond_wait@@GLIBC_2.3.2 () from 
 /lib64/libpthread.so.0
 #1  0x7f33dfce2a3b in leveldb::(anonymous 
 namespace)::PosixEnv::BGThreadWrapper(void*) () from 
 /tmp/libleveldbjni-64-1-6922178968300745716.8
 #2  0x003d83407851 in start_thread () from /lib64/libpthread.so.0
 #3  0x003d830e811d in clone () from /lib64/libc.so.6
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3711) Documentation of ResourceManager HA should explain about webapp address configuration


 [ 
https://issues.apache.org/jira/browse/YARN-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated YARN-3711:
---
Description: Proper proxy URL of AM Web UI could not be got without setting 
{{yarn.resourcemanager.webapp.address._rm-id_}} and/or 
{{yarn.resourcemanager.webapp.https.address._rm-id_}} if RM-HA is enabled.  
(was: Proper URL of AM Web UI could not be got without setting 
{{yarn.resourcemanager.webapp.address._node-id_}} and/or 
{{yarn.resourcemanager.webapp.https.address._node-id_}} if RM-HA is enabled.)

 Documentation of ResourceManager HA should explain about webapp address 
 configuration
 -

 Key: YARN-3711
 URL: https://issues.apache.org/jira/browse/YARN-3711
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor

 Proper proxy URL of AM Web UI could not be got without setting 
 {{yarn.resourcemanager.webapp.address._rm-id_}} and/or 
 {{yarn.resourcemanager.webapp.https.address._rm-id_}} if RM-HA is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure


[ 
https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558727#comment-14558727
 ] 

zhihai xu commented on YARN-3591:
-

Yes, I think we can get newErrorDirs and newRepairedDirs by comparing 
{{postCheckOtherDirs}} and {{preCheckOtherErrorDirs}} in 
{{DirectoryCollection#checkDirs}}.
Can we use {{String}} to store {{DirectoryCollection#errorDirs}} in statestore 
similar as {{storeContainerDiagnostics}}?

 Resource Localisation on a bad disk causes subsequent containers failure 
 -

 Key: YARN-3591
 URL: https://issues.apache.org/jira/browse/YARN-3591
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Lavkesh Lahngir
Assignee: Lavkesh Lahngir
 Attachments: 0001-YARN-3591.1.patch, 0001-YARN-3591.patch, 
 YARN-3591.2.patch, YARN-3591.3.patch, YARN-3591.4.patch


 It happens when a resource is localised on the disk, after localising that 
 disk has gone bad. NM keeps paths for localised resources in memory.  At the 
 time of resource request isResourcePresent(rsrc) will be called which calls 
 file.exists() on the localised path.
 In some cases when disk has gone bad, inodes are stilled cached and 
 file.exists() returns true. But at the time of reading, file will not open.
 Note: file.exists() actually calls stat64 natively which returns true because 
 it was able to find inode information from the OS.
 A proposal is to call file.list() on the parent path of the resource, which 
 will call open() natively. If the disk is good it should return an array of 
 paths with length at-least 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1772) Fair Scheduler documentation should indicate that admin ACLs also give submit permissions

2015-05-26 Thread Darrell Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558800#comment-14558800
 ] 

Darrell Taylor commented on YARN-1772:
--

I'm just reading through FairScheduler.md and found the following on line 197 :

{quote}
Anybody who may administer a queue may also submit applications to it.
{quote}

Does it need to be made clearer, or is everybody happy that covers it?


 Fair Scheduler documentation should indicate that admin ACLs also give submit 
 permissions
 -

 Key: YARN-1772
 URL: https://issues.apache.org/jira/browse/YARN-1772
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Sandy Ryza
Priority: Minor
  Labels: newbie

 I can submit to a Fair Scheduler queue if I'm in the submit ACL OR if I'm in 
 the administer ACL.  The Fair Scheduler docs seem to leave out the second 
 part. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3714) AM proxy filter can not get proper default proxy address if RM-HA is enabled

Masatake Iwasaki created YARN-3714:
--

 Summary: AM proxy filter can not get proper default proxy address 
if RM-HA is enabled
 Key: YARN-3714
 URL: https://issues.apache.org/jira/browse/YARN-3714
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3217) Remove httpclient dependency from hadoop-yarn-server-web-proxy

2015-05-26 Thread Akira AJISAKA (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-3217:

Release Note: Removed commons-httpclient dependency from 
hadoop-yarn-server-web-proxy module.
Hadoop Flags: Incompatible change,Reviewed  (was: Reviewed)

 Remove httpclient dependency from hadoop-yarn-server-web-proxy
 --

 Key: YARN-3217
 URL: https://issues.apache.org/jira/browse/YARN-3217
 Project: Hadoop YARN
  Issue Type: Task
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Brahma Reddy Battula
 Fix For: 2.7.0

 Attachments: YARN-3217-002.patch, YARN-3217-003.patch, 
 YARN-3217-003.patch, YARN-3217-004.patch, YARN-3217.patch


 Sub-task of HADOOP-10105. Remove httpclient dependency from 
 WebAppProxyServlet.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3127) Avoid timeline events during RM recovery or restart

2015-05-26 Thread Naganarasimha G R (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-3127:

Description: 
1.Start RM with HA and ATS configured and run some yarn applications
2.Once applications are finished sucessfully start timeline server
3.Now failover HA form active to standby
4.Access timeline server URL IP:PORT/applicationhistory

//Note Earlier exception was thrown when accessed. 
Incomplete information is shown in the ATS web UI. i.e. attempt container and 
other information is not displayed.

Also even if timeline server is started with RM, and on RM restart/ recovery 
ATS events for the applications already existing in ATS are resent which is not 
required.


  was:
1.Start RM with HA and ATS configured and run some yarn applications
2.Once applications are finished sucessfully start timeline server
3.Now failover HA form active to standby
4.Access timeline server URL IP:PORT/applicationhistory

Result: Application history URL fails with below info


{quote}
2015-02-03 20:28:09,511 ERROR org.apache.hadoop.yarn.webapp.View: Failed to 
read the applications.
java.lang.reflect.UndeclaredThrowableException
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1643)
at 
org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:80)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:67)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77)
at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
at 
org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
...
Caused by: 
org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: The 
entity for application attempt appattempt_1422972608379_0001_01 doesn't 
exist in the timeline store
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getApplicationAttempt(ApplicationHistoryManagerOnTimelineStore.java:151)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.generateApplicationReport(ApplicationHistoryManagerOnTimelineStore.java:499)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getAllApplications(ApplicationHistoryManagerOnTimelineStore.java:108)
at 
org.apache.hadoop.yarn.server.webapp.AppsBlock$1.run(AppsBlock.java:84)
at 
org.apache.hadoop.yarn.server.webapp.AppsBlock$1.run(AppsBlock.java:81)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
... 51 more
2015-02-03 20:28:09,512 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
handling URI: /applicationhistory
org.apache.hadoop.yarn.webapp.WebAppException: Error rendering block: 
nestLevel=6 expected 5
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77)
{quote}

Behaviour with AHS with file based history store
-Apphistory url is working 
-No attempt entries are shown for each application.


Based on inital analysis when RM switches ,application attempts from state 
store  are not replayed but only applications are.
So when /applicaitonhistory url is accessed it tries for all attempt id and 
fails


 Avoid timeline events during RM recovery or restart
 ---

 Key: YARN-3127
 URL: https://issues.apache.org/jira/browse/YARN-3127
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, timelineserver
Affects Versions: 2.6.0
 Environment: RM HA with ATS
Reporter: Bibin A Chundatt
Assignee: Naganarasimha G R
Priority: Critical
 Attachments: YARN-3127.20150213-1.patch, YARN-3127.20150329-1.patch


 1.Start RM with HA and ATS configured and run some yarn applications
 2.Once applications are finished sucessfully start timeline server
 3.Now failover HA form active to standby
 4.Access timeline server URL IP:PORT/applicationhistory
 //Note Earlier exception was thrown when accessed. 
 Incomplete information is shown in the ATS web UI. i.e. attempt container and 
 other information is not displayed.
 Also even if timeline server is started with RM, and on RM restart/ recovery 
 ATS events for the applications already existing in ATS are resent which is 
 not required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3711) Documentation of ResourceManager HA should explain about webapp address configuration


 [ 
https://issues.apache.org/jira/browse/YARN-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated YARN-3711:
---
Issue Type: Sub-task  (was: Improvement)
Parent: YARN-149

 Documentation of ResourceManager HA should explain about webapp address 
 configuration
 -

 Key: YARN-3711
 URL: https://issues.apache.org/jira/browse/YARN-3711
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: documentation
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor

 Proper proxy URL of AM Web UI could not be got without setting 
 {{yarn.resourcemanager.webapp.address._rm-id_}} and/or 
 {{yarn.resourcemanager.webapp.https.address._rm-id_}} if RM-HA is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3712) ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously


[ 
https://issues.apache.org/jira/browse/YARN-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558815#comment-14558815
 ] 

Hadoop QA commented on YARN-3712:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 35s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 36s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m  2s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m  5s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  41m 58s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12735262/YARN-3712.02.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 39077db |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8079/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8079/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8079/console |


This message was automatically generated.

 ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously
 -

 Key: YARN-3712
 URL: https://issues.apache.org/jira/browse/YARN-3712
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Jun Gong
Assignee: Jun Gong
 Attachments: YARN-3712.01.patch, YARN-3712.02.patch


 It will save some time by handling event CLEANUP_CONTAINER asynchronously. 
 This improvement will be useful for cases that cleaning up container cost a 
 little long time(e.g. for our case: we are running Docker container on NM, it 
 will take above 1 seconds to clean up one docker container.  ) and many 
 containers to clean up(e.g. NM need clean up all running containers when NM 
 shutdown). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3644) Node manager shuts down if unable to connect with RM

2015-05-26 Thread Raju Bairishetti (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raju Bairishetti updated YARN-3644:
---
Attachment: YARN-3644.patch

Intorduced a new config **NODEMANAGER_SHUTSDWON_ON_RM_CONNECTION_FAILURES** to 
allow the users to take decision on the shutdown of the NM when it is not able 
to connect to RM.

Keeping default value as true to honour the current behavior.

 Node manager shuts down if unable to connect with RM
 

 Key: YARN-3644
 URL: https://issues.apache.org/jira/browse/YARN-3644
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Srikanth Sundarrajan
Assignee: Raju Bairishetti
 Attachments: YARN-3644.patch


 When NM is unable to connect to RM, NM shuts itself down.
 {code}
   } catch (ConnectException e) {
 //catch and throw the exception if tried MAX wait time to connect 
 RM
 dispatcher.getEventHandler().handle(
 new NodeManagerEvent(NodeManagerEventType.SHUTDOWN));
 throw new YarnRuntimeException(e);
 {code}
 In large clusters, if RM is down for maintenance for longer period, all the 
 NMs shuts themselves down, requiring additional work to bring up the NMs.
 Setting the yarn.resourcemanager.connect.wait-ms to -1 has other side 
 effects, where non connection failures are being retried infinitely by all 
 YarnClients (via RMProxy).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3127) Avoid timeline events during RM recovery or restart

2015-05-26 Thread Naganarasimha G R (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-3127:

Summary: Avoid timeline events during RM recovery or restart  (was: 
Apphistory url crashes when RM switches with ATS enabled)

 Avoid timeline events during RM recovery or restart
 ---

 Key: YARN-3127
 URL: https://issues.apache.org/jira/browse/YARN-3127
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, timelineserver
Affects Versions: 2.6.0
 Environment: RM HA with ATS
Reporter: Bibin A Chundatt
Assignee: Naganarasimha G R
Priority: Critical
 Attachments: YARN-3127.20150213-1.patch, YARN-3127.20150329-1.patch


 1.Start RM with HA and ATS configured and run some yarn applications
 2.Once applications are finished sucessfully start timeline server
 3.Now failover HA form active to standby
 4.Access timeline server URL IP:PORT/applicationhistory
 Result: Application history URL fails with below info
 {quote}
 2015-02-03 20:28:09,511 ERROR org.apache.hadoop.yarn.webapp.View: Failed to 
 read the applications.
 java.lang.reflect.UndeclaredThrowableException
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1643)
   at 
 org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:80)
   at 
 org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:67)
   at 
 org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77)
   at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
   at 
 org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
   ...
 Caused by: 
 org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: The 
 entity for application attempt appattempt_1422972608379_0001_01 doesn't 
 exist in the timeline store
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getApplicationAttempt(ApplicationHistoryManagerOnTimelineStore.java:151)
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.generateApplicationReport(ApplicationHistoryManagerOnTimelineStore.java:499)
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getAllApplications(ApplicationHistoryManagerOnTimelineStore.java:108)
   at 
 org.apache.hadoop.yarn.server.webapp.AppsBlock$1.run(AppsBlock.java:84)
   at 
 org.apache.hadoop.yarn.server.webapp.AppsBlock$1.run(AppsBlock.java:81)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
   ... 51 more
 2015-02-03 20:28:09,512 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
 handling URI: /applicationhistory
 org.apache.hadoop.yarn.webapp.WebAppException: Error rendering block: 
 nestLevel=6 expected 5
   at 
 org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
   at 
 org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77)
 {quote}
 Behaviour with AHS with file based history store
   -Apphistory url is working 
   -No attempt entries are shown for each application.
   
 Based on inital analysis when RM switches ,application attempts from state 
 store  are not replayed but only applications are.
 So when /applicaitonhistory url is accessed it tries for all attempt id and 
 fails



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3713) Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition


 [ 
https://issues.apache.org/jira/browse/YARN-3713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3713:

Labels: cleanup maintenance  (was: )

 Remove duplicate function call storeContainerDiagnostics in 
 ContainerDiagnosticsUpdateTransition
 

 Key: YARN-3713
 URL: https://issues.apache.org/jira/browse/YARN-3713
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
  Labels: cleanup, maintenance

 remove duplicate function call {{storeContainerDiagnostics}} in 
 ContainerDiagnosticsUpdateTransition. {{storeContainerDiagnostics}} is 
 already called at ContainerImpl#addDiagnostics. 
 {code}
   private void addDiagnostics(String... diags) {
 for (String s : diags) {
   this.diagnostics.append(s);
 }
 try {
   stateStore.storeContainerDiagnostics(containerId, diagnostics);
 } catch (IOException e) {
   LOG.warn(Unable to update diagnostics in state store for 
   + containerId, e);
 }
   }
 {code} 
 So we don't need call {{storeContainerDiagnostics}} in  
 ContainerDiagnosticsUpdateTransition#transition.
 {code}
   container.addDiagnostics(updateEvent.getDiagnosticsUpdate(), \n);
   try {
 container.stateStore.storeContainerDiagnostics(container.containerId,
 container.diagnostics);
   } catch (IOException e) {
 LOG.warn(Unable to update state store diagnostics for 
 + container.containerId, e);
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3713) Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition

zhihai xu created YARN-3713:
---

 Summary: Remove duplicate function call storeContainerDiagnostics 
in ContainerDiagnosticsUpdateTransition
 Key: YARN-3713
 URL: https://issues.apache.org/jira/browse/YARN-3713
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor


remove duplicate function call {{storeContainerDiagnostics}} in 
ContainerDiagnosticsUpdateTransition. {{storeContainerDiagnostics}} is already 
called at ContainerImpl#addDiagnostics. 
{code}
  private void addDiagnostics(String... diags) {
for (String s : diags) {
  this.diagnostics.append(s);
}
try {
  stateStore.storeContainerDiagnostics(containerId, diagnostics);
} catch (IOException e) {
  LOG.warn(Unable to update diagnostics in state store for 
  + containerId, e);
}
  }
{code} 
So we don't need call {{storeContainerDiagnostics}} in  
ContainerDiagnosticsUpdateTransition#transition.
{code}
  container.addDiagnostics(updateEvent.getDiagnosticsUpdate(), \n);
  try {
container.stateStore.storeContainerDiagnostics(container.containerId,
container.diagnostics);
  } catch (IOException e) {
LOG.warn(Unable to update state store diagnostics for 
+ container.containerId, e);
  }
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3711) Documentation of ResourceManager HA should explain about webapp address configuration


 [ 
https://issues.apache.org/jira/browse/YARN-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated YARN-3711:
---
Attachment: YARN-3711.002.patch

I attached patch. 002 fixes markdown formatting nits too.

 Documentation of ResourceManager HA should explain about webapp address 
 configuration
 -

 Key: YARN-3711
 URL: https://issues.apache.org/jira/browse/YARN-3711
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: documentation
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: YARN-3711.002.patch


 Proper proxy URL of AM Web UI could not be got without setting 
 {{yarn.resourcemanager.webapp.address._rm-id_}} and/or 
 {{yarn.resourcemanager.webapp.https.address._rm-id_}} if RM-HA is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3711) Documentation of ResourceManager HA should explain about webapp address configuration


 [ 
https://issues.apache.org/jira/browse/YARN-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated YARN-3711:
---
Description: 
There should be explanation about webapp address in addition to RPC address.

AM proxy filter needs explicit definition of 
{{yarn.resourcemanager.webapp.address._rm-id_}} and/or 
{{yarn.resourcemanager.webapp.https.address._rm-id_}} to get proper default 
addresses in RM-HA mode now.


  was:Proper proxy URL of AM Web UI could not be got without setting 
{{yarn.resourcemanager.webapp.address._rm-id_}} and/or 
{{yarn.resourcemanager.webapp.https.address._rm-id_}} if RM-HA is enabled.


 Documentation of ResourceManager HA should explain about webapp address 
 configuration
 -

 Key: YARN-3711
 URL: https://issues.apache.org/jira/browse/YARN-3711
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: documentation
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: YARN-3711.002.patch


 There should be explanation about webapp address in addition to RPC address.
 AM proxy filter needs explicit definition of 
 {{yarn.resourcemanager.webapp.address._rm-id_}} and/or 
 {{yarn.resourcemanager.webapp.https.address._rm-id_}} to get proper default 
 addresses in RM-HA mode now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3713) Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition


[ 
https://issues.apache.org/jira/browse/YARN-3713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558837#comment-14558837
 ] 

Hadoop QA commented on YARN-3713:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 44s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 37s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 51s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 36s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m  3s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m 11s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  42m 36s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12735265/YARN-3713.000.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 56996a6 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8080/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8080/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8080/console |


This message was automatically generated.

 Remove duplicate function call storeContainerDiagnostics in 
 ContainerDiagnosticsUpdateTransition
 

 Key: YARN-3713
 URL: https://issues.apache.org/jira/browse/YARN-3713
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
  Labels: cleanup
 Attachments: YARN-3713.000.patch


 remove duplicate function call {{storeContainerDiagnostics}} in 
 ContainerDiagnosticsUpdateTransition. {{storeContainerDiagnostics}} is 
 already called at ContainerImpl#addDiagnostics. 
 {code}
   private void addDiagnostics(String... diags) {
 for (String s : diags) {
   this.diagnostics.append(s);
 }
 try {
   stateStore.storeContainerDiagnostics(containerId, diagnostics);
 } catch (IOException e) {
   LOG.warn(Unable to update diagnostics in state store for 
   + containerId, e);
 }
   }
 {code} 
 So we don't need call {{storeContainerDiagnostics}} in  
 ContainerDiagnosticsUpdateTransition#transition.
 {code}
   container.addDiagnostics(updateEvent.getDiagnosticsUpdate(), \n);
   try {
 container.stateStore.storeContainerDiagnostics(container.containerId,
 container.diagnostics);
   } catch (IOException e) {
 LOG.warn(Unable to update state store diagnostics for 
 + container.containerId, e);
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3713) Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition


 [ 
https://issues.apache.org/jira/browse/YARN-3713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3713:

Attachment: YARN-3713.000.patch

 Remove duplicate function call storeContainerDiagnostics in 
 ContainerDiagnosticsUpdateTransition
 

 Key: YARN-3713
 URL: https://issues.apache.org/jira/browse/YARN-3713
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
  Labels: cleanup, maintenance
 Attachments: YARN-3713.000.patch


 remove duplicate function call {{storeContainerDiagnostics}} in 
 ContainerDiagnosticsUpdateTransition. {{storeContainerDiagnostics}} is 
 already called at ContainerImpl#addDiagnostics. 
 {code}
   private void addDiagnostics(String... diags) {
 for (String s : diags) {
   this.diagnostics.append(s);
 }
 try {
   stateStore.storeContainerDiagnostics(containerId, diagnostics);
 } catch (IOException e) {
   LOG.warn(Unable to update diagnostics in state store for 
   + containerId, e);
 }
   }
 {code} 
 So we don't need call {{storeContainerDiagnostics}} in  
 ContainerDiagnosticsUpdateTransition#transition.
 {code}
   container.addDiagnostics(updateEvent.getDiagnosticsUpdate(), \n);
   try {
 container.stateStore.storeContainerDiagnostics(container.containerId,
 container.diagnostics);
   } catch (IOException e) {
 LOG.warn(Unable to update state store diagnostics for 
 + container.containerId, e);
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3713) Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition


 [ 
https://issues.apache.org/jira/browse/YARN-3713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3713:

Labels: cleanup  (was: cleanup maintenance)

 Remove duplicate function call storeContainerDiagnostics in 
 ContainerDiagnosticsUpdateTransition
 

 Key: YARN-3713
 URL: https://issues.apache.org/jira/browse/YARN-3713
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
  Labels: cleanup
 Attachments: YARN-3713.000.patch


 remove duplicate function call {{storeContainerDiagnostics}} in 
 ContainerDiagnosticsUpdateTransition. {{storeContainerDiagnostics}} is 
 already called at ContainerImpl#addDiagnostics. 
 {code}
   private void addDiagnostics(String... diags) {
 for (String s : diags) {
   this.diagnostics.append(s);
 }
 try {
   stateStore.storeContainerDiagnostics(containerId, diagnostics);
 } catch (IOException e) {
   LOG.warn(Unable to update diagnostics in state store for 
   + containerId, e);
 }
   }
 {code} 
 So we don't need call {{storeContainerDiagnostics}} in  
 ContainerDiagnosticsUpdateTransition#transition.
 {code}
   container.addDiagnostics(updateEvent.getDiagnosticsUpdate(), \n);
   try {
 container.stateStore.storeContainerDiagnostics(container.containerId,
 container.diagnostics);
   } catch (IOException e) {
 LOG.warn(Unable to update state store diagnostics for 
 + container.containerId, e);
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3711) Documentation of ResourceManager HA should explain about webapp address configuration


 [ 
https://issues.apache.org/jira/browse/YARN-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated YARN-3711:
---
Attachment: YARN-3711.001.patch

 Documentation of ResourceManager HA should explain about webapp address 
 configuration
 -

 Key: YARN-3711
 URL: https://issues.apache.org/jira/browse/YARN-3711
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: documentation
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: YARN-3711.001.patch


 Proper proxy URL of AM Web UI could not be got without setting 
 {{yarn.resourcemanager.webapp.address._rm-id_}} and/or 
 {{yarn.resourcemanager.webapp.https.address._rm-id_}} if RM-HA is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3714) AM proxy filter can not get proper default proxy address if RM-HA is enabled


 [ 
https://issues.apache.org/jira/browse/YARN-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated YARN-3714:
---
Description: Default proxy address could not be got without setting 
{{yarn.resourcemanager.webapp.address._rm-id_}} and/or 
{{yarn.resourcemanager.webapp.https.address._rm-id_}} explicitly if RM-HA is 
enabled.

 AM proxy filter can not get proper default proxy address if RM-HA is enabled
 

 Key: YARN-3714
 URL: https://issues.apache.org/jira/browse/YARN-3714
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor

 Default proxy address could not be got without setting 
 {{yarn.resourcemanager.webapp.address._rm-id_}} and/or 
 {{yarn.resourcemanager.webapp.https.address._rm-id_}} explicitly if RM-HA is 
 enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3712) ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously

2015-05-26 Thread Jun Gong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Gong updated YARN-3712:
---
Attachment: YARN-3712.02.patch

Fix checkstyle warnings.

 ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously
 -

 Key: YARN-3712
 URL: https://issues.apache.org/jira/browse/YARN-3712
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Jun Gong
Assignee: Jun Gong
 Attachments: YARN-3712.01.patch, YARN-3712.02.patch


 It will save some time by handling event CLEANUP_CONTAINER asynchronously. 
 This improvement will be useful for cases that cleaning up container cost a 
 little long time(e.g. for our case: we are running Docker container on NM, it 
 will take above 1 seconds to clean up one docker container.  ) and many 
 containers to clean up(e.g. NM need clean up all running containers when NM 
 shutdown). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3711) Documentation of ResourceManager HA should explain about webapp address configuration


 [ 
https://issues.apache.org/jira/browse/YARN-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated YARN-3711:
---
Attachment: (was: YARN-3711.001.patch)

 Documentation of ResourceManager HA should explain about webapp address 
 configuration
 -

 Key: YARN-3711
 URL: https://issues.apache.org/jira/browse/YARN-3711
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: documentation
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor

 Proper proxy URL of AM Web UI could not be got without setting 
 {{yarn.resourcemanager.webapp.address._rm-id_}} and/or 
 {{yarn.resourcemanager.webapp.https.address._rm-id_}} if RM-HA is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3711) Documentation of ResourceManager HA should explain about webapp address configuration


[ 
https://issues.apache.org/jira/browse/YARN-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558831#comment-14558831
 ] 

Hadoop QA commented on YARN-3711:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   2m 54s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | release audit |   0m 20s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   2m 56s | Site still builds. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| | |   6m 14s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12735274/YARN-3711.002.patch |
| Optional Tests | site |
| git revision | trunk / 56996a6 |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8081/console |


This message was automatically generated.

 Documentation of ResourceManager HA should explain about webapp address 
 configuration
 -

 Key: YARN-3711
 URL: https://issues.apache.org/jira/browse/YARN-3711
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: documentation
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: YARN-3711.002.patch


 Proper proxy URL of AM Web UI could not be got without setting 
 {{yarn.resourcemanager.webapp.address._rm-id_}} and/or 
 {{yarn.resourcemanager.webapp.https.address._rm-id_}} if RM-HA is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3644) Node manager shuts down if unable to connect with RM


[ 
https://issues.apache.org/jira/browse/YARN-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558902#comment-14558902
 ] 

Hadoop QA commented on YARN-3644:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 38s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 43s | The applied patch generated  1 
new checkstyle issues (total was 214, now 215). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 48s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 26s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 56s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   6m 15s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  49m  2s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12735276/YARN-3644.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 56996a6 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8082/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8082/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8082/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8082/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8082/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8082/console |


This message was automatically generated.

 Node manager shuts down if unable to connect with RM
 

 Key: YARN-3644
 URL: https://issues.apache.org/jira/browse/YARN-3644
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Srikanth Sundarrajan
Assignee: Raju Bairishetti
 Attachments: YARN-3644.patch


 When NM is unable to connect to RM, NM shuts itself down.
 {code}
   } catch (ConnectException e) {
 //catch and throw the exception if tried MAX wait time to connect 
 RM
 dispatcher.getEventHandler().handle(
 new NodeManagerEvent(NodeManagerEventType.SHUTDOWN));
 throw new YarnRuntimeException(e);
 {code}
 In large clusters, if RM is down for maintenance for longer period, all the 
 NMs shuts themselves down, requiring additional work to bring up the NMs.
 Setting the yarn.resourcemanager.connect.wait-ms to -1 has other side 
 effects, where non connection failures are being retried infinitely by all 
 YarnClients (via RMProxy).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3644) Node manager shuts down if unable to connect with RM


[ 
https://issues.apache.org/jira/browse/YARN-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558955#comment-14558955
 ] 

Hadoop QA commented on YARN-3644:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 36s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 32s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 20s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 49s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 27s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 57s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   6m 26s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  49m 19s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12735288/YARN-3644.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 56996a6 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8083/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8083/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8083/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8083/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8083/console |


This message was automatically generated.

 Node manager shuts down if unable to connect with RM
 

 Key: YARN-3644
 URL: https://issues.apache.org/jira/browse/YARN-3644
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Srikanth Sundarrajan
Assignee: Raju Bairishetti
 Attachments: YARN-3644.001.patch, YARN-3644.patch


 When NM is unable to connect to RM, NM shuts itself down.
 {code}
   } catch (ConnectException e) {
 //catch and throw the exception if tried MAX wait time to connect 
 RM
 dispatcher.getEventHandler().handle(
 new NodeManagerEvent(NodeManagerEventType.SHUTDOWN));
 throw new YarnRuntimeException(e);
 {code}
 In large clusters, if RM is down for maintenance for longer period, all the 
 NMs shuts themselves down, requiring additional work to bring up the NMs.
 Setting the yarn.resourcemanager.connect.wait-ms to -1 has other side 
 effects, where non connection failures are being retried infinitely by all 
 YarnClients (via RMProxy).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3644) Node manager shuts down if unable to connect with RM

2015-05-26 Thread Raju Bairishetti (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raju Bairishetti updated YARN-3644:
---
Attachment: YARN-3644.001.patch

 Node manager shuts down if unable to connect with RM
 

 Key: YARN-3644
 URL: https://issues.apache.org/jira/browse/YARN-3644
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Srikanth Sundarrajan
Assignee: Raju Bairishetti
 Attachments: YARN-3644.001.patch, YARN-3644.patch


 When NM is unable to connect to RM, NM shuts itself down.
 {code}
   } catch (ConnectException e) {
 //catch and throw the exception if tried MAX wait time to connect 
 RM
 dispatcher.getEventHandler().handle(
 new NodeManagerEvent(NodeManagerEventType.SHUTDOWN));
 throw new YarnRuntimeException(e);
 {code}
 In large clusters, if RM is down for maintenance for longer period, all the 
 NMs shuts themselves down, requiring additional work to bring up the NMs.
 Setting the yarn.resourcemanager.connect.wait-ms to -1 has other side 
 effects, where non connection failures are being retried infinitely by all 
 YarnClients (via RMProxy).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-05-26 Thread Tsuyoshi Ozawa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558941#comment-14558941
 ] 

Tsuyoshi Ozawa commented on YARN-2336:
--

+1, committing this shortly.

 Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
 --

 Key: YARN-2336
 URL: https://issues.apache.org/jira/browse/YARN-2336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1, 2.6.0
Reporter: Kenji Kikushima
Assignee: Akira AJISAKA
  Labels: BB2015-05-RFC
 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
 YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, 
 YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch


 When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
 blacket JSON for childQueues.
 This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree


[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558946#comment-14558946
 ] 

Hudson commented on YARN-2336:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7901 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7901/])
YARN-2336. Fair scheduler's REST API returns a missing '[' bracket JSON for 
deep queue tree. Contributed by Kenji Kikushima and Akira Ajisaka. (ozawa: rev 
9a3d617b6325d8918f2833c3e9ce329ecada9242)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesFairScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfoList.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRest.md
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/JAXBContextResolver.java


 Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
 --

 Key: YARN-2336
 URL: https://issues.apache.org/jira/browse/YARN-2336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1, 2.6.0
Reporter: Kenji Kikushima
Assignee: Akira AJISAKA
  Labels: BB2015-05-RFC
 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
 YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, 
 YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch


 When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
 blacket JSON for childQueues.
 This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree


[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559003#comment-14559003
 ] 

Hudson commented on YARN-2336:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #208 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/208/])
YARN-2336. Fair scheduler's REST API returns a missing '[' bracket JSON for 
deep queue tree. Contributed by Kenji Kikushima and Akira Ajisaka. (ozawa: rev 
9a3d617b6325d8918f2833c3e9ce329ecada9242)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfoList.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/JAXBContextResolver.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRest.md
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesFairScheduler.java
* hadoop-yarn-project/CHANGES.txt


 Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
 --

 Key: YARN-2336
 URL: https://issues.apache.org/jira/browse/YARN-2336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1, 2.6.0
Reporter: Kenji Kikushima
Assignee: Akira AJISAKA
  Labels: BB2015-05-RFC
 Fix For: 2.8.0

 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
 YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, 
 YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch


 When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
 blacket JSON for childQueues.
 This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2238) filtering on UI sticks even if I move away from the page

[
https://issues.apache.org/jira/browse/YARN-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559005#comment-14559005
]

Hudson commented on YARN-2238:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #208 (See
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/208/])
YARN-2238. Filtering on UI sticks even if I move away from the page. (xgong:
rev 39077dba2e877420e7470df253f6154f6ecc64ec)
* hadoop-yarn-project/CHANGES.txt
*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/JQueryUI.java

filtering on UI sticks even if I move away from the page

Key: YARN-2238
URL: https://issues.apache.org/jira/browse/YARN-2238
Project: Hadoop YARN
Issue Type: Bug
Components: webapp
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Jian He
Labels: usability
Fix For: 2.7.1

Attachments: YARN-2238.patch, YARN-2238.png, filtered.png

The main data table in many web pages (RM, AM, etc.) seems to show an
unexpected filtering behavior.
If I filter the table by typing something in the key or value field (or I
suspect any search field), the data table gets filtered. The example I used
is the job configuration page for a MR job. That is expected.
However, when I move away from that page and visit any other web page of the
same type (e.g. a job configuration page), the page is rendered with the
filtering! That is unexpected.
What's even stranger is that it does not render the filtering term. As a
result, I have a page that's mysteriously filtered but doesn't tell me what
it's filtering on.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3716) Output node label expression in ResourceRequestPBImpl.toString


[ 
https://issues.apache.org/jira/browse/YARN-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559065#comment-14559065
 ] 

Hadoop QA commented on YARN-3716:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 38s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 36s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 39s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 52s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 24s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   1m 56s | Tests passed in 
hadoop-yarn-common. |
| | |  38m 39s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12735305/YARN-3716.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 9a3d617 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8084/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8084/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8084/console |


This message was automatically generated.

 Output node label expression in ResourceRequestPBImpl.toString
 --

 Key: YARN-3716
 URL: https://issues.apache.org/jira/browse/YARN-3716
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Reporter: Xianyin Xin
Assignee: Xianyin Xin
Priority: Minor
 Attachments: YARN-3716.001.patch


 It's convenient for debug and log trace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3715) Oozie jobs are failed with IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address') on se

2015-05-26 Thread Sergey Svinarchuk (JIRA)

Sergey Svinarchuk created YARN-3715:
---

 Summary: Oozie jobs are failed with IllegalArgumentException: Does 
not contain a valid host:port authority: maprfs:/// (configuration property 
'yarn.resourcemanager.address') on secure cluster with RM HA
 Key: YARN-3715
 URL: https://issues.apache.org/jira/browse/YARN-3715
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Sergey Svinarchuk



2015-05-21 16:06:55,887  WARN ActionStartXCommand:544 -
SERVER[centos6.localdomain] USER[mapr] GROUP[-] TOKEN[] APP[Hive]
JOB[001-150521123655733-oozie-mapr-W]
ACTION[001-150521123655733-oozie-mapr-W@Hive] Error starting action [Hive].
ErrorType [ERROR], ErrorCode [IllegalArgumentException], Message
[IllegalArgumentException: Does not contain a valid host:port authority:
maprfs:/// (configuration property 'yarn.resourcemanager.address')]
org.apache.oozie.action.ActionExecutorException: IllegalArgumentException: Does
not contain a valid host:port authority: maprfs:/// (configuration property
'yarn.resourcemanager.address')
at
org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:401)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:979)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1134)
at
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:228)
at
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
at org.apache.oozie.command.XCommand.call(XCommand.java:281)
at
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:323)
at
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:252)
at
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: Does not contain a valid
host:port authority: maprfs:/// (configuration property
'yarn.resourcemanager.address')
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:211)
at
org.apache.hadoop.conf.Configuration.getSocketAddr(Configuration.java:1788)
at org.apache.hadoop.mapred.Master.getMasterAddress(Master.java:58)
at org.apache.hadoop.mapred.Master.getMasterPrincipal(Master.java:67)
at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:114)
at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
at
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
at
org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:127)
at
org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:460)
at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at
org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:964)
... 10 more
2015-05-21 16:06:55,889  WARN ActionStartXCommand:544 -
SERVER[centos6.localdomain] USER[mapr] GROUP[-] TOKEN[] APP[Hive]
JOB[001-150521123655733-oozie-mapr-W]
ACTION[001-150521123655733-oozie-mapr-W@Hive] Setting Action Status to
[DONE]




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3716) Output node label expression in ResourceRequestPBImpl.toString

Xianyin Xin created YARN-3716:
-

 Summary: Output node label expression in 
ResourceRequestPBImpl.toString
 Key: YARN-3716
 URL: https://issues.apache.org/jira/browse/YARN-3716
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Reporter: Xianyin Xin
Assignee: Xianyin Xin
Priority: Minor


It's convenient for debug and log trace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3715) Oozie jobs are failed with IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address') on

2015-05-26 Thread Sergey Svinarchuk (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558994#comment-14558994
 ] 

Sergey Svinarchuk commented on YARN-3715:
-

There are problem in yarn.resourcemanager.address property. When we try 
submitting regular job this property set to 0.0.0.0:8032, but when Oozie 
submitting job this property set to jobtracker property from file
job.propertires. 
In case with RM HA we set to job.properties 
jobTracker=maprfs:///
and then yarn.resourcemanager.address also set to maprfs:///. 
Then Master.getMasterAddress get socket address from Configuration as 
maprfs:/// and call NetUtils.createSocketAddr(address, defaultPort, name), 
but NetUtils.createSocketAddr can work only with format “hostname:port”. 
I think that for case when using RM HA need call getSocketAddr(String name, 
String defaultAddress, int defaultPort) from YarnConfiguration class. 


 Oozie jobs are failed with IllegalArgumentException: Does not contain a valid 
 host:port authority: maprfs:/// (configuration property 
 'yarn.resourcemanager.address') on secure cluster with RM HA
 --

 Key: YARN-3715
 URL: https://issues.apache.org/jira/browse/YARN-3715
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Sergey Svinarchuk

 2015-05-21 16:06:55,887  WARN ActionStartXCommand:544 -
 SERVER[centos6.localdomain] USER[mapr] GROUP[-] TOKEN[] APP[Hive]
 JOB[001-150521123655733-oozie-mapr-W]
 ACTION[001-150521123655733-oozie-mapr-W@Hive] Error starting action 
 [Hive].
 ErrorType [ERROR], ErrorCode [IllegalArgumentException], Message
 [IllegalArgumentException: Does not contain a valid host:port authority:
 maprfs:/// (configuration property 'yarn.resourcemanager.address')]
 org.apache.oozie.action.ActionExecutorException: IllegalArgumentException: 
 Does
 not contain a valid host:port authority: maprfs:/// (configuration property
 'yarn.resourcemanager.address')
 at
 org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:401)
 at
 org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:979)
 at
 org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1134)
 at
 org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:228)
 at
 org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
 at org.apache.oozie.command.XCommand.call(XCommand.java:281)
 at
 org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:323)
 at
 org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:252)
 at
 org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.IllegalArgumentException: Does not contain a valid
 host:port authority: maprfs:/// (configuration property
 'yarn.resourcemanager.address')
 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:211)
 at
 org.apache.hadoop.conf.Configuration.getSocketAddr(Configuration.java:1788)
 at org.apache.hadoop.mapred.Master.getMasterAddress(Master.java:58)
 at org.apache.hadoop.mapred.Master.getMasterPrincipal(Master.java:67)
 at
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:114)
 at
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
 at
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
 at
 org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:127)
 at
 org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:460)
 at
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343)
 at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
 at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
 at

[jira] [Updated] (YARN-3716) Output node label expression in ResourceRequestPBImpl.toString


 [ 
https://issues.apache.org/jira/browse/YARN-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-3716:
--
Attachment: YARN-3716.001.patch

 Output node label expression in ResourceRequestPBImpl.toString
 --

 Key: YARN-3716
 URL: https://issues.apache.org/jira/browse/YARN-3716
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Reporter: Xianyin Xin
Assignee: Xianyin Xin
Priority: Minor
 Attachments: YARN-3716.001.patch


 It's convenient for debug and log trace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree


[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559017#comment-14559017
 ] 

Hudson commented on YARN-2336:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #939 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/939/])
YARN-2336. Fair scheduler's REST API returns a missing '[' bracket JSON for 
deep queue tree. Contributed by Kenji Kikushima and Akira Ajisaka. (ozawa: rev 
9a3d617b6325d8918f2833c3e9ce329ecada9242)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRest.md
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfoList.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/JAXBContextResolver.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfo.java


 Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
 --

 Key: YARN-2336
 URL: https://issues.apache.org/jira/browse/YARN-2336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1, 2.6.0
Reporter: Kenji Kikushima
Assignee: Akira AJISAKA
  Labels: BB2015-05-RFC
 Fix For: 2.8.0

 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
 YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, 
 YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch


 When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
 blacket JSON for childQueues.
 This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2238) filtering on UI sticks even if I move away from the page

[
https://issues.apache.org/jira/browse/YARN-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559019#comment-14559019
]

Hudson commented on YARN-2238:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #939 (See
[https://builds.apache.org/job/Hadoop-Yarn-trunk/939/])
YARN-2238. Filtering on UI sticks even if I move away from the page. (xgong:
rev 39077dba2e877420e7470df253f6154f6ecc64ec)
*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/JQueryUI.java
* hadoop-yarn-project/CHANGES.txt

filtering on UI sticks even if I move away from the page

Attachments: YARN-2238.patch, YARN-2238.png, filtered.png

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-05-26 Thread Akira AJISAKA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559099#comment-14559099
 ] 

Akira AJISAKA commented on YARN-2336:
-

Thanks [~ozawa] and [~kj-ki]!

 Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
 --

 Key: YARN-2336
 URL: https://issues.apache.org/jira/browse/YARN-2336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1, 2.6.0
Reporter: Kenji Kikushima
Assignee: Akira AJISAKA
  Labels: BB2015-05-RFC
 Fix For: 2.8.0

 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
 YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, 
 YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch


 When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
 blacket JSON for childQueues.
 This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-160) nodemanagers should obtain cpu/memory values from underlying OS

2015-05-26 Thread Varun Vasudev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559475#comment-14559475
 ] 

Varun Vasudev commented on YARN-160:


The change for the Windows cpu limits fixes a bug in the current 
implementation. The current implementation allows YARN containers to exceed the 
configured cpu limit in some cases.

 nodemanagers should obtain cpu/memory values from underlying OS
 ---

 Key: YARN-160
 URL: https://issues.apache.org/jira/browse/YARN-160
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.0.3-alpha
Reporter: Alejandro Abdelnur
Assignee: Varun Vasudev
  Labels: BB2015-05-TBR
 Attachments: YARN-160.005.patch, YARN-160.006.patch, 
 YARN-160.007.patch, YARN-160.008.patch, apache-yarn-160.0.patch, 
 apache-yarn-160.1.patch, apache-yarn-160.2.patch, apache-yarn-160.3.patch


 As mentioned in YARN-2
 *NM memory and CPU configs*
 Currently these values are coming from the config of the NM, we should be 
 able to obtain those values from the OS (ie, in the case of Linux from 
 /proc/meminfo  /proc/cpuinfo). As this is highly OS dependent we should have 
 an interface that obtains this information. In addition implementations of 
 this interface should be able to specify a mem/cpu offset (amount of mem/cpu 
 not to be avail as YARN resource), this would allow to reserve mem/cpu for 
 the OS and other services outside of YARN containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3719) Improve Solaris support in YARN


 [ 
https://issues.apache.org/jira/browse/YARN-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Burlison updated YARN-3719:

Summary: Improve Solaris support in YARN  (was: Improve Solaris support in 
HDFS)

 Improve Solaris support in YARN
 ---

 Key: YARN-3719
 URL: https://issues.apache.org/jira/browse/YARN-3719
 Project: Hadoop YARN
  Issue Type: Task
  Components: build
Affects Versions: 2.7.0
 Environment: Solaris x86, Solaris sparc
Reporter: Alan Burlison

 At present the YARN native components aren't fully supported on Solaris 
 primarily due to differences between Linux and Solaris. This top-level task 
 will be used to group together both existing and new issues related to this 
 work. A second goal is to improve YARN performance and functionality on 
 Solaris wherever possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3719) Improve Solaris support in HDFS

Alan Burlison created YARN-3719:
---

 Summary: Improve Solaris support in HDFS
 Key: YARN-3719
 URL: https://issues.apache.org/jira/browse/YARN-3719
 Project: Hadoop YARN
  Issue Type: Task
  Components: build
Affects Versions: 2.7.0
 Environment: Solaris x86, Solaris sparc
Reporter: Alan Burlison


At present the YARN native components aren't fully supported on Solaris 
primarily due to differences between Linux and Solaris. This top-level task 
will be used to group together both existing and new issues related to this 
work. A second goal is to improve YARN performance and functionality on Solaris 
wherever possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2238) filtering on UI sticks even if I move away from the page


[ 
https://issues.apache.org/jira/browse/YARN-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559582#comment-14559582
 ] 

Sangjin Lee commented on YARN-2238:
---

Sorry for the belated comment. The changes look good to me. Thanks for working 
on this [~jianhe]!

 filtering on UI sticks even if I move away from the page
 

 Key: YARN-2238
 URL: https://issues.apache.org/jira/browse/YARN-2238
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Jian He
  Labels: usability
 Fix For: 2.7.1

 Attachments: YARN-2238.patch, YARN-2238.png, filtered.png


 The main data table in many web pages (RM, AM, etc.) seems to show an 
 unexpected filtering behavior.
 If I filter the table by typing something in the key or value field (or I 
 suspect any search field), the data table gets filtered. The example I used 
 is the job configuration page for a MR job. That is expected.
 However, when I move away from that page and visit any other web page of the 
 same type (e.g. a job configuration page), the page is rendered with the 
 filtering! That is unexpected.
 What's even stranger is that it does not render the filtering term. As a 
 result, I have a page that's mysteriously filtered but doesn't tell me what 
 it's filtering on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-41) The RM should handle the graceful shutdown of the NM.

2015-05-26 Thread Devaraj K (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-41:
--
Attachment: YARN-41-7.patch

[~djp] I have updated the patch with review comments. Can you have a look into 
this?

In the latest patch I have added a new NodeState i.e. SHUTDOWN.

bq. Add tests for new PB objects UnRegisterNodeManagerRequestPBImpl, 
UnRegisterNodeManagerResponsePBImpl into TestYarnServerApiClasses.java.
I have added test for UnRegisterNodeManagerRequestPBImpl and I haven't added 
test for UnRegisterNodeManagerResponsePBImpl since it doesn't have any state to 
verify and no value add for test.


 The RM should handle the graceful shutdown of the NM.
 -

 Key: YARN-41
 URL: https://issues.apache.org/jira/browse/YARN-41
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Ravi Teja Ch N V
Assignee: Devaraj K
 Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
 MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
 YARN-41-4.patch, YARN-41-5.patch, YARN-41-6.patch, YARN-41-7.patch, 
 YARN-41.patch


 Instead of waiting for the NM expiry, RM should remove and handle the NM, 
 which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3719) Improve Solaris support in YARN


 [ 
https://issues.apache.org/jira/browse/YARN-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Burlison updated YARN-3719:

Issue Type: New Feature  (was: Task)

 Improve Solaris support in YARN
 ---

 Key: YARN-3719
 URL: https://issues.apache.org/jira/browse/YARN-3719
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: build
Affects Versions: 2.7.0
 Environment: Solaris x86, Solaris sparc
Reporter: Alan Burlison

 At present the YARN native components aren't fully supported on Solaris 
 primarily due to differences between Linux and Solaris. This top-level task 
 will be used to group together both existing and new issues related to this 
 work. A second goal is to improve YARN performance and functionality on 
 Solaris wherever possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3720) Need comprehensive documentation for configuration CPU/memory resources on NodeManager

Vinod Kumar Vavilapalli created YARN-3720:
-

 Summary: Need comprehensive documentation for configuration 
CPU/memory resources on NodeManager
 Key: YARN-3720
 URL: https://issues.apache.org/jira/browse/YARN-3720
 Project: Hadoop YARN
  Issue Type: Task
  Components: documentation, nodemanager
Reporter: Vinod Kumar Vavilapalli


Things are getting more and more complex after the likes of YARN-160. We need a 
document explaining how to configure cpu/memory values on a NodeManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3718) hadoop-yarn-server-nodemanager's use of Linux Cgroups is non-portable


[ 
https://issues.apache.org/jira/browse/YARN-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559617#comment-14559617
 ] 

Alan Burlison commented on YARN-3718:
-



Yes, I created a new top-level task and moved it under there as there 
are a couple of other YARN-related issues as well.

-- 
Alan Burlison
--


 hadoop-yarn-server-nodemanager's use of Linux Cgroups is non-portable
 -

 Key: YARN-3718
 URL: https://issues.apache.org/jira/browse/YARN-3718
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.7.0
 Environment: BSD OSX Solaris Windows Linux
Reporter: Alan Burlison

 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
  makes use of the Linux-only Cgroups feature 
 (http://en.wikipedia.org/wiki/Cgroups) when Hadoop is built on Linux, but 
 there is no corresponding functionality for non-Linux platforms.
 Other platforms provide similar functionality, e.g. Solaris has an extensive 
 range of resource management features 
 (http://docs.oracle.com/cd/E23824_01/html/821-1460/index.html). Work is 
 needed to abstract the resource management features of Yarn so that the same 
 facilities for resource management can be provided on all platforms that 
 provide the requisite functionality,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-221) NM should provide a way for AM to tell it not to aggregate logs.

2015-05-26 Thread Xuan Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559577#comment-14559577
 ] 

Xuan Gong commented on YARN-221:


bq. All the known policies will be part of YARN including 
SampleRateContainerLogAggregationPolicy. So we still need to config sample rate 
for that policy. If we don't put it in YarnConfiguration, where can we put it? 
It seems we already have a bunch of configuration properties in 
YarnConfiguration that are specific the plugin implementation such as container 
executor properties.

I thought about this. How about adding a new protocol field:  String 
ContainerLogAggregationPolicyParameter along with ContainerLogAggregationPolicy 
in logAggregationContext. In ContainerLogAggregationPolicyParameter, users can 
define any parameter format which their ContainerLogAggregationPolicy can 
understand. For example, we could define ContainerLogAggregationPolicyParameter 
as SR:0.2 and in SampleRateContainerLogAggregationPolicy, we could add 
implementation to understand and parse the parameter.
Also, we could change to
{code}
public interface ContainerLogAggregationPolicy {
public boolean shouldDoLogAggregation(ContainerId containerId,  int 
exitCode);
public void parseParameters(String parameters)
}
{code} 

bq. How MR overrides the default policy. Maybe we can have YarnRunner at MR 
level honor yarn property yarn.container-log-aggregation-policy.class on per 
job level when it creates the ApplicationSubmissionContext with the proper 
LogAggregationContext. In that way we don't have to create extra log 
aggregation properties specific at MR layer.

Good question. Another possible solution could be parsing them from 
command-line if users use ToolRunner.run to launch their MR application.

 NM should provide a way for AM to tell it not to aggregate logs.
 

 Key: YARN-221
 URL: https://issues.apache.org/jira/browse/YARN-221
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: log-aggregation, nodemanager
Reporter: Robert Joseph Evans
Assignee: Ming Ma
 Attachments: YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch, 
 YARN-221-trunk-v3.patch, YARN-221-trunk-v4.patch, YARN-221-trunk-v5.patch


 The NodeManager should provide a way for an AM to tell it that either the 
 logs should not be aggregated, that they should be aggregated with a high 
 priority, or that they should be aggregated but with a lower priority.  The 
 AM should be able to do this in the ContainerLaunch context to provide a 
 default value, but should also be able to update the value when the container 
 is released.
 This would allow for the NM to not aggregate logs in some cases, and avoid 
 connection to the NN at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3718) hadoop-yarn-server-nodemanager's use of Linux Cgroups is non-portable

2015-05-26 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559584#comment-14559584
 ] 

Karthik Kambatla commented on YARN-3718:


We have container executors for each OS - default for all unix-based, Linux for 
linux, Windows for windows. Are you proposing adding a new executor for 
Solaris? If yes, we should mark it a new feature (instead of a bug) and update 
the title accordingly. 

 hadoop-yarn-server-nodemanager's use of Linux Cgroups is non-portable
 -

 Key: YARN-3718
 URL: https://issues.apache.org/jira/browse/YARN-3718
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.7.0
 Environment: BSD OSX Solaris Windows Linux
Reporter: Alan Burlison

 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
  makes use of the Linux-only Cgroups feature 
 (http://en.wikipedia.org/wiki/Cgroups) when Hadoop is built on Linux, but 
 there is no corresponding functionality for non-Linux platforms.
 Other platforms provide similar functionality, e.g. Solaris has an extensive 
 range of resource management features 
 (http://docs.oracle.com/cd/E23824_01/html/821-1460/index.html). Work is 
 needed to abstract the resource management features of Yarn so that the same 
 facilities for resource management can be provided on all platforms that 
 provide the requisite functionality,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3719) Improve Solaris support in YARN


[ 
https://issues.apache.org/jira/browse/YARN-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559598#comment-14559598
 ] 

Alan Burlison commented on YARN-3719:
-

Solaris-related changes to HADOOP and HDFS are covered under the two top-level 
issues:

HADOOP-11985 Improve Solaris support in Hadoop
HDFS-8478 Improve Solaris support in HDFS


 Improve Solaris support in YARN
 ---

 Key: YARN-3719
 URL: https://issues.apache.org/jira/browse/YARN-3719
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: build
Affects Versions: 2.7.0
 Environment: Solaris x86, Solaris sparc
Reporter: Alan Burlison

 At present the YARN native components aren't fully supported on Solaris 
 primarily due to differences between Linux and Solaris. This top-level task 
 will be used to group together both existing and new issues related to this 
 work. A second goal is to improve YARN performance and functionality on 
 Solaris wherever possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-160) nodemanagers should obtain cpu/memory values from underlying OS


[ 
https://issues.apache.org/jira/browse/YARN-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559599#comment-14559599
 ] 

Hudson commented on YARN-160:
-

SUCCESS: Integrated in Hadoop-trunk-Commit #7903 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7903/])
YARN-160. Enhanced NodeManager to automatically obtain cpu/memory values from 
underlying OS when configured to do so. Contributed by Varun Vasudev. (vinodkv: 
rev 500a1d9c76ec612b4e737888f4be79951c11591d)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/NodeManagerHardwareUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
* 
hadoop-tools/hadoop-gridmix/src/test/java/org/apache/hadoop/mapred/gridmix/DummyResourceCalculatorPlugin.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestCgroupsLCEResourcesHandler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/WindowsResourceCalculatorPlugin.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/LinuxResourceCalculatorPlugin.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ResourceCalculatorPlugin.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/CgroupsLCEResourcesHandler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestContainerExecutor.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerExecutor.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestNodeManagerHardwareUtils.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestLinuxResourceCalculatorPlugin.java


 nodemanagers should obtain cpu/memory values from underlying OS
 ---

 Key: YARN-160
 URL: https://issues.apache.org/jira/browse/YARN-160
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.0.3-alpha
Reporter: Alejandro Abdelnur
Assignee: Varun Vasudev
  Labels: BB2015-05-TBR
 Fix For: 2.8.0

 Attachments: YARN-160.005.patch, YARN-160.006.patch, 
 YARN-160.007.patch, YARN-160.008.patch, apache-yarn-160.0.patch, 
 apache-yarn-160.1.patch, apache-yarn-160.2.patch, apache-yarn-160.3.patch


 As mentioned in YARN-2
 *NM memory and CPU configs*
 Currently these values are coming from the config of the NM, we should be 
 able to obtain those values from the OS (ie, in the case of Linux from 
 /proc/meminfo  /proc/cpuinfo). As this is highly OS dependent we should have 
 an interface that obtains this information. In addition implementations of 
 this interface should be able to specify a mem/cpu offset (amount of mem/cpu 
 not to be avail as YARN resource), this would allow to reserve mem/cpu for 
 the OS and other services outside of YARN containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3715) Oozie jobs are failed with IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address') on

2015-05-26 Thread Sergey Svinarchuk (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559479#comment-14559479
 ] 

Sergey Svinarchuk commented on YARN-3715:
-

Yes, it was configuration issue. Thanks

 Oozie jobs are failed with IllegalArgumentException: Does not contain a valid 
 host:port authority: maprfs:/// (configuration property 
 'yarn.resourcemanager.address') on secure cluster with RM HA
 --

 Key: YARN-3715
 URL: https://issues.apache.org/jira/browse/YARN-3715
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Sergey Svinarchuk

 2015-05-21 16:06:55,887  WARN ActionStartXCommand:544 -
 SERVER[centos6.localdomain] USER[mapr] GROUP[-] TOKEN[] APP[Hive]
 JOB[001-150521123655733-oozie-mapr-W]
 ACTION[001-150521123655733-oozie-mapr-W@Hive] Error starting action 
 [Hive].
 ErrorType [ERROR], ErrorCode [IllegalArgumentException], Message
 [IllegalArgumentException: Does not contain a valid host:port authority:
 maprfs:/// (configuration property 'yarn.resourcemanager.address')]
 org.apache.oozie.action.ActionExecutorException: IllegalArgumentException: 
 Does
 not contain a valid host:port authority: maprfs:/// (configuration property
 'yarn.resourcemanager.address')
 at
 org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:401)
 at
 org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:979)
 at
 org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1134)
 at
 org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:228)
 at
 org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
 at org.apache.oozie.command.XCommand.call(XCommand.java:281)
 at
 org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:323)
 at
 org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:252)
 at
 org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.IllegalArgumentException: Does not contain a valid
 host:port authority: maprfs:/// (configuration property
 'yarn.resourcemanager.address')
 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:211)
 at
 org.apache.hadoop.conf.Configuration.getSocketAddr(Configuration.java:1788)
 at org.apache.hadoop.mapred.Master.getMasterAddress(Master.java:58)
 at org.apache.hadoop.mapred.Master.getMasterPrincipal(Master.java:67)
 at
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:114)
 at
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
 at
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
 at
 org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:127)
 at
 org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:460)
 at
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343)
 at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
 at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
 at 
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
 at
 org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:964)
 ... 10 more
 2015-05-21 16:06:55,889  WARN ActionStartXCommand:544 -
 SERVER[centos6.localdomain] USER[mapr] GROUP[-] TOKEN[] APP[Hive]
 JOB[001-150521123655733-oozie-mapr-W]
 ACTION[001-150521123655733-oozie-mapr-W@Hive]

[jira] [Commented] (YARN-1012) NM should report resource utilization of running containers to RM in heartbeat


[ 
https://issues.apache.org/jira/browse/YARN-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559496#comment-14559496
 ] 

Hadoop QA commented on YARN-1012:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 53s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m  5s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 48s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 22s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 58s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   6m 17s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  49m 10s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12735357/YARN-1012-7.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 022f49d |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8088/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8088/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8088/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8088/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8088/console |


This message was automatically generated.

 NM should report resource utilization of running containers to RM in heartbeat
 --

 Key: YARN-1012
 URL: https://issues.apache.org/jira/browse/YARN-1012
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Arun C Murthy
Assignee: Inigo Goiri
 Attachments: YARN-1012-1.patch, YARN-1012-2.patch, YARN-1012-3.patch, 
 YARN-1012-4.patch, YARN-1012-5.patch, YARN-1012-6.patch, YARN-1012-7.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3718) hadoop-yarn-server-nodemanager's use of Linux Cgroups is non-portable


 [ 
https://issues.apache.org/jira/browse/YARN-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Burlison updated YARN-3718:

Issue Type: Sub-task  (was: Bug)
Parent: YARN-3719

 hadoop-yarn-server-nodemanager's use of Linux Cgroups is non-portable
 -

 Key: YARN-3718
 URL: https://issues.apache.org/jira/browse/YARN-3718
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.7.0
 Environment: BSD OSX Solaris Windows Linux
Reporter: Alan Burlison

 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
  makes use of the Linux-only Cgroups feature 
 (http://en.wikipedia.org/wiki/Cgroups) when Hadoop is built on Linux, but 
 there is no corresponding functionality for non-Linux platforms.
 Other platforms provide similar functionality, e.g. Solaris has an extensive 
 range of resource management features 
 (http://docs.oracle.com/cd/E23824_01/html/821-1460/index.html). Work is 
 needed to abstract the resource management features of Yarn so that the same 
 facilities for resource management can be provided on all platforms that 
 provide the requisite functionality,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-160) nodemanagers should obtain cpu/memory values from underlying OS


[ 
https://issues.apache.org/jira/browse/YARN-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559579#comment-14559579
 ] 

Vinod Kumar Vavilapalli commented on YARN-160:
--

Tx for the explanation, Varun. The new logic definitely makes sense to me.

The patch looks good. Checking this in.

 nodemanagers should obtain cpu/memory values from underlying OS
 ---

 Key: YARN-160
 URL: https://issues.apache.org/jira/browse/YARN-160
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.0.3-alpha
Reporter: Alejandro Abdelnur
Assignee: Varun Vasudev
  Labels: BB2015-05-TBR
 Attachments: YARN-160.005.patch, YARN-160.006.patch, 
 YARN-160.007.patch, YARN-160.008.patch, apache-yarn-160.0.patch, 
 apache-yarn-160.1.patch, apache-yarn-160.2.patch, apache-yarn-160.3.patch


 As mentioned in YARN-2
 *NM memory and CPU configs*
 Currently these values are coming from the config of the NM, we should be 
 able to obtain those values from the OS (ie, in the case of Linux from 
 /proc/meminfo  /proc/cpuinfo). As this is highly OS dependent we should have 
 an interface that obtains this information. In addition implementations of 
 this interface should be able to specify a mem/cpu offset (amount of mem/cpu 
 not to be avail as YARN resource), this would allow to reserve mem/cpu for 
 the OS and other services outside of YARN containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3718) hadoop-yarn-server-nodemanager's use of Linux Cgroups is non-portable

2015-05-26 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559590#comment-14559590
 ] 

Karthik Kambatla commented on YARN-3718:


Never mind. I see this is a subtask of YARN-3719.

 hadoop-yarn-server-nodemanager's use of Linux Cgroups is non-portable
 -

 Key: YARN-3718
 URL: https://issues.apache.org/jira/browse/YARN-3718
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.7.0
 Environment: BSD OSX Solaris Windows Linux
Reporter: Alan Burlison

 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
  makes use of the Linux-only Cgroups feature 
 (http://en.wikipedia.org/wiki/Cgroups) when Hadoop is built on Linux, but 
 there is no corresponding functionality for non-Linux platforms.
 Other platforms provide similar functionality, e.g. Solaris has an extensive 
 range of resource management features 
 (http://docs.oracle.com/cd/E23824_01/html/821-1460/index.html). Work is 
 needed to abstract the resource management features of Yarn so that the same 
 facilities for resource management can be provided on all platforms that 
 provide the requisite functionality,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3712) ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously


[ 
https://issues.apache.org/jira/browse/YARN-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559345#comment-14559345
 ] 

Vinod Kumar Vavilapalli commented on YARN-3712:
---

What is the effect of today's way of doing it synchronously?

Interesting you mention time taking for cleaning docker containers. /cc 
[~ashahab], [~sidharta-s] who are looking into that area.

 ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously
 -

 Key: YARN-3712
 URL: https://issues.apache.org/jira/browse/YARN-3712
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Jun Gong
Assignee: Jun Gong
 Attachments: YARN-3712.01.patch, YARN-3712.02.patch


 It will save some time by handling event CLEANUP_CONTAINER asynchronously. 
 This improvement will be useful for cases that cleaning up container cost a 
 little long time(e.g. for our case: we are running Docker container on NM, it 
 will take above 1 seconds to clean up one docker container.  ) and many 
 containers to clean up(e.g. NM need clean up all running containers when NM 
 shutdown). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1012) NM should report resource utilization of running containers to RM in heartbeat

2015-05-26 Thread Inigo Goiri (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559351#comment-14559351
 ] 

Inigo Goiri commented on YARN-1012:
---

I checked the issue with testContainerStatusPBImpl and I cannot figure out 
what's wrong there. Am I missing any method in ResourceUtilization?

I also updated the interfaces and made it Unstable and Private (which I think 
matches our scope).

Regarding the unit test, how would you check? Would you check to 
context.getContainers()? This related to your original quesiton of where should 
we store this information (ContainerMetrics or ContainerStatus).

 NM should report resource utilization of running containers to RM in heartbeat
 --

 Key: YARN-1012
 URL: https://issues.apache.org/jira/browse/YARN-1012
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Arun C Murthy
Assignee: Inigo Goiri
 Attachments: YARN-1012-1.patch, YARN-1012-2.patch, YARN-1012-3.patch, 
 YARN-1012-4.patch, YARN-1012-5.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3712) ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously

2015-05-26 Thread Abin Shahab (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559366#comment-14559366
 ] 

Abin Shahab commented on YARN-3712:
---

You an try changing the file system to aufs or overlayfs from the default 
devmapper. 

 ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously
 -

 Key: YARN-3712
 URL: https://issues.apache.org/jira/browse/YARN-3712
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Jun Gong
Assignee: Jun Gong
 Attachments: YARN-3712.01.patch, YARN-3712.02.patch


 It will save some time by handling event CLEANUP_CONTAINER asynchronously. 
 This improvement will be useful for cases that cleaning up container cost a 
 little long time(e.g. for our case: we are running Docker container on NM, it 
 will take above 1 seconds to clean up one docker container.  ) and many 
 containers to clean up(e.g. NM need clean up all running containers when NM 
 shutdown). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3518) default rm/am expire interval should not less than default resourcemanager connect wait time

2015-05-26 Thread sandflee (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sandflee updated YARN-3518:
---
Attachment: YARN-3518.003.patch

 default rm/am expire interval should not less than default resourcemanager 
 connect wait time
 

 Key: YARN-3518
 URL: https://issues.apache.org/jira/browse/YARN-3518
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Reporter: sandflee
Assignee: sandflee
  Labels: BB2015-05-TBR, configuration, newbie
 Attachments: YARN-3518.001.patch, YARN-3518.002.patch, 
 YARN-3518.003.patch


 take am for example, if am can't connect to RM, after am expire (600s), RM 
 relaunch am, and there will be two am at the same time util resourcemanager 
 connect max wait time(900s) passed.
 DEFAULT_RESOURCEMANAGER_CONNECT_MAX_WAIT_MS =  15 * 60 * 1000;
 DEFAULT_RM_AM_EXPIRY_INTERVAL_MS = 60;
 DEFAULT_RM_NM_EXPIRY_INTERVAL_MS = 60;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree


[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559260#comment-14559260
 ] 

Hudson commented on YARN-2336:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2155 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2155/])
YARN-2336. Fair scheduler's REST API returns a missing '[' bracket JSON for 
deep queue tree. Contributed by Kenji Kikushima and Akira Ajisaka. (ozawa: rev 
9a3d617b6325d8918f2833c3e9ce329ecada9242)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfoList.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/JAXBContextResolver.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRest.md


 Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
 --

 Key: YARN-2336
 URL: https://issues.apache.org/jira/browse/YARN-2336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1, 2.6.0
Reporter: Kenji Kikushima
Assignee: Akira AJISAKA
  Labels: BB2015-05-RFC
 Fix For: 2.8.0

 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
 YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, 
 YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch


 When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
 blacket JSON for childQueues.
 This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2238) filtering on UI sticks even if I move away from the page

[
https://issues.apache.org/jira/browse/YARN-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559262#comment-14559262
]

Hudson commented on YARN-2238:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2155 (See
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2155/])
YARN-2238. Filtering on UI sticks even if I move away from the page. (xgong:
rev 39077dba2e877420e7470df253f6154f6ecc64ec)
* hadoop-yarn-project/CHANGES.txt
*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/JQueryUI.java

filtering on UI sticks even if I move away from the page

Attachments: YARN-2238.patch, YARN-2238.png, filtered.png

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-1012) NM should report resource utilization of running containers to RM in heartbeat

2015-05-26 Thread Inigo Goiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-1012:
--
Attachment: YARN-1012-6.patch

Changed annotations for ResourceUtilization.

 NM should report resource utilization of running containers to RM in heartbeat
 --

 Key: YARN-1012
 URL: https://issues.apache.org/jira/browse/YARN-1012
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: Arun C Murthy
Assignee: Inigo Goiri
 Attachments: YARN-1012-1.patch, YARN-1012-2.patch, YARN-1012-3.patch, 
 YARN-1012-4.patch, YARN-1012-5.patch, YARN-1012-6.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3712) ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously

2015-05-26 Thread Sidharta Seethana (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559362#comment-14559362
 ] 

Sidharta Seethana commented on YARN-3712:
-

[~hex108]

Are you referring to cleaning the docker image or the container instance itself 
? Which of these takes 1 second? If I remember correctly, the docker container 
executor uses a docker run option that automatically cleans up the container 
once it exits and it becomes a part of the container lifetime as far as the 
node manager is concerned. 

thanks,
-Sidharta

 ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously
 -

 Key: YARN-3712
 URL: https://issues.apache.org/jira/browse/YARN-3712
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Jun Gong
Assignee: Jun Gong
 Attachments: YARN-3712.01.patch, YARN-3712.02.patch


 It will save some time by handling event CLEANUP_CONTAINER asynchronously. 
 This improvement will be useful for cases that cleaning up container cost a 
 little long time(e.g. for our case: we are running Docker container on NM, it 
 will take above 1 seconds to clean up one docker container.  ) and many 
 containers to clean up(e.g. NM need clean up all running containers when NM 
 shutdown). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3685) NodeManager unnecessarily knows about classpath-jars due to Windows limitations

[
https://issues.apache.org/jira/browse/YARN-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559309#comment-14559309
]

Vinod Kumar Vavilapalli commented on YARN-3685:
---

bq. Perhaps it's possible to move the classpath jar generation to the MR client
or AM. It's not immediately obvious to me which of those 2 choices is better.
For AM container, client is the right place. For the rest of the tasks, AM is.

bq. We'd need to change the manifest to use relative paths in the Class-Path
attribute instead of absolute paths. (The client and AM are not aware of the
exact layout of the NodeManager's yarn.nodemanager.local-dirs, so the client
can't predict the absolute paths at time of container launch.)
I think this was one of the chief issues in the original patches - we need to
investigate if manifest file can have relative paths or not. Otherwise, it's
ugly but we can still get YARN to replace some sort of markers only in specific
files like the manifest.

bq. Some classpath entries are defined in terms of environment variables. These
environment variables are expanded at the NodeManager via the container launch
scripts. This was true of Linux even before YARN-316, so in that sense, YARN
did already have some classpath logic indirectly.
Which ones are these?

bq. If we do move classpath handling out of the NodeManager, then it would be a
backwards-incompatible change, and so it could not be shipped in the 2.x
release line.
Not clear this is true or not. Have to see the final solution/patch to
realistically reason about this.

NodeManager unnecessarily knows about classpath-jars due to Windows
limitations
---

Key: YARN-3685
URL: https://issues.apache.org/jira/browse/YARN-3685
Project: Hadoop YARN
Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-160) nodemanagers should obtain cpu/memory values from underlying OS


[ 
https://issues.apache.org/jira/browse/YARN-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559338#comment-14559338
 ] 

Hadoop QA commented on YARN-160:


\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 40s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 5 new or modified test files. |
| {color:green}+1{color} | javac |   7m 34s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 38s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m  6s | The applied patch generated  1 
new checkstyle issues (total was 214, now 215). |
| {color:green}+1{color} | whitespace |   0m 28s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 30s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | tools/hadoop tests |  14m 39s | Tests passed in 
hadoop-gridmix. |
| {color:green}+1{color} | yarn tests |   0m 24s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 58s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   6m  8s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  65m 20s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12735336/YARN-160.008.patch |
| Optional Tests | javac unit findbugs checkstyle javadoc |
| git revision | trunk / 022f49d |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8085/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-gridmix test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8085/artifact/patchprocess/testrun_hadoop-gridmix.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8085/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8085/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8085/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8085/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8085/console |


This message was automatically generated.

 nodemanagers should obtain cpu/memory values from underlying OS
 ---

 Key: YARN-160
 URL: https://issues.apache.org/jira/browse/YARN-160
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.0.3-alpha
Reporter: Alejandro Abdelnur
Assignee: Varun Vasudev
  Labels: BB2015-05-TBR
 Attachments: YARN-160.005.patch, YARN-160.006.patch, 
 YARN-160.007.patch, YARN-160.008.patch, apache-yarn-160.0.patch, 
 apache-yarn-160.1.patch, apache-yarn-160.2.patch, apache-yarn-160.3.patch


 As mentioned in YARN-2
 *NM memory and CPU configs*
 Currently these values are coming from the config of the NM, we should be 
 able to obtain those values from the OS (ie, in the case of Linux from 
 /proc/meminfo  /proc/cpuinfo). As this is highly OS dependent we should have 
 an interface that obtains this information. In addition implementations of 
 this interface should be able to specify a mem/cpu offset (amount of mem/cpu 
 not to be avail as YARN resource), this would allow to reserve mem/cpu for 
 the OS and other services outside of YARN containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3652) A SchedulerMetrics may be need for evaluating the scheduler's performance