date:20150519


[ 
https://issues.apache.org/jira/browse/YARN-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549967#comment-14549967
 ] 

Hadoop QA commented on YARN-3654:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 41s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 34s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 38s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 36s | The applied patch generated  2 
new checkstyle issues (total was 12, now 13). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m  4s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m  6s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  42m 14s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733729/YARN-3654.2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 0790275 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7991/artifact/patchprocess/diffcheckstylehadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7991/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7991/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7991/console |


This message was automatically generated.

 ContainerLogsPage web UI should not have meta-refresh
 -

 Key: YARN-3654
 URL: https://issues.apache.org/jira/browse/YARN-3654
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.1
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-3654.1.patch, YARN-3654.2.patch


 Currently, When we try to find the container logs for the finished 
 application, it will re-direct to the url which we re-configured for 
 yarn.log.server.url in yarn-site.xml. But in ContainerLogsPage, we are using 
 meta-refresh:
 {code}
 set(TITLE, join(Redirecting to log server for , $(CONTAINER_ID)));
 html.meta_http(refresh, 1; url= + redirectUrl);
 {code}
 which is not good for some browsers which need to enable the meta-refresh in 
 their security setting, especially for IE which meta-refresh is considered a 
 security hole.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect


[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549983#comment-14549983
 ] 

Hadoop QA commented on YARN-3601:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   5m 10s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 27s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 19s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 33s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 42s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m 50s | Tests passed in 
hadoop-yarn-client. |
| | |  23m  9s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733727/YARN-3601.001.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / 93972a3 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7992/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7992/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7992/console |


This message was automatically generated.

 Fix UT TestRMFailover.testRMWebAppRedirect
 --

 Key: YARN-3601
 URL: https://issues.apache.org/jira/browse/YARN-3601
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, webapp
 Environment: Red Hat Enterprise Linux Workstation release 6.5 
 (Santiago)
Reporter: Weiwei Yang
Assignee: Weiwei Yang
Priority: Critical
  Labels: test
 Attachments: YARN-3601.001.patch


 This test case was not working since the commit from YARN-2605. It failed 
 with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3126) FairScheduler: queue's usedResource is always more than the maxResource limit

2015-05-19 Thread Xia Hu (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xia Hu updated YARN-3126:
-
Attachment: resourcelimit-test.patch

Add a unit test for this patch.

FairScheduler: queue's usedResource is always more than the maxResource limit
-

Key: YARN-3126
URL: https://issues.apache.org/jira/browse/YARN-3126
Project: Hadoop YARN
Issue Type: Bug
Components: fairscheduler
Affects Versions: 2.3.0
Environment: hadoop2.3.0. fair scheduler. spark 1.1.0.
Reporter: Xia Hu
Labels: BB2015-05-TBR, assignContainer, fairscheduler, resources
Fix For: trunk-win

Attachments: resourcelimit-02.patch, resourcelimit-test.patch,
resourcelimit.patch

When submitting spark application(both spark-on-yarn-cluster and
spark-on-yarn-cleint model), the queue's usedResources assigned by
fairscheduler always can be more than the queue's maxResources limit.
And by reading codes of fairscheduler, I suppose this issue happened because
of ignore to check the request resources when assign Container.
Here is the detail:
1. choose a queue. In this process, it will check if queue's usedResource is
bigger than its max, with assignContainerPreCheck.
2. then choose a app in the certain queue.
3. then choose a container. And here is the question, there is no check
whether this container would make the queue sources over its max limit. If a
queue's usedResource is 13G, the maxResource limit is 16G, then a container
which asking for 4G resources may be assigned successful.
This problem will always happen in spark application, cause we can ask for
different container resources in different applications.
By the way, I have already use the patch from YARN-2083.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-19 Thread Zhijie Shen (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549848#comment-14549848
]

Zhijie Shen commented on YARN-3411:
---

[~vrushalic], thanks for working on the patch. Some comment from my side:

1. I saw in HBase implementation flow version is not included as part of row
key. This is a bit different from primary key design of Phoenix implementation.
Would you mind elaborating your rationale a bit?

2. Shall we make the constants in TimelineEntitySchemaConstants follow Hadoop
convention? We can keep them in this class now. Once we decide to move on with
HBase impl, we should move (some of) them into YarnConfiguration as API.

3. I saw the classes are marked \@Public, but they're the backend classes and
not accessible by the user directly. In fact, you can leave these classes not
annotated.

4. According to TimelineSchemaCreator, we need to run command line to create
the table when we setup the backend, right? Can we include creating the table
into the lifecycle of HBaseTimelineWriterImpl?

[Storage implementation] explore the native HBase write schema for storage
--

Key: YARN-3411
URL: https://issues.apache.org/jira/browse/YARN-3411
Project: Hadoop YARN
Issue Type: Sub-task
Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
Attachments: ATSv2BackendHBaseSchemaproposal.pdf,
YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch,
YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch,
YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch,
YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt,
YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, YARN-3411.poc.7.txt,
YARN-3411.poc.txt

There is work that's in progress to implement the storage based on a Phoenix
schema (YARN-3134).
In parallel, we would like to explore an implementation based on a native
HBase schema for the write path. Such a schema does not exclude using
Phoenix, especially for reads and offline queries.
Once we have basic implementations of both options, we could evaluate them in
terms of performance, scalability, usability, etc. and make a call.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3674) YARN application disappears from view

2015-05-19 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549858#comment-14549858
 ] 

Siddharth Seth commented on YARN-3674:
--

Clicking on a specific queue on the scheduler page, followed by a click on the 
'Applications' / 'RUNNING' / etc links - ends up on a page which show no 
information that a queue has been selected. Ends up looking like the cluster 
isn't RUNNING anything or hasn't run anything if the queue isn't used.

For [~sershe] - this was worse. Going back and selecting the default queue made 
no difference to the apps listing.

 YARN application disappears from view
 -

 Key: YARN-3674
 URL: https://issues.apache.org/jira/browse/YARN-3674
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Sergey Shelukhin

 I have 2 tabs open at exact same URL with RUNNING applications view. There is 
 an application that is, in fact, running, that is visible in one tab but not 
 the other. This persists across refreshes. If I open new tab from the tab 
 where the application is not visible, in that tab it shows up ok.
 I didn't change scheduler/queue settings before this behavior happened; on 
 [~sseth]'s advice I went and tried to click the root node of the scheduler on 
 scheduler page; the app still does not become visible.
 Something got stuck somewhere...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3654) ContainerLogsPage web UI should not have meta-refresh

2015-05-19 Thread Xuan Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549875#comment-14549875
 ] 

Xuan Gong commented on YARN-3654:
-

fix -1 on findbugs

 ContainerLogsPage web UI should not have meta-refresh
 -

 Key: YARN-3654
 URL: https://issues.apache.org/jira/browse/YARN-3654
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.1
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-3654.1.patch, YARN-3654.2.patch


 Currently, When we try to find the container logs for the finished 
 application, it will re-direct to the url which we re-configured for 
 yarn.log.server.url in yarn-site.xml. But in ContainerLogsPage, we are using 
 meta-refresh:
 {code}
 set(TITLE, join(Redirecting to log server for , $(CONTAINER_ID)));
 html.meta_http(refresh, 1; url= + redirectUrl);
 {code}
 which is not good for some browsers which need to enable the meta-refresh in 
 their security setting, especially for IE which meta-refresh is considered a 
 security hole.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3560) Not able to navigate to the cluster from tracking url (proxy) generated after submission of job

2015-05-19 Thread Anushri (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anushri updated YARN-3560:
--
Assignee: (was: Anushri)

 Not able to navigate to the cluster from tracking url (proxy) generated after 
 submission of job
 ---

 Key: YARN-3560
 URL: https://issues.apache.org/jira/browse/YARN-3560
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Anushri
Priority: Minor
 Attachments: YARN-3560.patch


 a standalone web proxy server is enabled in the cluster
 when a job is submitted the url generated contains proxy
 track this url
 in the web page , if we try to navigate to the cluster links [about. 
 applications, or scheduler] it gets redirected to some default port instead 
 of actual RM web port configured
 as such it throws webpage not available



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3646) Applications are getting stuck some times in case of retry policy forever

2015-05-19 Thread Raju Bairishetti (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raju Bairishetti updated YARN-3646:
---
Attachment: YARN-3646.patch

 Applications are getting stuck some times in case of retry policy forever
 -

 Key: YARN-3646
 URL: https://issues.apache.org/jira/browse/YARN-3646
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Reporter: Raju Bairishetti
 Attachments: YARN-3646.patch


 We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
 retry policy.
 Yarn client is infinitely retrying in case of exceptions from the RM as it is 
 using retrying policy as FOREVER. The problem is it is retrying for all kinds 
 of exceptions (like ApplicationNotFoundException), even though it is not a 
 connection failure. Due to this my application is not progressing further.
 *Yarn client should not retry infinitely in case of non connection failures.*
 We have written a simple yarn-client which is trying to get an application 
 report for an invalid  or older appId. ResourceManager is throwing an 
 ApplicationNotFoundException as this is an invalid or older appId.  But 
 because of retry policy FOREVER, client is keep on retrying for getting the 
 application report and ResourceManager is throwing 
 ApplicationNotFoundException continuously.
 {code}
 private void testYarnClientRetryPolicy() throws  Exception{
 YarnConfiguration conf = new YarnConfiguration();
 conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
 -1);
 YarnClient yarnClient = YarnClient.createYarnClient();
 yarnClient.init(conf);
 yarnClient.start();
 ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
 10645);
 ApplicationReport report = yarnClient.getApplicationReport(appId);
 }
 {code}
 *RM logs:*
 {noformat}
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875162 Retry#0
 org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
 with id 'application_1430126768987_10645' doesn't exist in RM.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
   at 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
   at 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 47 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875163 Retry#0
 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3677) Fix findbugs warnings in FileSystemRMStateStore.java

2015-05-19 Thread Akira AJISAKA (JIRA)

Akira AJISAKA created YARN-3677:
---

 Summary: Fix findbugs warnings in FileSystemRMStateStore.java
 Key: YARN-3677
 URL: https://issues.apache.org/jira/browse/YARN-3677
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Akira AJISAKA
Priority: Minor


There is 1 findbugs warning in FileSystemRMStateStore.java.
{noformat}
Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
time
Unsynchronized access at FileSystemRMStateStore.java: [line 156]
Field 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
Synchronized 66% of the time
Synchronized access at FileSystemRMStateStore.java: [line 148]
Synchronized access at FileSystemRMStateStore.java: [line 859]
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3674) YARN application disappears from view


[ 
https://issues.apache.org/jira/browse/YARN-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549928#comment-14549928
 ] 

Rohith commented on YARN-3674:
--

Is this dup of YARN-2238?

 YARN application disappears from view
 -

 Key: YARN-3674
 URL: https://issues.apache.org/jira/browse/YARN-3674
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Sergey Shelukhin

 I have 2 tabs open at exact same URL with RUNNING applications view. There is 
 an application that is, in fact, running, that is visible in one tab but not 
 the other. This persists across refreshes. If I open new tab from the tab 
 where the application is not visible, in that tab it shows up ok.
 I didn't change scheduler/queue settings before this behavior happened; on 
 [~sseth]'s advice I went and tried to click the root node of the scheduler on 
 scheduler page; the app still does not become visible.
 Something got stuck somewhere...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect

2015-05-19 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549943#comment-14549943
 ] 

Weiwei Yang commented on YARN-3601:
---

I set a false flag so that HttpURLConnection does NOT automatically follow the 
redirect, this fixes too many redirections problem. (In the past it doesn't 
have this problem because there is a refresh time of 3 seconds so the client is 
still able to retrieve the redirect url from the http header). I am now able to 
retrieve redirection url from header field Location,  and null if there is no 
redirection. The overall logic is not changed, the test case is fixed now.

 Fix UT TestRMFailover.testRMWebAppRedirect
 --

 Key: YARN-3601
 URL: https://issues.apache.org/jira/browse/YARN-3601
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, webapp
 Environment: Red Hat Enterprise Linux Workstation release 6.5 
 (Santiago)
Reporter: Weiwei Yang
Assignee: Weiwei Yang
Priority: Critical
  Labels: test
 Attachments: YARN-3601.001.patch


 This test case was not working since the commit from YARN-2605. It failed 
 with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3633) With Fair Scheduler, cluster can logjam when there are too many queues

2015-05-19 Thread Rohit Agarwal (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550008#comment-14550008
]

Rohit Agarwal commented on YARN-3633:
-

Other non-AM containers can be scheduled in the queue - unlike the maxAMShare
limit, the fair share is not a hard limit. So, the FS will schedule non-AM
containers in this queue when it cannot schedule AM containers in other queues.

I gave a walkthrough in this comment:
https://issues.apache.org/jira/browse/YARN-3633?focusedCommentId=14542895page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14542895

With Fair Scheduler, cluster can logjam when there are too many queues
--

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3126) FairScheduler: queue's usedResource is always more than the maxResource limit

2015-05-19 Thread Xia Hu (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550007#comment-14550007
]

Xia Hu commented on YARN-3126:
--

I have submitted a unit test just now, review it again, thx~

FairScheduler: queue's usedResource is always more than the maxResource limit
-

Attachments: resourcelimit-02.patch, resourcelimit-test.patch,
resourcelimit.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3677) Fix findbugs warnings in FileSystemRMStateStore.java

2015-05-19 Thread Akira AJISAKA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550015#comment-14550015
 ] 

Akira AJISAKA commented on YARN-3677:
-

setIsHDFS method should be synchronized.
{code}
  @VisibleForTesting
  void setIsHDFS(boolean isHDFS) {
this.isHDFS = isHDFS;
  }
{code}
Looks like this issue is caused by commit 9a2a95 but there is no issue id in 
the commit message. Hi [~vinodkv], would you point the jira related to the 
commit?

 Fix findbugs warnings in FileSystemRMStateStore.java
 

 Key: YARN-3677
 URL: https://issues.apache.org/jira/browse/YARN-3677
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Akira AJISAKA
Priority: Minor
  Labels: newbie

 There is 1 findbugs warning in FileSystemRMStateStore.java.
 {noformat}
 Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
 time
 Unsynchronized access at FileSystemRMStateStore.java: [line 156]
 Field 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 Synchronized 66% of the time
 Synchronized access at FileSystemRMStateStore.java: [line 148]
 Synchronized access at FileSystemRMStateStore.java: [line 859]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3646) Applications are getting stuck some times in case of retry policy forever


[ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550092#comment-14550092
 ] 

Hadoop QA commented on YARN-3646:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 43s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 37s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 44s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m  1s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m  2s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  22m 17s | Tests passed in 
hadoop-common. |
| {color:green}+1{color} | yarn tests |   1m 56s | Tests passed in 
hadoop-yarn-common. |
| | |  63m 53s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733743/YARN-3646.patch |
| Optional Tests | javac unit findbugs checkstyle javadoc |
| git revision | trunk / 93972a3 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7994/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7994/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7994/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7994/console |


This message was automatically generated.

 Applications are getting stuck some times in case of retry policy forever
 -

 Key: YARN-3646
 URL: https://issues.apache.org/jira/browse/YARN-3646
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Reporter: Raju Bairishetti
 Attachments: YARN-3646.patch


 We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
 retry policy.
 Yarn client is infinitely retrying in case of exceptions from the RM as it is 
 using retrying policy as FOREVER. The problem is it is retrying for all kinds 
 of exceptions (like ApplicationNotFoundException), even though it is not a 
 connection failure. Due to this my application is not progressing further.
 *Yarn client should not retry infinitely in case of non connection failures.*
 We have written a simple yarn-client which is trying to get an application 
 report for an invalid  or older appId. ResourceManager is throwing an 
 ApplicationNotFoundException as this is an invalid or older appId.  But 
 because of retry policy FOREVER, client is keep on retrying for getting the 
 application report and ResourceManager is throwing 
 ApplicationNotFoundException continuously.
 {code}
 private void testYarnClientRetryPolicy() throws  Exception{
 YarnConfiguration conf = new YarnConfiguration();
 conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
 -1);
 YarnClient yarnClient = YarnClient.createYarnClient();
 yarnClient.init(conf);
 yarnClient.start();
 ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
 10645);
 ApplicationReport report = yarnClient.getApplicationReport(appId);
 }
 {code}
 *RM logs:*
 {noformat}
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875162 Retry#0
 org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
 with id 'application_1430126768987_10645' doesn't exist in RM.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
   at

[jira] [Commented] (YARN-3126) FairScheduler: queue's usedResource is always more than the maxResource limit


[ 
https://issues.apache.org/jira/browse/YARN-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550119#comment-14550119
 ] 

Hadoop QA commented on YARN-3126:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   5m 19s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 31s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 20s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 44s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 3  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 31s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   1m 16s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  60m 19s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  77m 34s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time  Unsynchronized access at FileSystemRMStateStore.java:66% 
of time  Unsynchronized access at FileSystemRMStateStore.java:[line 156] |
| Timed out tests | 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733746/resourcelimit-test.patch
 |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / 93972a3 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7993/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7993/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7993/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7993/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7993/console |


This message was automatically generated.

 FairScheduler: queue's usedResource is always more than the maxResource limit
 -

 Key: YARN-3126
 URL: https://issues.apache.org/jira/browse/YARN-3126
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.3.0
 Environment: hadoop2.3.0. fair scheduler. spark 1.1.0. 
Reporter: Xia Hu
  Labels: BB2015-05-TBR, assignContainer, fairscheduler, resources
 Fix For: trunk-win

 Attachments: resourcelimit-02.patch, resourcelimit-test.patch, 
 resourcelimit.patch


 When submitting spark application(both spark-on-yarn-cluster and 
 spark-on-yarn-cleint model), the queue's usedResources assigned by 
 fairscheduler always can be more than the queue's maxResources limit.
 And by reading codes of fairscheduler, I suppose this issue happened because 
 of ignore to check the request resources when assign Container.
 Here is the detail:
 1. choose a queue. In this process, it will check if queue's usedResource is 
 bigger than its max, with assignContainerPreCheck. 
 2. then choose a app in the certain queue. 
 3. then choose a container. And here is the question, there is no check 
 whether this container would make the queue sources over its max limit. If a 
 queue's usedResource is 13G, the maxResource limit is 16G, then a container 
 which asking for 4G resources may be assigned successful. 
 This problem will always happen in spark application, cause we can ask for 
 different container resources in different applications. 
 By the way, I have already use the patch from YARN-2083. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3677) Fix findbugs warnings in FileSystemRMStateStore.java


[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550051#comment-14550051
 ] 

Tsuyoshi Ozawa commented on YARN-3677:
--

[~ajisakaa] thank you for finding the issue.

The commit message says that the contribution is done by [~asuresh]. I think we 
should revert the change if the JIRA has not been opened yet - we should 
discuss the point. IMHO, we shouldn't switch the behaviour based on whether 
HDFS is used or not without the special reason. 

 Fix findbugs warnings in FileSystemRMStateStore.java
 

 Key: YARN-3677
 URL: https://issues.apache.org/jira/browse/YARN-3677
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Akira AJISAKA
Priority: Minor
  Labels: newbie

 There is 1 findbugs warning in FileSystemRMStateStore.java.
 {noformat}
 Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
 time
 Unsynchronized access at FileSystemRMStateStore.java: [line 156]
 Field 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 Synchronized 66% of the time
 Synchronized access at FileSystemRMStateStore.java: [line 148]
 Synchronized access at FileSystemRMStateStore.java: [line 859]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3646) Applications are getting stuck some times in case of retry policy forever


[ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550233#comment-14550233
 ] 

Rohith commented on YARN-3646:
--

bq. Seems we do not even require exceptionToPolicy for FOREVER policy if we 
catch the exception in shouldRetry method.
make sense to me,will reveiw the patch, thanks

 Applications are getting stuck some times in case of retry policy forever
 -

 Key: YARN-3646
 URL: https://issues.apache.org/jira/browse/YARN-3646
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Reporter: Raju Bairishetti
 Attachments: YARN-3646.patch


 We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
 retry policy.
 Yarn client is infinitely retrying in case of exceptions from the RM as it is 
 using retrying policy as FOREVER. The problem is it is retrying for all kinds 
 of exceptions (like ApplicationNotFoundException), even though it is not a 
 connection failure. Due to this my application is not progressing further.
 *Yarn client should not retry infinitely in case of non connection failures.*
 We have written a simple yarn-client which is trying to get an application 
 report for an invalid  or older appId. ResourceManager is throwing an 
 ApplicationNotFoundException as this is an invalid or older appId.  But 
 because of retry policy FOREVER, client is keep on retrying for getting the 
 application report and ResourceManager is throwing 
 ApplicationNotFoundException continuously.
 {code}
 private void testYarnClientRetryPolicy() throws  Exception{
 YarnConfiguration conf = new YarnConfiguration();
 conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
 -1);
 YarnClient yarnClient = YarnClient.createYarnClient();
 yarnClient.init(conf);
 yarnClient.start();
 ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
 10645);
 ApplicationReport report = yarnClient.getApplicationReport(appId);
 }
 {code}
 *RM logs:*
 {noformat}
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875162 Retry#0
 org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
 with id 'application_1430126768987_10645' doesn't exist in RM.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
   at 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
   at 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 47 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875163 Retry#0
 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-19 Thread Varun Vasudev (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-2821:

Attachment: YARN-2821.005.patch

Uploaded 005.patch which adds the tests requested by [~jianhe].

 Distributed shell app master becomes unresponsive sometimes
 ---

 Key: YARN-2821
 URL: https://issues.apache.org/jira/browse/YARN-2821
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Affects Versions: 2.5.1
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: YARN-2821.002.patch, YARN-2821.003.patch, 
 YARN-2821.004.patch, YARN-2821.005.patch, apache-yarn-2821.0.patch, 
 apache-yarn-2821.1.patch


 We've noticed that once in a while the distributed shell app master becomes 
 unresponsive and is eventually killed by the RM. snippet of the logs -
 {noformat}
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
 appattempt_1415123350094_0017_01 received 0 previous attempts' running 
 containers on AM registration.
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez2:45454
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_02, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 START_CONTAINER for Container container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
 onprem-tez2:45454
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 QUERY_CONTAINER for Container container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
 onprem-tez2:45454
 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez3:45454
 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez4:45454
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=3
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_03, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_04, 
 containerNode=onprem-tez3:45454, containerNodeURI=onprem-tez3:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_05, 
 containerNode=onprem-tez4:45454, containerNodeURI=onprem-tez4:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_03
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_05
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_04
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 START_CONTAINER for Container container_1415123350094_0017_01_05
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 START_CONTAINER for Container container_1415123350094_0017_01_03
 14/11/04 18:21:39 INFO

[jira] [Updated] (YARN-41) The RM should handle the graceful shutdown of the NM.


 [ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-41:
--
Attachment: YARN-41-5.patch

I am attaching patch as per latest source code and also with the above comments 
fix.

 The RM should handle the graceful shutdown of the NM.
 -

 Key: YARN-41
 URL: https://issues.apache.org/jira/browse/YARN-41
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Ravi Teja Ch N V
Assignee: Devaraj K
 Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
 MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
 YARN-41-4.patch, YARN-41-5.patch, YARN-41.patch


 Instead of waiting for the NM expiry, RM should remove and handle the NM, 
 which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2821) Distributed shell app master becomes unresponsive sometimes


[ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550216#comment-14550216
 ] 

Hadoop QA commented on YARN-2821:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 40s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 36s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 18s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 35s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m 56s | Tests passed in 
hadoop-yarn-applications-distributedshell. |
| | |  42m 15s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733765/YARN-2821.005.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / eb4c9dd |
| hadoop-yarn-applications-distributedshell test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7995/artifact/patchprocess/testrun_hadoop-yarn-applications-distributedshell.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7995/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7995/console |


This message was automatically generated.

 Distributed shell app master becomes unresponsive sometimes
 ---

 Key: YARN-2821
 URL: https://issues.apache.org/jira/browse/YARN-2821
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Affects Versions: 2.5.1
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: YARN-2821.002.patch, YARN-2821.003.patch, 
 YARN-2821.004.patch, YARN-2821.005.patch, apache-yarn-2821.0.patch, 
 apache-yarn-2821.1.patch


 We've noticed that once in a while the distributed shell app master becomes 
 unresponsive and is eventually killed by the RM. snippet of the logs -
 {noformat}
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
 appattempt_1415123350094_0017_01 received 0 previous attempts' running 
 containers on AM registration.
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez2:45454
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_02, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 START_CONTAINER for Container container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
 onprem-tez2:45454
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 QUERY_CONTAINER for Container

[jira] [Commented] (YARN-3646) Applications are getting stuck some times in case of retry policy forever


[ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550256#comment-14550256
 ] 

Rohith commented on YARN-3646:
--

Thanks for working on this issue.. The patch overall looks good to me.
nit : Can the test moved to Yarn package since issue is in Yarn? Otherwise if 
there is any changed in the RMProxy, test will not run.

 Applications are getting stuck some times in case of retry policy forever
 -

 Key: YARN-3646
 URL: https://issues.apache.org/jira/browse/YARN-3646
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Reporter: Raju Bairishetti
 Attachments: YARN-3646.patch


 We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
 retry policy.
 Yarn client is infinitely retrying in case of exceptions from the RM as it is 
 using retrying policy as FOREVER. The problem is it is retrying for all kinds 
 of exceptions (like ApplicationNotFoundException), even though it is not a 
 connection failure. Due to this my application is not progressing further.
 *Yarn client should not retry infinitely in case of non connection failures.*
 We have written a simple yarn-client which is trying to get an application 
 report for an invalid  or older appId. ResourceManager is throwing an 
 ApplicationNotFoundException as this is an invalid or older appId.  But 
 because of retry policy FOREVER, client is keep on retrying for getting the 
 application report and ResourceManager is throwing 
 ApplicationNotFoundException continuously.
 {code}
 private void testYarnClientRetryPolicy() throws  Exception{
 YarnConfiguration conf = new YarnConfiguration();
 conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
 -1);
 YarnClient yarnClient = YarnClient.createYarnClient();
 yarnClient.init(conf);
 yarnClient.start();
 ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
 10645);
 ApplicationReport report = yarnClient.getApplicationReport(appId);
 }
 {code}
 *RM logs:*
 {noformat}
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875162 Retry#0
 org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
 with id 'application_1430126768987_10645' doesn't exist in RM.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
   at 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
   at 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 47 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875163 Retry#0
 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3646) Applications are getting stuck some times in case of retry policy forever


[ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550258#comment-14550258
 ] 

Rohith commented on YARN-3646:
--

And I verified in one node cluster by enabling and disabling retryforever 
policy.

 Applications are getting stuck some times in case of retry policy forever
 -

 Key: YARN-3646
 URL: https://issues.apache.org/jira/browse/YARN-3646
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Reporter: Raju Bairishetti
 Attachments: YARN-3646.patch


 We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
 retry policy.
 Yarn client is infinitely retrying in case of exceptions from the RM as it is 
 using retrying policy as FOREVER. The problem is it is retrying for all kinds 
 of exceptions (like ApplicationNotFoundException), even though it is not a 
 connection failure. Due to this my application is not progressing further.
 *Yarn client should not retry infinitely in case of non connection failures.*
 We have written a simple yarn-client which is trying to get an application 
 report for an invalid  or older appId. ResourceManager is throwing an 
 ApplicationNotFoundException as this is an invalid or older appId.  But 
 because of retry policy FOREVER, client is keep on retrying for getting the 
 application report and ResourceManager is throwing 
 ApplicationNotFoundException continuously.
 {code}
 private void testYarnClientRetryPolicy() throws  Exception{
 YarnConfiguration conf = new YarnConfiguration();
 conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
 -1);
 YarnClient yarnClient = YarnClient.createYarnClient();
 yarnClient.init(conf);
 yarnClient.start();
 ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
 10645);
 ApplicationReport report = yarnClient.getApplicationReport(appId);
 }
 {code}
 *RM logs:*
 {noformat}
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875162 Retry#0
 org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
 with id 'application_1430126768987_10645' doesn't exist in RM.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
   at 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
   at 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 47 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875163 Retry#0
 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3543) ApplicationReport should be able to tell whether the Application is AM managed or not.


 [ 
https://issues.apache.org/jira/browse/YARN-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-3543:
-
Attachment: 0004-YARN-3543.patch

 ApplicationReport should be able to tell whether the Application is AM 
 managed or not. 
 ---

 Key: YARN-3543
 URL: https://issues.apache.org/jira/browse/YARN-3543
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.6.0
Reporter: Spandan Dutta
Assignee: Rohith
  Labels: BB2015-05-TBR
 Attachments: 0001-YARN-3543.patch, 0001-YARN-3543.patch, 
 0002-YARN-3543.patch, 0002-YARN-3543.patch, 0003-YARN-3543.patch, 
 0003-YARN-3543.patch, 0004-YARN-3543.patch, YARN-3543-AH.PNG, YARN-3543-RM.PNG


 Currently we can know whether the application submitted by the user is AM 
 managed from the applicationSubmissionContext. This can be only done  at the 
 time when the user submits the job. We should have access to this info from 
 the ApplicationReport as well so that we can check whether an app is AM 
 managed or not anytime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3541) Add version info on timeline service / generic history web UI and REST API


[ 
https://issues.apache.org/jira/browse/YARN-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550268#comment-14550268
 ] 

Hudson commented on YARN-3541:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #932 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/932/])
YARN-3541. Add version info on timeline service / generic history web UI and 
REST API. Contributed by Zhijie Shen (xgong: rev 
76afd28862c1f27011273659a82cd45903a77170)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timeline/TimelineAbout.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/webapp/TimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/timeline/TimelineUtils.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebServices.java


 Add version info on timeline service / generic history web UI and REST API
 --

 Key: YARN-3541
 URL: https://issues.apache.org/jira/browse/YARN-3541
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.8.0

 Attachments: YARN-3541.1.patch, YARN-3541.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3646) Applications are getting stuck some times in case of retry policy forever

2015-05-19 Thread Raju Bairishetti (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550288#comment-14550288
 ] 

Raju Bairishetti commented on YARN-3646:


Thanks [~rohithsharma] for the review.

 Looks like it is mainly an issue with retry policy.



 Applications are getting stuck some times in case of retry policy forever
 -

 Key: YARN-3646
 URL: https://issues.apache.org/jira/browse/YARN-3646
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Reporter: Raju Bairishetti
 Attachments: YARN-3646.patch


 We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
 retry policy.
 Yarn client is infinitely retrying in case of exceptions from the RM as it is 
 using retrying policy as FOREVER. The problem is it is retrying for all kinds 
 of exceptions (like ApplicationNotFoundException), even though it is not a 
 connection failure. Due to this my application is not progressing further.
 *Yarn client should not retry infinitely in case of non connection failures.*
 We have written a simple yarn-client which is trying to get an application 
 report for an invalid  or older appId. ResourceManager is throwing an 
 ApplicationNotFoundException as this is an invalid or older appId.  But 
 because of retry policy FOREVER, client is keep on retrying for getting the 
 application report and ResourceManager is throwing 
 ApplicationNotFoundException continuously.
 {code}
 private void testYarnClientRetryPolicy() throws  Exception{
 YarnConfiguration conf = new YarnConfiguration();
 conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
 -1);
 YarnClient yarnClient = YarnClient.createYarnClient();
 yarnClient.init(conf);
 yarnClient.start();
 ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
 10645);
 ApplicationReport report = yarnClient.getApplicationReport(appId);
 }
 {code}
 *RM logs:*
 {noformat}
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875162 Retry#0
 org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
 with id 'application_1430126768987_10645' doesn't exist in RM.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
   at 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
   at 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 47 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875163 Retry#0
 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3541) Add version info on timeline service / generic history web UI and REST API


[ 
https://issues.apache.org/jira/browse/YARN-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550299#comment-14550299
 ] 

Hudson commented on YARN-3541:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #201 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/201/])
YARN-3541. Add version info on timeline service / generic history web UI and 
REST API. Contributed by Zhijie Shen (xgong: rev 
76afd28862c1f27011273659a82cd45903a77170)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timeline/TimelineAbout.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/webapp/TimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/timeline/TimelineUtils.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebApp.java


 Add version info on timeline service / generic history web UI and REST API
 --

 Key: YARN-3541
 URL: https://issues.apache.org/jira/browse/YARN-3541
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.8.0

 Attachments: YARN-3541.1.patch, YARN-3541.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-41) The RM should handle the graceful shutdown of the NM.


[ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550314#comment-14550314
 ] 

Hadoop QA commented on YARN-41:
---

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 10s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 9 new or modified test files. |
| {color:green}+1{color} | javac |   7m 40s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 40s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 30s | The applied patch generated  
18 new checkstyle issues (total was 15, now 33). |
| {color:green}+1{color} | whitespace |   0m 15s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 45s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 24s | Tests passed in 
hadoop-yarn-server-common. |
| {color:green}+1{color} | yarn tests |   5m 57s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| {color:green}+1{color} | yarn tests |  49m 59s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| {color:green}+1{color} | yarn tests |   1m 53s | Tests passed in 
hadoop-yarn-server-tests. |
| | |  99m 18s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time  Unsynchronized access at FileSystemRMStateStore.java:66% 
of time  Unsynchronized access at FileSystemRMStateStore.java:[line 156] |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733771/YARN-41-5.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / eb4c9dd |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7996/artifact/patchprocess/diffcheckstylehadoop-yarn-server-common.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7996/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7996/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7996/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7996/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-tests test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7996/artifact/patchprocess/testrun_hadoop-yarn-server-tests.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7996/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7996/console |


This message was automatically generated.

 The RM should handle the graceful shutdown of the NM.
 -

 Key: YARN-41
 URL: https://issues.apache.org/jira/browse/YARN-41
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Ravi Teja Ch N V
Assignee: Devaraj K
 Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
 MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
 YARN-41-4.patch, YARN-41-5.patch, YARN-41.patch


 Instead of waiting for the NM expiry, RM should remove and handle the NM, 
 which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-41) The RM should handle the graceful shutdown of the NM.


 [ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-41:
--
Attachment: (was: YARN-41-5.patch)

 The RM should handle the graceful shutdown of the NM.
 -

 Key: YARN-41
 URL: https://issues.apache.org/jira/browse/YARN-41
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Ravi Teja Ch N V
Assignee: Devaraj K
 Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
 MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
 YARN-41-4.patch, YARN-41.patch


 Instead of waiting for the NM expiry, RM should remove and handle the NM, 
 which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-41) The RM should handle the graceful shutdown of the NM.


 [ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-41:
--
Attachment: YARN-41-5.patch

 The RM should handle the graceful shutdown of the NM.
 -

 Key: YARN-41
 URL: https://issues.apache.org/jira/browse/YARN-41
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Ravi Teja Ch N V
Assignee: Devaraj K
 Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
 MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
 YARN-41-4.patch, YARN-41-5.patch, YARN-41.patch


 Instead of waiting for the NM expiry, RM should remove and handle the NM, 
 which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3626) On Windows localized resources are not moved to the front of the classpath when they should be


[ 
https://issues.apache.org/jira/browse/YARN-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551790#comment-14551790
 ] 

Hadoop QA commented on YARN-3626:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 42s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 22s | The applied patch generated  1 
new checkstyle issues (total was 240, now 238). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 21s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | mapreduce tests |   9m 26s | Tests passed in 
hadoop-mapreduce-client-app. |
| {color:green}+1{color} | mapreduce tests |   0m 45s | Tests passed in 
hadoop-mapreduce-client-common. |
| {color:green}+1{color} | yarn tests |   0m 30s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   6m  5s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  58m  2s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734054/YARN-3626.14.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / ce53c8e |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8015/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-mapreduce-client-app test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8015/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt
 |
| hadoop-mapreduce-client-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8015/artifact/patchprocess/testrun_hadoop-mapreduce-client-common.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8015/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8015/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8015/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8015/console |


This message was automatically generated.

 On Windows localized resources are not moved to the front of the classpath 
 when they should be
 --

 Key: YARN-3626
 URL: https://issues.apache.org/jira/browse/YARN-3626
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
 Environment: Windows
Reporter: Craig Welch
Assignee: Craig Welch
 Fix For: 2.7.1

 Attachments: YARN-3626.0.patch, YARN-3626.11.patch, 
 YARN-3626.14.patch, YARN-3626.4.patch, YARN-3626.6.patch, YARN-3626.9.patch


 In response to the mapreduce.job.user.classpath.first setting the classpath 
 is ordered differently so that localized resources will appear before system 
 classpath resources when tasks execute.  On Windows this does not work 
 because the localized resources are not linked into their final location when 
 the classpath jar is created.  To compensate for that localized jar resources 
 are added directly to the classpath generated for the jar rather than being 
 discovered from the localized directories.  Unfortunately, they are always 
 appended to the classpath, and so are never preferred over system resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3688) Remove unimplemented option for `hadoop fs -ls` from document in branch-2.7

2015-05-19 Thread Akira AJISAKA (JIRA)

Akira AJISAKA created YARN-3688:
---

 Summary: Remove unimplemented option for `hadoop fs -ls` from 
document in branch-2.7
 Key: YARN-3688
 URL: https://issues.apache.org/jira/browse/YARN-3688
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Akira AJISAKA


{{-t}}, {{-s}}, {{-R}}, and {{-u}} option for {{hadoop fs -ls}} are 
unimplemented in 2.7.0 but documented in 
http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-common/FileSystemShell.html#ls
We should fix the document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3651) Tracking url in ApplicationCLI wrong for running application


[ 
https://issues.apache.org/jira/browse/YARN-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551793#comment-14551793
 ] 

Bibin A Chundatt commented on YARN-3651:


Hi [~devaraj.k] and [~jianhe] i agree with comments , but i feel still 
improvement can be done on the same .
as per the security. After discussion with some of members *MR can define a 
separate config and have NM localize the key files for AM*
So i am reopening this bug as improvement. We can further discuss on this. 




 Tracking url in ApplicationCLI wrong for running application
 

 Key: YARN-3651
 URL: https://issues.apache.org/jira/browse/YARN-3651
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications, resourcemanager
Affects Versions: 2.7.0
 Environment: Suse 11 Sp3
Reporter: Bibin A Chundatt
Priority: Minor

 Application URL in Application CLI wrong
 Steps to reproduce
 ==
 1. Start HA setup insecure mode
 2.Configure HTTPS_ONLY
 3.Submit application to cluster
 4.Execute command ./yarn application -list
 5.Observer tracking URL shown
 {code}
 15/05/15 13:34:38 INFO client.AHSProxy: Connecting to Application History 
 server at /IP:45034
 Total number of applications (application-types: [] and states: [SUBMITTED, 
 ACCEPTED, RUNNING]):1
 Application-Id --- Tracking-URL
 application_1431672734347_0003   *http://host-10-19-92-117:13013*
 {code}
 *Expected*
 https://IP:64323/proxy/application_1431672734347_0003 /



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3651) Support SSL for AM webapp


 [ 
https://issues.apache.org/jira/browse/YARN-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-3651:
---
Summary: Support SSL for AM webapp  (was: Tracking url in ApplicationCLI 
wrong for running application)

 Support SSL for AM webapp
 -

 Key: YARN-3651
 URL: https://issues.apache.org/jira/browse/YARN-3651
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: applications, resourcemanager
Affects Versions: 2.7.0
 Environment: Suse 11 Sp3
Reporter: Bibin A Chundatt
Priority: Minor

 Application URL in Application CLI wrong
 Steps to reproduce
 ==
 1. Start HA setup insecure mode
 2.Configure HTTPS_ONLY
 3.Submit application to cluster
 4.Execute command ./yarn application -list
 5.Observer tracking URL shown
 {code}
 15/05/15 13:34:38 INFO client.AHSProxy: Connecting to Application History 
 server at /IP:45034
 Total number of applications (application-types: [] and states: [SUBMITTED, 
 ACCEPTED, RUNNING]):1
 Application-Id --- Tracking-URL
 application_1431672734347_0003   *http://host-10-19-92-117:13013*
 {code}
 *Expected*
 https://IP:64323/proxy/application_1431672734347_0003 /



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3651) Tracking url in ApplicationCLI wrong for running application


 [ 
https://issues.apache.org/jira/browse/YARN-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-3651:
---
Issue Type: Improvement  (was: Bug)

 Tracking url in ApplicationCLI wrong for running application
 

 Key: YARN-3651
 URL: https://issues.apache.org/jira/browse/YARN-3651
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: applications, resourcemanager
Affects Versions: 2.7.0
 Environment: Suse 11 Sp3
Reporter: Bibin A Chundatt
Priority: Minor

 Application URL in Application CLI wrong
 Steps to reproduce
 ==
 1. Start HA setup insecure mode
 2.Configure HTTPS_ONLY
 3.Submit application to cluster
 4.Execute command ./yarn application -list
 5.Observer tracking URL shown
 {code}
 15/05/15 13:34:38 INFO client.AHSProxy: Connecting to Application History 
 server at /IP:45034
 Total number of applications (application-types: [] and states: [SUBMITTED, 
 ACCEPTED, RUNNING]):1
 Application-Id --- Tracking-URL
 application_1431672734347_0003   *http://host-10-19-92-117:13013*
 {code}
 *Expected*
 https://IP:64323/proxy/application_1431672734347_0003 /



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (YARN-3651) Support SSL for AM webapp


 [ 
https://issues.apache.org/jira/browse/YARN-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt reopened YARN-3651:


 Support SSL for AM webapp
 -

 Key: YARN-3651
 URL: https://issues.apache.org/jira/browse/YARN-3651
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: applications, resourcemanager
Affects Versions: 2.7.0
 Environment: Suse 11 Sp3
Reporter: Bibin A Chundatt
Priority: Minor

 Application URL in Application CLI wrong
 Steps to reproduce
 ==
 1. Start HA setup insecure mode
 2.Configure HTTPS_ONLY
 3.Submit application to cluster
 4.Execute command ./yarn application -list
 5.Observer tracking URL shown
 {code}
 15/05/15 13:34:38 INFO client.AHSProxy: Connecting to Application History 
 server at /IP:45034
 Total number of applications (application-types: [] and states: [SUBMITTED, 
 ACCEPTED, RUNNING]):1
 Application-Id --- Tracking-URL
 application_1431672734347_0003   *http://host-10-19-92-117:13013*
 {code}
 *Expected*
 https://IP:64323/proxy/application_1431672734347_0003 /



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3646) Applications are getting stuck some times in case of retry policy forever

2015-05-19 Thread Raju Bairishetti (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raju Bairishetti updated YARN-3646:
---
Attachment: YARN-3646.001.patch

Added a new unit test in hadoop-yarn-client. [~rohithsharma] Could you please 
review?

Ran the test without starting the RM and then test was getting timeout.

Ran the test by starting the RM then client is getting 
ApplicationNotFoundException for older/invalid appId.
{code}
  rm = new ResourceManager();
  rm.init(conf);
  rm.start();
{code}

 Applications are getting stuck some times in case of retry policy forever
 -

 Key: YARN-3646
 URL: https://issues.apache.org/jira/browse/YARN-3646
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Reporter: Raju Bairishetti
 Attachments: YARN-3646.001.patch, YARN-3646.patch


 We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
 retry policy.
 Yarn client is infinitely retrying in case of exceptions from the RM as it is 
 using retrying policy as FOREVER. The problem is it is retrying for all kinds 
 of exceptions (like ApplicationNotFoundException), even though it is not a 
 connection failure. Due to this my application is not progressing further.
 *Yarn client should not retry infinitely in case of non connection failures.*
 We have written a simple yarn-client which is trying to get an application 
 report for an invalid  or older appId. ResourceManager is throwing an 
 ApplicationNotFoundException as this is an invalid or older appId.  But 
 because of retry policy FOREVER, client is keep on retrying for getting the 
 application report and ResourceManager is throwing 
 ApplicationNotFoundException continuously.
 {code}
 private void testYarnClientRetryPolicy() throws  Exception{
 YarnConfiguration conf = new YarnConfiguration();
 conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
 -1);
 YarnClient yarnClient = YarnClient.createYarnClient();
 yarnClient.init(conf);
 yarnClient.start();
 ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
 10645);
 ApplicationReport report = yarnClient.getApplicationReport(appId);
 }
 {code}
 *RM logs:*
 {noformat}
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875162 Retry#0
 org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
 with id 'application_1430126768987_10645' doesn't exist in RM.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
   at 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
   at 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 47 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875163 Retry#0
 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3685) NodeManager unnecessarily knows about classpath-jars due to Windows limitations


[ 
https://issues.apache.org/jira/browse/YARN-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551482#comment-14551482
 ] 

Vinod Kumar Vavilapalli commented on YARN-3685:
---

My first reaction is that this jar can be generated by the app (say MR client 
or AM) and passed to YARN.

In the worst case, this should get thrown out of ContainerExecutor and instead 
become a specific implementation only on the Windows-executor which inspects 
env variables and does this mangling.

/cc [~cnauroth], [~rusanu] who worked in this area before.

Thoughts?

 NodeManager unnecessarily knows about classpath-jars due to Windows 
 limitations
 ---

 Key: YARN-3685
 URL: https://issues.apache.org/jira/browse/YARN-3685
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli

 Found this while looking at cleaning up ContainerExecutor via YARN-3648, 
 making it a sub-task.
 YARN *should not* know about classpaths. Our original design modeled around 
 this. But when we added windows suppport, due to classpath issues, we ended 
 up breaking this abstraction via YARN-316. We should clean this up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2268) Disallow formatting the RMStateStore when there is an RM running

[
https://issues.apache.org/jira/browse/YARN-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551826#comment-14551826
]

Rohith commented on YARN-2268:
--

Thanks [~sunilg] [~jianhe] [~kasha] for sharing your thoughts..
bq. Given we recommend using the ZK-store when using HA, how about adding this
for the ZK-store using an ephemeral znode for lock first?
+1 given state store recommend for ZKRMStateStore.

bq. How about creating a lock file and declaring it stale after a stipulated
period of time.
If we use stipulated period, am thinking that within the stiplated period,
neither RM cant be started nor state store format cant be done. And the file
has to be stored in hdfs neverthless of RMStateStore which is extra binding
with filesytem.

I am thinking , why can't we use general approach of polling the web service,
it will give more accurate state. ?

Disallow formatting the RMStateStore when there is an RM running

Key: YARN-2268
URL: https://issues.apache.org/jira/browse/YARN-2268
Project: Hadoop YARN
Issue Type: Improvement
Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Rohith
Attachments: 0001-YARN-2268.patch

YARN-2131 adds a way to format the RMStateStore. However, it can be a problem
if we format the store while an RM is actively using it. It would be nice to
fail the format if there is an RM running and using this store.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2923) Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup


[ 
https://issues.apache.org/jira/browse/YARN-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551841#comment-14551841
 ] 

Hadoop QA commented on YARN-2923:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 38s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 34s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 38s | The applied patch generated  1 
new checkstyle issues (total was 214, now 215). |
| {color:green}+1{color} | whitespace |   0m  3s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 47s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 21s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 58s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   6m  8s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  48m 52s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733391/YARN-2923.20150517-1.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / ce53c8e |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8016/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8016/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8016/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8016/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8016/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8016/console |


This message was automatically generated.

 Support configuration based NodeLabelsProvider Service in Distributed Node 
 Label Configuration Setup 
 -

 Key: YARN-2923
 URL: https://issues.apache.org/jira/browse/YARN-2923
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
 Fix For: 2.8.0

 Attachments: YARN-2923.20141204-1.patch, YARN-2923.20141210-1.patch, 
 YARN-2923.20150328-1.patch, YARN-2923.20150404-1.patch, 
 YARN-2923.20150517-1.patch


 As part of Distributed Node Labels configuration we need to support Node 
 labels to be configured in Yarn-site.xml. And on modification of Node Labels 
 configuration in yarn-site.xml, NM should be able to get modified Node labels 
 from this NodeLabelsprovider service without NM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3678) DelayedProcessKiller may kill other process other than container

2015-05-19 Thread Varun Saxena (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551854#comment-14551854
]

Varun Saxena commented on YARN-3678:

[~vinodkv], as this issue happened in our customer deployment, I will explain
the issue. We got an issue wherein NM was being randomly killed at one of the
places where Hadoop distribution is deployed. In logs, we could see NM being
killed immediately after {{signalContainer}} is called. What happens is as
under :
# LCE sends a SIGTERM to the container and waits for 250 ms
# Probably within this 250 ms period, container processes the signal and exits
gracefully.
# Now it is possible the pid assigned to container is taken up by some other
process or thread(which run as light weight processes in Linux).
# When LCE again tries to send a SIGKILL to the same pid, it might actually be
sending it to another process or thread.
# As we could not find any other reason for NM going randomly down, we suspect
it may have gone down because some new thread of NM took up this pid and
SIGKILL may have been sent to it, which may have crashed NM. This is more based
on suspicion though rather than fool proof analysis. Not sure how to verify if
this indeed happened.

Pls note that {{pid_max}} in the deployment was {{32768}}.
I am not sure about which user was the process owner though. Probably [~gu chi]
can shed some light on that.
An additional check can be done IMHO.

DelayedProcessKiller may kill other process other than container

Key: YARN-3678
URL: https://issues.apache.org/jira/browse/YARN-3678
Project: Hadoop YARN
Issue Type: Bug
Components: nodemanager
Affects Versions: 2.6.0
Reporter: gu-chi
Priority: Critical

Suppose one container finished, then it will do clean up, the PID file still
exist and will trigger once singalContainer, this will kill the process with
the pid in PID file, but as container already finished, so this PID may be
occupied by other process, this may cause serious issue.
As I know, my NM was killed unexpectedly, what I described can be the cause.
Even rarely occur.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3655) FairScheduler: potential livelock due to maxAMShare limitation and container reservation

2015-05-19 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551506#comment-14551506
 ] 

Karthik Kambatla commented on YARN-3655:


If allocating a container is going to take the amShare over the maxAMShare, not 
allocating and hence unreserving resources seems reasonable. That said, we 
should also add the same check before making such a reservation in 
FSAppAttempt#assignContainer.

There is already a check to ensure we won't go over maxShare. In terms of code 
organization, I would like for us to create a helper method 
(okayToReserveResources) that would check the maxShare for all containers and 
maxAMShare for AM containers. 

Also, looking at the code, I see fitsInMaxShare method is a static in 
FairScheduler. We should just make it a non-static method in FSQueue, it can 
call parent.fitsInMaxShare. Can we file a follow-up JIRA for it? 

 FairScheduler: potential livelock due to maxAMShare limitation and container 
 reservation 
 -

 Key: YARN-3655
 URL: https://issues.apache.org/jira/browse/YARN-3655
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3655.000.patch, YARN-3655.001.patch


 FairScheduler: potential livelock due to maxAMShare limitation and container 
 reservation.
 If a node is reserved by an application, all the other applications don't 
 have any chance to assign a new container on this node, unless the 
 application which reserves the node assigns a new container on this node or 
 releases the reserved container on this node.
 The problem is if an application tries to call assignReservedContainer and 
 fail to get a new container due to maxAMShare limitation, it will block all 
 other applications to use the nodes it reserves. If all other running 
 applications can't release their AM containers due to being blocked by these 
 reserved containers. A livelock situation can happen.
 The following is the code at FSAppAttempt#assignContainer which can cause 
 this potential livelock.
 {code}
 // Check the AM resource usage for the leaf queue
 if (!isAmRunning()  !getUnmanagedAM()) {
   ListResourceRequest ask = appSchedulingInfo.getAllResourceRequests();
   if (ask.isEmpty() || !getQueue().canRunAppAM(
   ask.get(0).getCapability())) {
 if (LOG.isDebugEnabled()) {
   LOG.debug(Skipping allocation because maxAMShare limit would  +
   be exceeded);
 }
 return Resources.none();
   }
 }
 {code}
 To fix this issue, we can unreserve the node if we can't allocate the AM 
 container on the node due to Max AM share limitation and the node is reserved 
 by the application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3645) ResourceManager can't start success if attribute value of aclSubmitApps is null in fair-scheduler.xml

2015-05-19 Thread Gabor Liptak (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Liptak updated YARN-3645:
---
Attachment: YARN-3645.patch

 ResourceManager can't start success if  attribute value of aclSubmitApps is 
 null in fair-scheduler.xml
 

 Key: YARN-3645
 URL: https://issues.apache.org/jira/browse/YARN-3645
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.2
Reporter: zhoulinlin
 Attachments: YARN-3645.patch


 The aclSubmitApps is configured in fair-scheduler.xml like below:
 queue name=mr
 aclSubmitApps/aclSubmitApps
  /queue
 The resourcemanager log:
 2015-05-14 12:59:48,623 INFO org.apache.hadoop.service.AbstractService: 
 Service ResourceManager failed in state INITED; cause: 
 org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
 to initialize FairScheduler
 org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
 to initialize FairScheduler
   at 
 org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
   at 
 org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:493)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:920)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:240)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1159)
 Caused by: java.io.IOException: Failed to initialize FairScheduler
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1301)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1318)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   ... 7 more
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:458)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:337)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1299)
   ... 9 more
 2015-05-14 12:59:48,623 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning 
 to standby state
 2015-05-14 12:59:48,623 INFO 
 com.zte.zdh.platformplugin.factory.YarnPlatformPluginProxyFactory: plugin 
 transitionToStandbyIn
 2015-05-14 12:59:48,623 WARN org.apache.hadoop.service.AbstractService: When 
 stopping the service ResourceManager : java.lang.NullPointerException
 java.lang.NullPointerException
   at 
 com.zte.zdh.platformplugin.factory.YarnPlatformPluginProxyFactory.transitionToStandbyIn(YarnPlatformPluginProxyFactory.java:71)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToStandby(ResourceManager.java:997)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1058)
   at 
 org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
   at 
 org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
   at 
 org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:171)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1159)
 2015-05-14 12:59:48,623 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
 ResourceManager
 org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
 to initialize FairScheduler
   at 
 org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
   at 
 org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:493)

[jira] [Commented] (YARN-3678) DelayedProcessKiller may kill other process other than container


[ 
https://issues.apache.org/jira/browse/YARN-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551666#comment-14551666
 ] 

Vinod Kumar Vavilapalli commented on YARN-3678:
---

The default delay is 250 milliseconds. So it is very hard to hit this condition.

At least when LinuxContainerExecutor is used, the kill is done as the user 
itself, so it's unlikely it will affect other users' processes.

Other than also doing a user-check to ensure its the same user's container, I 
am not sure what else can be done.

 DelayedProcessKiller may kill other process other than container
 

 Key: YARN-3678
 URL: https://issues.apache.org/jira/browse/YARN-3678
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: gu-chi
Priority: Critical

 Suppose one container finished, then it will do clean up, the PID file still 
 exist and will trigger once singalContainer, this will kill the process with 
 the pid in PID file, but as container already finished, so this PID may be 
 occupied by other process, this may cause serious issue.
 As I know, my NM was killed unexpectedly, what I described can be the cause. 
 Even rarely occur.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3645) ResourceManager can't start success if attribute value of aclSubmitApps is null in fair-scheduler.xml


[ 
https://issues.apache.org/jira/browse/YARN-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551742#comment-14551742
 ] 

Hadoop QA commented on YARN-3645:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 50s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 37s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 47s | The applied patch generated  7 
new checkstyle issues (total was 27, now 33). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 15s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m 11s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  86m 47s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734036/YARN-3645.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / ce53c8e |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8014/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8014/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8014/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8014/console |


This message was automatically generated.

 ResourceManager can't start success if  attribute value of aclSubmitApps is 
 null in fair-scheduler.xml
 

 Key: YARN-3645
 URL: https://issues.apache.org/jira/browse/YARN-3645
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.2
Reporter: zhoulinlin
 Attachments: YARN-3645.patch


 The aclSubmitApps is configured in fair-scheduler.xml like below:
 queue name=mr
 aclSubmitApps/aclSubmitApps
  /queue
 The resourcemanager log:
 2015-05-14 12:59:48,623 INFO org.apache.hadoop.service.AbstractService: 
 Service ResourceManager failed in state INITED; cause: 
 org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
 to initialize FairScheduler
 org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
 to initialize FairScheduler
   at 
 org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
   at 
 org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:493)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:920)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:240)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1159)
 Caused by: java.io.IOException: Failed to initialize FairScheduler
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1301)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1318)
   at

[jira] [Commented] (YARN-221) NM should provide a way for AM to tell it not to aggregate logs.

2015-05-19 Thread Ming Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551759#comment-14551759
 ] 

Ming Ma commented on YARN-221:
--

Thanks [~xgong]. You raise some valid points about abstraction. Here are my 
takes on this.

It appears the main requirements are:

* There needs to be a cluster-wide default log aggregation policy at YARN 
layer. That should be extensible. To change it and add a new policy, it is ok 
to require NM restart given NM needs to load the policy object.
* Any YARN application can override the default YARN policy with its own the 
log aggregation policy. This application specific policy can come from the list 
of available policies provided at YARN layer. There is no need to provide the 
ability for the application to submit a new policy implementation on the fly.

Given these:

* Abstraction via interface seem like a good idea. 
ContainerLogAggregationPolicy interface can include the following method to 
address all the policies that we know of so far. However, it seems we might end 
up with many policies given the possible permutation, e.g., 
AMContainerLogAndFailWorkerContainerOnlyLogAggregationPolicy, 
AMContainerLogAndFailOrKilledWorkerContainerOnlyLogAggregationPolicy, etc.

{noformat}
public interface ContainerLogAggregationPolicy {
public boolean shouldDoLogAggregation(ContainerId containerId,  int 
exitCode);
}
{noformat}

* The cluster-wide default policy at YARN layer is configurable.

{noformat}
property
nameyarn.nodemanager.container-log-aggregation-policy.class/name

valueorg.apache.hadoop.yarn.server.nodemanager.container-log-aggregation-policy.AllContainerLogAggregationPolicy/value
/property
{noformat}

* All the known policies will be part of YARN including 
SampleRateContainerLogAggregationPolicy. So we still need to config sample rate 
for that policy. If we don't put it in YarnConfiguration, where can we put it? 
It seems we already have a bunch of configuration properties in 
YarnConfiguration that are specific the plugin implementation such as container 
executor properties.

* Should ContainerLogAggregationPolicy be part of ContainerLaunchContext or 
LogAggregationContext. It seems LogAggregationContext is a better fit. That 
also means ContainerLogAggregationPolicy will be specified as part of 
ApplicationSubmissionContext. For application to specify a log policy, the 
policy class needs to be loadable by NM. So the LogAggregationContext will have 
new methods like:

{noformat}
public abstract class LogAggregationContext {
public void setContainerLogPolicyClass(Class? extends 
ContainerLogAggregationPolicy logPolicy);
public Class? extends ContainerLogAggregationPolicy 
getContainerLogPolicyClass();
}
{noformat}


* How MR overrides the default policy. Maybe we can have YarnRunner at MR level 
honor yarn property yarn.container-log-aggregation-policy.class on per job 
level when it creates the ApplicationSubmissionContext with the proper 
LogAggregationContext. In that way we don't have to create extra log 
aggregation properties specific at MR layer.

 NM should provide a way for AM to tell it not to aggregate logs.
 

 Key: YARN-221
 URL: https://issues.apache.org/jira/browse/YARN-221
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: log-aggregation, nodemanager
Reporter: Robert Joseph Evans
Assignee: Ming Ma
 Attachments: YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch, 
 YARN-221-trunk-v3.patch, YARN-221-trunk-v4.patch, YARN-221-trunk-v5.patch


 The NodeManager should provide a way for an AM to tell it that either the 
 logs should not be aggregated, that they should be aggregated with a high 
 priority, or that they should be aggregated but with a lower priority.  The 
 AM should be able to do this in the ContainerLaunch context to provide a 
 default value, but should also be able to update the value when the container 
 is released.
 This would allow for the NM to not aggregate logs in some cases, and avoid 
 connection to the NN at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3685) NodeManager unnecessarily knows about classpath-jars due to Windows limitations

Vinod Kumar Vavilapalli created YARN-3685:
-

 Summary: NodeManager unnecessarily knows about classpath-jars due 
to Windows limitations
 Key: YARN-3685
 URL: https://issues.apache.org/jira/browse/YARN-3685
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli


Found this while looking at cleaning up ContainerExecutor via YARN-3648, making 
it a sub-task.

YARN *should not* know about classpaths. Our original design modeled around 
this. But when we added windows suppport, due to classpath issues, we ended up 
breaking this abstraction via YARN-316. We should clean this up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree


[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551706#comment-14551706
 ] 

Hadoop QA commented on YARN-2336:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 59s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 47s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 58s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   3m  0s | Site still builds. |
| {color:green}+1{color} | checkstyle |   0m 47s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 17s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  49m 59s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  93m 19s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734018/YARN-2336.009.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle site |
| git revision | trunk / 7401e5b |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8013/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8013/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8013/console |


This message was automatically generated.

 Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
 --

 Key: YARN-2336
 URL: https://issues.apache.org/jira/browse/YARN-2336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1, 2.6.0
Reporter: Kenji Kikushima
Assignee: Akira AJISAKA
  Labels: BB2015-05-RFC
 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
 YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, 
 YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch


 When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
 blacket JSON for childQueues.
 This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3565) NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String


[ 
https://issues.apache.org/jira/browse/YARN-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551477#comment-14551477
 ] 

Hudson commented on YARN-3565:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #7870 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7870/])
YARN-3565. NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel 
object instead of String. (Naganarasimha G R via wangda) (wangda: rev 
b37da52a1c4fb3da2bd21bfadc5ec61c5f953a59)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RegisterNodeManagerRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/TestYarnServerApiClasses.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdaterForLabels.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/NodeLabelTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/NodeLabelsProvider.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java


 NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object 
 instead of String
 -

 Key: YARN-3565
 URL: https://issues.apache.org/jira/browse/YARN-3565
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
Priority: Blocker
 Fix For: 2.8.0

 Attachments: YARN-3565-20150502-1.patch, YARN-3565.20150515-1.patch, 
 YARN-3565.20150516-1.patch, YARN-3565.20150519-1.patch


 Now NM HB/Register uses SetString, it will be hard to add new fields if we 
 want to support specifying NodeLabel type such as exclusivity/constraints, 
 etc. We need to make sure rolling upgrade works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3677) Fix findbugs warnings in yarn-server-resourcemanager


 [ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa updated YARN-3677:
-
Summary: Fix findbugs warnings in yarn-server-resourcemanager  (was: Fix 
findbugs warnings in FileSystemRMStateStore)

 Fix findbugs warnings in yarn-server-resourcemanager
 

 Key: YARN-3677
 URL: https://issues.apache.org/jira/browse/YARN-3677
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Akira AJISAKA
Assignee: Vinod Kumar Vavilapalli
Priority: Minor
  Labels: newbie
 Attachments: YARN-3677-20150519.txt


 There is 1 findbugs warning in FileSystemRMStateStore.java.
 {noformat}
 Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
 time
 Unsynchronized access at FileSystemRMStateStore.java: [line 156]
 Field 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 Synchronized 66% of the time
 Synchronized access at FileSystemRMStateStore.java: [line 148]
 Synchronized access at FileSystemRMStateStore.java: [line 859]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3677) Fix findbugs warnings in FileSystemRMStateStore


 [ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa updated YARN-3677:
-
Summary: Fix findbugs warnings in FileSystemRMStateStore  (was: Fix 
findbugs warnings in yarn-server-resourcemanager)

 Fix findbugs warnings in FileSystemRMStateStore
 ---

 Key: YARN-3677
 URL: https://issues.apache.org/jira/browse/YARN-3677
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Akira AJISAKA
Assignee: Vinod Kumar Vavilapalli
Priority: Minor
  Labels: newbie
 Attachments: YARN-3677-20150519.txt


 There is 1 findbugs warning in FileSystemRMStateStore.java.
 {noformat}
 Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
 time
 Unsynchronized access at FileSystemRMStateStore.java: [line 156]
 Field 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 Synchronized 66% of the time
 Synchronized access at FileSystemRMStateStore.java: [line 148]
 Synchronized access at FileSystemRMStateStore.java: [line 859]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3677) Fix findbugs warnings in yarn-server-resourcemanager


[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551526#comment-14551526
 ] 

Hudson commented on YARN-3677:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7872 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7872/])
YARN-3677. Fix findbugs warnings in yarn-server-resourcemanager. Contributed by 
Vinod Kumar Vavilapalli. (ozawa: rev 7401e5b5e8060b6b027d714b5ceb641fcfe5b598)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java


 Fix findbugs warnings in yarn-server-resourcemanager
 

 Key: YARN-3677
 URL: https://issues.apache.org/jira/browse/YARN-3677
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Akira AJISAKA
Assignee: Vinod Kumar Vavilapalli
Priority: Minor
  Labels: newbie
 Fix For: 2.7.1

 Attachments: YARN-3677-20150519.txt


 There is 1 findbugs warning in FileSystemRMStateStore.java.
 {noformat}
 Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
 time
 Unsynchronized access at FileSystemRMStateStore.java: [line 156]
 Field 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 Synchronized 66% of the time
 Synchronized access at FileSystemRMStateStore.java: [line 148]
 Synchronized access at FileSystemRMStateStore.java: [line 859]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3687) We should be able to remove node-label if there's no queue can use it.

Wangda Tan created YARN-3687:


 Summary: We should be able to remove node-label if there's no 
queue can use it.
 Key: YARN-3687
 URL: https://issues.apache.org/jira/browse/YARN-3687
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Wangda Tan
Assignee: Wangda Tan


Currently, we cannot remove node label from the cluster if there's no queue 
configure it, but actually we should be able to remove it if capacity on the 
node label in root queue is 0. This can avoid painful when user wants to 
reconfigure node label.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-3686) CapacityScheduler should trim default_node_label_expression

2015-05-19 Thread Sunil G (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G reassigned YARN-3686:
-

Assignee: Sunil G

 CapacityScheduler should trim default_node_label_expression
 ---

 Key: YARN-3686
 URL: https://issues.apache.org/jira/browse/YARN-3686
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Sunil G
Priority: Critical

 We should trim default_node_label_expression for queue before using it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3678) DelayedProcessKiller may kill other process other than container


[ 
https://issues.apache.org/jira/browse/YARN-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551681#comment-14551681
 ] 

gu-chi commented on YARN-3678:
--

I see the possibility is low, but with heavy task load, it occurs frequently. I 
would suggest to add a check before kill, check if the process ID belongs to 
the container.

 DelayedProcessKiller may kill other process other than container
 

 Key: YARN-3678
 URL: https://issues.apache.org/jira/browse/YARN-3678
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: gu-chi
Priority: Critical

 Suppose one container finished, then it will do clean up, the PID file still 
 exist and will trigger once singalContainer, this will kill the process with 
 the pid in PID file, but as container already finished, so this PID may be 
 occupied by other process, this may cause serious issue.
 As I know, my NM was killed unexpectedly, what I described can be the cause. 
 Even rarely occur.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3626) On Windows localized resources are not moved to the front of the classpath when they should be

2015-05-19 Thread Craig Welch (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3626:
--
Attachment: YARN-3626.14.patch

 On Windows localized resources are not moved to the front of the classpath 
 when they should be
 --

 Key: YARN-3626
 URL: https://issues.apache.org/jira/browse/YARN-3626
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
 Environment: Windows
Reporter: Craig Welch
Assignee: Craig Welch
 Fix For: 2.7.1

 Attachments: YARN-3626.0.patch, YARN-3626.11.patch, 
 YARN-3626.14.patch, YARN-3626.4.patch, YARN-3626.6.patch, YARN-3626.9.patch


 In response to the mapreduce.job.user.classpath.first setting the classpath 
 is ordered differently so that localized resources will appear before system 
 classpath resources when tasks execute.  On Windows this does not work 
 because the localized resources are not linked into their final location when 
 the classpath jar is created.  To compensate for that localized jar resources 
 are added directly to the classpath generated for the jar rather than being 
 discovered from the localized directories.  Unfortunately, they are always 
 appended to the classpath, and so are never preferred over system resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3681) yarn cmd says could not find main class 'queue' in windows

2015-05-19 Thread Craig Welch (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551749#comment-14551749
 ] 

Craig Welch commented on YARN-3681:
---

Tested my own version of this patch yesterday which does the same thing and 
works, so +1 LGTM

 yarn cmd says could not find main class 'queue' in windows
 

 Key: YARN-3681
 URL: https://issues.apache.org/jira/browse/YARN-3681
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
 Environment: Windows Only
Reporter: Sumana Sathish
Assignee: Varun Saxena
Priority: Blocker
  Labels: windows, yarn-client
 Attachments: YARN-3681.01.patch, yarncmd.png


 Attached the screenshot of the command prompt in windows running yarn queue 
 command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2268) Disallow formatting the RMStateStore when there is an RM running

2015-05-19 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551745#comment-14551745
 ] 

Karthik Kambatla commented on YARN-2268:


Given we recommend using the ZK-store when using HA, how about adding this for 
the ZK-store using an ephemeral znode for lock first? 

We could think of alternate ways for other stores. How about creating a lock 
file and declaring it stale after a stipulated period of time. It is a hacky 
approach, but might suffice? 

 Disallow formatting the RMStateStore when there is an RM running
 

 Key: YARN-2268
 URL: https://issues.apache.org/jira/browse/YARN-2268
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Rohith
 Attachments: 0001-YARN-2268.patch


 YARN-2131 adds a way to format the RMStateStore. However, it can be a problem 
 if we format the store while an RM is actively using it. It would be nice to 
 fail the format if there is an RM running and using this store. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3684) Change ContainerExecutor's primary lifecycle methods to use a more extensible mechanism for passing information.

2015-05-19 Thread Sidharta Seethana (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sidharta Seethana updated YARN-3684:

Attachment: YARN-3684.002.patch

Attaching a (correctly-named) patch with fixes to findbugs issues

 Change ContainerExecutor's primary lifecycle methods to use a more extensible 
 mechanism for passing information. 
 -

 Key: YARN-3684
 URL: https://issues.apache.org/jira/browse/YARN-3684
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: yarn
Reporter: Sidharta Seethana
Assignee: Sidharta Seethana
 Attachments: YARN-3648.001.patch, YARN-3684.002.patch


 As per description in parent JIRA :  Adding additional arguments to key 
 ContainerExecutor methods ( e.g startLocalizer or launchContainer ) would 
 break the existing ContainerExecutor interface and would require changes to 
 all executor implementations in YARN. In order to make this interface less 
 brittle in the future, it would make sense to encapsulate arguments in some 
 kind of a ‘context’ object which could be modified/extended without breaking 
 the ContainerExecutor interface in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.


[ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551513#comment-14551513
 ] 

Hudson commented on YARN-3583:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7871 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7871/])
YARN-3583. Support of NodeLabel object instead of plain String in YarnClient 
side. (Sunil G via wangda) (wangda: rev 
563eb1ad2ae848a23bbbf32ebfaf107e8fa14e87)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/ReplaceLabelsOnNodeRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetNodesToLabelsResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetLabelsToNodesResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetNodesToLabelsResponse.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ResourceMgrDelegate.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/YarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetLabelsToNodesResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java


 Support of NodeLabel object instead of plain String in YarnClient side.
 ---

 Key: YARN-3583
 URL: https://issues.apache.org/jira/browse/YARN-3583
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.6.0
Reporter: Sunil G
Assignee: Sunil G
 Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
 0003-YARN-3583.patch, 0004-YARN-3583.patch


 Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
 getLabelsToNodes/getNodeToLabels api's can use NodeLabel object instead of 
 using plain label name.
 This will help to bring other label details such as Exclusivity to client 
 side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3667) Fix findbugs warning Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS


[ 
https://issues.apache.org/jira/browse/YARN-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551543#comment-14551543
 ] 

Tsuyoshi Ozawa commented on YARN-3667:
--

[~zxu] [~leftnoteasy] thank you for taking this issue. I've missed this issue - 
YARN-3677 has been committed a few minutes ago. Can we close this as duplicated?

 Fix findbugs warning Inconsistent synchronization of 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 -

 Key: YARN-3667
 URL: https://issues.apache.org/jira/browse/YARN-3667
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3667.000.patch


 Fix findbugs warning Inconsistent synchronization of 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 This  findbugs warning is reported at 
 https://builds.apache.org/job/PreCommit-YARN-Build/7956/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect

2015-05-19 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551651#comment-14551651
 ] 

Weiwei Yang commented on YARN-3601:
---

Thank you [~xgong]

 Fix UT TestRMFailover.testRMWebAppRedirect
 --

 Key: YARN-3601
 URL: https://issues.apache.org/jira/browse/YARN-3601
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, webapp
 Environment: Red Hat Enterprise Linux Workstation release 6.5 
 (Santiago)
Reporter: Weiwei Yang
Assignee: Weiwei Yang
Priority: Critical
  Labels: test
 Fix For: 2.7.1

 Attachments: YARN-3601.001.patch


 This test case was not working since the commit from YARN-2605. It failed 
 with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3681) yarn cmd says could not find main class 'queue' in windows


 [ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-3681:
--
Priority: Blocker  (was: Critical)
Target Version/s: 2.7.1  (was: 2.8.0)

Sounds like a 2.7.1 blocker to me..

 yarn cmd says could not find main class 'queue' in windows
 

 Key: YARN-3681
 URL: https://issues.apache.org/jira/browse/YARN-3681
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
 Environment: Windows Only
Reporter: Sumana Sathish
Assignee: Varun Saxena
Priority: Blocker
  Labels: windows, yarn-client
 Attachments: YARN-3681.01.patch, yarncmd.png


 Attached the screenshot of the command prompt in windows running yarn queue 
 command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2729) Support script based NodeLabelsProvider Interface in Distributed Node Label Configuration Setup

2015-05-19 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551769#comment-14551769
 ] 

Naganarasimha G R commented on YARN-2729:
-

Hi [~wangda]  [~vinodkv],
Any further thoughts on the above comments ?

 Support script based NodeLabelsProvider Interface in Distributed Node Label 
 Configuration Setup
 ---

 Key: YARN-2729
 URL: https://issues.apache.org/jira/browse/YARN-2729
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
 Attachments: YARN-2729.20141023-1.patch, YARN-2729.20141024-1.patch, 
 YARN-2729.20141031-1.patch, YARN-2729.20141120-1.patch, 
 YARN-2729.20141210-1.patch, YARN-2729.20150309-1.patch, 
 YARN-2729.20150322-1.patch, YARN-2729.20150401-1.patch, 
 YARN-2729.20150402-1.patch, YARN-2729.20150404-1.patch, 
 YARN-2729.20150517-1.patch


 Support script based NodeLabelsProvider Interface in Distributed Node Label 
 Configuration Setup . 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3667) Fix findbugs warning Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS


[ 
https://issues.apache.org/jira/browse/YARN-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551426#comment-14551426
 ] 

Wangda Tan commented on YARN-3667:
--

+1, will commit it later.

 Fix findbugs warning Inconsistent synchronization of 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 -

 Key: YARN-3667
 URL: https://issues.apache.org/jira/browse/YARN-3667
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3667.000.patch


 Fix findbugs warning Inconsistent synchronization of 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 This  findbugs warning is reported at 
 https://builds.apache.org/job/PreCommit-YARN-Build/7956/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3667) Fix findbugs warning Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS


 [ 
https://issues.apache.org/jira/browse/YARN-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3667:
-
Fix Version/s: 2.8.0

 Fix findbugs warning Inconsistent synchronization of 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 -

 Key: YARN-3667
 URL: https://issues.apache.org/jira/browse/YARN-3667
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3667.000.patch


 Fix findbugs warning Inconsistent synchronization of 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 This  findbugs warning is reported at 
 https://builds.apache.org/job/PreCommit-YARN-Build/7956/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3609) Move load labels from storage from serviceInit to serviceStart to make it works with RM HA case.


[ 
https://issues.apache.org/jira/browse/YARN-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551428#comment-14551428
 ] 

Wangda Tan commented on YARN-3609:
--

Findbugs warning is tracked by: https://issues.apache.org/jira/browse/YARN-3667

 Move load labels from storage from serviceInit to serviceStart to make it 
 works with RM HA case.
 

 Key: YARN-3609
 URL: https://issues.apache.org/jira/browse/YARN-3609
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3609.1.preliminary.patch, YARN-3609.2.patch, 
 YARN-3609.3.patch


 Now RMNodeLabelsManager loads label when serviceInit, but 
 RMActiveService.start() is called when RM HA transition happens.
 We haven't done this before because queue's initialization happens in 
 serviceInit as well, we need make sure labels added to system before init 
 queue, after YARN-2918, we should be able to do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3677) Fix findbugs warnings in yarn-server-resourcemanager


[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551467#comment-14551467
 ] 

Hadoop QA commented on YARN-3677:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | patch |   0m  1s | The patch file was not named 
according to hadoop's naming conventions. Please see 
https://wiki.apache.org/hadoop/HowToContribute for instructions. |
| {color:blue}0{color} | pre-patch |  14m 35s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 25s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 15s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m  3s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  85m 55s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733965/YARN-3677-20150519.txt 
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 7438966 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8011/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8011/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8011/console |


This message was automatically generated.

 Fix findbugs warnings in yarn-server-resourcemanager
 

 Key: YARN-3677
 URL: https://issues.apache.org/jira/browse/YARN-3677
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Akira AJISAKA
Assignee: Vinod Kumar Vavilapalli
Priority: Minor
  Labels: newbie
 Attachments: YARN-3677-20150519.txt


 There is 1 findbugs warning in FileSystemRMStateStore.java.
 {noformat}
 Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
 time
 Unsynchronized access at FileSystemRMStateStore.java: [line 156]
 Field 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 Synchronized 66% of the time
 Synchronized access at FileSystemRMStateStore.java: [line 148]
 Synchronized access at FileSystemRMStateStore.java: [line 859]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3677) Fix findbugs warnings in yarn-server-resourcemanager


[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551494#comment-14551494
 ] 

Tsuyoshi Ozawa commented on YARN-3677:
--

+1, committing this shortly.

 Fix findbugs warnings in yarn-server-resourcemanager
 

 Key: YARN-3677
 URL: https://issues.apache.org/jira/browse/YARN-3677
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Akira AJISAKA
Assignee: Vinod Kumar Vavilapalli
Priority: Minor
  Labels: newbie
 Attachments: YARN-3677-20150519.txt


 There is 1 findbugs warning in FileSystemRMStateStore.java.
 {noformat}
 Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
 time
 Unsynchronized access at FileSystemRMStateStore.java: [line 156]
 Field 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 Synchronized 66% of the time
 Synchronized access at FileSystemRMStateStore.java: [line 148]
 Synchronized access at FileSystemRMStateStore.java: [line 859]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3684) Change ContainerExecutor's primary lifecycle methods to use a more extensible mechanism for passing information.

2015-05-19 Thread Sidharta Seethana (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551566#comment-14551566
 ] 

Sidharta Seethana commented on YARN-3684:
-

[~vinodkv] , could you please review this patch? Thanks.


 Change ContainerExecutor's primary lifecycle methods to use a more extensible 
 mechanism for passing information. 
 -

 Key: YARN-3684
 URL: https://issues.apache.org/jira/browse/YARN-3684
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: yarn
Reporter: Sidharta Seethana
Assignee: Sidharta Seethana
 Attachments: YARN-3648.001.patch, YARN-3684.002.patch


 As per description in parent JIRA :  Adding additional arguments to key 
 ContainerExecutor methods ( e.g startLocalizer or launchContainer ) would 
 break the existing ContainerExecutor interface and would require changes to 
 all executor implementations in YARN. In order to make this interface less 
 brittle in the future, it would make sense to encapsulate arguments in some 
 kind of a ‘context’ object which could be modified/extended without breaking 
 the ContainerExecutor interface in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3686) CapacityScheduler should trim default_node_label_expression

Wangda Tan created YARN-3686:


 Summary: CapacityScheduler should trim 
default_node_label_expression
 Key: YARN-3686
 URL: https://issues.apache.org/jira/browse/YARN-3686
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Wangda Tan
Priority: Critical


We should trim default_node_label_expression for queue before using it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree


 [ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa updated YARN-2336:
-
Attachment: YARN-2336.009.patch

Submitting Akira's patch again sicne YARN-3677 is fixed now.

 Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
 --

 Key: YARN-2336
 URL: https://issues.apache.org/jira/browse/YARN-2336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1, 2.6.0
Reporter: Kenji Kikushima
Assignee: Akira AJISAKA
  Labels: BB2015-05-RFC
 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
 YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, 
 YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch


 When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
 blacket JSON for childQueues.
 This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3645) ResourceManager can't start success if attribute value of aclSubmitApps is null in fair-scheduler.xml

2015-05-19 Thread Gabor Liptak (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551645#comment-14551645
 ] 

Gabor Liptak commented on YARN-3645:


I attached a patch (having trouble running unit tests locally ...)

 ResourceManager can't start success if  attribute value of aclSubmitApps is 
 null in fair-scheduler.xml
 

 Key: YARN-3645
 URL: https://issues.apache.org/jira/browse/YARN-3645
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.2
Reporter: zhoulinlin
 Attachments: YARN-3645.patch


 The aclSubmitApps is configured in fair-scheduler.xml like below:
 queue name=mr
 aclSubmitApps/aclSubmitApps
  /queue
 The resourcemanager log:
 2015-05-14 12:59:48,623 INFO org.apache.hadoop.service.AbstractService: 
 Service ResourceManager failed in state INITED; cause: 
 org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
 to initialize FairScheduler
 org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
 to initialize FairScheduler
   at 
 org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
   at 
 org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:493)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:920)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:240)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1159)
 Caused by: java.io.IOException: Failed to initialize FairScheduler
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1301)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1318)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   ... 7 more
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:458)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:337)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1299)
   ... 9 more
 2015-05-14 12:59:48,623 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning 
 to standby state
 2015-05-14 12:59:48,623 INFO 
 com.zte.zdh.platformplugin.factory.YarnPlatformPluginProxyFactory: plugin 
 transitionToStandbyIn
 2015-05-14 12:59:48,623 WARN org.apache.hadoop.service.AbstractService: When 
 stopping the service ResourceManager : java.lang.NullPointerException
 java.lang.NullPointerException
   at 
 com.zte.zdh.platformplugin.factory.YarnPlatformPluginProxyFactory.transitionToStandbyIn(YarnPlatformPluginProxyFactory.java:71)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToStandby(ResourceManager.java:997)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1058)
   at 
 org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
   at 
 org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
   at 
 org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:171)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1159)
 2015-05-14 12:59:48,623 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
 ResourceManager
 org.apache.hadoop.service.ServiceStateException: java.io.IOException: Failed 
 to initialize FairScheduler
   at 
 org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
   at 
 org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
   at

[jira] [Commented] (YARN-3678) DelayedProcessKiller may kill other process other than container


[ 
https://issues.apache.org/jira/browse/YARN-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551756#comment-14551756
 ] 

gu-chi commented on YARN-3678:
--

The PID number may be not use as a process, also can be a thread, linux treat 
process and thread the same, kill one thread in process may also kill the 
process too, for thread, 250ms is possible to start, rt?

 DelayedProcessKiller may kill other process other than container
 

 Key: YARN-3678
 URL: https://issues.apache.org/jira/browse/YARN-3678
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: gu-chi
Priority: Critical

 Suppose one container finished, then it will do clean up, the PID file still 
 exist and will trigger once singalContainer, this will kill the process with 
 the pid in PID file, but as container already finished, so this PID may be 
 occupied by other process, this may cause serious issue.
 As I know, my NM was killed unexpectedly, what I described can be the cause. 
 Even rarely occur.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3565) NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String

2015-05-19 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551755#comment-14551755
 ] 

Naganarasimha G R commented on YARN-3565:
-

Thanks for Reviewing and Committing  [~wangda]  [~vinodkv]

 NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object 
 instead of String
 -

 Key: YARN-3565
 URL: https://issues.apache.org/jira/browse/YARN-3565
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
Priority: Blocker
 Fix For: 2.8.0

 Attachments: YARN-3565-20150502-1.patch, YARN-3565.20150515-1.patch, 
 YARN-3565.20150516-1.patch, YARN-3565.20150519-1.patch


 Now NM HB/Register uses SetString, it will be hard to add new fields if we 
 want to support specifying NodeLabel type such as exclusivity/constraints, 
 etc. We need to make sure rolling upgrade works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3684) Change ContainerExecutor's primary lifecycle methods to use a more extensible mechanism for passing information.


[ 
https://issues.apache.org/jira/browse/YARN-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551557#comment-14551557
 ] 

Hadoop QA commented on YARN-3684:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 40s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 10 new or modified test files. |
| {color:green}+1{color} | javac |   7m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 36s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 22s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m 49s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m  5s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m  9s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  42m 52s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733995/YARN-3684.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / b37da52 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8012/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8012/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8012/console |


This message was automatically generated.

 Change ContainerExecutor's primary lifecycle methods to use a more extensible 
 mechanism for passing information. 
 -

 Key: YARN-3684
 URL: https://issues.apache.org/jira/browse/YARN-3684
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: yarn
Reporter: Sidharta Seethana
Assignee: Sidharta Seethana
 Attachments: YARN-3648.001.patch, YARN-3684.002.patch


 As per description in parent JIRA :  Adding additional arguments to key 
 ContainerExecutor methods ( e.g startLocalizer or launchContainer ) would 
 break the existing ContainerExecutor interface and would require changes to 
 all executor implementations in YARN. In order to make this interface less 
 brittle in the future, it would make sense to encapsulate arguments in some 
 kind of a ‘context’ object which could be modified/extended without breaking 
 the ContainerExecutor interface in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3667) Fix findbugs warning Inconsistent synchronization of org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS

2015-05-19 Thread zhihai xu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551673#comment-14551673
 ] 

zhihai xu commented on YARN-3667:
-

[~ozawa], yes, please go ahead and close it as duplicated. thanks

 Fix findbugs warning Inconsistent synchronization of 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 -

 Key: YARN-3667
 URL: https://issues.apache.org/jira/browse/YARN-3667
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3667.000.patch


 Fix findbugs warning Inconsistent synchronization of 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 This  findbugs warning is reported at 
 https://builds.apache.org/job/PreCommit-YARN-Build/7956/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.

2015-05-19 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551687#comment-14551687
 ] 

Sunil G commented on YARN-3583:
---

Thank you very much [~leftnoteasy] for reviewing and committing the same!

 Support of NodeLabel object instead of plain String in YarnClient side.
 ---

 Key: YARN-3583
 URL: https://issues.apache.org/jira/browse/YARN-3583
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.6.0
Reporter: Sunil G
Assignee: Sunil G
 Fix For: 2.8.0

 Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
 0003-YARN-3583.patch, 0004-YARN-3583.patch


 Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
 getLabelsToNodes/getNodeToLabels api's can use NodeLabel object instead of 
 using plain label name.
 This will help to bring other label details such as Exclusivity to client 
 side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3677) Fix findbugs warnings in yarn-server-resourcemanager


[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551465#comment-14551465
 ] 

Tsuyoshi Ozawa commented on YARN-3677:
--

Talked with Vinod and Arun offline. I understood that it's necessary change. 

 Fix findbugs warnings in yarn-server-resourcemanager
 

 Key: YARN-3677
 URL: https://issues.apache.org/jira/browse/YARN-3677
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Akira AJISAKA
Assignee: Vinod Kumar Vavilapalli
Priority: Minor
  Labels: newbie
 Attachments: YARN-3677-20150519.txt


 There is 1 findbugs warning in FileSystemRMStateStore.java.
 {noformat}
 Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
 time
 Unsynchronized access at FileSystemRMStateStore.java: [line 156]
 Field 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 Synchronized 66% of the time
 Synchronized access at FileSystemRMStateStore.java: [line 148]
 Synchronized access at FileSystemRMStateStore.java: [line 859]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3686) CapacityScheduler should trim default_node_label_expression

2015-05-19 Thread Sunil G (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-3686:
--
Attachment: 0001-YARN-3686.patch

Hi [~leftnoteasy],
I would like to take up this problem if its fine.
IMO from CapacitySchedulerConfiguration#getDefaultNodeLabelExpression, we can 
trim the label to avoid this problem. Sharing a patch. Please share your 
opinion.

 CapacityScheduler should trim default_node_label_expression
 ---

 Key: YARN-3686
 URL: https://issues.apache.org/jira/browse/YARN-3686
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Sunil G
Priority: Critical
 Attachments: 0001-YARN-3686.patch


 We should trim default_node_label_expression for queue before using it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3678) DelayedProcessKiller may kill other process other than container


[ 
https://issues.apache.org/jira/browse/YARN-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550390#comment-14550390
 ] 

gu-chi commented on YARN-3678:
--

I think if decrease the max_pid setting in OS can enlarge the possibility of 
reproducing, working on

 DelayedProcessKiller may kill other process other than container
 

 Key: YARN-3678
 URL: https://issues.apache.org/jira/browse/YARN-3678
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: gu-chi
Priority: Critical

 Suppose one container finished, then it will do clean up, the PID file still 
 exist and will trigger once singalContainer, this will kill the process with 
 the pid in PID file, but as container already finished, so this PID may be 
 occupied by other process, this may cause serious issue.
 As I know, my NM was killed unexpectedly, what I described can be the cause. 
 Even rarely occur.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-3679) Add documentation for timeline server filter ordering


 [ 
https://issues.apache.org/jira/browse/YARN-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai reassigned YARN-3679:
---

Assignee: Mit Desai

 Add documentation for timeline server filter ordering
 -

 Key: YARN-3679
 URL: https://issues.apache.org/jira/browse/YARN-3679
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Mit Desai
Assignee: Mit Desai

 Currently the auth filter is before static user filter by default. After 
 YARN-3624, the filter order is no longer reversed. So the pseudo auth's 
 allowing anonymous config is useless with both filters loaded in the new 
 order, because static user will be created before presenting it to auth 
 filter. The user can remove static user filter from the config to get 
 anonymous user work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (YARN-41) The RM should handle the graceful shutdown of the NM.


[ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550380#comment-14550380
 ] 

Devaraj K edited comment on YARN-41 at 5/19/15 12:53 PM:
-

Updated the patch with checkstyle fixes.


was (Author: devaraj.k):
Updated the patch checkstyle fixes.

 The RM should handle the graceful shutdown of the NM.
 -

 Key: YARN-41
 URL: https://issues.apache.org/jira/browse/YARN-41
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Ravi Teja Ch N V
Assignee: Devaraj K
 Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
 MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
 YARN-41-4.patch, YARN-41-5.patch, YARN-41-6.patch, YARN-41.patch


 Instead of waiting for the NM expiry, RM should remove and handle the NM, 
 which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3541) Add version info on timeline service / generic history web UI and REST API


[ 
https://issues.apache.org/jira/browse/YARN-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550500#comment-14550500
 ] 

Hudson commented on YARN-3541:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2130 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2130/])
YARN-3541. Add version info on timeline service / generic history web UI and 
REST API. Contributed by Zhijie Shen (xgong: rev 
76afd28862c1f27011273659a82cd45903a77170)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/NavBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/webapp/TimelineWebServices.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timeline/TimelineAbout.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/timeline/TimelineUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebApp.java


 Add version info on timeline service / generic history web UI and REST API
 --

 Key: YARN-3541
 URL: https://issues.apache.org/jira/browse/YARN-3541
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.8.0

 Attachments: YARN-3541.1.patch, YARN-3541.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3630) YARN should suggest a heartbeat interval for applications

2015-05-19 Thread Xianyin Xin (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated YARN-3630:
--
Attachment: YARN-3630.001.patch.patch

Initial patch with adaptive heartbeat policy unimplemented. If we determine to 
implement a good enough adaptive heartbeat policy, this jira would depend 
YARN-3652, where we have enough information of the scheduler's load to 
determine the heartbeat interval.

 YARN should suggest a heartbeat interval for applications
 -

 Key: YARN-3630
 URL: https://issues.apache.org/jira/browse/YARN-3630
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager, scheduler
Affects Versions: 2.7.0
Reporter: Zoltán Zvara
Assignee: Xianyin Xin
Priority: Minor
 Attachments: YARN-3630.001.patch.patch


 It seems currently applications - for example Spark - are not adaptive to RM 
 regarding heartbeat intervals. RM should be able to suggest a desired 
 heartbeat interval to applications.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-19 Thread Lavkesh Lahngir (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lavkesh Lahngir updated YARN-3591:
--
 Target Version/s: 2.8.0  (was: 2.7.1)
Affects Version/s: (was: 2.6.0)
   2.7.0

 Resource Localisation on a bad disk causes subsequent containers failure 
 -

 Key: YARN-3591
 URL: https://issues.apache.org/jira/browse/YARN-3591
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Lavkesh Lahngir
Assignee: Lavkesh Lahngir
 Attachments: 0001-YARN-3591.1.patch, 0001-YARN-3591.patch, 
 YARN-3591.2.patch


 It happens when a resource is localised on the disk, after localising that 
 disk has gone bad. NM keeps paths for localised resources in memory.  At the 
 time of resource request isResourcePresent(rsrc) will be called which calls 
 file.exists() on the localised path.
 In some cases when disk has gone bad, inodes are stilled cached and 
 file.exists() returns true. But at the time of reading, file will not open.
 Note: file.exists() actually calls stat64 natively which returns true because 
 it was able to find inode information from the OS.
 A proposal is to call file.list() on the parent path of the resource, which 
 will call open() natively. If the disk is good it should return an array of 
 paths with length at-least 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-3605) _ as method name may not be supported much longer


 [ 
https://issues.apache.org/jira/browse/YARN-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K reassigned YARN-3605:
---

Assignee: Devaraj K

 _ as method name may not be supported much longer
 -

 Key: YARN-3605
 URL: https://issues.apache.org/jira/browse/YARN-3605
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Robert Joseph Evans
Assignee: Devaraj K

 I was trying to run the precommit test on my mac under JDK8, and I got the 
 following error related to javadocs.
  
  (use of '_' as an identifier might not be supported in releases after Java 
 SE 8)
 It looks like we need to at least change the method name to not be '_' any 
 more, or possibly replace the HTML generation with something more standard. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3678) DelayedProcessKiller may kill other process other than container

gu-chi created YARN-3678:


 Summary: DelayedProcessKiller may kill other process other than 
container
 Key: YARN-3678
 URL: https://issues.apache.org/jira/browse/YARN-3678
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
Reporter: gu-chi
Priority: Critical


Suppose one container finished, then it will do clean up, the PID file still 
exist and will trigger once singalContainer, this will kill the process with 
the pid in PID file, but as container already finished, so this PID may be 
occupied by other process, this may cause serious issue.
As I know, my NM was killed unexpectedly, what I described can be the cause. 
Even rarely occur.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3624) ApplicationHistoryServer reverses the order of the filters it gets


[ 
https://issues.apache.org/jira/browse/YARN-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550479#comment-14550479
 ] 

Mit Desai commented on YARN-3624:
-

Correction: YARN-3679

 ApplicationHistoryServer reverses the order of the filters it gets
 --

 Key: YARN-3624
 URL: https://issues.apache.org/jira/browse/YARN-3624
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: YARN-3624.patch


 AppliactionHistoryServer should not alter the order in which it gets the 
 filter chain. Additional filters should be added at the end of the chain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-41) The RM should handle the graceful shutdown of the NM.


 [ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-41:
--
Attachment: YARN-41-6.patch

Updated the patch checkstyle fixes.

 The RM should handle the graceful shutdown of the NM.
 -

 Key: YARN-41
 URL: https://issues.apache.org/jira/browse/YARN-41
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Ravi Teja Ch N V
Assignee: Devaraj K
 Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
 MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
 YARN-41-4.patch, YARN-41-5.patch, YARN-41-6.patch, YARN-41.patch


 Instead of waiting for the NM expiry, RM should remove and handle the NM, 
 which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3679) Add documentation for timeline server filter ordering

Mit Desai created YARN-3679:
---

 Summary: Add documentation for timeline server filter ordering
 Key: YARN-3679
 URL: https://issues.apache.org/jira/browse/YARN-3679
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Mit Desai


Currently the auth filter is before static user filter by default. After 
YARN-3624, the filter order is no longer reversed. So the pseudo auth's 
allowing anonymous config is useless with both filters loaded in the new order, 
because static user will be created before presenting it to auth filter. The 
user can remove static user filter from the config to get anonymous user work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3624) ApplicationHistoryServer reverses the order of the filters it gets


[ 
https://issues.apache.org/jira/browse/YARN-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550447#comment-14550447
 ] 

Mit Desai commented on YARN-3624:
-

Filed YARN-2679

 ApplicationHistoryServer reverses the order of the filters it gets
 --

 Key: YARN-3624
 URL: https://issues.apache.org/jira/browse/YARN-3624
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Affects Versions: 2.6.0
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: YARN-3624.patch


 AppliactionHistoryServer should not alter the order in which it gets the 
 filter chain. Additional filters should be added at the end of the chain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3541) Add version info on timeline service / generic history web UI and REST API


[ 
https://issues.apache.org/jira/browse/YARN-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550529#comment-14550529
 ] 

Hudson commented on YARN-3541:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #190 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/190/])
YARN-3541. Add version info on timeline service / generic history web UI and 
REST API. Contributed by Zhijie Shen (xgong: rev 
76afd28862c1f27011273659a82cd45903a77170)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/timeline/TimelineUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSController.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/webapp/TimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timeline/TimelineAbout.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AboutPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/NavBlock.java


 Add version info on timeline service / generic history web UI and REST API
 --

 Key: YARN-3541
 URL: https://issues.apache.org/jira/browse/YARN-3541
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.8.0

 Attachments: YARN-3541.1.patch, YARN-3541.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3679) Add documentation for timeline server filter ordering


[ 
https://issues.apache.org/jira/browse/YARN-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550614#comment-14550614
 ] 

Hadoop QA commented on YARN-3679:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   2m 54s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | release audit |   0m 20s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   2m 55s | Site still builds. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| | |   6m 15s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733835/YARN-3679.patch |
| Optional Tests | site |
| git revision | trunk / de30d66 |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8000/console |


This message was automatically generated.

 Add documentation for timeline server filter ordering
 -

 Key: YARN-3679
 URL: https://issues.apache.org/jira/browse/YARN-3679
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: YARN-3679.patch


 Currently the auth filter is before static user filter by default. After 
 YARN-3624, the filter order is no longer reversed. So the pseudo auth's 
 allowing anonymous config is useless with both filters loaded in the new 
 order, because static user will be created before presenting it to auth 
 filter. The user can remove static user filter from the config to get 
 anonymous user work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-41) The RM should handle the graceful shutdown of the NM.


[ 
https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550558#comment-14550558
 ] 

Hadoop QA commented on YARN-41:
---

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 43s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 9 new or modified test files. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 42s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 54s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m 16s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 49s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-server-common. |
| {color:green}+1{color} | yarn tests |   5m 59s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| {color:green}+1{color} | yarn tests |  50m 15s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| {color:green}+1{color} | yarn tests |   1m 51s | Tests passed in 
hadoop-yarn-server-tests. |
| | |  99m  0s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS;
 locked 66% of time  Unsynchronized access at FileSystemRMStateStore.java:66% 
of time  Unsynchronized access at FileSystemRMStateStore.java:[line 156] |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733802/YARN-41-6.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / de30d66 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7998/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7998/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7998/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7998/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-tests test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7998/artifact/patchprocess/testrun_hadoop-yarn-server-tests.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7998/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7998/console |


This message was automatically generated.

 The RM should handle the graceful shutdown of the NM.
 -

 Key: YARN-41
 URL: https://issues.apache.org/jira/browse/YARN-41
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Ravi Teja Ch N V
Assignee: Devaraj K
 Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, 
 MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, 
 YARN-41-4.patch, YARN-41-5.patch, YARN-41-6.patch, YARN-41.patch


 Instead of waiting for the NM expiry, RM should remove and handle the NM, 
 which is shutdown gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3541) Add version info on timeline service / generic history web UI and REST API