[jira] [Commented] (YARN-2912) Jersey Tests failing with port in use

2014-12-11 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242485#comment-14242485
 ] 

Steve Loughran commented on YARN-2912:
--

let's try it and see
+1

 Jersey Tests failing with port in use
 -

 Key: YARN-2912
 URL: https://issues.apache.org/jira/browse/YARN-2912
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
 Environment: jenkins on java 8
Reporter: Steve Loughran
Assignee: Varun Saxena
 Fix For: 2.7.0

 Attachments: YARN-2912.patch


 Jersey tests like TestNMWebServices apps are failing with port in use.
 The jersey test runner appears to always use the same port unless a system 
 property is set to point to a different one. Every test should really be 
 changing that sysprop in a @Before method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2917) Potential deadlock in AsyncDispatcher when system.exit called in AsyncDispatcher#dispatch and AsyscDispatcher#serviceStop from shutdown hook

2014-12-11 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242519#comment-14242519
 ] 

Rohith commented on YARN-2917:
--

Thanks for your explanation about necessacity of draining events. I do not 
think any other side effects. I will upload new patch soon.

 Potential deadlock in AsyncDispatcher when system.exit called in 
 AsyncDispatcher#dispatch and AsyscDispatcher#serviceStop from shutdown hook
 

 Key: YARN-2917
 URL: https://issues.apache.org/jira/browse/YARN-2917
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Critical
 Attachments: 0001-YARN-2917.patch


 I encoutered scenario where RM hanged while shutting down and keep on logging 
 {{2014-12-03 19:32:44,283 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Waiting for AsyncDispatcher to drain.}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2917) Potential deadlock in AsyncDispatcher when system.exit called in AsyncDispatcher#dispatch and AsyscDispatcher#serviceStop from shutdown hook

2014-12-11 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-2917:
-
Attachment: 0002-YARN-2917.patch

 Potential deadlock in AsyncDispatcher when system.exit called in 
 AsyncDispatcher#dispatch and AsyscDispatcher#serviceStop from shutdown hook
 

 Key: YARN-2917
 URL: https://issues.apache.org/jira/browse/YARN-2917
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Critical
 Attachments: 0001-YARN-2917.patch, 0002-YARN-2917.patch


 I encoutered scenario where RM hanged while shutting down and keep on logging 
 {{2014-12-03 19:32:44,283 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Waiting for AsyncDispatcher to drain.}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2917) Potential deadlock in AsyncDispatcher when system.exit called in AsyncDispatcher#dispatch and AsyscDispatcher#serviceStop from shutdown hook

2014-12-11 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242534#comment-14242534
 ] 

Rohith commented on YARN-2917:
--

Kindly review the attached patch.  I have manually verified problematic 
scenario as I mentioned in my [previous 
comment|https://issues.apache.org/jira/browse/YARN-2917?focusedCommentId=14233971page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14233971]
 by deploying in 1 node cluster. It is able to shutdown gracefully calling 
ShutdownHook.


Not related to this jira : I observed that ResourceManager#rmDispatcher does 
not drain ,is it bug?

 Potential deadlock in AsyncDispatcher when system.exit called in 
 AsyncDispatcher#dispatch and AsyscDispatcher#serviceStop from shutdown hook
 

 Key: YARN-2917
 URL: https://issues.apache.org/jira/browse/YARN-2917
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Critical
 Attachments: 0001-YARN-2917.patch, 0002-YARN-2917.patch


 I encoutered scenario where RM hanged while shutting down and keep on logging 
 {{2014-12-03 19:32:44,283 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Waiting for AsyncDispatcher to drain.}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2356) yarn status command for non-existent application/application attempt/container is too verbose

2014-12-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242549#comment-14242549
 ] 

Hadoop QA commented on YARN-2356:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12682693/0002-YARN-2356.patch
  against trunk revision 390642a.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6084//console

This message is automatically generated.

 yarn status command for non-existent application/application 
 attempt/container is too verbose 
 --

 Key: YARN-2356
 URL: https://issues.apache.org/jira/browse/YARN-2356
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Reporter: Sunil G
Assignee: Sunil G
Priority: Minor
 Attachments: 0001-YARN-2356.patch, 0002-YARN-2356.patch, 
 Yarn-2356.1.patch


 *yarn application -status* or *applicationattempt -status* or *container 
 status* commands can suppress exception such as ApplicationNotFound, 
 ApplicationAttemptNotFound and ContainerNotFound for non-existent entries in 
 RM or History Server. 
 For example, below exception can be suppressed better
 sunildev@host-a:~/hadoop/hadoop/bin ./yarn application -status 
 application_1402668848165_0015
 No GC_PROFILE is given. Defaults to medium.
 14/07/25 16:21:45 INFO client.RMProxy: Connecting to ResourceManager at 
 /10.18.40.77:45022
 Exception in thread main 
 org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
 with id 'application_1402668848165_0015' doesn't exist in RM.
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:285)
 at 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
 at 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:607)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:932)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2099)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2095)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1626)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2093)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at 
 org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
 at 
 org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:166)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
 at $Proxy12.getApplicationReport(Unknown Source)
 at 
 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:291)
 at 
 org.apache.hadoop.yarn.client.cli.ApplicationCLI.printApplicationReport(ApplicationCLI.java:428)
 at 
 org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:153)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:76)
 Caused by: 
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException):
  Application with 

[jira] [Commented] (YARN-2917) Potential deadlock in AsyncDispatcher when system.exit called in AsyncDispatcher#dispatch and AsyscDispatcher#serviceStop from shutdown hook

2014-12-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242552#comment-14242552
 ] 

Hadoop QA commented on YARN-2917:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12686586/0002-YARN-2917.patch
  against trunk revision 390642a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 25 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6083//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6083//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6083//console

This message is automatically generated.

 Potential deadlock in AsyncDispatcher when system.exit called in 
 AsyncDispatcher#dispatch and AsyscDispatcher#serviceStop from shutdown hook
 

 Key: YARN-2917
 URL: https://issues.apache.org/jira/browse/YARN-2917
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Critical
 Attachments: 0001-YARN-2917.patch, 0002-YARN-2917.patch


 I encoutered scenario where RM hanged while shutting down and keep on logging 
 {{2014-12-03 19:32:44,283 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Waiting for AsyncDispatcher to drain.}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2917) Potential deadlock in AsyncDispatcher when system.exit called in AsyncDispatcher#dispatch and AsyscDispatcher#serviceStop from shutdown hook

2014-12-11 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242564#comment-14242564
 ] 

Rohith commented on YARN-2917:
--

Findbug warnings are unrelated to this patch

 Potential deadlock in AsyncDispatcher when system.exit called in 
 AsyncDispatcher#dispatch and AsyscDispatcher#serviceStop from shutdown hook
 

 Key: YARN-2917
 URL: https://issues.apache.org/jira/browse/YARN-2917
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Critical
 Attachments: 0001-YARN-2917.patch, 0002-YARN-2917.patch


 I encoutered scenario where RM hanged while shutting down and keep on logging 
 {{2014-12-03 19:32:44,283 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Waiting for AsyncDispatcher to drain.}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242578#comment-14242578
 ] 

Hudson commented on YARN-2437:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #35 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/35/])
YARN-2437. start-yarn.sh/stop-yarn should give info (Varun Saxena via aw) (aw: 
rev 59cb8b9123fac725660fc7cfbaaad3d1aa3e3bd7)
* hadoop-yarn-project/CHANGES.txt
* hadoop-yarn-project/hadoop-yarn/bin/start-yarn.sh
* hadoop-yarn-project/hadoop-yarn/bin/stop-yarn.sh


 start-yarn.sh/stop-yarn should give info
 

 Key: YARN-2437
 URL: https://issues.apache.org/jira/browse/YARN-2437
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scripts
Reporter: Allen Wittenauer
Assignee: Varun Saxena
  Labels: newbie
 Fix For: 3.0.0

 Attachments: YARN-2437.001.patch, YARN-2437.002.patch, YARN-2437.patch


 With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
 longer prints Starting information.  This should be made more of an analog 
 of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242592#comment-14242592
 ] 

Hudson commented on YARN-2437:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1969 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1969/])
YARN-2437. start-yarn.sh/stop-yarn should give info (Varun Saxena via aw) (aw: 
rev 59cb8b9123fac725660fc7cfbaaad3d1aa3e3bd7)
* hadoop-yarn-project/hadoop-yarn/bin/start-yarn.sh
* hadoop-yarn-project/hadoop-yarn/bin/stop-yarn.sh
* hadoop-yarn-project/CHANGES.txt


 start-yarn.sh/stop-yarn should give info
 

 Key: YARN-2437
 URL: https://issues.apache.org/jira/browse/YARN-2437
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scripts
Reporter: Allen Wittenauer
Assignee: Varun Saxena
  Labels: newbie
 Fix For: 3.0.0

 Attachments: YARN-2437.001.patch, YARN-2437.002.patch, YARN-2437.patch


 With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
 longer prints Starting information.  This should be made more of an analog 
 of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2356) yarn status command for non-existent application/application attempt/container is too verbose

2014-12-11 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-2356:

Target Version/s: 2.7.0

 yarn status command for non-existent application/application 
 attempt/container is too verbose 
 --

 Key: YARN-2356
 URL: https://issues.apache.org/jira/browse/YARN-2356
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Reporter: Sunil G
Assignee: Sunil G
Priority: Minor
 Attachments: 0001-YARN-2356.patch, 0002-YARN-2356.patch, 
 Yarn-2356.1.patch


 *yarn application -status* or *applicationattempt -status* or *container 
 status* commands can suppress exception such as ApplicationNotFound, 
 ApplicationAttemptNotFound and ContainerNotFound for non-existent entries in 
 RM or History Server. 
 For example, below exception can be suppressed better
 sunildev@host-a:~/hadoop/hadoop/bin ./yarn application -status 
 application_1402668848165_0015
 No GC_PROFILE is given. Defaults to medium.
 14/07/25 16:21:45 INFO client.RMProxy: Connecting to ResourceManager at 
 /10.18.40.77:45022
 Exception in thread main 
 org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
 with id 'application_1402668848165_0015' doesn't exist in RM.
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:285)
 at 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
 at 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:607)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:932)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2099)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2095)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1626)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2093)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at 
 org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
 at 
 org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:166)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
 at $Proxy12.getApplicationReport(Unknown Source)
 at 
 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:291)
 at 
 org.apache.hadoop.yarn.client.cli.ApplicationCLI.printApplicationReport(ApplicationCLI.java:428)
 at 
 org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:153)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:76)
 Caused by: 
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException):
  Application with id 'application_1402668848165_0015' doesn't exist in RM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242666#comment-14242666
 ] 

Hudson commented on YARN-2437:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #39 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/39/])
YARN-2437. start-yarn.sh/stop-yarn should give info (Varun Saxena via aw) (aw: 
rev 59cb8b9123fac725660fc7cfbaaad3d1aa3e3bd7)
* hadoop-yarn-project/CHANGES.txt
* hadoop-yarn-project/hadoop-yarn/bin/start-yarn.sh
* hadoop-yarn-project/hadoop-yarn/bin/stop-yarn.sh


 start-yarn.sh/stop-yarn should give info
 

 Key: YARN-2437
 URL: https://issues.apache.org/jira/browse/YARN-2437
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scripts
Reporter: Allen Wittenauer
Assignee: Varun Saxena
  Labels: newbie
 Fix For: 3.0.0

 Attachments: YARN-2437.001.patch, YARN-2437.002.patch, YARN-2437.patch


 With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
 longer prints Starting information.  This should be made more of an analog 
 of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2946) Deadlock in ZKRMStateStore

2014-12-11 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-2946:
-
Attachment: TestYARN2946.java

 Deadlock in ZKRMStateStore
 --

 Key: YARN-2946
 URL: https://issues.apache.org/jira/browse/YARN-2946
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
Priority: Blocker
 Attachments: TestYARN2946.java


 Found one deadlock in ZKRMStateStore.
 # Initial stage zkClient is null because of zk disconnected event.
 # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
 re establish zookeeper connection either via synconnected or expired event, 
 it is highly possible that any other thred can obtain lock on 
 {{ZKRMStateStore.this}} from state machine transition events. This cause 
 Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2946) Deadlock in ZKRMStateStore

2014-12-11 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-2946:
-
Attachment: 0001-YARN-2946.patch

 Deadlock in ZKRMStateStore
 --

 Key: YARN-2946
 URL: https://issues.apache.org/jira/browse/YARN-2946
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
Priority: Blocker
 Attachments: 0001-YARN-2946.patch, TestYARN2946.java


 Found one deadlock in ZKRMStateStore.
 # Initial stage zkClient is null because of zk disconnected event.
 # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
 re establish zookeeper connection either via synconnected or expired event, 
 it is highly possible that any other thred can obtain lock on 
 {{ZKRMStateStore.this}} from state machine transition events. This cause 
 Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2437) start-yarn.sh/stop-yarn should give info

2014-12-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242690#comment-14242690
 ] 

Hudson commented on YARN-2437:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1989 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1989/])
YARN-2437. start-yarn.sh/stop-yarn should give info (Varun Saxena via aw) (aw: 
rev 59cb8b9123fac725660fc7cfbaaad3d1aa3e3bd7)
* hadoop-yarn-project/hadoop-yarn/bin/start-yarn.sh
* hadoop-yarn-project/hadoop-yarn/bin/stop-yarn.sh
* hadoop-yarn-project/CHANGES.txt


 start-yarn.sh/stop-yarn should give info
 

 Key: YARN-2437
 URL: https://issues.apache.org/jira/browse/YARN-2437
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scripts
Reporter: Allen Wittenauer
Assignee: Varun Saxena
  Labels: newbie
 Fix For: 3.0.0

 Attachments: YARN-2437.001.patch, YARN-2437.002.patch, YARN-2437.patch


 With the merger and cleanup of the daemon launch code, yarn-daemons.sh no 
 longer prints Starting information.  This should be made more of an analog 
 of start-dfs.sh/stop-dfs.sh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2946) Deadlock in ZKRMStateStore

2014-12-11 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242695#comment-14242695
 ] 

Rohith commented on YARN-2946:
--

 I wrote small program(TestYARN2946.java attached) to simulate exact deadlock 
scenario. The same naming convention I have used for better understanding same 
as deadlock involved classes and its same implementation logic. Running 
TestYARN2946.java with synchronized keyword in method updateFencedState() 
causes deadlock.After the fix i.e by removing synchronized keyword runs the 
program without deadlock in while loop. This is only simulation.

In the attached patch, I have done 2 changes
# Removed *synchronized* keyword from method updateFencedState().
# Changed the method updateFencedState() modifier from public to private since 
it is used only from method notifyStoreOperationFailed().

Kindly review the analysis and attached patch.

 Deadlock in ZKRMStateStore
 --

 Key: YARN-2946
 URL: https://issues.apache.org/jira/browse/YARN-2946
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
Priority: Blocker
 Attachments: 0001-YARN-2946.patch, TestYARN2946.java


 Found one deadlock in ZKRMStateStore.
 # Initial stage zkClient is null because of zk disconnected event.
 # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
 re establish zookeeper connection either via synconnected or expired event, 
 it is highly possible that any other thred can obtain lock on 
 {{ZKRMStateStore.this}} from state machine transition events. This cause 
 Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2946) Deadlock in ZKRMStateStore

2014-12-11 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242694#comment-14242694
 ] 

Rohith commented on YARN-2946:
--

Thanks [~varun_saxena] for your suggestion.


 Deadlock in ZKRMStateStore
 --

 Key: YARN-2946
 URL: https://issues.apache.org/jira/browse/YARN-2946
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
Priority: Blocker
 Attachments: 0001-YARN-2946.patch, TestYARN2946.java


 Found one deadlock in ZKRMStateStore.
 # Initial stage zkClient is null because of zk disconnected event.
 # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
 re establish zookeeper connection either via synconnected or expired event, 
 it is highly possible that any other thred can obtain lock on 
 {{ZKRMStateStore.this}} from state machine transition events. This cause 
 Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1935) Security for timeline server

2014-12-11 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242764#comment-14242764
 ] 

Hitesh Shah commented on YARN-1935:
---

[~vinodkv] [~zjshen] Wasn't most of the secure support for timeline with 
respect to application data already introduced in 2.5 and 2.6? If yes, does 
this jira need to be closed out as it confuses users as to whether Timeline 
is/isn't supported in a secure environment?  

 Security for timeline server
 

 Key: YARN-1935
 URL: https://issues.apache.org/jira/browse/YARN-1935
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Zhijie Shen
 Attachments: Timeline Security Diagram.pdf, 
 Timeline_Kerberos_DT_ACLs.2.patch, Timeline_Kerberos_DT_ACLs.patch


 Jira to track work to secure the ATS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2912) Jersey Tests failing with port in use

2014-12-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242771#comment-14242771
 ] 

Hadoop QA commented on YARN-2912:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12685569/YARN-2912.patch
  against trunk revision 390642a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 14 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 70 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6085//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6085//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6085//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-applicationhistoryservice.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6085//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6085//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6085//console

This message is automatically generated.

 Jersey Tests failing with port in use
 -

 Key: YARN-2912
 URL: https://issues.apache.org/jira/browse/YARN-2912
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
 Environment: jenkins on java 8
Reporter: Steve Loughran
Assignee: Varun Saxena
 Fix For: 2.7.0

 Attachments: YARN-2912.patch


 Jersey tests like TestNMWebServices apps are failing with port in use.
 The jersey test runner appears to always use the same port unless a system 
 property is set to point to a different one. Every test should really be 
 changing that sysprop in a @Before method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2950) Change message to mandate, not suggest JS requirement on UI

2014-12-11 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242829#comment-14242829
 ] 

Harsh J commented on YARN-2950:
---

File of message is 
{{hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/JQueryUI.java}}

 Change message to mandate, not suggest JS requirement on UI
 ---

 Key: YARN-2950
 URL: https://issues.apache.org/jira/browse/YARN-2950
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: webapp
Reporter: Harsh J
Priority: Minor

 Most of YARN's UIs do not work with JavaScript disabled on the browser, cause 
 they appear to send back data as JS arrays instead of within the actual HTML 
 content.
 The JQueryUI prints only a mild warning about this suggesting that {{This 
 page works best with javascript enabled.}}, when in fact it ought to be 
 {{This page will not function without javascript enabled. Please enable 
 javascript on your browser.}} or something as such (more direct).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2950) Change message to mandate, not suggest JS requirement on UI

2014-12-11 Thread Harsh J (JIRA)
Harsh J created YARN-2950:
-

 Summary: Change message to mandate, not suggest JS requirement on 
UI
 Key: YARN-2950
 URL: https://issues.apache.org/jira/browse/YARN-2950
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: webapp
Reporter: Harsh J
Priority: Minor


Most of YARN's UIs do not work with JavaScript disabled on the browser, cause 
they appear to send back data as JS arrays instead of within the actual HTML 
content.

The JQueryUI prints only a mild warning about this suggesting that {{This page 
works best with javascript enabled.}}, when in fact it ought to be {{This page 
will not function without javascript enabled. Please enable javascript on your 
browser.}} or something as such (more direct).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2950) Change message to mandate, not suggest JS requirement on UI

2014-12-11 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated YARN-2950:
--
Labels: newbie  (was: )

 Change message to mandate, not suggest JS requirement on UI
 ---

 Key: YARN-2950
 URL: https://issues.apache.org/jira/browse/YARN-2950
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: webapp
Reporter: Harsh J
Priority: Minor
  Labels: newbie

 Most of YARN's UIs do not work with JavaScript disabled on the browser, cause 
 they appear to send back data as JS arrays instead of within the actual HTML 
 content.
 The JQueryUI prints only a mild warning about this suggesting that {{This 
 page works best with javascript enabled.}}, when in fact it ought to be 
 {{This page will not function without javascript enabled. Please enable 
 javascript on your browser.}} or something as such (more direct).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2014-12-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242846#comment-14242846
 ] 

Wangda Tan commented on YARN-2637:
--

[~cwelch], I will take a look at this patch today as well.

Thanks,

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.2.patch, YARN-2637.6.patch, 
 YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2950) Change message to mandate, not suggest JS requirement on UI

2014-12-11 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated YARN-2950:
--
Affects Version/s: 2.5.0

 Change message to mandate, not suggest JS requirement on UI
 ---

 Key: YARN-2950
 URL: https://issues.apache.org/jira/browse/YARN-2950
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: webapp
Affects Versions: 2.5.0
Reporter: Harsh J
Priority: Minor
  Labels: newbie

 Most of YARN's UIs do not work with JavaScript disabled on the browser, cause 
 they appear to send back data as JS arrays instead of within the actual HTML 
 content.
 The JQueryUI prints only a mild warning about this suggesting that {{This 
 page works best with javascript enabled.}}, when in fact it ought to be 
 {{This page will not function without javascript enabled. Please enable 
 javascript on your browser.}} or something as such (more direct).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2917) Potential deadlock in AsyncDispatcher when system.exit called in AsyncDispatcher#dispatch and AsyscDispatcher#serviceStop from shutdown hook

2014-12-11 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242853#comment-14242853
 ] 

Jian He commented on YARN-2917:
---

patch looks good, thanks Rohith !
Committing this.

bq. I observed that ResourceManager#rmDispatcher does not drain ,is it bug?
I think the decision was to make the state-store relevant dispatcher drained. 
rmDispatcher is a global dispatcher. draining that may take more time depending 
on how busy the cluster is. I think it's still fine to drain the rmDispatcher, 
but need to evaluate the pros and cons.

 Potential deadlock in AsyncDispatcher when system.exit called in 
 AsyncDispatcher#dispatch and AsyscDispatcher#serviceStop from shutdown hook
 

 Key: YARN-2917
 URL: https://issues.apache.org/jira/browse/YARN-2917
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Critical
 Attachments: 0001-YARN-2917.patch, 0002-YARN-2917.patch


 I encoutered scenario where RM hanged while shutting down and keep on logging 
 {{2014-12-03 19:32:44,283 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Waiting for AsyncDispatcher to drain.}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2749) Some testcases from TestLogAggregationService fails in trunk

2014-12-11 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242861#comment-14242861
 ] 

Xuan Gong commented on YARN-2749:
-

-1 on findbug is unrelated.

 Some testcases from TestLogAggregationService fails in trunk
 

 Key: YARN-2749
 URL: https://issues.apache.org/jira/browse/YARN-2749
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-2749.1.patch, YARN-2749.2.patch, YARN-2749.2.patch


 Some testcases from TestLogAggregationService fails in trunk. 
 Those can be reproduced in centos
 Stack Trace:
 java.lang.AssertionError: null
   at org.junit.Assert.fail(Assert.java:86)
   at org.junit.Assert.assertTrue(Assert.java:41)
   at org.junit.Assert.assertTrue(Assert.java:52)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testLogAggregationService(TestLogAggregationService.java:1362)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testLogAggregationServiceWithRetention(TestLogAggregationService.java:1290)
 Stack Trace:
 java.lang.AssertionError: null
   at org.junit.Assert.fail(Assert.java:86)
   at org.junit.Assert.assertTrue(Assert.java:41)
   at org.junit.Assert.assertTrue(Assert.java:52)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testLogAggregationService(TestLogAggregationService.java:1362)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testLogAggregationServiceWithRetention(TestLogAggregationService.java:1290)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2951) 20 Findbugs warnings on trunk in hadoop-yarn-server-nodemanager

2014-12-11 Thread Xuan Gong (JIRA)
Xuan Gong created YARN-2951:
---

 Summary: 20 Findbugs warnings on trunk in 
hadoop-yarn-server-nodemanager
 Key: YARN-2951
 URL: https://issues.apache.org/jira/browse/YARN-2951
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Xuan Gong






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2951) 20 Findbugs warnings on trunk in hadoop-yarn-server-nodemanager

2014-12-11 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-2951:

Attachment: FindBugs_Report.html

 20 Findbugs warnings on trunk in hadoop-yarn-server-nodemanager
 ---

 Key: YARN-2951
 URL: https://issues.apache.org/jira/browse/YARN-2951
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Xuan Gong
 Attachments: FindBugs_Report.html


 There are 20 findbugs warnings on trunk. See attached html file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2951) 20 Findbugs warnings on trunk in hadoop-yarn-server-nodemanager

2014-12-11 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-2951:

Description: There are 20 findbugs warnings on trunk. See attached html 
file. 

 20 Findbugs warnings on trunk in hadoop-yarn-server-nodemanager
 ---

 Key: YARN-2951
 URL: https://issues.apache.org/jira/browse/YARN-2951
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Xuan Gong
 Attachments: FindBugs_Report.html


 There are 20 findbugs warnings on trunk. See attached html file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2951) 20 Findbugs warnings on trunk in hadoop-yarn-server-nodemanager

2014-12-11 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242875#comment-14242875
 ] 

Varun Saxena commented on YARN-2951:


[~xgong], this is a duplicate of YARN-2937

 20 Findbugs warnings on trunk in hadoop-yarn-server-nodemanager
 ---

 Key: YARN-2951
 URL: https://issues.apache.org/jira/browse/YARN-2951
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Xuan Gong
 Attachments: FindBugs_Report.html


 There are 20 findbugs warnings on trunk. See attached html file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2951) 20 Findbugs warnings on trunk in hadoop-yarn-server-nodemanager

2014-12-11 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242879#comment-14242879
 ] 

Varun Saxena commented on YARN-2951:


JIRAs' from YARN-2937 to YARN-2940 will address the new findbugs warnings 
appearing after bumping up the findbugs version to 3.0.0

 20 Findbugs warnings on trunk in hadoop-yarn-server-nodemanager
 ---

 Key: YARN-2951
 URL: https://issues.apache.org/jira/browse/YARN-2951
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Xuan Gong
 Attachments: FindBugs_Report.html


 There are 20 findbugs warnings on trunk. See attached html file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-2951) 20 Findbugs warnings on trunk in hadoop-yarn-server-nodemanager

2014-12-11 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong resolved YARN-2951.
-
Resolution: Duplicate

 20 Findbugs warnings on trunk in hadoop-yarn-server-nodemanager
 ---

 Key: YARN-2951
 URL: https://issues.apache.org/jira/browse/YARN-2951
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Xuan Gong
 Attachments: FindBugs_Report.html


 There are 20 findbugs warnings on trunk. See attached html file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2951) 20 Findbugs warnings on trunk in hadoop-yarn-server-nodemanager

2014-12-11 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242882#comment-14242882
 ] 

Xuan Gong commented on YARN-2951:
-

yes, it is. Close this as duplicate. Thanks, [~varun_saxena]

 20 Findbugs warnings on trunk in hadoop-yarn-server-nodemanager
 ---

 Key: YARN-2951
 URL: https://issues.apache.org/jira/browse/YARN-2951
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Xuan Gong
 Attachments: FindBugs_Report.html


 There are 20 findbugs warnings on trunk. See attached html file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2749) Some testcases from TestLogAggregationService fails in trunk

2014-12-11 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242884#comment-14242884
 ] 

Xuan Gong commented on YARN-2749:
-

findbug will be fix in https://issues.apache.org/jira/browse/YARN-2937

 Some testcases from TestLogAggregationService fails in trunk
 

 Key: YARN-2749
 URL: https://issues.apache.org/jira/browse/YARN-2749
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-2749.1.patch, YARN-2749.2.patch, YARN-2749.2.patch


 Some testcases from TestLogAggregationService fails in trunk. 
 Those can be reproduced in centos
 Stack Trace:
 java.lang.AssertionError: null
   at org.junit.Assert.fail(Assert.java:86)
   at org.junit.Assert.assertTrue(Assert.java:41)
   at org.junit.Assert.assertTrue(Assert.java:52)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testLogAggregationService(TestLogAggregationService.java:1362)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testLogAggregationServiceWithRetention(TestLogAggregationService.java:1290)
 Stack Trace:
 java.lang.AssertionError: null
   at org.junit.Assert.fail(Assert.java:86)
   at org.junit.Assert.assertTrue(Assert.java:41)
   at org.junit.Assert.assertTrue(Assert.java:52)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testLogAggregationService(TestLogAggregationService.java:1362)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testLogAggregationServiceWithRetention(TestLogAggregationService.java:1290)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-2950) Change message to mandate, not suggest JS requirement on UI

2014-12-11 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote reassigned YARN-2950:
-

Assignee: Dustin Cote

 Change message to mandate, not suggest JS requirement on UI
 ---

 Key: YARN-2950
 URL: https://issues.apache.org/jira/browse/YARN-2950
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: webapp
Affects Versions: 2.5.0
Reporter: Harsh J
Assignee: Dustin Cote
Priority: Minor
  Labels: newbie

 Most of YARN's UIs do not work with JavaScript disabled on the browser, cause 
 they appear to send back data as JS arrays instead of within the actual HTML 
 content.
 The JQueryUI prints only a mild warning about this suggesting that {{This 
 page works best with javascript enabled.}}, when in fact it ought to be 
 {{This page will not function without javascript enabled. Please enable 
 javascript on your browser.}} or something as such (more direct).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2940) Fix new findbugs warnings in rest of the hadoop-yarn components

2014-12-11 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242890#comment-14242890
 ] 

Li Lu commented on YARN-2940:
-

Similar to the patch of YARN-2939, I could not reproduce the test failures 
locally. For the results I can see all of them are connection related, which 
appears to be unrelated to the changes in this patch.

 Fix new findbugs warnings in rest of the hadoop-yarn components
 ---

 Key: YARN-2940
 URL: https://issues.apache.org/jira/browse/YARN-2940
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Varun Saxena
Assignee: Li Lu
 Attachments: YARN-2940-121014-1.patch, YARN-2940-121014.patch


 Fix findbugs warnings in the following YARN components:
 hadoop-yarn-applications-distributedshell
 hadoop-yarn-applications-unmanaged-am-launcher
 hadoop-yarn-server-web-proxy
 hadoop-yarn-registry
 hadoop-yarn-server-common
 hadoop-yarn-client



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2943) Add a node-labels page in RM web UI

2014-12-11 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2943:
-
Attachment: YARN-2943.1.patch

And also patch.

 Add a node-labels page in RM web UI
 ---

 Key: YARN-2943
 URL: https://issues.apache.org/jira/browse/YARN-2943
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: Node-labels-page.png, Nodes-page-with-label-filter.png, 
 YARN-2943.1.patch


 Now we have node labels in the system, but there's no a very convenient to 
 get information like how many active NM(s) assigned to a given label?, how 
 much total resource for a give label?, For a given label, which queues can 
 access it?, etc.
 It will be better to add a node-labels page in RM web UI, users/admins can 
 have a centralized view to see such information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2943) Add a node-labels page in RM web UI

2014-12-11 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2943:
-
Attachment: Nodes-page-with-label-filter.png
Node-labels-page.png

Attached screenshots for review.

 Add a node-labels page in RM web UI
 ---

 Key: YARN-2943
 URL: https://issues.apache.org/jira/browse/YARN-2943
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: Node-labels-page.png, Nodes-page-with-label-filter.png, 
 YARN-2943.1.patch


 Now we have node labels in the system, but there's no a very convenient to 
 get information like how many active NM(s) assigned to a given label?, how 
 much total resource for a give label?, For a given label, which queues can 
 access it?, etc.
 It will be better to add a node-labels page in RM web UI, users/admins can 
 have a centralized view to see such information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2932) Add entry for preemption setting to queue status screen and startup/refresh logging

2014-12-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242909#comment-14242909
 ] 

Wangda Tan commented on YARN-2932:
--

bq. Actually, rather than getting the queue's 'preemption-disable' status, I 
think it would make more sense to get the queue's preemption status. So, 
something like getPreemptionStatus. It would return true or false, depending on 
if queue is preemptable or not. What do you think?
Make sense to me.

 Add entry for preemption setting to queue status screen and startup/refresh 
 logging
 ---

 Key: YARN-2932
 URL: https://issues.apache.org/jira/browse/YARN-2932
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 2.7.0
Reporter: Eric Payne
Assignee: Eric Payne

 YARN-2056 enables the ability to turn preemption on or off on a per-queue 
 level. This JIRA will provide the preemption status for each queue in the 
 {{HOST:8088/cluster/scheduler}} UI and in the RM log during startup/queue 
 refresh.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2950) Change message to mandate, not suggest JS requirement on UI

2014-12-11 Thread Joao Salcedo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242929#comment-14242929
 ] 

Joao Salcedo commented on YARN-2950:


Improve messaging in JavaScript is needed

 Change message to mandate, not suggest JS requirement on UI
 ---

 Key: YARN-2950
 URL: https://issues.apache.org/jira/browse/YARN-2950
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: webapp
Affects Versions: 2.5.0
Reporter: Harsh J
Assignee: Dustin Cote
Priority: Minor
  Labels: newbie

 Most of YARN's UIs do not work with JavaScript disabled on the browser, cause 
 they appear to send back data as JS arrays instead of within the actual HTML 
 content.
 The JQueryUI prints only a mild warning about this suggesting that {{This 
 page works best with javascript enabled.}}, when in fact it ought to be 
 {{This page will not function without javascript enabled. Please enable 
 javascript on your browser.}} or something as such (more direct).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2912) Jersey Tests failing with port in use

2014-12-11 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242946#comment-14242946
 ] 

Varun Saxena commented on YARN-2912:


Test failure unrelated.
Findbugs introduced due to bumping up of version to 3.0.0 and will be addressed 
by different JIRAs'(already raised).


 Jersey Tests failing with port in use
 -

 Key: YARN-2912
 URL: https://issues.apache.org/jira/browse/YARN-2912
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
 Environment: jenkins on java 8
Reporter: Steve Loughran
Assignee: Varun Saxena
 Fix For: 2.7.0

 Attachments: YARN-2912.patch


 Jersey tests like TestNMWebServices apps are failing with port in use.
 The jersey test runner appears to always use the same port unless a system 
 property is set to point to a different one. Every test should really be 
 changing that sysprop in a @Before method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1244) Missing yarn queue-cli

2014-12-11 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242968#comment-14242968
 ] 

Devaraj K commented on YARN-1244:
-

It was handled with YARN-2647, duplicate of YARN-2647.

 Missing yarn queue-cli
 --

 Key: YARN-1244
 URL: https://issues.apache.org/jira/browse/YARN-1244
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Vinod Kumar Vavilapalli
  Labels: newbie
 Attachments: YARN-1244.1.patch


 We don't have a yarn queue CLI. For now mapred still has one that is 
 working, but we need to move over that functionality to yarn CLI itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1244) Missing yarn queue-cli

2014-12-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242974#comment-14242974
 ] 

Hadoop QA commented on YARN-1244:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12665668/YARN-1244.1.patch
  against trunk revision 8e9a266.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6087//console

This message is automatically generated.

 Missing yarn queue-cli
 --

 Key: YARN-1244
 URL: https://issues.apache.org/jira/browse/YARN-1244
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Vinod Kumar Vavilapalli
  Labels: newbie
 Attachments: YARN-1244.1.patch


 We don't have a yarn queue CLI. For now mapred still has one that is 
 working, but we need to move over that functionality to yarn CLI itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2917) Potential deadlock in AsyncDispatcher when system.exit called in AsyncDispatcher#dispatch and AsyscDispatcher#serviceStop from shutdown hook

2014-12-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242988#comment-14242988
 ] 

Hudson commented on YARN-2917:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6698 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6698/])
YARN-2917. Fixed potential deadlock when system.exit is called in 
AsyncDispatcher. Contributed by Rohith Sharmaks (jianhe: rev 
614b6afea450ebb897fbb2519c6f02e13b9bd12d)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java
* hadoop-yarn-project/CHANGES.txt


 Potential deadlock in AsyncDispatcher when system.exit called in 
 AsyncDispatcher#dispatch and AsyscDispatcher#serviceStop from shutdown hook
 

 Key: YARN-2917
 URL: https://issues.apache.org/jira/browse/YARN-2917
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Critical
 Fix For: 2.7.0

 Attachments: 0001-YARN-2917.patch, 0002-YARN-2917.patch


 I encoutered scenario where RM hanged while shutting down and keep on logging 
 {{2014-12-03 19:32:44,283 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Waiting for AsyncDispatcher to drain.}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-1244) Missing yarn queue-cli

2014-12-11 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K resolved YARN-1244.
-
Resolution: Duplicate

 Missing yarn queue-cli
 --

 Key: YARN-1244
 URL: https://issues.apache.org/jira/browse/YARN-1244
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Vinod Kumar Vavilapalli
  Labels: newbie
 Attachments: YARN-1244.1.patch


 We don't have a yarn queue CLI. For now mapred still has one that is 
 working, but we need to move over that functionality to yarn CLI itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2920) CapacityScheduler should be notified when labels on nodes changed

2014-12-11 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2920:
-
Attachment: YARN-2920.4.patch

[~jianhe],
Thanks for your comments, I've updated patch addressed all your suggestions, 
for your comment:

bq. how about containers running on a node without label, and now we are adding 
a label.
Now we will also kill containers on that node, this will be changed after we 
get YARN-2498 in. I added a TODO note at 
{{CapacityScheduler.updateLabelsOnNode}}.

Please kindly review,

Wangda


 CapacityScheduler should be notified when labels on nodes changed
 -

 Key: YARN-2920
 URL: https://issues.apache.org/jira/browse/YARN-2920
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2920.1.patch, YARN-2920.2.patch, YARN-2920.3.patch, 
 YARN-2920.4.patch


 Currently, labels on nodes changes will only be handled by 
 RMNodeLabelsManager, but that is not enough upon labels on nodes changes:
 - Scheduler should be able to do take actions to running containers. (Like 
 kill/preempt/do-nothing)
 - Used / available capacity in scheduler should be updated for future 
 planning.
 We need add a new event to pass such updates to scheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2943) Add a node-labels page in RM web UI

2014-12-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243056#comment-14243056
 ] 

Hadoop QA commented on YARN-2943:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12686641/YARN-2943.1.patch
  against trunk revision 8e9a266.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 40 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
  
org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart
  org.apache.hadoop.yarn.server.resourcemanager.TestRM

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6086//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6086//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6086//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6086//console

This message is automatically generated.

 Add a node-labels page in RM web UI
 ---

 Key: YARN-2943
 URL: https://issues.apache.org/jira/browse/YARN-2943
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: Node-labels-page.png, Nodes-page-with-label-filter.png, 
 YARN-2943.1.patch


 Now we have node labels in the system, but there's no a very convenient to 
 get information like how many active NM(s) assigned to a given label?, how 
 much total resource for a give label?, For a given label, which queues can 
 access it?, etc.
 It will be better to add a node-labels page in RM web UI, users/admins can 
 have a centralized view to see such information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2014-12-11 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243065#comment-14243065
 ] 

Craig Welch commented on YARN-2637:
---

I double checked - none of the findbugs warnings are related to my change and 
the tests actually pass on my box with the change - and are unrelated in any 
case, as far as I can see.  There's plenty of chatter on other jira's that this 
is related to the jdk/findbugs update... so, I believe these can be ignored.

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.2.patch, YARN-2637.6.patch, 
 YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2938) Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice

2014-12-11 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2938:
---
Attachment: YARN-2938.002.patch

KIck Jenkins with new patch. Same changes as previous patch

 Fix new findbugs warnings in hadoop-yarn-resourcemanager and 
 hadoop-yarn-applicationhistoryservice
 --

 Key: YARN-2938
 URL: https://issues.apache.org/jira/browse/YARN-2938
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Varun Saxena
Assignee: Varun Saxena
 Fix For: 2.7.0

 Attachments: YARN-2938.001.patch, YARN-2938.002.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2937) Fix new findbugs warnings in hadoop-yarn-nodemanager

2014-12-11 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2937:
---
Attachment: YARN-2937.002.patch

Kick Jenkins. Patch same as previous one.

 Fix new findbugs warnings in hadoop-yarn-nodemanager
 

 Key: YARN-2937
 URL: https://issues.apache.org/jira/browse/YARN-2937
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Varun Saxena
Assignee: Varun Saxena
 Fix For: 2.7.0

 Attachments: HADOOP-11373.patch, YARN-2937.001.patch, 
 YARN-2937.002.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2920) CapacityScheduler should be notified when labels on nodes changed

2014-12-11 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243130#comment-14243130
 ] 

Jian He commented on YARN-2920:
---

- getUsedResources- getUsedResourcesByLabel
- RMNodeLabelsManager constructor can pass rmContext, instead of a separate 
setRMDispatcher method
- AM container is killed as well, should we not kill the am container until the 
max-am-percentage is met, similar to preemption?

 CapacityScheduler should be notified when labels on nodes changed
 -

 Key: YARN-2920
 URL: https://issues.apache.org/jira/browse/YARN-2920
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2920.1.patch, YARN-2920.2.patch, YARN-2920.3.patch, 
 YARN-2920.4.patch


 Currently, labels on nodes changes will only be handled by 
 RMNodeLabelsManager, but that is not enough upon labels on nodes changes:
 - Scheduler should be able to do take actions to running containers. (Like 
 kill/preempt/do-nothing)
 - Used / available capacity in scheduler should be updated for future 
 planning.
 We need add a new event to pass such updates to scheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2003) Support to process Job priority from Submission Context in AppAttemptAddedSchedulerEvent [RM side]

2014-12-11 Thread Ashwin Shankar (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243145#comment-14243145
 ] 

Ashwin Shankar commented on YARN-2003:
--

Hi [~sunilg], is there anything that needs to be done on the RM side for app 
priority to work in fair scheduler ?
There is already a patch for app priority from the fair sched side and was 
wondering if anything in this jira is blocking it.

 Support to process Job priority from Submission Context in 
 AppAttemptAddedSchedulerEvent [RM side]
 --

 Key: YARN-2003
 URL: https://issues.apache.org/jira/browse/YARN-2003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Sunil G
Assignee: Sunil G

 AppAttemptAddedSchedulerEvent should be able to receive the Job Priority from 
 Submission Context and store.
 Later this can be used by Scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2920) CapacityScheduler should be notified when labels on nodes changed

2014-12-11 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2920:
-
Attachment: YARN-2920.5.patch

bq. getUsedResources- getUsedResourcesByLabel
And
bq. RMNodeLabelsManager constructor can pass rmContext, instead of a separate 
setRMDispatcher method
Make sense to me, updated.

bq. AM container is killed as well, should we not kill the am container until 
the max-am-percentage is met, similar to preemption?
This needs update internal used resource for LeafQueue/ParentQueue. With 
YARN-2498, containers will not be immediately killed, and preemption policy 
will handle that, AM is already last killed by preemption policy.

Thanks,
Wangda

 CapacityScheduler should be notified when labels on nodes changed
 -

 Key: YARN-2920
 URL: https://issues.apache.org/jira/browse/YARN-2920
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2920.1.patch, YARN-2920.2.patch, YARN-2920.3.patch, 
 YARN-2920.4.patch, YARN-2920.5.patch


 Currently, labels on nodes changes will only be handled by 
 RMNodeLabelsManager, but that is not enough upon labels on nodes changes:
 - Scheduler should be able to do take actions to running containers. (Like 
 kill/preempt/do-nothing)
 - Used / available capacity in scheduler should be updated for future 
 planning.
 We need add a new event to pass such updates to scheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2014-12-11 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243171#comment-14243171
 ] 

Craig Welch commented on YARN-2495:
---

Sorry if I'm jumping in late and asking redundant questions as a result, but 
I've gone through the various related jiras and the design documents (incl 
updates) ( and the patch :-) ) and I have some requirements related questions 
as a result.  Just a bit of background to make sure I understand things - it 
appears that we've settled on two different but related features here:

1. To enable node labels to be added or removed for a given node without 
validation against a centralized list of node labels (not on this patch, but 
relevant to the discussion) and
2. To enable node managers to specify their node labels based on local 
configuration and scripting (this patch is specific to that feature).

These are, strictly speaking, orthogonal, but may be used together and will 
provide something more in a 'combined feature'

A couple things about this feature (2) - I don't believe that it is necessary 
to add the node label configuration to the local configuration (yarn-site) or 
the heartbeat as such to enable configuration of labels for a node from the 
node in a decentralized fashion (e.g. a script on the node saying put these 
labels on me).  This can already be accomplished using the admin cli from a 
script or calling a web service from the node (most likely the former, but 
either is possible...), so I don't think we need this change to support the 
script case, it's already possible to write a script to add a label to a node 
on the fly today without any changes.  To make this dynamic we would need 
feature 1, from the above, but that's not covered in this patch / is a separate 
discussion / and also does not require this change.  Also, I don't see how this 
change allows a script to dynamically configure labels unless it was changing 
the yarn-site or the like (I may have missed it, but I don't see that logic 
here) - and in any case, it would not be necessary to add this logic to support 
that sort of configuration as I pointed out.  Is this all just to support 
putting labels into the node-managers configuration file and introducing them 
that way?  Do we have a solid need for that?  It's not needed for the dynamic 
script case, which is all I've seen discussed here from a requirements 
perspective (putting it into the config file / adding it to the heartbeat is 
implementation, I don't see a requirement for it as such).

In a nutshell - do we need change 2 (this), or do we really just need change 1 
(eliminating validation of labels against a centralized list, at least as a 
configurable option)?


 Allow admin specify labels from each NM (Distributed configuration)
 ---

 Key: YARN-2495
 URL: https://issues.apache.org/jira/browse/YARN-2495
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
 YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
 YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
 YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
 YARN-2495_20141022.1.patch


 Target of this JIRA is to allow admin specify labels in each NM, this covers
 - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
 using script suggested by [~aw] (YARN-2729) )
 - NM will send labels to RM via ResourceTracker API
 - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2944) SCMStore/InMemorySCMStore is not currently compatible with ReflectionUtils#newInstance

2014-12-11 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated YARN-2944:
---
Attachment: YARN-2944-trunk-v1.patch

Attached is v1 trunk patch.

 SCMStore/InMemorySCMStore is not currently compatible with 
 ReflectionUtils#newInstance
 --

 Key: YARN-2944
 URL: https://issues.apache.org/jira/browse/YARN-2944
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Chris Trezzo
Assignee: Chris Trezzo
Priority: Minor
 Attachments: YARN-2944-trunk-v1.patch


 Currently the Shared Cache Manager uses ReflectionUtils#newInstance to create 
 the SCMStore service. Unfortunately the SCMStore class does not have a 
 0-argument constructor.
 On startup, the SCM fails with the following:
 {noformat}
 14/12/09 16:10:53 INFO service.AbstractService: Service SharedCacheManager 
 failed in state INITED; cause: java.lang.RuntimeException: 
 java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 java.lang.RuntimeException: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.createSCMStoreService(SharedCacheManager.java:103)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.serviceInit(SharedCacheManager.java:65)
 at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.main(SharedCacheManager.java:156)
 Caused by: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 at java.lang.Class.getConstructor0(Class.java:2763)
 at java.lang.Class.getDeclaredConstructor(Class.java:2021)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)
 ... 4 more
 14/12/09 16:10:53 FATAL sharedcachemanager.SharedCacheManager: Error starting 
 SharedCacheManager
 java.lang.RuntimeException: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.createSCMStoreService(SharedCacheManager.java:103)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.serviceInit(SharedCacheManager.java:65)
 at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.main(SharedCacheManager.java:156)
 Caused by: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 at java.lang.Class.getConstructor0(Class.java:2763)
 at java.lang.Class.getDeclaredConstructor(Class.java:2021)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)
 ... 4 more
 {noformat}
 This JIRA is to add a 0-argument constructor to SCMStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-2168) SCM/Client/NM/Admin protocols

2014-12-11 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo resolved YARN-2168.

Resolution: Fixed

Closing issue as comments have been addressed in other subtasks of YARN-1492.

 SCM/Client/NM/Admin protocols
 -

 Key: YARN-2168
 URL: https://issues.apache.org/jira/browse/YARN-2168
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Chris Trezzo
Assignee: Chris Trezzo
 Attachments: YARN-2168-trunk-v1.patch, YARN-2168-trunk-v2.patch


 This jira is meant to be used to review the main shared cache APIs. They are 
 as follows:
 * ClientSCMProtocol - The protocol between the yarn client and the cache 
 manager. This protocol controls how resources in the cache are claimed and 
 released.
 ** UseSharedCacheResourceRequest
 ** UseSharedCacheResourceResponse
 ** ReleaseSharedCacheResourceRequest
 ** ReleaseSharedCacheResourceResponse
 * SCMAdminProtocol - This is an administrative protocol for the cache 
 manager. It allows administrators to manually trigger cleaner runs.
 ** RunSharedCacheCleanerTaskRequest
 ** RunSharedCacheCleanerTaskResponse
 * NMCacheUploaderSCMProtocol - The protocol between the NodeManager and the 
 cache manager. This allows the NodeManager to coordinate with the cache 
 manager when uploading new resources to the shared cache.
 ** NotifySCMRequest
 ** NotifySCMResponse



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2920) CapacityScheduler should be notified when labels on nodes changed

2014-12-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243180#comment-14243180
 ] 

Hadoop QA commented on YARN-2920:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12686656/YARN-2920.4.patch
  against trunk revision 614b6af.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 28 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-tools/hadoop-sls 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService
org.apache.hadoop.yarn.server.resourcemanager.scheduler.TestAbstractYarnScheduler

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6088//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6088//artifact/patchprocess/newPatchFindbugsWarningshadoop-sls.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6088//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6088//console

This message is automatically generated.

 CapacityScheduler should be notified when labels on nodes changed
 -

 Key: YARN-2920
 URL: https://issues.apache.org/jira/browse/YARN-2920
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2920.1.patch, YARN-2920.2.patch, YARN-2920.3.patch, 
 YARN-2920.4.patch, YARN-2920.5.patch


 Currently, labels on nodes changes will only be handled by 
 RMNodeLabelsManager, but that is not enough upon labels on nodes changes:
 - Scheduler should be able to do take actions to running containers. (Like 
 kill/preempt/do-nothing)
 - Used / available capacity in scheduler should be updated for future 
 planning.
 We need add a new event to pass such updates to scheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2937) Fix new findbugs warnings in hadoop-yarn-nodemanager

2014-12-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243187#comment-14243187
 ] 

Hadoop QA commented on YARN-2937:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12686671/YARN-2937.002.patch
  against trunk revision b9f6d0c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6090//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6090//console

This message is automatically generated.

 Fix new findbugs warnings in hadoop-yarn-nodemanager
 

 Key: YARN-2937
 URL: https://issues.apache.org/jira/browse/YARN-2937
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Varun Saxena
Assignee: Varun Saxena
 Fix For: 2.7.0

 Attachments: HADOOP-11373.patch, YARN-2937.001.patch, 
 YARN-2937.002.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2014-12-11 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243197#comment-14243197
 ] 

Craig Welch commented on YARN-2495:
---

So, I see the language around the script based vs the conf based provider, etc, 
so I assume that's where the scripting side comes in.  However, it's still not 
clear to me that it's really a good idea to add all of this when there is 
already a way to accomplish the activity with a script - and already ways to 
run scripts from the node manager...


 Allow admin specify labels from each NM (Distributed configuration)
 ---

 Key: YARN-2495
 URL: https://issues.apache.org/jira/browse/YARN-2495
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
 YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
 YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
 YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
 YARN-2495_20141022.1.patch


 Target of this JIRA is to allow admin specify labels in each NM, this covers
 - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
 using script suggested by [~aw] (YARN-2729) )
 - NM will send labels to RM via ResourceTracker API
 - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation

2014-12-11 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243208#comment-14243208
 ] 

Junping Du commented on YARN-2637:
--

Ok. Thanks for double-check it. I will wait [~leftnoteasy] for more review 
comments and may commit it tomorrow if no future comments.

 maximum-am-resource-percent could be violated when resource of AM is  
 minimumAllocation
 

 Key: YARN-2637
 URL: https://issues.apache.org/jira/browse/YARN-2637
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Wangda Tan
Assignee: Craig Welch
Priority: Critical
 Attachments: YARN-2637.0.patch, YARN-2637.1.patch, 
 YARN-2637.12.patch, YARN-2637.13.patch, YARN-2637.15.patch, 
 YARN-2637.16.patch, YARN-2637.17.patch, YARN-2637.2.patch, YARN-2637.6.patch, 
 YARN-2637.7.patch, YARN-2637.9.patch


 Currently, number of AM in leaf queue will be calculated in following way:
 {code}
 max_am_resource = queue_max_capacity * maximum_am_resource_percent
 #max_am_number = max_am_resource / minimum_allocation
 #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
 {code}
 And when submit new application to RM, it will check if an app can be 
 activated in following way:
 {code}
 for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); 
  i.hasNext(); ) {
   FiCaSchedulerApp application = i.next();
   
   // Check queue limit
   if (getNumActiveApplications() = getMaximumActiveApplications()) {
 break;
   }
   
   // Check user limit
   User user = getUser(application.getUser());
   if (user.getActiveApplications()  
 getMaximumActiveApplicationsPerUser()) {
 user.activateApplication();
 activeApplications.add(application);
 i.remove();
 LOG.info(Application  + application.getApplicationId() +
  from user:  + application.getUser() + 
  activated in queue:  + getQueueName());
   }
 }
 {code}
 An example is,
 If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
 resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
 launched is 200, and if user uses 5M for each AM ( minimum_allocation). All 
 apps can still be activated, and it will occupy all resource of a queue 
 instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2944) SCMStore/InMemorySCMStore is not currently compatible with ReflectionUtils#newInstance

2014-12-11 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243213#comment-14243213
 ] 

Sangjin Lee commented on YARN-2944:
---

Thanks for the patch [~ctrezzo]!

The patch looks good to me. One small nit: the InMemorySCMStore (and SCMStore) 
constructor that takes an AppChecker instance is needed only for unit testing 
purposes, right? If so, can we make it default scope instead of public and mark 
it as visible for testing?

 SCMStore/InMemorySCMStore is not currently compatible with 
 ReflectionUtils#newInstance
 --

 Key: YARN-2944
 URL: https://issues.apache.org/jira/browse/YARN-2944
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Chris Trezzo
Assignee: Chris Trezzo
Priority: Minor
 Attachments: YARN-2944-trunk-v1.patch


 Currently the Shared Cache Manager uses ReflectionUtils#newInstance to create 
 the SCMStore service. Unfortunately the SCMStore class does not have a 
 0-argument constructor.
 On startup, the SCM fails with the following:
 {noformat}
 14/12/09 16:10:53 INFO service.AbstractService: Service SharedCacheManager 
 failed in state INITED; cause: java.lang.RuntimeException: 
 java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 java.lang.RuntimeException: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.createSCMStoreService(SharedCacheManager.java:103)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.serviceInit(SharedCacheManager.java:65)
 at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.main(SharedCacheManager.java:156)
 Caused by: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 at java.lang.Class.getConstructor0(Class.java:2763)
 at java.lang.Class.getDeclaredConstructor(Class.java:2021)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)
 ... 4 more
 14/12/09 16:10:53 FATAL sharedcachemanager.SharedCacheManager: Error starting 
 SharedCacheManager
 java.lang.RuntimeException: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.createSCMStoreService(SharedCacheManager.java:103)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.serviceInit(SharedCacheManager.java:65)
 at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.main(SharedCacheManager.java:156)
 Caused by: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 at java.lang.Class.getConstructor0(Class.java:2763)
 at java.lang.Class.getDeclaredConstructor(Class.java:2021)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)
 ... 4 more
 {noformat}
 This JIRA is to add a 0-argument constructor to SCMStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2014-12-11 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243217#comment-14243217
 ] 

Allen Wittenauer commented on YARN-2495:


bq.  This can already be accomplished using the admin cli from a script or 
calling a web service from the node

Think about the secure cluster case. A whole new level of complexity is 
required to get this functionality using your proposed method vs. having the NM 
just run the script itself.

bq.  Is this all just to support putting labels into the node-managers 
configuration file and introducing them that way? Do we have a solid need for 
that? 

No, this is so we *don't* have to have hard-coded labels in a file.  If we are 
doing an external software change, we need to be able to reflect that change up 
the chain.  Think rolling upgrade.  Think multiple service owners. 

FWIW, yes, we do have a solid need for this feature.  Almost every ops person I 
talked to has said they'd likely make use it for the exact use cases I've 
highlighted above. Being able to roll out new versions of Java and make 
scheduling decisions on its installation is *extremely* powerful and useful.

 Allow admin specify labels from each NM (Distributed configuration)
 ---

 Key: YARN-2495
 URL: https://issues.apache.org/jira/browse/YARN-2495
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
 YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
 YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
 YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
 YARN-2495_20141022.1.patch


 Target of this JIRA is to allow admin specify labels in each NM, this covers
 - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
 using script suggested by [~aw] (YARN-2729) )
 - NM will send labels to RM via ResourceTracker API
 - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2284) Find missing config options in YarnConfiguration and yarn-default.xml

2014-12-11 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243237#comment-14243237
 ] 

Robert Kanter commented on YARN-2284:
-

Overall looks good, a few minor things:
- For the methods in {{Configuration}} that are only meant for testing, can you 
annotate them as {{@VisibleForTesting}}?
- In {{TestConfigurationFieldsBase.compareConfigurationToXmlFields}} we can use 
a HashSet instead of a TreeSet
- Can you add some Javadoc to the top of {{TestMapreduceConfigFields}} to 
explain what this class is for and how to use it?  It should be clear enough 
that somebody can later make a subclass to check their project, without having 
to look into the other code too much.
- We should turn this into a parent JIRA in HADOOP, and then have separate 
child JIRAs for YARN, MAPREDUCE, and HDFS to add the subclasses for yarn-site, 
mapped-site, and hdfs-site.

Don't worry about the findbugs warnings as long as their not from this patch; 
they recently upgraded the findbugs version and it's found some new ones -- 
there's a bunch of JIRAs to fix those.

 Find missing config options in YarnConfiguration and yarn-default.xml
 -

 Key: YARN-2284
 URL: https://issues.apache.org/jira/browse/YARN-2284
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.4.1
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor
  Labels: supportability
 Attachments: YARN-2284-04.patch, YARN-2284-05.patch, 
 YARN-2284-06.patch, YARN-2284-07.patch, YARN-2284-08.patch, 
 YARN-2284-09.patch, YARN2284-01.patch, YARN2284-02.patch, YARN2284-03.patch


 YarnConfiguration has one set of properties.  yarn-default.xml has another 
 set of properties.  Ideally, there should be an automatic way to find missing 
 properties in either location.
 This is analogous to MAPREDUCE-5130, but for yarn-default.xml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2284) Find missing config options in YarnConfiguration and yarn-default.xml

2014-12-11 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243241#comment-14243241
 ] 

Robert Kanter commented on YARN-2284:
-

Sorry, those should be yarn-default, mapred-default, and hdfs-default; not 
*-site.

 Find missing config options in YarnConfiguration and yarn-default.xml
 -

 Key: YARN-2284
 URL: https://issues.apache.org/jira/browse/YARN-2284
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.4.1
Reporter: Ray Chiang
Assignee: Ray Chiang
Priority: Minor
  Labels: supportability
 Attachments: YARN-2284-04.patch, YARN-2284-05.patch, 
 YARN-2284-06.patch, YARN-2284-07.patch, YARN-2284-08.patch, 
 YARN-2284-09.patch, YARN2284-01.patch, YARN2284-02.patch, YARN2284-03.patch


 YarnConfiguration has one set of properties.  yarn-default.xml has another 
 set of properties.  Ideally, there should be an automatic way to find missing 
 properties in either location.
 This is analogous to MAPREDUCE-5130, but for yarn-default.xml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2938) Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice

2014-12-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243268#comment-14243268
 ] 

Hadoop QA commented on YARN-2938:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12686669/YARN-2938.002.patch
  against trunk revision b9f6d0c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6089//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6089//console

This message is automatically generated.

 Fix new findbugs warnings in hadoop-yarn-resourcemanager and 
 hadoop-yarn-applicationhistoryservice
 --

 Key: YARN-2938
 URL: https://issues.apache.org/jira/browse/YARN-2938
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Varun Saxena
Assignee: Varun Saxena
 Fix For: 2.7.0

 Attachments: YARN-2938.001.patch, YARN-2938.002.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2944) SCMStore/InMemorySCMStore is not currently compatible with ReflectionUtils#newInstance

2014-12-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243280#comment-14243280
 ] 

Hadoop QA commented on YARN-2944:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12686681/YARN-2944-trunk-v1.patch
  against trunk revision 0bcea11.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager:

  
org.apache.hadoop.yarn.server.sharedcachemanager.TestSharedCacheUploaderService
  
org.apache.hadoop.yarn.server.sharedcachemanager.TestClientSCMProtocolService

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6092//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6092//console

This message is automatically generated.

 SCMStore/InMemorySCMStore is not currently compatible with 
 ReflectionUtils#newInstance
 --

 Key: YARN-2944
 URL: https://issues.apache.org/jira/browse/YARN-2944
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Chris Trezzo
Assignee: Chris Trezzo
Priority: Minor
 Attachments: YARN-2944-trunk-v1.patch


 Currently the Shared Cache Manager uses ReflectionUtils#newInstance to create 
 the SCMStore service. Unfortunately the SCMStore class does not have a 
 0-argument constructor.
 On startup, the SCM fails with the following:
 {noformat}
 14/12/09 16:10:53 INFO service.AbstractService: Service SharedCacheManager 
 failed in state INITED; cause: java.lang.RuntimeException: 
 java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 java.lang.RuntimeException: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.createSCMStoreService(SharedCacheManager.java:103)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.serviceInit(SharedCacheManager.java:65)
 at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.main(SharedCacheManager.java:156)
 Caused by: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 at java.lang.Class.getConstructor0(Class.java:2763)
 at java.lang.Class.getDeclaredConstructor(Class.java:2021)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)
 ... 4 more
 14/12/09 16:10:53 FATAL sharedcachemanager.SharedCacheManager: Error starting 
 SharedCacheManager
 java.lang.RuntimeException: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.createSCMStoreService(SharedCacheManager.java:103)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.serviceInit(SharedCacheManager.java:65)
 at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.main(SharedCacheManager.java:156)
 Caused by: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 at java.lang.Class.getConstructor0(Class.java:2763)
 at 

[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2014-12-11 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243283#comment-14243283
 ] 

Craig Welch commented on YARN-2495:
---

Other comments on the patch as such, assuming we really do need this part of 
the change...

DECENTRALIZED_CONFIGURATION_ENABLED et all - I do see the basis for enabling 
and disabling the enforcement of a centralized list and discussion around that, 
but I don't see any reason to have conditional enablement of the node manager 
side of things as well as a provider specification and I think it just adds 
unnecessary complexity and possible surprise at configuration time - at the 
level of this configuration I think it should just be enabled (and I don't mean 
just by default, I mean if we add this, it should just be a way to manage node 
labels, not conditionally enabled or disabled, any more than the web service or 
cli are conditionally enabled or disabled, and so we don't have this parameter 
/ it's associated branching at all).

I think the default node labels provider service should definitely be a null 
provider that always returns an empty list and areLabelsUpdated false - this 
takes out the need to decide which is default, a no-op one is, and it allows 
us to get rid of the extra enabled/disabled configuration above without adding 
a new configuration (the provider will be specified anyway if the feature is 
going to be used)

NodeHeartbeatRequest - 

isNodeLabelsUpdated - I would go with areNodeLablesSet (all isNodeLabels = 
areNodeLabels wherever it appears, actually)  - wrt Set vs Updated - this 
is primarily a workaround for the null/empty ambiguity and I think this name 
better reflects what is really going on (am I sending a value to act on or 
not), but I also think that this is a better contract, the receiver (rm) 
shouldn't really care about the logic the nm side is using to decide whether or 
not to set it's labels (freshness, updatedness, whatever), so all that should 
be communicated in the api is whether or not the value is set, not whether it's 
an update/whether it's checking freshness, etc.  that's a nit, but I think it's 
a clearer name.

RegisterNodeLabelManagerResponse - get/set IsNodeLabelsAcceptedByRM - I would 
make it get/set AreNodeLablesAcceptedByRM (and on imples, etc, of course)

RegisterNodeManagerRequest - missing spaces in args (l 42)  also, assuming we 
drop the distributed on/off config as I'm suggesting, you'll need the 
areNodeLablesSet to be passed here as well.  (I also like this better because 
it harmonizes the api between registration and heartbeat, which is easier to 
understand b/c they are doing the same thing / should do it the same way).


 Allow admin specify labels from each NM (Distributed configuration)
 ---

 Key: YARN-2495
 URL: https://issues.apache.org/jira/browse/YARN-2495
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
 YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
 YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
 YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
 YARN-2495_20141022.1.patch


 Target of this JIRA is to allow admin specify labels in each NM, this covers
 - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
 using script suggested by [~aw] (YARN-2729) )
 - NM will send labels to RM via ResourceTracker API
 - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2938) Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice

2014-12-11 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243289#comment-14243289
 ] 

Varun Saxena commented on YARN-2938:


[~zjshen], kindly review.

 Fix new findbugs warnings in hadoop-yarn-resourcemanager and 
 hadoop-yarn-applicationhistoryservice
 --

 Key: YARN-2938
 URL: https://issues.apache.org/jira/browse/YARN-2938
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Varun Saxena
Assignee: Varun Saxena
 Fix For: 2.7.0

 Attachments: YARN-2938.001.patch, YARN-2938.002.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2937) Fix new findbugs warnings in hadoop-yarn-nodemanager

2014-12-11 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-2937:
---
Attachment: YARN-2937.003.patch

 Fix new findbugs warnings in hadoop-yarn-nodemanager
 

 Key: YARN-2937
 URL: https://issues.apache.org/jira/browse/YARN-2937
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Varun Saxena
Assignee: Varun Saxena
 Fix For: 2.7.0

 Attachments: HADOOP-11373.patch, YARN-2937.001.patch, 
 YARN-2937.002.patch, YARN-2937.003.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2946) Deadlock in ZKRMStateStore

2014-12-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243320#comment-14243320
 ] 

Hadoop QA commented on YARN-2946:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12686604/0001-YARN-2946.patch
  against trunk revision 0bcea11.

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6093//console

This message is automatically generated.

 Deadlock in ZKRMStateStore
 --

 Key: YARN-2946
 URL: https://issues.apache.org/jira/browse/YARN-2946
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
Priority: Blocker
 Attachments: 0001-YARN-2946.patch, TestYARN2946.java


 Found one deadlock in ZKRMStateStore.
 # Initial stage zkClient is null because of zk disconnected event.
 # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
 re establish zookeeper connection either via synconnected or expired event, 
 it is highly possible that any other thred can obtain lock on 
 {{ZKRMStateStore.this}} from state machine transition events. This cause 
 Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2920) CapacityScheduler should be notified when labels on nodes changed

2014-12-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243338#comment-14243338
 ] 

Hadoop QA commented on YARN-2920:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12686676/YARN-2920.5.patch
  against trunk revision 0bcea11.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 28 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-tools/hadoop-sls 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService
org.apache.hadoop.yarn.server.resourcemanager.scheduler.TestAbstractYarnScheduler

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6091//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6091//artifact/patchprocess/newPatchFindbugsWarningshadoop-sls.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6091//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6091//console

This message is automatically generated.

 CapacityScheduler should be notified when labels on nodes changed
 -

 Key: YARN-2920
 URL: https://issues.apache.org/jira/browse/YARN-2920
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2920.1.patch, YARN-2920.2.patch, YARN-2920.3.patch, 
 YARN-2920.4.patch, YARN-2920.5.patch


 Currently, labels on nodes changes will only be handled by 
 RMNodeLabelsManager, but that is not enough upon labels on nodes changes:
 - Scheduler should be able to do take actions to running containers. (Like 
 kill/preempt/do-nothing)
 - Used / available capacity in scheduler should be updated for future 
 planning.
 We need add a new event to pass such updates to scheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2938) Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice

2014-12-11 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243358#comment-14243358
 ] 

Zhijie Shen commented on YARN-2938:
---

will review

 Fix new findbugs warnings in hadoop-yarn-resourcemanager and 
 hadoop-yarn-applicationhistoryservice
 --

 Key: YARN-2938
 URL: https://issues.apache.org/jira/browse/YARN-2938
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Varun Saxena
Assignee: Varun Saxena
 Fix For: 2.7.0

 Attachments: YARN-2938.001.patch, YARN-2938.002.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2944) SCMStore/InMemorySCMStore is not currently compatible with ReflectionUtils#newInstance

2014-12-11 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243365#comment-14243365
 ] 

Chris Trezzo commented on YARN-2944:


Thanks Sangjin. Ack, will adjust scope of the InMemorySCMStore constructor and 
fix the unit tests.

 SCMStore/InMemorySCMStore is not currently compatible with 
 ReflectionUtils#newInstance
 --

 Key: YARN-2944
 URL: https://issues.apache.org/jira/browse/YARN-2944
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Chris Trezzo
Assignee: Chris Trezzo
Priority: Minor
 Attachments: YARN-2944-trunk-v1.patch


 Currently the Shared Cache Manager uses ReflectionUtils#newInstance to create 
 the SCMStore service. Unfortunately the SCMStore class does not have a 
 0-argument constructor.
 On startup, the SCM fails with the following:
 {noformat}
 14/12/09 16:10:53 INFO service.AbstractService: Service SharedCacheManager 
 failed in state INITED; cause: java.lang.RuntimeException: 
 java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 java.lang.RuntimeException: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.createSCMStoreService(SharedCacheManager.java:103)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.serviceInit(SharedCacheManager.java:65)
 at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.main(SharedCacheManager.java:156)
 Caused by: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 at java.lang.Class.getConstructor0(Class.java:2763)
 at java.lang.Class.getDeclaredConstructor(Class.java:2021)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)
 ... 4 more
 14/12/09 16:10:53 FATAL sharedcachemanager.SharedCacheManager: Error starting 
 SharedCacheManager
 java.lang.RuntimeException: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.createSCMStoreService(SharedCacheManager.java:103)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.serviceInit(SharedCacheManager.java:65)
 at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 at 
 org.apache.hadoop.yarn.server.sharedcachemanager.SharedCacheManager.main(SharedCacheManager.java:156)
 Caused by: java.lang.NoSuchMethodException: 
 org.apache.hadoop.yarn.server.sharedcachemanager.store.InMemorySCMStore.init()
 at java.lang.Class.getConstructor0(Class.java:2763)
 at java.lang.Class.getDeclaredConstructor(Class.java:2021)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:125)
 ... 4 more
 {noformat}
 This JIRA is to add a 0-argument constructor to SCMStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2946) Deadlock in ZKRMStateStore

2014-12-11 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243364#comment-14243364
 ] 

Jian He commented on YARN-2946:
---

looks good,  +1

 Deadlock in ZKRMStateStore
 --

 Key: YARN-2946
 URL: https://issues.apache.org/jira/browse/YARN-2946
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
Priority: Blocker
 Attachments: 0001-YARN-2946.patch, TestYARN2946.java


 Found one deadlock in ZKRMStateStore.
 # Initial stage zkClient is null because of zk disconnected event.
 # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
 re establish zookeeper connection either via synconnected or expired event, 
 it is highly possible that any other thred can obtain lock on 
 {{ZKRMStateStore.this}} from state machine transition events. This cause 
 Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2943) Add a node-labels page in RM web UI

2014-12-11 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243371#comment-14243371
 ] 

Jian He commented on YARN-2943:
---

- {{LOG.info(yyy);}} ?
- typo {{// Nodes will show need include Non-empty label filter}} 
- If I restart NM, “num of active NMs” is incorrect; this is probably because 
NM is using ephemeral ports; 
- “NO_LABEL” - “N/A”, similarly, update the previous nodes page to show 
“N/A” for not labeled nodes;
- usability: num of active NMs link can point to a filtered list of nodes by 
label ?
- pullRMNodeLabelsInfo - it’s calculated on demand and each time loop all the 
nodes in the cluster; we can probably promote Label as a separate class and 
internally bookkeeping the number of NMs.

 Add a node-labels page in RM web UI
 ---

 Key: YARN-2943
 URL: https://issues.apache.org/jira/browse/YARN-2943
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: Node-labels-page.png, Nodes-page-with-label-filter.png, 
 YARN-2943.1.patch


 Now we have node labels in the system, but there's no a very convenient to 
 get information like how many active NM(s) assigned to a given label?, how 
 much total resource for a give label?, For a given label, which queues can 
 access it?, etc.
 It will be better to add a node-labels page in RM web UI, users/admins can 
 have a centralized view to see such information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2946) Deadlock in ZKRMStateStore

2014-12-11 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243378#comment-14243378
 ] 

Rohith commented on YARN-2946:
--

Thanks [~jianhe] for reviewing analysis and patch. It seems some compilation 
error,I will take look at this and update patch.

 Deadlock in ZKRMStateStore
 --

 Key: YARN-2946
 URL: https://issues.apache.org/jira/browse/YARN-2946
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
Priority: Blocker
 Attachments: 0001-YARN-2946.patch, TestYARN2946.java


 Found one deadlock in ZKRMStateStore.
 # Initial stage zkClient is null because of zk disconnected event.
 # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
 re establish zookeeper connection either via synconnected or expired event, 
 it is highly possible that any other thred can obtain lock on 
 {{ZKRMStateStore.this}} from state machine transition events. This cause 
 Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2937) Fix new findbugs warnings in hadoop-yarn-nodemanager

2014-12-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243381#comment-14243381
 ] 

Hadoop QA commented on YARN-2937:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12686704/YARN-2937.003.patch
  against trunk revision 0bcea11.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6094//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6094//console

This message is automatically generated.

 Fix new findbugs warnings in hadoop-yarn-nodemanager
 

 Key: YARN-2937
 URL: https://issues.apache.org/jira/browse/YARN-2937
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Varun Saxena
Assignee: Varun Saxena
 Fix For: 2.7.0

 Attachments: HADOOP-11373.patch, YARN-2937.001.patch, 
 YARN-2937.002.patch, YARN-2937.003.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2946) Deadlock in ZKRMStateStore

2014-12-11 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243384#comment-14243384
 ] 

Rohith commented on YARN-2946:
--

I was changed method modifier from public to private causing compilation error 
but it directly used from test. I will update patch without modifying method 
modifiers.

 Deadlock in ZKRMStateStore
 --

 Key: YARN-2946
 URL: https://issues.apache.org/jira/browse/YARN-2946
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
Priority: Blocker
 Attachments: 0001-YARN-2946.patch, TestYARN2946.java


 Found one deadlock in ZKRMStateStore.
 # Initial stage zkClient is null because of zk disconnected event.
 # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
 re establish zookeeper connection either via synconnected or expired event, 
 it is highly possible that any other thred can obtain lock on 
 {{ZKRMStateStore.this}} from state machine transition events. This cause 
 Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2946) Deadlock in ZKRMStateStore

2014-12-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243385#comment-14243385
 ] 

Hadoop QA commented on YARN-2946:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12686604/0001-YARN-2946.patch
  against trunk revision 0bcea11.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6095//console

This message is automatically generated.

 Deadlock in ZKRMStateStore
 --

 Key: YARN-2946
 URL: https://issues.apache.org/jira/browse/YARN-2946
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
Priority: Blocker
 Attachments: 0001-YARN-2946.patch, TestYARN2946.java


 Found one deadlock in ZKRMStateStore.
 # Initial stage zkClient is null because of zk disconnected event.
 # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
 re establish zookeeper connection either via synconnected or expired event, 
 it is highly possible that any other thred can obtain lock on 
 {{ZKRMStateStore.this}} from state machine transition events. This cause 
 Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2946) Deadlock in ZKRMStateStore

2014-12-11 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243391#comment-14243391
 ] 

Varun Saxena commented on YARN-2946:


[~rohithsharma], making the method private makes sense. You can probably use 
VisibleForTesting annotation.

 Deadlock in ZKRMStateStore
 --

 Key: YARN-2946
 URL: https://issues.apache.org/jira/browse/YARN-2946
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
Priority: Blocker
 Attachments: 0001-YARN-2946.patch, TestYARN2946.java


 Found one deadlock in ZKRMStateStore.
 # Initial stage zkClient is null because of zk disconnected event.
 # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
 re establish zookeeper connection either via synconnected or expired event, 
 it is highly possible that any other thred can obtain lock on 
 {{ZKRMStateStore.this}} from state machine transition events. This cause 
 Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2920) CapacityScheduler should be notified when labels on nodes changed

2014-12-11 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2920:
-
Attachment: YARN-2920.6.patch

Fixed test failure (TestResourceTracker is related, but 
TestAbstractYarnScheduler cannot reproduce locally).

Findbugs warning not related.

 CapacityScheduler should be notified when labels on nodes changed
 -

 Key: YARN-2920
 URL: https://issues.apache.org/jira/browse/YARN-2920
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2920.1.patch, YARN-2920.2.patch, YARN-2920.3.patch, 
 YARN-2920.4.patch, YARN-2920.5.patch, YARN-2920.6.patch


 Currently, labels on nodes changes will only be handled by 
 RMNodeLabelsManager, but that is not enough upon labels on nodes changes:
 - Scheduler should be able to do take actions to running containers. (Like 
 kill/preempt/do-nothing)
 - Used / available capacity in scheduler should be updated for future 
 planning.
 We need add a new event to pass such updates to scheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2946) Deadlock in ZKRMStateStore

2014-12-11 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243396#comment-14243396
 ] 

Varun Saxena commented on YARN-2946:


I guess same package access can be given instead of public. Private may not 
work because the annotation is only for documentation purposes. 

 Deadlock in ZKRMStateStore
 --

 Key: YARN-2946
 URL: https://issues.apache.org/jira/browse/YARN-2946
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
Priority: Blocker
 Attachments: 0001-YARN-2946.patch, TestYARN2946.java


 Found one deadlock in ZKRMStateStore.
 # Initial stage zkClient is null because of zk disconnected event.
 # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
 re establish zookeeper connection either via synconnected or expired event, 
 it is highly possible that any other thred can obtain lock on 
 {{ZKRMStateStore.this}} from state machine transition events. This cause 
 Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2243) Order of arguments for Preconditions.checkNotNull() is wrong in SchedulerApplicationAttempt ctor

2014-12-11 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243409#comment-14243409
 ] 

Tsuyoshi OZAWA commented on YARN-2243:
--

Good catch, +1.

From the javadoc of [Google 
Guava|https://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/base/Preconditions.html]:
{code}
checkArgument(boolean expression, Object errorMessage) 
{code}

 Order of arguments for Preconditions.checkNotNull() is wrong in 
 SchedulerApplicationAttempt ctor
 

 Key: YARN-2243
 URL: https://issues.apache.org/jira/browse/YARN-2243
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.5.1
Reporter: Ted Yu
Assignee: Devaraj K
Priority: Minor
 Attachments: YARN-2243.patch, YARN-2243.patch


 {code}
   public SchedulerApplicationAttempt(ApplicationAttemptId 
 applicationAttemptId, 
   String user, Queue queue, ActiveUsersManager activeUsersManager,
   RMContext rmContext) {
 Preconditions.checkNotNull(RMContext should not be null, rmContext);
 {code}
 Order of arguments is wrong for Preconditions.checkNotNull().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2243) Order of arguments for Preconditions.checkNotNull() is wrong in SchedulerApplicationAttempt ctor

2014-12-11 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243412#comment-14243412
 ] 

Tsuyoshi OZAWA commented on YARN-2243:
--

{code}
checkNotNull(T reference, Object errorMessage)
{code}

 Order of arguments for Preconditions.checkNotNull() is wrong in 
 SchedulerApplicationAttempt ctor
 

 Key: YARN-2243
 URL: https://issues.apache.org/jira/browse/YARN-2243
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.5.1
Reporter: Ted Yu
Assignee: Devaraj K
Priority: Minor
 Attachments: YARN-2243.patch, YARN-2243.patch


 {code}
   public SchedulerApplicationAttempt(ApplicationAttemptId 
 applicationAttemptId, 
   String user, Queue queue, ActiveUsersManager activeUsersManager,
   RMContext rmContext) {
 Preconditions.checkNotNull(RMContext should not be null, rmContext);
 {code}
 Order of arguments is wrong for Preconditions.checkNotNull().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2942) Aggregated Log Files should be compacted

2014-12-11 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243413#comment-14243413
 ] 

Robert Kanter commented on YARN-2942:
-

Thanks for taking a look at the proposal Zhijie.  

Ya, it looks like YARN-2548 is related.  That one looks to be more about long 
running jobs, and for this one I hadn't really considered those; this only 
works after the job finishes.

1. That's true.  This design doesn't currently address that.  However, the 
format used by the compacted files isn't anything special; the data is just 
dumped into the file and an index written to the index file for each 
container.  As far as this format is concerned, we should be able to append 
more logs and indices to it.  We would just need to figure out a good way to 
manage when they're appended and how this compaction process is triggered.  

2. Yes.  We'd leave the original aggregated logs until the compacted log is 
available.  The JHS would continue using the aggregated log files until the 
compacted log file is ready.  

3. I might not have been clear about that in the design.  The RM would be the 
one to figure out when the app is done and the aggregated logs can be 
compacted.  We'd run the actual compacting code in one of the NMs, so that the 
RM isn't spending cycles doing that, and so that we don't end up with a replica 
of each compacted log on one datanode (in other words, the RM would chose, at 
random or round-robin, an NM to do each app's compaction; this will cause the 
replicas to be spread around the cluster).

4. That's a good question; though I don't think the index is the problem here.  
It's small enough that we could always just rewrite a new index to replace the 
stale one.  I think the problem would be with the compacted log file itself 
because we can't simply delete a chunk of it on HDFS; and it's big enough that 
there would be a lot of overhead to rewriting it.  One solution here is to 
write a new compacted log file every N containers or file size, and we can do 
cleanup by deleting an earlier compacted log file and updating the index.  The 
downside to this is that the life length of a container in a compacted log file 
would not all be equal, but that's probably okay.

Perhaps we can start out with this design, and then modify it for long running 
jobs that support YARN-2468 to have some other way of:
- Triggering/Managing the compaction process (#1)
- Deleting old logs (#4)

Perhaps we can use this JIRA for normal jobs and then use YARN-2548 to add 
support to it for long running jobs?  What do you think [~zjshen] and [~xgong]?

 Aggregated Log Files should be compacted
 

 Key: YARN-2942
 URL: https://issues.apache.org/jira/browse/YARN-2942
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.6.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Attachments: CompactedAggregatedLogsProposal_v1.pdf, 
 YARN-2942-preliminary.001.patch


 Turning on log aggregation allows users to easily store container logs in 
 HDFS and subsequently view them in the YARN web UIs from a central place.  
 Currently, there is a separate log file for each Node Manager.  This can be a 
 problem for HDFS if you have a cluster with many nodes as you’ll slowly start 
 accumulating many (possibly small) files per YARN application.  The current 
 “solution” for this problem is to configure YARN (actually the JHS) to 
 automatically delete these files after some amount of time.  
 We should improve this by compacting the per-node aggregated log files into 
 one log file per application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2946) Deadlock in ZKRMStateStore

2014-12-11 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-2946:
-
Attachment: 0002-YARN-2946.patch

 Deadlock in ZKRMStateStore
 --

 Key: YARN-2946
 URL: https://issues.apache.org/jira/browse/YARN-2946
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
Priority: Blocker
 Attachments: 0001-YARN-2946.patch, 0002-YARN-2946.patch, 
 TestYARN2946.java


 Found one deadlock in ZKRMStateStore.
 # Initial stage zkClient is null because of zk disconnected event.
 # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
 re establish zookeeper connection either via synconnected or expired event, 
 it is highly possible that any other thred can obtain lock on 
 {{ZKRMStateStore.this}} from state machine transition events. This cause 
 Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2946) Deadlock in ZKRMStateStore

2014-12-11 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243452#comment-14243452
 ] 

Rohith commented on YARN-2946:
--

I updated patch by changing test case.It should be fine now.

 Deadlock in ZKRMStateStore
 --

 Key: YARN-2946
 URL: https://issues.apache.org/jira/browse/YARN-2946
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
Priority: Blocker
 Attachments: 0001-YARN-2946.patch, 0002-YARN-2946.patch, 
 TestYARN2946.java


 Found one deadlock in ZKRMStateStore.
 # Initial stage zkClient is null because of zk disconnected event.
 # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
 re establish zookeeper connection either via synconnected or expired event, 
 it is highly possible that any other thred can obtain lock on 
 {{ZKRMStateStore.this}} from state machine transition events. This cause 
 Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2946) Deadlock in ZKRMStateStore

2014-12-11 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243458#comment-14243458
 ] 

Rohith commented on YARN-2946:
--

Kindly review the updated new patch fixing compilation errors.

 Deadlock in ZKRMStateStore
 --

 Key: YARN-2946
 URL: https://issues.apache.org/jira/browse/YARN-2946
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
Priority: Blocker
 Attachments: 0001-YARN-2946.patch, 0002-YARN-2946.patch, 
 TestYARN2946.java


 Found one deadlock in ZKRMStateStore.
 # Initial stage zkClient is null because of zk disconnected event.
 # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
 re establish zookeeper connection either via synconnected or expired event, 
 it is highly possible that any other thred can obtain lock on 
 {{ZKRMStateStore.this}} from state machine transition events. This cause 
 Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2929) Adding separator ApplicationConstants.FILE_PATH_SEPARATOR for better Windows support

2014-12-11 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243482#comment-14243482
 ] 

Chris Nauroth commented on YARN-2929:
-

I apologize, but I'm still having trouble seeing the usefulness of this.  Using 
{{Path}} as I described earlier effectively changes any file path into valid 
URI syntax, using forward slashes instead of back slashes.  I expect this would 
then be a valid, usable path at the NodeManager regardless of its OS.  Even if 
the path originates from a shell script, I don't see how that would make a 
difference.

Do you have an example YARN application submission that would demonstrate the 
problem in more detail?  Alternatively, if you could point out a spot in 
Spark's YARN application submission code that demonstrates the problem, then I 
could look at that.  I am assuming here that a path originating from a shell 
script would get passed into a Spark Java process, where Spark code would have 
an opportunity to use the {{Path}} class like I described.  Please let me know 
if my assumption is wrong.

There isn't anything necessarily wrong with the patch posted.  It just looks to 
me at this point like it isn't required.  By minimizing token replacement rules 
like this, we'd reduce the number of special cases that YARN application 
writers would need to consider.

 Adding separator ApplicationConstants.FILE_PATH_SEPARATOR for better Windows 
 support
 

 Key: YARN-2929
 URL: https://issues.apache.org/jira/browse/YARN-2929
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-2929.001.patch


 Some frameworks like Spark is tackling to run jobs on Windows(SPARK-1825). 
 For better multiple platform support, we should introduce 
 ApplicationConstants.FILE_PATH_SEPARATOR for making filepath 
 platform-independent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2920) CapacityScheduler should be notified when labels on nodes changed

2014-12-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243568#comment-14243568
 ] 

Hadoop QA commented on YARN-2920:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12686715/YARN-2920.6.patch
  against trunk revision f6f2a3f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 28 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-tools/hadoop-sls 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
  org.apache.hadoop.yarn.server.resourcemanager.TestRM
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler

  The following test timeouts occurred in 
hadoop-tools/hadoop-sls 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6096//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6096//artifact/patchprocess/newPatchFindbugsWarningshadoop-sls.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6096//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6096//console

This message is automatically generated.

 CapacityScheduler should be notified when labels on nodes changed
 -

 Key: YARN-2920
 URL: https://issues.apache.org/jira/browse/YARN-2920
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2920.1.patch, YARN-2920.2.patch, YARN-2920.3.patch, 
 YARN-2920.4.patch, YARN-2920.5.patch, YARN-2920.6.patch


 Currently, labels on nodes changes will only be handled by 
 RMNodeLabelsManager, but that is not enough upon labels on nodes changes:
 - Scheduler should be able to do take actions to running containers. (Like 
 kill/preempt/do-nothing)
 - Used / available capacity in scheduler should be updated for future 
 planning.
 We need add a new event to pass such updates to scheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2762) RMAdminCLI node-labels-related args should be trimmed and checked before sending to RM

2014-12-11 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243567#comment-14243567
 ] 

Jian He commented on YARN-2762:
---

looks good, one minor comment:
we can have a common method for this:
{code}
SetString labels = new HashSetString();
for (String p : args.split(,)) {
  if (!p.trim().isEmpty()) {
labels.add(p.trim());
  }
}
if (labels.isEmpty()) {
  throw new IllegalArgumentException(NO_LABEL_ERR_MSG);
}
{code}

 RMAdminCLI node-labels-related args should be trimmed and checked before 
 sending to RM
 --

 Key: YARN-2762
 URL: https://issues.apache.org/jira/browse/YARN-2762
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Minor
 Attachments: YARN-2762.1.patch, YARN-2762.2.patch, YARN-2762.2.patch, 
 YARN-2762.3.patch, YARN-2762.4.patch, YARN-2762.5.patch, YARN-2762.6.patch, 
 YARN-2762.patch


 All NodeLabel args validation's are done at server side. The same can be done 
 at RMAdminCLI so that unnecessary RPC calls can be avoided.
 And for the input such as x,y,,z,, no need to add empty string instead can 
 be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2942) Aggregated Log Files should be compacted

2014-12-11 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243569#comment-14243569
 ] 

Zhijie Shen commented on YARN-2942:
---

bq. Perhaps we can use this JIRA for normal jobs and then use YARN-2548 to add 
support to it for long running jobs?

It makes sense to separate normal applications and long running services, but 
we need to make sure the logs from long running services are not affected. In 
other word, compacting won't happen on the log files of long running services.

 Aggregated Log Files should be compacted
 

 Key: YARN-2942
 URL: https://issues.apache.org/jira/browse/YARN-2942
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.6.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Attachments: CompactedAggregatedLogsProposal_v1.pdf, 
 YARN-2942-preliminary.001.patch


 Turning on log aggregation allows users to easily store container logs in 
 HDFS and subsequently view them in the YARN web UIs from a central place.  
 Currently, there is a separate log file for each Node Manager.  This can be a 
 problem for HDFS if you have a cluster with many nodes as you’ll slowly start 
 accumulating many (possibly small) files per YARN application.  The current 
 “solution” for this problem is to configure YARN (actually the JHS) to 
 automatically delete these files after some amount of time.  
 We should improve this by compacting the per-node aggregated log files into 
 one log file per application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2946) Deadlock in ZKRMStateStore

2014-12-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243578#comment-14243578
 ] 

Hadoop QA commented on YARN-2946:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12686723/0002-YARN-2946.patch
  against trunk revision f6f2a3f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 15 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6097//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6097//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6097//console

This message is automatically generated.

 Deadlock in ZKRMStateStore
 --

 Key: YARN-2946
 URL: https://issues.apache.org/jira/browse/YARN-2946
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
Priority: Blocker
 Attachments: 0001-YARN-2946.patch, 0002-YARN-2946.patch, 
 TestYARN2946.java


 Found one deadlock in ZKRMStateStore.
 # Initial stage zkClient is null because of zk disconnected event.
 # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
 re establish zookeeper connection either via synconnected or expired event, 
 it is highly possible that any other thred can obtain lock on 
 {{ZKRMStateStore.this}} from state machine transition events. This cause 
 Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2952) Incorrect version check in RMStateStore

2014-12-11 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243585#comment-14243585
 ] 

Jian He commented on YARN-2952:
---

we may change {{loadedVersion = Version.newInstance(1, 0);}} to 
{{getCurrentVersion()}}

 Incorrect version check in RMStateStore
 ---

 Key: YARN-2952
 URL: https://issues.apache.org/jira/browse/YARN-2952
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He

 In RMStateStore#checkVersion:  if we modify  tCURRENT_VERSION_INFO to 2.0, 
 it'll still store the version as 1.0 which is incorrect; 
 {code}
 // if there is no version info, treat it as 1.0;
 if (loadedVersion == null) {
   loadedVersion = Version.newInstance(1, 0);
 }
 if (loadedVersion.isCompatibleTo(getCurrentVersion())) {
   LOG.info(Storing RM state version info  + getCurrentVersion());
   storeVersion();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2952) Incorrect version check in RMStateStore

2014-12-11 Thread Jian He (JIRA)
Jian He created YARN-2952:
-

 Summary: Incorrect version check in RMStateStore
 Key: YARN-2952
 URL: https://issues.apache.org/jira/browse/YARN-2952
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He


In RMStateStore#checkVersion:  if we modify  tCURRENT_VERSION_INFO to 2.0, 
it'll still store the version as 1.0 which is incorrect; 
{code}
// if there is no version info, treat it as 1.0;
if (loadedVersion == null) {
  loadedVersion = Version.newInstance(1, 0);
}
if (loadedVersion.isCompatibleTo(getCurrentVersion())) {
  LOG.info(Storing RM state version info  + getCurrentVersion());
  storeVersion();
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2952) Incorrect version check in RMStateStore

2014-12-11 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-2952:
--
Description: 
In RMStateStore#checkVersion:  if we modify  tCURRENT_VERSION_INFO to 2.0, 
it'll still store the version as 1.0 which is incorrect; The same thing might 
happen to NM store, timeline store.
{code}
// if there is no version info, treat it as 1.0;
if (loadedVersion == null) {
  loadedVersion = Version.newInstance(1, 0);
}
if (loadedVersion.isCompatibleTo(getCurrentVersion())) {
  LOG.info(Storing RM state version info  + getCurrentVersion());
  storeVersion();
{code}

  was:
In RMStateStore#checkVersion:  if we modify  tCURRENT_VERSION_INFO to 2.0, 
it'll still store the version as 1.0 which is incorrect; 
{code}
// if there is no version info, treat it as 1.0;
if (loadedVersion == null) {
  loadedVersion = Version.newInstance(1, 0);
}
if (loadedVersion.isCompatibleTo(getCurrentVersion())) {
  LOG.info(Storing RM state version info  + getCurrentVersion());
  storeVersion();
{code}


 Incorrect version check in RMStateStore
 ---

 Key: YARN-2952
 URL: https://issues.apache.org/jira/browse/YARN-2952
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He

 In RMStateStore#checkVersion:  if we modify  tCURRENT_VERSION_INFO to 2.0, 
 it'll still store the version as 1.0 which is incorrect; The same thing might 
 happen to NM store, timeline store.
 {code}
 // if there is no version info, treat it as 1.0;
 if (loadedVersion == null) {
   loadedVersion = Version.newInstance(1, 0);
 }
 if (loadedVersion.isCompatibleTo(getCurrentVersion())) {
   LOG.info(Storing RM state version info  + getCurrentVersion());
   storeVersion();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2946) Deadlock in ZKRMStateStore

2014-12-11 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-2946:
--
Affects Version/s: (was: 2.6.0)
   2.7.0

 Deadlock in ZKRMStateStore
 --

 Key: YARN-2946
 URL: https://issues.apache.org/jira/browse/YARN-2946
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: Rohith
Assignee: Rohith
Priority: Blocker
 Attachments: 0001-YARN-2946.patch, 0002-YARN-2946.patch, 
 TestYARN2946.java


 Found one deadlock in ZKRMStateStore.
 # Initial stage zkClient is null because of zk disconnected event.
 # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to 
 re establish zookeeper connection either via synconnected or expired event, 
 it is highly possible that any other thred can obtain lock on 
 {{ZKRMStateStore.this}} from state machine transition events. This cause 
 Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2929) Adding separator ApplicationConstants.FILE_PATH_SEPARATOR for better Windows support

2014-12-11 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243590#comment-14243590
 ] 

Tsuyoshi OZAWA commented on YARN-2929:
--

Sorry for the shortage of explanations and thanks for your clarification.

{quote}
I am assuming here that a path originating from a shell script would get passed 
into a Spark Java process, where Spark code would have an opportunity to use 
the Path class like I described.
{quote}

In fact, this problem happens when launching AM - before launching JVM of the 
AM.
For instance, AM need classpath to launch. Sometimes classpath includes 
subdirectory with path separators. This is a case. The path separators cannot 
be parsed on the OS and fails to find jar.

{code}
// JVM option in launch-container.sh
-classpath=local/lib/hadoop # cannot be parsed on Windows
{code}

For the case, following path is converted into platform depend paths:
{code}
// JVM option in launch-container.sh
-classpath=localFPSlibFPShadoop # can be converted into platform depend 
paths by expandEnvironments
{code}

{quote}
By minimizing token replacement rules like this, we'd reduce the number of 
special cases that YARN application writers would need to consider.
{quote}

Yes, I agree with you. If we don't need it, we shouldn't include the 
modifications like this.

 Adding separator ApplicationConstants.FILE_PATH_SEPARATOR for better Windows 
 support
 

 Key: YARN-2929
 URL: https://issues.apache.org/jira/browse/YARN-2929
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-2929.001.patch


 Some frameworks like Spark is tackling to run jobs on Windows(SPARK-1825). 
 For better multiple platform support, we should introduce 
 ApplicationConstants.FILE_PATH_SEPARATOR for making filepath 
 platform-independent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2943) Add a node-labels page in RM web UI

2014-12-11 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2943:
-
Attachment: YARN-2943.2.patch

Hi [~jianhe],
Thanks for your comments,
bq. If I restart NM, “num of active NMs” is incorrect; this is probably because 
NM is using ephemeral ports;
This is what RM can see about NM, after some time elapsed, lost NM will be 
marked to LOSTED, and will not count in RM side too.

bq. “NO_LABEL” - “N/A”, similarly, update the previous nodes page to show 
“N/A” for not labeled nodes;
IMO, N/A is not clear enough. By wikipedia: http://en.wikipedia.org/wiki/N/a, 
N/a means not available, not applicable or no answer. Here no-label is 
just a kind of special label, since NO_LABEL has all same characteristics as 
other normal labels, like exclusivity, etc. 

bq. usability: num of active NMs link can point to a filtered list of nodes 
by label ?
It already did like what you suggested, when user/admin click the number, it 
will link to NM page and only shows NMs has that label. 

And I've addressed rest of your comments.

 Add a node-labels page in RM web UI
 ---

 Key: YARN-2943
 URL: https://issues.apache.org/jira/browse/YARN-2943
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: Node-labels-page.png, Nodes-page-with-label-filter.png, 
 YARN-2943.1.patch, YARN-2943.2.patch


 Now we have node labels in the system, but there's no a very convenient to 
 get information like how many active NM(s) assigned to a given label?, how 
 much total resource for a give label?, For a given label, which queues can 
 access it?, etc.
 It will be better to add a node-labels page in RM web UI, users/admins can 
 have a centralized view to see such information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2009) Priority support for preemption in ProportionalCapacityPreemptionPolicy

2014-12-11 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243608#comment-14243608
 ] 

Carlo Curino commented on YARN-2009:


[~sunilg], pardon the long delay... From what you say it seems like the 
priority issues within queue it is important for you and you observe 
non-trivial delays. 
If that's is the case, I think it is fine to venture in adding within-queue 
cross-app preemption. I would argue in favor of a conservative policy with 
several built-in
dampers (like max per-round preemptions, deadzones, fraction of imbalance) like 
we did for cross-queue. Also we should be careful not to make it too expensive
(if we have thousands of apps in the queues, we should be mindful of not 
overloading the RM with costly rebalancing algos, and extra scheduling 
decisions that
derived from preemption).

What [~eepayne] says also makes sense, if we preemptions triggered by 
cross-queue imbalances it would be good to spend them to correct the issue 
you observed.


 Priority support for preemption in ProportionalCapacityPreemptionPolicy
 ---

 Key: YARN-2009
 URL: https://issues.apache.org/jira/browse/YARN-2009
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Devaraj K
Assignee: Sunil G

 While preempting containers based on the queue ideal assignment, we may need 
 to consider preempting the low priority application containers first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2356) yarn status command for non-existent application/application attempt/container is too verbose

2014-12-11 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-2356:
--
Attachment: 0003-YARN-2356.patch

Thank you [~devaraj.k]

I have updated the patch against trunk. Kindly check

 yarn status command for non-existent application/application 
 attempt/container is too verbose 
 --

 Key: YARN-2356
 URL: https://issues.apache.org/jira/browse/YARN-2356
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Reporter: Sunil G
Assignee: Sunil G
Priority: Minor
 Attachments: 0001-YARN-2356.patch, 0002-YARN-2356.patch, 
 0003-YARN-2356.patch, Yarn-2356.1.patch


 *yarn application -status* or *applicationattempt -status* or *container 
 status* commands can suppress exception such as ApplicationNotFound, 
 ApplicationAttemptNotFound and ContainerNotFound for non-existent entries in 
 RM or History Server. 
 For example, below exception can be suppressed better
 sunildev@host-a:~/hadoop/hadoop/bin ./yarn application -status 
 application_1402668848165_0015
 No GC_PROFILE is given. Defaults to medium.
 14/07/25 16:21:45 INFO client.RMProxy: Connecting to ResourceManager at 
 /10.18.40.77:45022
 Exception in thread main 
 org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
 with id 'application_1402668848165_0015' doesn't exist in RM.
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:285)
 at 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
 at 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:607)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:932)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2099)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2095)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1626)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2093)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at 
 org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
 at 
 org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
 at 
 org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:166)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
 at $Proxy12.getApplicationReport(Unknown Source)
 at 
 org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:291)
 at 
 org.apache.hadoop.yarn.client.cli.ApplicationCLI.printApplicationReport(ApplicationCLI.java:428)
 at 
 org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:153)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:76)
 Caused by: 
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException):
  Application with id 'application_1402668848165_0015' doesn't exist in RM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >