[jira] [Commented] (YARN-1813) Better error message for yarn logs when permission denied

2014-10-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186428#comment-14186428
 ] 

Hadoop QA commented on YARN-1813:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12661008/YARN-1813.4.patch
  against trunk revision 971e91c.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5600//console

This message is automatically generated.

 Better error message for yarn logs when permission denied
 ---

 Key: YARN-1813
 URL: https://issues.apache.org/jira/browse/YARN-1813
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.3.0
Reporter: Andrew Wang
Assignee: Tsuyoshi OZAWA
Priority: Minor
 Attachments: YARN-1813.1.patch, YARN-1813.2.patch, YARN-1813.2.patch, 
 YARN-1813.3.patch, YARN-1813.4.patch


 I ran some MR jobs as the hdfs user, and then forgot to sudo -u when 
 grabbing the logs. yarn logs prints an error message like the following:
 {noformat}
 [andrew.wang@a2402 ~]$ yarn logs -applicationId application_1394482121761_0010
 14/03/10 16:05:10 INFO client.RMProxy: Connecting to ResourceManager at 
 a2402.halxg.cloudera.com/10.20.212.10:8032
 Logs not available at 
 /tmp/logs/andrew.wang/logs/application_1394482121761_0010
 Log aggregation has not completed or is not enabled.
 {noformat}
 It'd be nicer if it said Permission denied or AccessControlException or 
 something like that instead, since that's the real issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2753) Shouldn't change the value in labelCollections if the key already exists and potential NPE at CommonNodeLabelsManager.

2014-10-28 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2753:

Description: 
CommonNodeLabelsManager#addToCluserNodeLabels should not change the value in 
labelCollections if the key already exists otherwise the Label.resource will be 
changed(reset).

potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
CommonNodeLabelsManager.
It because when a Node is created, Node.labels can be null.
In this case, nm.labels; may be null.
So we need check originalLabels not null before use 
it(originalLabels.containsAll).

  was:
potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
CommonNodeLabelsManager.
It because when a Node is created, Node.labels can be null.
In this case, nm.labels; may be null.
So we need check originalLabels not null before use 
it(originalLabels.containsAll).

Summary: Shouldn't change the value in labelCollections if the key 
already exists and potential NPE at CommonNodeLabelsManager.  (was: potential 
NPE in checkRemoveLabelsFromNode of CommonNodeLabelsManager)

 Shouldn't change the value in labelCollections if the key already exists and 
 potential NPE at CommonNodeLabelsManager.
 --

 Key: YARN-2753
 URL: https://issues.apache.org/jira/browse/YARN-2753
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2753.000.patch, YARN-2753.001.patch, 
 YARN-2753.002.patch, YARN-2753.003.patch


 CommonNodeLabelsManager#addToCluserNodeLabels should not change the value in 
 labelCollections if the key already exists otherwise the Label.resource will 
 be changed(reset).
 potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
 CommonNodeLabelsManager.
 It because when a Node is created, Node.labels can be null.
 In this case, nm.labels; may be null.
 So we need check originalLabels not null before use 
 it(originalLabels.containsAll).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2753) Shouldn't change the value in labelCollections if the key already exists and potential NPE at CommonNodeLabelsManager.

2014-10-28 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2753:

Description: 
CommonNodeLabelsManager#addToCluserNodeLabels should not change the value in 
labelCollections if the key already exists otherwise the Label.resource will be 
changed(reset).

potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
CommonNodeLabelsManager.
because when a Node is created, Node.labels can be null.
In this case, nm.labels; may be null.
So we need check originalLabels not null before use 
it(originalLabels.containsAll).

  was:
CommonNodeLabelsManager#addToCluserNodeLabels should not change the value in 
labelCollections if the key already exists otherwise the Label.resource will be 
changed(reset).

potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
CommonNodeLabelsManager.
It because when a Node is created, Node.labels can be null.
In this case, nm.labels; may be null.
So we need check originalLabels not null before use 
it(originalLabels.containsAll).


 Shouldn't change the value in labelCollections if the key already exists and 
 potential NPE at CommonNodeLabelsManager.
 --

 Key: YARN-2753
 URL: https://issues.apache.org/jira/browse/YARN-2753
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2753.000.patch, YARN-2753.001.patch, 
 YARN-2753.002.patch, YARN-2753.003.patch


 CommonNodeLabelsManager#addToCluserNodeLabels should not change the value in 
 labelCollections if the key already exists otherwise the Label.resource will 
 be changed(reset).
 potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
 CommonNodeLabelsManager.
 because when a Node is created, Node.labels can be null.
 In this case, nm.labels; may be null.
 So we need check originalLabels not null before use 
 it(originalLabels.containsAll).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2759) addToCluserNodeLabels should not change the value in labelCollections if the key already exists to avoid the Label.resource being reset.

2014-10-28 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2759:

Description: addToCluserNodeLabels should not change the value in 
labelCollections if the key already exists to avoid the Label.resource being 
reset.  (was: addToCluserNodeLabels should not change the value in 
labelCollections if the key already exists to avoid the Label.resource is 
reset.)
Summary: addToCluserNodeLabels should not change the value in 
labelCollections if the key already exists to avoid the Label.resource being 
reset.  (was: addToCluserNodeLabels should not change the value in 
labelCollections if the key already exists to avoid the Label.resource is 
reset.)

 addToCluserNodeLabels should not change the value in labelCollections if the 
 key already exists to avoid the Label.resource being reset.
 

 Key: YARN-2759
 URL: https://issues.apache.org/jira/browse/YARN-2759
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2759.000.patch


 addToCluserNodeLabels should not change the value in labelCollections if the 
 key already exists to avoid the Label.resource being reset.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1964) Create Docker analog of the LinuxContainerExecutor in YARN

2014-10-28 Thread Janos Matyas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186446#comment-14186446
 ] 

Janos Matyas commented on YARN-1964:


Hi Abin,

I have applied the changes to the image the next day Re your request - sorry 
for being late, Docker.io does not send notifications about comments. Should 
you need anything in the future please drop me a direct email - 
janos.mat...@sequenceiq.com - or open a GitHub issue.

Janos

 Create Docker analog of the LinuxContainerExecutor in YARN
 --

 Key: YARN-1964
 URL: https://issues.apache.org/jira/browse/YARN-1964
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.2.0
Reporter: Arun C Murthy
Assignee: Abin Shahab
 Attachments: YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, 
 YARN-1964.patch, YARN-1964.patch, yarn-1964-branch-2.2.0-docker.patch, 
 yarn-1964-branch-2.2.0-docker.patch, yarn-1964-docker.patch, 
 yarn-1964-docker.patch, yarn-1964-docker.patch, yarn-1964-docker.patch, 
 yarn-1964-docker.patch


 Docker (https://www.docker.io/) is, increasingly, a very popular container 
 technology.
 In context of YARN, the support for Docker will provide a very elegant 
 solution to allow applications to *package* their software into a Docker 
 container (entire Linux file system incl. custom versions of perl, python 
 etc.) and use it as a blueprint to launch all their YARN containers with 
 requisite software environment. This provides both consistency (all YARN 
 containers will have the same software environment) and isolation (no 
 interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2669) FairScheduler: queueName shouldn't allow periods the allocation.xml

2014-10-28 Thread bc Wong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186448#comment-14186448
 ] 

bc Wong commented on YARN-2669:
---

Replacing . with \_dot\_ sounds fine here. While it doesn't eliminate 
collision, it makes it unlikely. Again, I'd leave it for another patch to do 
the real fix, which is more involved.

 FairScheduler: queueName shouldn't allow periods the allocation.xml
 ---

 Key: YARN-2669
 URL: https://issues.apache.org/jira/browse/YARN-2669
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wei Yan
Assignee: Wei Yan
Priority: Minor
 Attachments: YARN-2669-1.patch, YARN-2669-2.patch, YARN-2669-3.patch


 For an allocation file like:
 {noformat}
 allocations
   queue name=root.q1
 minResources4096mb,4vcores/minResources
   /queue
 /allocations
 {noformat}
 Users may wish to config minResources for a queue with full path root.q1. 
 However, right now, fair scheduler will treat this configureation for the 
 queue with full name root.root.q1. We need to print out a warning msg to 
 notify users about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2762) Provide RMAdminCLI args validation for NodeLabelManager operations

2014-10-28 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-2762:
-
Attachment: YARN-2762.patch

 Provide RMAdminCLI args validation for NodeLabelManager operations
 --

 Key: YARN-2762
 URL: https://issues.apache.org/jira/browse/YARN-2762
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Minor
 Attachments: YARN-2762.patch


 All NodeLabel args validation's are done at server side. The same can be done 
 at RMAdminCLI so that unnecessary RPC calls can be avoided.
 And for the input such as x,y,,z,, no need to add empty string instead can 
 be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2741) Windows: Node manager cannot serve up log files via the web user interface when yarn.nodemanager.log-dirs to any drive letter other than C: (or, the drive that nodemanag

2014-10-28 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186494#comment-14186494
 ] 

Varun Vasudev commented on YARN-2741:
-

+1, patch looks good.

 Windows: Node manager cannot serve up log files via the web user interface 
 when yarn.nodemanager.log-dirs to any drive letter other than C: (or, the 
 drive that nodemanager is running on)
 --

 Key: YARN-2741
 URL: https://issues.apache.org/jira/browse/YARN-2741
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
 Environment: Windows
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-2741.1.patch, YARN-2741.6.patch


 PROBLEM: User is getting No Logs available for Container Container_number 
 when setting the yarn.nodemanager.log-dirs to any drive letter other than C:
 STEPS TO REPRODUCE:
 On Windows
 1) Run NodeManager on C:
 2) Create two local drive partitions D: and E:
 3) Put yarn.nodemanager.log-dirs = D:\nmlogs or E:\nmlogs
 4) Run a MR job that will last at least 5 minutes
 5) While the job is in flight, log into the Yarn web ui , 
 resource_manager_server:8088/cluster
 6) Click on the application_idnumber
 7) Click on the logs link, you will get No Logs available for Container 
 Container_number
 ACTUAL BEHAVIOR: Getting an error message when viewing the container logs
 EXPECTED BEHAVIOR: Able to use different drive letters in 
 yarn.nodemanager.log-dirs and not get error
 NOTE: If we use the drive letter C: in yarn.nodemanager.log-dirs, we are able 
 to see the container logs while the MR job is in flight.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1813) Better error message for yarn logs when permission denied

2014-10-28 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-1813:
-
Attachment: YARN-1813.5.patch

Refreshed a patch.

 Better error message for yarn logs when permission denied
 ---

 Key: YARN-1813
 URL: https://issues.apache.org/jira/browse/YARN-1813
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.3.0
Reporter: Andrew Wang
Assignee: Tsuyoshi OZAWA
Priority: Minor
 Attachments: YARN-1813.1.patch, YARN-1813.2.patch, YARN-1813.2.patch, 
 YARN-1813.3.patch, YARN-1813.4.patch, YARN-1813.5.patch


 I ran some MR jobs as the hdfs user, and then forgot to sudo -u when 
 grabbing the logs. yarn logs prints an error message like the following:
 {noformat}
 [andrew.wang@a2402 ~]$ yarn logs -applicationId application_1394482121761_0010
 14/03/10 16:05:10 INFO client.RMProxy: Connecting to ResourceManager at 
 a2402.halxg.cloudera.com/10.20.212.10:8032
 Logs not available at 
 /tmp/logs/andrew.wang/logs/application_1394482121761_0010
 Log aggregation has not completed or is not enabled.
 {noformat}
 It'd be nicer if it said Permission denied or AccessControlException or 
 something like that instead, since that's the real issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1813) Better error message for yarn logs when permission denied

2014-10-28 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-1813:
-
Affects Version/s: 2.4.1
   2.5.1

 Better error message for yarn logs when permission denied
 ---

 Key: YARN-1813
 URL: https://issues.apache.org/jira/browse/YARN-1813
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.3.0, 2.4.1, 2.5.1
Reporter: Andrew Wang
Assignee: Tsuyoshi OZAWA
Priority: Minor
 Attachments: YARN-1813.1.patch, YARN-1813.2.patch, YARN-1813.2.patch, 
 YARN-1813.3.patch, YARN-1813.4.patch, YARN-1813.5.patch


 I ran some MR jobs as the hdfs user, and then forgot to sudo -u when 
 grabbing the logs. yarn logs prints an error message like the following:
 {noformat}
 [andrew.wang@a2402 ~]$ yarn logs -applicationId application_1394482121761_0010
 14/03/10 16:05:10 INFO client.RMProxy: Connecting to ResourceManager at 
 a2402.halxg.cloudera.com/10.20.212.10:8032
 Logs not available at 
 /tmp/logs/andrew.wang/logs/application_1394482121761_0010
 Log aggregation has not completed or is not enabled.
 {noformat}
 It'd be nicer if it said Permission denied or AccessControlException or 
 something like that instead, since that's the real issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1813) Better error message for yarn logs when permission denied

2014-10-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186523#comment-14186523
 ] 

Hadoop QA commented on YARN-1813:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677550/YARN-1813.5.patch
  against trunk revision 0398db1.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5601//console

This message is automatically generated.

 Better error message for yarn logs when permission denied
 ---

 Key: YARN-1813
 URL: https://issues.apache.org/jira/browse/YARN-1813
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.3.0, 2.4.1, 2.5.1
Reporter: Andrew Wang
Assignee: Tsuyoshi OZAWA
Priority: Minor
 Attachments: YARN-1813.1.patch, YARN-1813.2.patch, YARN-1813.2.patch, 
 YARN-1813.3.patch, YARN-1813.4.patch, YARN-1813.5.patch


 I ran some MR jobs as the hdfs user, and then forgot to sudo -u when 
 grabbing the logs. yarn logs prints an error message like the following:
 {noformat}
 [andrew.wang@a2402 ~]$ yarn logs -applicationId application_1394482121761_0010
 14/03/10 16:05:10 INFO client.RMProxy: Connecting to ResourceManager at 
 a2402.halxg.cloudera.com/10.20.212.10:8032
 Logs not available at 
 /tmp/logs/andrew.wang/logs/application_1394482121761_0010
 Log aggregation has not completed or is not enabled.
 {noformat}
 It'd be nicer if it said Permission denied or AccessControlException or 
 something like that instead, since that's the real issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2737) Misleading msg in LogCLI when app is not successfully submitted

2014-10-28 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186548#comment-14186548
 ] 

Tsuyoshi OZAWA commented on YARN-2737:
--

YARN-1813 is addressing a issue for handling AccessControlException correctly.

 Misleading msg in LogCLI when app is not successfully submitted 
 

 Key: YARN-2737
 URL: https://issues.apache.org/jira/browse/YARN-2737
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, resourcemanager
Reporter: Jian He
Assignee: Rohith

 {{LogCLiHelpers#logDirNotExist}} prints msg {{Log aggregation has not 
 completed or is not enabled.}} if the app log file doesn't exist. This is 
 misleading because if the application is not submitted successfully. Clearly, 
 we won't have logs for this application. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2763) TestNMSimulator fails sometimes due to timing issue

2014-10-28 Thread Varun Vasudev (JIRA)
Varun Vasudev created YARN-2763:
---

 Summary: TestNMSimulator fails sometimes due to timing issue
 Key: YARN-2763
 URL: https://issues.apache.org/jira/browse/YARN-2763
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Varun Vasudev
Assignee: Varun Vasudev


TestNMSimulator fails sometimes due to timing issues. From a failure -
{noformat}
2014-10-16 23:21:42,343 INFO  resourcemanager.ResourceTrackerService 
(ResourceTrackerService.java:registerNodeManager(337)) - NodeManager from node 
node1(cmPort: 0 httpPort: 80) registered with capability: memory:10240, 
vCores:10, assigned nodeId node1:0
2014-10-16 23:21:42,397 ERROR delegation.AbstractDelegationTokenSecretManager 
(AbstractDelegationTokenSecretManager.java:run(642)) - ExpiredTokenRemover 
received java.lang.InterruptedException: sleep interrupted
2014-10-16 23:21:42,400 INFO  rmnode.RMNodeImpl (RMNodeImpl.java:handle(423)) - 
node1:0 Node Transitioned from NEW to RUNNING
2014-10-16 23:21:42,404 INFO  fair.FairScheduler 
(FairScheduler.java:addNode(825)) - Added node node1:0 cluster capacity: 
memory:10240, vCores:10
2014-10-16 23:21:42,407 INFO  mortbay.log (Slf4jLog.java:info(67)) - Stopped 
HttpServer2$SelectChannelConnectorWithSafeStartup@localhost:18088
2014-10-16 23:21:42,409 ERROR delegation.AbstractDelegationTokenSecretManager 
(AbstractDelegationTokenSecretManager.java:run(642)) - ExpiredTokenRemover 
received java.lang.InterruptedException: sleep interrupted
2014-10-16 23:21:42,410 INFO  ipc.Server (Server.java:stop(2437)) - Stopping 
server on 18032
2014-10-16 23:21:42,412 INFO  ipc.Server (Server.java:run(706)) - Stopping IPC 
Server listener on 18032
2014-10-16 23:21:42,412 INFO  ipc.Server (Server.java:run(832)) - Stopping IPC 
Server Responder
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2763) TestNMSimulator fails sometimes due to timing issue

2014-10-28 Thread Varun Vasudev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-2763:

Attachment: apache-yarn-2763.0.patch

Attached patch with fix.

 TestNMSimulator fails sometimes due to timing issue
 ---

 Key: YARN-2763
 URL: https://issues.apache.org/jira/browse/YARN-2763
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-2763.0.patch


 TestNMSimulator fails sometimes due to timing issues. From a failure -
 {noformat}
 2014-10-16 23:21:42,343 INFO  resourcemanager.ResourceTrackerService 
 (ResourceTrackerService.java:registerNodeManager(337)) - NodeManager from 
 node node1(cmPort: 0 httpPort: 80) registered with capability: memory:10240, 
 vCores:10, assigned nodeId node1:0
 2014-10-16 23:21:42,397 ERROR delegation.AbstractDelegationTokenSecretManager 
 (AbstractDelegationTokenSecretManager.java:run(642)) - ExpiredTokenRemover 
 received java.lang.InterruptedException: sleep interrupted
 2014-10-16 23:21:42,400 INFO  rmnode.RMNodeImpl (RMNodeImpl.java:handle(423)) 
 - node1:0 Node Transitioned from NEW to RUNNING
 2014-10-16 23:21:42,404 INFO  fair.FairScheduler 
 (FairScheduler.java:addNode(825)) - Added node node1:0 cluster capacity: 
 memory:10240, vCores:10
 2014-10-16 23:21:42,407 INFO  mortbay.log (Slf4jLog.java:info(67)) - Stopped 
 HttpServer2$SelectChannelConnectorWithSafeStartup@localhost:18088
 2014-10-16 23:21:42,409 ERROR delegation.AbstractDelegationTokenSecretManager 
 (AbstractDelegationTokenSecretManager.java:run(642)) - ExpiredTokenRemover 
 received java.lang.InterruptedException: sleep interrupted
 2014-10-16 23:21:42,410 INFO  ipc.Server (Server.java:stop(2437)) - Stopping 
 server on 18032
 2014-10-16 23:21:42,412 INFO  ipc.Server (Server.java:run(706)) - Stopping 
 IPC Server listener on 18032
 2014-10-16 23:21:42,412 INFO  ipc.Server (Server.java:run(832)) - Stopping 
 IPC Server Responder
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2763) TestNMSimulator fails sometimes due to timing issue

2014-10-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186574#comment-14186574
 ] 

Hadoop QA commented on YARN-2763:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12677561/apache-yarn-2763.0.patch
  against trunk revision 0398db1.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5602//console

This message is automatically generated.

 TestNMSimulator fails sometimes due to timing issue
 ---

 Key: YARN-2763
 URL: https://issues.apache.org/jira/browse/YARN-2763
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-2763.0.patch


 TestNMSimulator fails sometimes due to timing issues. From a failure -
 {noformat}
 2014-10-16 23:21:42,343 INFO  resourcemanager.ResourceTrackerService 
 (ResourceTrackerService.java:registerNodeManager(337)) - NodeManager from 
 node node1(cmPort: 0 httpPort: 80) registered with capability: memory:10240, 
 vCores:10, assigned nodeId node1:0
 2014-10-16 23:21:42,397 ERROR delegation.AbstractDelegationTokenSecretManager 
 (AbstractDelegationTokenSecretManager.java:run(642)) - ExpiredTokenRemover 
 received java.lang.InterruptedException: sleep interrupted
 2014-10-16 23:21:42,400 INFO  rmnode.RMNodeImpl (RMNodeImpl.java:handle(423)) 
 - node1:0 Node Transitioned from NEW to RUNNING
 2014-10-16 23:21:42,404 INFO  fair.FairScheduler 
 (FairScheduler.java:addNode(825)) - Added node node1:0 cluster capacity: 
 memory:10240, vCores:10
 2014-10-16 23:21:42,407 INFO  mortbay.log (Slf4jLog.java:info(67)) - Stopped 
 HttpServer2$SelectChannelConnectorWithSafeStartup@localhost:18088
 2014-10-16 23:21:42,409 ERROR delegation.AbstractDelegationTokenSecretManager 
 (AbstractDelegationTokenSecretManager.java:run(642)) - ExpiredTokenRemover 
 received java.lang.InterruptedException: sleep interrupted
 2014-10-16 23:21:42,410 INFO  ipc.Server (Server.java:stop(2437)) - Stopping 
 server on 18032
 2014-10-16 23:21:42,412 INFO  ipc.Server (Server.java:run(706)) - Stopping 
 IPC Server listener on 18032
 2014-10-16 23:21:42,412 INFO  ipc.Server (Server.java:run(832)) - Stopping 
 IPC Server Responder
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2762) Provide RMAdminCLI args validation for NodeLabelManager operations

2014-10-28 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186617#comment-14186617
 ] 

Rohith commented on YARN-2762:
--

Added sanity check at client that validate the arguments.

 Provide RMAdminCLI args validation for NodeLabelManager operations
 --

 Key: YARN-2762
 URL: https://issues.apache.org/jira/browse/YARN-2762
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Minor
 Attachments: YARN-2762.patch


 All NodeLabel args validation's are done at server side. The same can be done 
 at RMAdminCLI so that unnecessary RPC calls can be avoided.
 And for the input such as x,y,,z,, no need to add empty string instead can 
 be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2762) Provide RMAdminCLI args validation for NodeLabelManager operations

2014-10-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186655#comment-14186655
 ] 

Hadoop QA commented on YARN-2762:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677545/YARN-2762.patch
  against trunk revision 0398db1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client:

  org.apache.hadoop.yarn.client.TestResourceTrackerOnHA
  
org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5603//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5603//console

This message is automatically generated.

 Provide RMAdminCLI args validation for NodeLabelManager operations
 --

 Key: YARN-2762
 URL: https://issues.apache.org/jira/browse/YARN-2762
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Minor
 Attachments: YARN-2762.patch


 All NodeLabel args validation's are done at server side. The same can be done 
 at RMAdminCLI so that unnecessary RPC calls can be avoided.
 And for the input such as x,y,,z,, no need to add empty string instead can 
 be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1813) Better error message for yarn logs when permission denied

2014-10-28 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186688#comment-14186688
 ] 

Rohith commented on YARN-1813:
--

Thanks Tshuyosh for rebasing patch!!
Some minor comments
1. I think Logging both messages Logs not available at  and Permission 
denied makes contradictory. Only Permission denied is sufficient.
2. AggregatedLogsBlock.java not changed but it imports AccessControlException. 
Does it required?
3. If app is submitted but not yet started? Here also log will be displayed as 
Log aggregation has not completed or is not enabled. I think we can log all 
the possible reason why logs not available.



 Better error message for yarn logs when permission denied
 ---

 Key: YARN-1813
 URL: https://issues.apache.org/jira/browse/YARN-1813
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.3.0, 2.4.1, 2.5.1
Reporter: Andrew Wang
Assignee: Tsuyoshi OZAWA
Priority: Minor
 Attachments: YARN-1813.1.patch, YARN-1813.2.patch, YARN-1813.2.patch, 
 YARN-1813.3.patch, YARN-1813.4.patch, YARN-1813.5.patch


 I ran some MR jobs as the hdfs user, and then forgot to sudo -u when 
 grabbing the logs. yarn logs prints an error message like the following:
 {noformat}
 [andrew.wang@a2402 ~]$ yarn logs -applicationId application_1394482121761_0010
 14/03/10 16:05:10 INFO client.RMProxy: Connecting to ResourceManager at 
 a2402.halxg.cloudera.com/10.20.212.10:8032
 Logs not available at 
 /tmp/logs/andrew.wang/logs/application_1394482121761_0010
 Log aggregation has not completed or is not enabled.
 {noformat}
 It'd be nicer if it said Permission denied or AccessControlException or 
 something like that instead, since that's the real issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2601) RMs(HA RMS) can't enter active state

2014-10-28 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186695#comment-14186695
 ] 

Rohith commented on YARN-2601:
--

[~cindy2012] This will be fixing in YARN-2010. Please follow this jira.

 RMs(HA RMS) can't enter active state
 

 Key: YARN-2601
 URL: https://issues.apache.org/jira/browse/YARN-2601
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Cindy Li

 2014-09-24 15:04:04,527 DEBUG 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Processing 
 event for application_1409048687352_0552 of type APP_REJECTED
 2014-09-24 15:04:04,528 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
 application_1409048687352_0552 State change from NEW to FAILED
 2014-09-24 15:04:04,528 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Dispatching the event 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.event.AppRemovedSchedulerEvent.EventType:
  APP_REMOVED
 2014-09-24 15:04:04,528 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: 
 Dispatching the event 
 org.apache.hadoop.yarn.server.resourcemanager.RMAppManagerEvent.EventType: 
 APP_COMPLETED
 2014-09-24 15:04:04,528 DEBUG 
 org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: RMAppManager 
 processing event for application_1409048687352_0552 of type APP_COMPLETED
 2014-09-24 15:04:04,528 WARN 
 org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=b_hiveperf0 
  OPERATION=Application Finished - Failed TARGET=RMAppManager 
 RESULT=FAILURE  DESCRIPTION=App failed with state: FAILED   
 PERMISSIONS=hadoop tried to renew an expired token
 at 
 org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:366)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:6279)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewDelegationToken(NameNodeRpcServer.java:488)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.renewDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:923)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2020)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2016)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1650)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2014)
 APPID=application_1409048687352_0552
 2014-09-24 15:04:04,529 DEBUG org.apache.hadoop.service.AbstractService: 
 Service: RMActiveServices entered state STOPPED
 
 2014-09-24 15:04:04,538 WARN 
 org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop   
 OPERATION=transitionToActiveTARGET=RMHAProtocolService  
 RESULT=FAILURE  DESCRIPTION=Exception transitioning to active   
 PERMISSIONS=Users [hadoop] are allowed
 2014-09-24 15:04:04,539 WARN org.apache.hadoop.ha.ActiveStandbyElector: 
 Exception handling the winning of election
 org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
 at 
 org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:118)
 at 
 org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:804)
 at 
 org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:596)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when 
 transitioning to Active mode
 at 
 org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:292)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:116)
 ... 4 more
 Caused by: org.apache.hadoop.service.ServiceStateException: 
 org.apache.hadoop.security.token.SecretManager$InvalidToken: hadoop tried to 
 renew an expired token
 at 
 org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:366)
 at 
 

[jira] [Commented] (YARN-2762) Provide RMAdminCLI args validation for NodeLabelManager operations

2014-10-28 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186703#comment-14186703
 ] 

Rohith commented on YARN-2762:
--

test case falures are not related to this fix.

 Provide RMAdminCLI args validation for NodeLabelManager operations
 --

 Key: YARN-2762
 URL: https://issues.apache.org/jira/browse/YARN-2762
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Rohith
Assignee: Rohith
Priority: Minor
 Attachments: YARN-2762.patch


 All NodeLabel args validation's are done at server side. The same can be done 
 at RMAdminCLI so that unnecessary RPC calls can be avoided.
 And for the input such as x,y,,z,, no need to add empty string instead can 
 be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2726) CapacityScheduler should explicitly log when an accessible label has no capacity

2014-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186725#comment-14186725
 ] 

Hudson commented on YARN-2726:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #726 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/726/])
YARN-2726. CapacityScheduler should explicitly log when an accessible label has 
no capacity. Contributed by Wangda Tan (xgong: rev 
ce1a4419a6c938447a675c416567db56bf9cb29e)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java


 CapacityScheduler should explicitly log when an accessible label has no 
 capacity
 

 Key: YARN-2726
 URL: https://issues.apache.org/jira/browse/YARN-2726
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Phil D'Amore
Assignee: Wangda Tan
Priority: Minor
 Fix For: 2.6.0

 Attachments: YARN-2726-20141023-1.patch, YARN-2726-20141023-2.patch


 Given:
 - Node label defined: test-label
 - Two queues defined: a, b
 - label accessibility and and capacity defined as follows (properties 
 abbreviated for readability):
 root.a.accessible-node-labels = test-label
 root.a.accessible-node-labels.test-label.capacity = 100
 If you restart the RM or do a 'rmadmin -refreshQueues' you will get a stack 
 trace with the following error buried within:
 Illegal capacity of -1.0 for label=test-label in queue=root.b
 This of course occurs because test-label is accessible to b due to 
 inheritance from the root, and -1 is the UNDEFINED value.  To my mind this 
 might not be obvious to the admin, and the error message which results does 
 not help guide someone to the source of the issue.
 I propose that this situation be updated so that when the capacity on an 
 accessible label is undefined, it is explicitly called out instead of falling 
 through to the illegal capacity check.  Something like:
 {code}
 if (capacity == UNDEFINED) {
 throw new IllegalArgumentException(Configuration issue:  +  label= + 
 label +  is accessible from queue= + queue +  but has no capacity set.);
 }
 {code}
 I'll leave it to better judgement than mine as to whether I'm throwing the 
 appropriate exception there.  I think this check should be added to both 
 getNodeLabelCapacities and getMaximumNodeLabelCapacities in 
 CapacitySchedulerConfiguration.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2591) AHSWebServices should return FORBIDDEN(403) if the request user doesn't have access to the history data

2014-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186730#comment-14186730
 ] 

Hudson commented on YARN-2591:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #726 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/726/])
YARN-2591. Fixed AHSWebServices to return FORBIDDEN(403) if the request user 
doesn't have access to the history data. Contributed by Zhijie Shen (jianhe: 
rev c05b581a5522eed499d3ba16af9fa6dc694563f6)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryManagerOnTimelineStore.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryManagerOnTimelineStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/WebServices.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/authorize/AuthorizationException.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebServices.java


 AHSWebServices should return FORBIDDEN(403) if the request user doesn't have 
 access to the history data
 ---

 Key: YARN-2591
 URL: https://issues.apache.org/jira/browse/YARN-2591
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: 3.0.0, 2.6.0
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.6.0

 Attachments: YARN-2591.1.patch, YARN-2591.2.patch


 AHSWebServices should return FORBIDDEN(403) if the request user doesn't have 
 access to the history data. Currently, it is going to return 
 INTERNAL_SERVER_ERROR(500).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2502) Changes in distributed shell to support specify labels

2014-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186720#comment-14186720
 ] 

Hudson commented on YARN-2502:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #726 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/726/])
YARN-2502. Changed DistributedShell to support node labels. Contributed by 
Wangda Tan (jianhe: rev f6b963fdfc517429149165e4bb6fb947be6e3c99)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java


 Changes in distributed shell to support specify labels
 --

 Key: YARN-2502
 URL: https://issues.apache.org/jira/browse/YARN-2502
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2502-20141009.1.patch, YARN-2502-20141009.2.patch, 
 YARN-2502-20141013.1.patch, YARN-2502-20141017-1.patch, 
 YARN-2502-20141017-2.patch, YARN-2502-20141027-2.patch, YARN-2502.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2704) Localization and log-aggregation will fail if hdfs delegation token expired after token-max-life-time

2014-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186726#comment-14186726
 ] 

Hudson commented on YARN-2704:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #726 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/726/])
YARN-2704. Changed ResourceManager to optionally obtain tokens itself for the 
sake of localization and log-aggregation for long-running services. Contributed 
by Jian He. (vinodkv: rev a16d022ca4313a41425c8e97841c841a2d6f2f54)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalCacheDirectoryManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/DummyContainerManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/api/protocolrecords/TestProtocolRecords.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManagerRecovery.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/Context.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java


  Localization and log-aggregation will fail if hdfs delegation token expired 
 after token-max-life-time
 --

 Key: YARN-2704
 URL: https://issues.apache.org/jira/browse/YARN-2704
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Jian He
Assignee: Jian He
Priority: Critical
  

[jira] [Updated] (YARN-2760) Completely remove word 'experimental' from FairScheduler docs

2014-10-28 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated YARN-2760:
--
Attachment: YARN-2760.patch

Re-uploading patch to retry after the patching issue was fixed in buildbot.

 Completely remove word 'experimental' from FairScheduler docs
 -

 Key: YARN-2760
 URL: https://issues.apache.org/jira/browse/YARN-2760
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.1.0-beta
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Attachments: YARN-2760.patch, YARN-2760.patch


 After YARN-1034, FairScheduler has not been 'experimental' in any aspect of 
 use, but the doc change done in that did not entirely cover removal of that 
 word, leaving a remnant in the preemption sub-point. This needs to be removed 
 as well, as the feature has been good to use for a long time now, and is not 
 experimental.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2726) CapacityScheduler should explicitly log when an accessible label has no capacity

2014-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186824#comment-14186824
 ] 

Hudson commented on YARN-2726:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1940 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1940/])
YARN-2726. CapacityScheduler should explicitly log when an accessible label has 
no capacity. Contributed by Wangda Tan (xgong: rev 
ce1a4419a6c938447a675c416567db56bf9cb29e)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java


 CapacityScheduler should explicitly log when an accessible label has no 
 capacity
 

 Key: YARN-2726
 URL: https://issues.apache.org/jira/browse/YARN-2726
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Phil D'Amore
Assignee: Wangda Tan
Priority: Minor
 Fix For: 2.6.0

 Attachments: YARN-2726-20141023-1.patch, YARN-2726-20141023-2.patch


 Given:
 - Node label defined: test-label
 - Two queues defined: a, b
 - label accessibility and and capacity defined as follows (properties 
 abbreviated for readability):
 root.a.accessible-node-labels = test-label
 root.a.accessible-node-labels.test-label.capacity = 100
 If you restart the RM or do a 'rmadmin -refreshQueues' you will get a stack 
 trace with the following error buried within:
 Illegal capacity of -1.0 for label=test-label in queue=root.b
 This of course occurs because test-label is accessible to b due to 
 inheritance from the root, and -1 is the UNDEFINED value.  To my mind this 
 might not be obvious to the admin, and the error message which results does 
 not help guide someone to the source of the issue.
 I propose that this situation be updated so that when the capacity on an 
 accessible label is undefined, it is explicitly called out instead of falling 
 through to the illegal capacity check.  Something like:
 {code}
 if (capacity == UNDEFINED) {
 throw new IllegalArgumentException(Configuration issue:  +  label= + 
 label +  is accessible from queue= + queue +  but has no capacity set.);
 }
 {code}
 I'll leave it to better judgement than mine as to whether I'm throwing the 
 appropriate exception there.  I think this check should be added to both 
 getNodeLabelCapacities and getMaximumNodeLabelCapacities in 
 CapacitySchedulerConfiguration.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2502) Changes in distributed shell to support specify labels

2014-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186819#comment-14186819
 ] 

Hudson commented on YARN-2502:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1940 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1940/])
YARN-2502. Changed DistributedShell to support node labels. Contributed by 
Wangda Tan (jianhe: rev f6b963fdfc517429149165e4bb6fb947be6e3c99)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java


 Changes in distributed shell to support specify labels
 --

 Key: YARN-2502
 URL: https://issues.apache.org/jira/browse/YARN-2502
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2502-20141009.1.patch, YARN-2502-20141009.2.patch, 
 YARN-2502-20141013.1.patch, YARN-2502-20141017-1.patch, 
 YARN-2502-20141017-2.patch, YARN-2502-20141027-2.patch, YARN-2502.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2591) AHSWebServices should return FORBIDDEN(403) if the request user doesn't have access to the history data

2014-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186829#comment-14186829
 ] 

Hudson commented on YARN-2591:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1940 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1940/])
YARN-2591. Fixed AHSWebServices to return FORBIDDEN(403) if the request user 
doesn't have access to the history data. Contributed by Zhijie Shen (jianhe: 
rev c05b581a5522eed499d3ba16af9fa6dc694563f6)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/authorize/AuthorizationException.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/WebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryManagerOnTimelineStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryManagerOnTimelineStore.java


 AHSWebServices should return FORBIDDEN(403) if the request user doesn't have 
 access to the history data
 ---

 Key: YARN-2591
 URL: https://issues.apache.org/jira/browse/YARN-2591
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: 3.0.0, 2.6.0
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.6.0

 Attachments: YARN-2591.1.patch, YARN-2591.2.patch


 AHSWebServices should return FORBIDDEN(403) if the request user doesn't have 
 access to the history data. Currently, it is going to return 
 INTERNAL_SERVER_ERROR(500).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2760) Completely remove word 'experimental' from FairScheduler docs

2014-10-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186841#comment-14186841
 ] 

Hadoop QA commented on YARN-2760:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677597/YARN-2760.patch
  against trunk revision c9bec46.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5604//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5604//console

This message is automatically generated.

 Completely remove word 'experimental' from FairScheduler docs
 -

 Key: YARN-2760
 URL: https://issues.apache.org/jira/browse/YARN-2760
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.1.0-beta
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Attachments: YARN-2760.patch, YARN-2760.patch


 After YARN-1034, FairScheduler has not been 'experimental' in any aspect of 
 use, but the doc change done in that did not entirely cover removal of that 
 word, leaving a remnant in the preemption sub-point. This needs to be removed 
 as well, as the feature has been good to use for a long time now, and is not 
 experimental.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2704) Localization and log-aggregation will fail if hdfs delegation token expired after token-max-life-time

2014-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186825#comment-14186825
 ] 

Hudson commented on YARN-2704:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1940 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1940/])
YARN-2704. Changed ResourceManager to optionally obtain tokens itself for the 
sake of localization and log-aggregation for long-running services. Contributed 
by Jian He. (vinodkv: rev a16d022ca4313a41425c8e97841c841a2d6f2f54)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/api/protocolrecords/TestProtocolRecords.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalCacheDirectoryManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/Context.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManagerRecovery.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/DummyContainerManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java


  Localization and log-aggregation will fail if hdfs delegation token expired 
 after token-max-life-time
 --

 Key: YARN-2704
 URL: https://issues.apache.org/jira/browse/YARN-2704
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Jian He
Assignee: Jian He

[jira] [Commented] (YARN-2704) Localization and log-aggregation will fail if hdfs delegation token expired after token-max-life-time

2014-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186887#comment-14186887
 ] 

Hudson commented on YARN-2704:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1915 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1915/])
YARN-2704. Changed ResourceManager to optionally obtain tokens itself for the 
sake of localization and log-aggregation for long-running services. Contributed 
by Jian He. (vinodkv: rev a16d022ca4313a41425c8e97841c841a2d6f2f54)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManagerRecovery.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/Context.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/DummyContainerManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/LogAggregationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestLocalCacheDirectoryManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/api/protocolrecords/TestProtocolRecords.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java


  Localization and log-aggregation will fail if hdfs delegation token expired 
 after token-max-life-time
 --

 Key: YARN-2704
 URL: https://issues.apache.org/jira/browse/YARN-2704
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Jian He
Assignee: Jian He
Priority: Critical

[jira] [Commented] (YARN-2726) CapacityScheduler should explicitly log when an accessible label has no capacity

2014-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186886#comment-14186886
 ] 

Hudson commented on YARN-2726:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1915 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1915/])
YARN-2726. CapacityScheduler should explicitly log when an accessible label has 
no capacity. Contributed by Wangda Tan (xgong: rev 
ce1a4419a6c938447a675c416567db56bf9cb29e)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* hadoop-yarn-project/CHANGES.txt


 CapacityScheduler should explicitly log when an accessible label has no 
 capacity
 

 Key: YARN-2726
 URL: https://issues.apache.org/jira/browse/YARN-2726
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Phil D'Amore
Assignee: Wangda Tan
Priority: Minor
 Fix For: 2.6.0

 Attachments: YARN-2726-20141023-1.patch, YARN-2726-20141023-2.patch


 Given:
 - Node label defined: test-label
 - Two queues defined: a, b
 - label accessibility and and capacity defined as follows (properties 
 abbreviated for readability):
 root.a.accessible-node-labels = test-label
 root.a.accessible-node-labels.test-label.capacity = 100
 If you restart the RM or do a 'rmadmin -refreshQueues' you will get a stack 
 trace with the following error buried within:
 Illegal capacity of -1.0 for label=test-label in queue=root.b
 This of course occurs because test-label is accessible to b due to 
 inheritance from the root, and -1 is the UNDEFINED value.  To my mind this 
 might not be obvious to the admin, and the error message which results does 
 not help guide someone to the source of the issue.
 I propose that this situation be updated so that when the capacity on an 
 accessible label is undefined, it is explicitly called out instead of falling 
 through to the illegal capacity check.  Something like:
 {code}
 if (capacity == UNDEFINED) {
 throw new IllegalArgumentException(Configuration issue:  +  label= + 
 label +  is accessible from queue= + queue +  but has no capacity set.);
 }
 {code}
 I'll leave it to better judgement than mine as to whether I'm throwing the 
 appropriate exception there.  I think this check should be added to both 
 getNodeLabelCapacities and getMaximumNodeLabelCapacities in 
 CapacitySchedulerConfiguration.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2591) AHSWebServices should return FORBIDDEN(403) if the request user doesn't have access to the history data

2014-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186892#comment-14186892
 ] 

Hudson commented on YARN-2591:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1915 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1915/])
YARN-2591. Fixed AHSWebServices to return FORBIDDEN(403) if the request user 
doesn't have access to the history data. Contributed by Zhijie Shen (jianhe: 
rev c05b581a5522eed499d3ba16af9fa6dc694563f6)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/WebServices.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryManagerOnTimelineStore.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/authorize/AuthorizationException.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryManagerOnTimelineStore.java


 AHSWebServices should return FORBIDDEN(403) if the request user doesn't have 
 access to the history data
 ---

 Key: YARN-2591
 URL: https://issues.apache.org/jira/browse/YARN-2591
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: 3.0.0, 2.6.0
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.6.0

 Attachments: YARN-2591.1.patch, YARN-2591.2.patch


 AHSWebServices should return FORBIDDEN(403) if the request user doesn't have 
 access to the history data. Currently, it is going to return 
 INTERNAL_SERVER_ERROR(500).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2502) Changes in distributed shell to support specify labels

2014-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186881#comment-14186881
 ] 

Hudson commented on YARN-2502:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1915 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1915/])
YARN-2502. Changed DistributedShell to support node labels. Contributed by 
Wangda Tan (jianhe: rev f6b963fdfc517429149165e4bb6fb947be6e3c99)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java
* hadoop-yarn-project/CHANGES.txt


 Changes in distributed shell to support specify labels
 --

 Key: YARN-2502
 URL: https://issues.apache.org/jira/browse/YARN-2502
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2502-20141009.1.patch, YARN-2502-20141009.2.patch, 
 YARN-2502-20141013.1.patch, YARN-2502-20141017-1.patch, 
 YARN-2502-20141017-2.patch, YARN-2502-20141027-2.patch, YARN-2502.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2742) FairSchedulerConfiguration fails to parse if there is extra space between value and unit

2014-10-28 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-2742:
--
Assignee: Wei Yan

 FairSchedulerConfiguration fails to parse if there is extra space between 
 value and unit
 

 Key: YARN-2742
 URL: https://issues.apache.org/jira/browse/YARN-2742
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Wei Yan
Priority: Minor
 Attachments: YARN-2742-1.patch


 FairSchedulerConfiguration is very strict about the number of space 
 characters between the value and the unit: 0 or 1 space.
 For example, for values like the following:
 {noformat}
 maxResources4096  mb, 2 vcoresmaxResources
 {noformat}
 (note 2 spaces)
 This above line fails to parse:
 {noformat}
 2014-10-24 22:56:40,802 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService:
  Failed to reload fair scheduler config file - will use existing allocations.
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException:
  Missing resource: mb
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.findResource(FairSchedulerConfiguration.java:247)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.parseResourceConfigValue(FairSchedulerConfiguration.java:231)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:347)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:381)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:293)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService$1.run(AllocationFileLoaderService.java:117)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2742) FairSchedulerConfiguration fails to parse if there is extra space between value and unit

2014-10-28 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186983#comment-14186983
 ] 

Sangjin Lee commented on YARN-2742:
---

Thanks for the patch [~ywskycn]!

+1 in terms of using \s* to cover these cases.

I'm comfortable with also making the unit match case-insensitive, but I'd like 
to hear others' thoughts on that.

 FairSchedulerConfiguration fails to parse if there is extra space between 
 value and unit
 

 Key: YARN-2742
 URL: https://issues.apache.org/jira/browse/YARN-2742
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Wei Yan
Priority: Minor
 Attachments: YARN-2742-1.patch


 FairSchedulerConfiguration is very strict about the number of space 
 characters between the value and the unit: 0 or 1 space.
 For example, for values like the following:
 {noformat}
 maxResources4096  mb, 2 vcoresmaxResources
 {noformat}
 (note 2 spaces)
 This above line fails to parse:
 {noformat}
 2014-10-24 22:56:40,802 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService:
  Failed to reload fair scheduler config file - will use existing allocations.
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException:
  Missing resource: mb
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.findResource(FairSchedulerConfiguration.java:247)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.parseResourceConfigValue(FairSchedulerConfiguration.java:231)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:347)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:381)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:293)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService$1.run(AllocationFileLoaderService.java:117)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2760) Completely remove word 'experimental' from FairScheduler docs

2014-10-28 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-2760:
---
Target Version/s: 2.6.0

 Completely remove word 'experimental' from FairScheduler docs
 -

 Key: YARN-2760
 URL: https://issues.apache.org/jira/browse/YARN-2760
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.1.0-beta
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Attachments: YARN-2760.patch, YARN-2760.patch


 After YARN-1034, FairScheduler has not been 'experimental' in any aspect of 
 use, but the doc change done in that did not entirely cover removal of that 
 word, leaving a remnant in the preemption sub-point. This needs to be removed 
 as well, as the feature has been good to use for a long time now, and is not 
 experimental.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1964) Create Docker analog of the LinuxContainerExecutor in YARN

2014-10-28 Thread Abin Shahab (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abin Shahab updated YARN-1964:
--
Attachment: YARN-1964.patch

Patch with merge conflicts fixed. Docs pointing back to 
sequenceiq/hadoop-docker:2.4.1

 Create Docker analog of the LinuxContainerExecutor in YARN
 --

 Key: YARN-1964
 URL: https://issues.apache.org/jira/browse/YARN-1964
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.2.0
Reporter: Arun C Murthy
Assignee: Abin Shahab
 Attachments: YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, 
 YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, 
 yarn-1964-branch-2.2.0-docker.patch, yarn-1964-branch-2.2.0-docker.patch, 
 yarn-1964-docker.patch, yarn-1964-docker.patch, yarn-1964-docker.patch, 
 yarn-1964-docker.patch, yarn-1964-docker.patch


 Docker (https://www.docker.io/) is, increasingly, a very popular container 
 technology.
 In context of YARN, the support for Docker will provide a very elegant 
 solution to allow applications to *package* their software into a Docker 
 container (entire Linux file system incl. custom versions of perl, python 
 etc.) and use it as a blueprint to launch all their YARN containers with 
 requisite software environment. This provides both consistency (all YARN 
 containers will have the same software environment) and isolation (no 
 interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2755) NM fails to clean up usercache_DEL_timestamp dirs after YARN-661

2014-10-28 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186991#comment-14186991
 ] 

Sangjin Lee commented on YARN-2755:
---

[~l201514], the patch looks good. It might be good to add a small unit test to 
demonstrate the bug/fix.

 NM fails to clean up usercache_DEL_timestamp dirs after YARN-661
 --

 Key: YARN-2755
 URL: https://issues.apache.org/jira/browse/YARN-2755
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
Priority: Critical
 Attachments: YARN-2755.v1.patch


 When NM restarts frequently due to some reason, a large number of directories 
 like these left in /data/disk$num/yarn/local/:
 /data/disk1/yarn/local/usercache_DEL_1414372756105
 /data/disk1/yarn/local/usercache_DEL_1413557901696
 /data/disk1/yarn/local/usercache_DEL_1413657004894
 /data/disk1/yarn/local/usercache_DEL_1413675321860
 /data/disk1/yarn/local/usercache_DEL_1414093167936
 /data/disk1/yarn/local/usercache_DEL_1413565841271
 These directories are empty, but take up 100M+ due to the number of them. 
 There were 38714 on the machine I looked at per data disk.
 It appears to be a regression introduced by YARN-661



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1964) Create Docker analog of the LinuxContainerExecutor in YARN

2014-10-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187014#comment-14187014
 ] 

Hadoop QA commented on YARN-1964:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677620/YARN-1964.patch
  against trunk revision 58c0bb9.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5605//console

This message is automatically generated.

 Create Docker analog of the LinuxContainerExecutor in YARN
 --

 Key: YARN-1964
 URL: https://issues.apache.org/jira/browse/YARN-1964
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.2.0
Reporter: Arun C Murthy
Assignee: Abin Shahab
 Attachments: YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, 
 YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, 
 yarn-1964-branch-2.2.0-docker.patch, yarn-1964-branch-2.2.0-docker.patch, 
 yarn-1964-docker.patch, yarn-1964-docker.patch, yarn-1964-docker.patch, 
 yarn-1964-docker.patch, yarn-1964-docker.patch


 Docker (https://www.docker.io/) is, increasingly, a very popular container 
 technology.
 In context of YARN, the support for Docker will provide a very elegant 
 solution to allow applications to *package* their software into a Docker 
 container (entire Linux file system incl. custom versions of perl, python 
 etc.) and use it as a blueprint to launch all their YARN containers with 
 requisite software environment. This provides both consistency (all YARN 
 containers will have the same software environment) and isolation (no 
 interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2764) counters.LimitExceededException shouldn't abort AsyncDispatcher

2014-10-28 Thread Ted Yu (JIRA)
Ted Yu created YARN-2764:


 Summary: counters.LimitExceededException shouldn't abort 
AsyncDispatcher
 Key: YARN-2764
 URL: https://issues.apache.org/jira/browse/YARN-2764
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.5.1
Reporter: Ted Yu


I saw the following in container log:
{code}
2014-10-25 10:28:55,052 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Task succeeded with 
attemptattempt_1414221548789_0023_r_03_0
2014-10-25 10:28:55,052 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: 
task_1414221548789_0023_r_03 Task Transitioned from RUNNING to SUCCEEDED
2014-10-25 10:28:55,052 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 24
2014-10-25 10:28:55,053 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1414221548789_0023Job 
Transitioned from RUNNING to COMMITTING
2014-10-25 10:28:55,054 INFO [CommitterEvent Processor #1] 
org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the 
event EventType: JOB_COMMIT
2014-10-25 10:28:55,177 FATAL [AsyncDispatcher event handler] 
org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many counters: 
121 max=120
  at org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:101)
  at org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:108)
  at 
org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.addCounter(AbstractCounterGroup.java:78)
  at 
org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.addCounterImpl(AbstractCounterGroup.java:95)
  at 
org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.findCounter(AbstractCounterGroup.java:106)
  at 
org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.incrAllCounters(AbstractCounterGroup.java:203)
  at 
org.apache.hadoop.mapreduce.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:348)
  at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.constructFinalFullcounters(JobImpl.java:1754)
  at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.mayBeConstructFinalFullCounters(JobImpl.java:1737)
  at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.createJobFinishedEvent(JobImpl.java:1718)
  at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.logJobHistoryFinishedEvent(JobImpl.java:1089)
  at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$CommitSucceededTransition.transition(JobImpl.java:2049)
  at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$CommitSucceededTransition.transition(JobImpl.java:2045)
  at 
org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
  at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
  at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
  at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
  at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996)
  at 
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
  at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1289)
  at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1285)
  at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
  at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
  at java.lang.Thread.run(Thread.java:745)
2014-10-25 10:28:55,185 INFO [AsyncDispatcher event handler] 
org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..
{code}
Counter limit was exceeded when JobFinishedEvent was created.
Better handling of LimitExceededException should be provided so that 
AsyncDispatcher can continue functioning.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2742) FairSchedulerConfiguration fails to parse if there is extra space between value and unit

2014-10-28 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187042#comment-14187042
 ] 

Wei Yan commented on YARN-2742:
---

Thanks for the comments, [~sjlee0]. Any comments, [~kasha]?

 FairSchedulerConfiguration fails to parse if there is extra space between 
 value and unit
 

 Key: YARN-2742
 URL: https://issues.apache.org/jira/browse/YARN-2742
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Wei Yan
Priority: Minor
 Attachments: YARN-2742-1.patch


 FairSchedulerConfiguration is very strict about the number of space 
 characters between the value and the unit: 0 or 1 space.
 For example, for values like the following:
 {noformat}
 maxResources4096  mb, 2 vcoresmaxResources
 {noformat}
 (note 2 spaces)
 This above line fails to parse:
 {noformat}
 2014-10-24 22:56:40,802 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService:
  Failed to reload fair scheduler config file - will use existing allocations.
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException:
  Missing resource: mb
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.findResource(FairSchedulerConfiguration.java:247)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.parseResourceConfigValue(FairSchedulerConfiguration.java:231)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:347)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:381)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:293)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService$1.run(AllocationFileLoaderService.java:117)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2761) potential race condition in SchedulingPolicy

2014-10-28 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187058#comment-14187058
 ] 

Karthik Kambatla commented on YARN-2761:


I am all up for fixing this, but the race itself shouldn't be a big deal. The 
worst thing that can happen is - multiple threads would create multiple 
instances of the policy and only one of them eventually goes to the map, the 
rest would get GCed eventually. [~zhiguohong] - do you agree or am I missing 
something? 




















 

 potential race condition in SchedulingPolicy
 

 Key: YARN-2761
 URL: https://issues.apache.org/jira/browse/YARN-2761
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Hong Zhiguo
Assignee: Hong Zhiguo
Priority: Minor

 reported by findbug. 
 In SchedulingPolicy.getInstance, ConcurrentHashMap.get and 
 ConcurrentHashMap.put is called. These two operations together should be 
 atomic, but using ConcurrentHashMap doesn't guarantee this. 
 {code} 
 public static SchedulingPolicy getInstance(Class? extends  SchedulingPolicy 
 clazz) { 
   SchedulingPolicy policy = instances.get(clazz); 
   if (policy == null) { 
 policy = ReflectionUtils.newInstance(clazz, null); 
instances.put(clazz, policy); 
   } 
   return policy; 
 } 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2760) Completely remove word 'experimental' from FairScheduler docs

2014-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187061#comment-14187061
 ] 

Hudson commented on YARN-2760:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6368 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6368/])
YARN-2760. Remove 'experimental' from FairScheduler docs. (Harsh J via kasha) 
(kasha: rev ade3727ecb092935dcc0f1291c1e6cf43d764a03)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm
* hadoop-yarn-project/CHANGES.txt


 Completely remove word 'experimental' from FairScheduler docs
 -

 Key: YARN-2760
 URL: https://issues.apache.org/jira/browse/YARN-2760
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.1.0-beta
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Fix For: 2.6.0

 Attachments: YARN-2760.patch, YARN-2760.patch


 After YARN-1034, FairScheduler has not been 'experimental' in any aspect of 
 use, but the doc change done in that did not entirely cover removal of that 
 word, leaving a remnant in the preemption sub-point. This needs to be removed 
 as well, as the feature has been good to use for a long time now, and is not 
 experimental.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2761) potential race condition in SchedulingPolicy

2014-10-28 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187073#comment-14187073
 ] 

Wei Yan commented on YARN-2761:
---

Good catch, [~zhiguohong].

 potential race condition in SchedulingPolicy
 

 Key: YARN-2761
 URL: https://issues.apache.org/jira/browse/YARN-2761
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Hong Zhiguo
Assignee: Hong Zhiguo
Priority: Minor

 reported by findbug. 
 In SchedulingPolicy.getInstance, ConcurrentHashMap.get and 
 ConcurrentHashMap.put is called. These two operations together should be 
 atomic, but using ConcurrentHashMap doesn't guarantee this. 
 {code} 
 public static SchedulingPolicy getInstance(Class? extends  SchedulingPolicy 
 clazz) { 
   SchedulingPolicy policy = instances.get(clazz); 
   if (policy == null) { 
 policy = ReflectionUtils.newInstance(clazz, null); 
instances.put(clazz, policy); 
   } 
   return policy; 
 } 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1964) Create Docker analog of the LinuxContainerExecutor in YARN

2014-10-28 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187081#comment-14187081
 ] 

Abin Shahab commented on YARN-1964:
---

This says it passed:
https://builds.apache.org/job/PreCommit-YARN-Build/5605/artifact/patchprocess/trunkJavacWarnings.txt/*view*/

Is there another patch that's making it fail? 
https://issues.apache.org/jira/browse/HADOOP-10926 ?

 Create Docker analog of the LinuxContainerExecutor in YARN
 --

 Key: YARN-1964
 URL: https://issues.apache.org/jira/browse/YARN-1964
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.2.0
Reporter: Arun C Murthy
Assignee: Abin Shahab
 Attachments: YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, 
 YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, 
 yarn-1964-branch-2.2.0-docker.patch, yarn-1964-branch-2.2.0-docker.patch, 
 yarn-1964-docker.patch, yarn-1964-docker.patch, yarn-1964-docker.patch, 
 yarn-1964-docker.patch, yarn-1964-docker.patch


 Docker (https://www.docker.io/) is, increasingly, a very popular container 
 technology.
 In context of YARN, the support for Docker will provide a very elegant 
 solution to allow applications to *package* their software into a Docker 
 container (entire Linux file system incl. custom versions of perl, python 
 etc.) and use it as a blueprint to launch all their YARN containers with 
 requisite software environment. This provides both consistency (all YARN 
 containers will have the same software environment) and isolation (no 
 interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2753) Fix potential issues and code clean up for *NodeLabelsManager

2014-10-28 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2753:
-
Summary: Fix potential issues and code clean up for *NodeLabelsManager  
(was: Shouldn't change the value in labelCollections if the key already exists 
and potential NPE at CommonNodeLabelsManager.)

 Fix potential issues and code clean up for *NodeLabelsManager
 -

 Key: YARN-2753
 URL: https://issues.apache.org/jira/browse/YARN-2753
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2753.000.patch, YARN-2753.001.patch, 
 YARN-2753.002.patch, YARN-2753.003.patch


 CommonNodeLabelsManager#addToCluserNodeLabels should not change the value in 
 labelCollections if the key already exists otherwise the Label.resource will 
 be changed(reset).
 potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
 CommonNodeLabelsManager.
 because when a Node is created, Node.labels can be null.
 In this case, nm.labels; may be null.
 So we need check originalLabels not null before use 
 it(originalLabels.containsAll).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2753) Fix potential issues and code clean up for *NodeLabelsManager

2014-10-28 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2753:
-
Description: 
Issues include:

* CommonNodeLabelsManager#addToCluserNodeLabels should not change the value in 
labelCollections if the key already exists otherwise the Label.resource will be 
changed(reset).
* potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
CommonNodeLabelsManager.
** because when a Node is created, Node.labels can be null.
** In this case, nm.labels; may be null. So we need check originalLabels not 
null before use it(originalLabels.containsAll).

  was:
CommonNodeLabelsManager#addToCluserNodeLabels should not change the value in 
labelCollections if the key already exists otherwise the Label.resource will be 
changed(reset).

potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
CommonNodeLabelsManager.
because when a Node is created, Node.labels can be null.
In this case, nm.labels; may be null.
So we need check originalLabels not null before use 
it(originalLabels.containsAll).


 Fix potential issues and code clean up for *NodeLabelsManager
 -

 Key: YARN-2753
 URL: https://issues.apache.org/jira/browse/YARN-2753
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2753.000.patch, YARN-2753.001.patch, 
 YARN-2753.002.patch, YARN-2753.003.patch


 Issues include:
 * CommonNodeLabelsManager#addToCluserNodeLabels should not change the value 
 in labelCollections if the key already exists otherwise the Label.resource 
 will be changed(reset).
 * potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
 CommonNodeLabelsManager.
 ** because when a Node is created, Node.labels can be null.
 ** In this case, nm.labels; may be null. So we need check originalLabels not 
 null before use it(originalLabels.containsAll).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2754) addToCluserNodeLabels should be protected by writeLock in RMNodeLabelsManager.java.

2014-10-28 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2754:
-
Issue Type: Bug  (was: Sub-task)
Parent: (was: YARN-2492)

 addToCluserNodeLabels should be protected by writeLock in 
 RMNodeLabelsManager.java.
 ---

 Key: YARN-2754
 URL: https://issues.apache.org/jira/browse/YARN-2754
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2754.000.patch


 addToCluserNodeLabels should be protected by writeLock in 
 RMNodeLabelsManager.java. because we should protect labelCollections in 
 RMNodeLabelsManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2753) Fix potential issues and code clean up for *NodeLabelsManager

2014-10-28 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187114#comment-14187114
 ] 

Wangda Tan commented on YARN-2753:
--

[~zxu],
I've changed the title and description, I suggest to merge other 
RMNodeLabelsManager fixes here also. Basically they're same module,  and close 
others as duplicated. Please comment such issues here. Which we can do quick 
review and commit this easier..

Thanks,
Wangda

 Fix potential issues and code clean up for *NodeLabelsManager
 -

 Key: YARN-2753
 URL: https://issues.apache.org/jira/browse/YARN-2753
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2753.000.patch, YARN-2753.001.patch, 
 YARN-2753.002.patch, YARN-2753.003.patch


 Issues include:
 * CommonNodeLabelsManager#addToCluserNodeLabels should not change the value 
 in labelCollections if the key already exists otherwise the Label.resource 
 will be changed(reset).
 * potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
 CommonNodeLabelsManager.
 ** because when a Node is created, Node.labels can be null.
 ** In this case, nm.labels; may be null. So we need check originalLabels not 
 null before use it(originalLabels.containsAll).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2757) potential NPE in checkNodeLabelExpression of SchedulerUtils for nodeLabels.

2014-10-28 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2757:
-
Priority: Minor  (was: Major)

 potential NPE in checkNodeLabelExpression of SchedulerUtils for nodeLabels.
 ---

 Key: YARN-2757
 URL: https://issues.apache.org/jira/browse/YARN-2757
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
 Attachments: YARN-2757.000.patch


 pontential NPE in checkNodeLabelExpression of SchedulerUtils for nodeLabels.
 since we check the nodeLabels null at 
 {code}
 if (!str.trim().isEmpty()
  (nodeLabels == null || !nodeLabels.contains(str.trim( {
   return false;
 }
 {code}
 We should also check nodeLabels null at 
 {code}
   if (!nodeLabels.isEmpty()) {
 return false;
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2757) potential NPE in checkNodeLabelExpression of SchedulerUtils for nodeLabels.

2014-10-28 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187131#comment-14187131
 ] 

Wangda Tan commented on YARN-2757:
--

[~zxu], This method is invoked by RMNodeLabelsManager#getLabelsOnNode, and that 
method will never return null, so I changed priority of this issue to minor, 
please boost it if you don't agree. 

 potential NPE in checkNodeLabelExpression of SchedulerUtils for nodeLabels.
 ---

 Key: YARN-2757
 URL: https://issues.apache.org/jira/browse/YARN-2757
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
 Attachments: YARN-2757.000.patch


 pontential NPE in checkNodeLabelExpression of SchedulerUtils for nodeLabels.
 since we check the nodeLabels null at 
 {code}
 if (!str.trim().isEmpty()
  (nodeLabels == null || !nodeLabels.contains(str.trim( {
   return false;
 }
 {code}
 We should also check nodeLabels null at 
 {code}
   if (!nodeLabels.isEmpty()) {
 return false;
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2279) Add UTs to cover timeline server authentication

2014-10-28 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187139#comment-14187139
 ] 

Xuan Gong commented on YARN-2279:
-

+1 Looks good. Will commit it after Jenkins give +1

 Add UTs to cover timeline server authentication
 ---

 Key: YARN-2279
 URL: https://issues.apache.org/jira/browse/YARN-2279
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
  Labels: test
 Attachments: YARN-2279.1.patch


 Currently, timeline server authentication is lacking unit tests. We have to 
 verify each incremental patch manually. It's good to add some unit tests here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2279) Add UTs to cover timeline server authentication

2014-10-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187190#comment-14187190
 ] 

Hadoop QA commented on YARN-2279:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12676042/YARN-2279.1.patch
  against trunk revision ade3727.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5606//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5606//console

This message is automatically generated.

 Add UTs to cover timeline server authentication
 ---

 Key: YARN-2279
 URL: https://issues.apache.org/jira/browse/YARN-2279
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
  Labels: test
 Attachments: YARN-2279.1.patch


 Currently, timeline server authentication is lacking unit tests. We have to 
 verify each incremental patch manually. It's good to add some unit tests here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1813) Better error message for yarn logs when permission denied

2014-10-28 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-1813:
-
Attachment: YARN-1813.6.patch

Thanks for your review, Rohith! Updated:

1. Changed a log message to Permission denied. : /path/to/dir
2. Removed needless change in AggregatedLogsBlock.
3. Updated log message in {{logDirNotExist}}.


 Better error message for yarn logs when permission denied
 ---

 Key: YARN-1813
 URL: https://issues.apache.org/jira/browse/YARN-1813
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.3.0, 2.4.1, 2.5.1
Reporter: Andrew Wang
Assignee: Tsuyoshi OZAWA
Priority: Minor
 Attachments: YARN-1813.1.patch, YARN-1813.2.patch, YARN-1813.2.patch, 
 YARN-1813.3.patch, YARN-1813.4.patch, YARN-1813.5.patch, YARN-1813.6.patch


 I ran some MR jobs as the hdfs user, and then forgot to sudo -u when 
 grabbing the logs. yarn logs prints an error message like the following:
 {noformat}
 [andrew.wang@a2402 ~]$ yarn logs -applicationId application_1394482121761_0010
 14/03/10 16:05:10 INFO client.RMProxy: Connecting to ResourceManager at 
 a2402.halxg.cloudera.com/10.20.212.10:8032
 Logs not available at 
 /tmp/logs/andrew.wang/logs/application_1394482121761_0010
 Log aggregation has not completed or is not enabled.
 {noformat}
 It'd be nicer if it said Permission denied or AccessControlException or 
 something like that instead, since that's the real issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1964) Create Docker analog of the LinuxContainerExecutor in YARN

2014-10-28 Thread sidharta seethana (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187219#comment-14187219
 ] 

sidharta seethana commented on YARN-1964:
-

This commit in branch-2.6 : https://github.com/apache/hadoop/commit/29d0164e 
which changed the signature of an abstract function in ContainerExecutor. It 
looks like your latest patch fixes this, though. We'll take a look, thanks.
 

 Create Docker analog of the LinuxContainerExecutor in YARN
 --

 Key: YARN-1964
 URL: https://issues.apache.org/jira/browse/YARN-1964
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.2.0
Reporter: Arun C Murthy
Assignee: Abin Shahab
 Attachments: YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, 
 YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, 
 yarn-1964-branch-2.2.0-docker.patch, yarn-1964-branch-2.2.0-docker.patch, 
 yarn-1964-docker.patch, yarn-1964-docker.patch, yarn-1964-docker.patch, 
 yarn-1964-docker.patch, yarn-1964-docker.patch


 Docker (https://www.docker.io/) is, increasingly, a very popular container 
 technology.
 In context of YARN, the support for Docker will provide a very elegant 
 solution to allow applications to *package* their software into a Docker 
 container (entire Linux file system incl. custom versions of perl, python 
 etc.) and use it as a blueprint to launch all their YARN containers with 
 requisite software environment. This provides both consistency (all YARN 
 containers will have the same software environment) and isolation (no 
 interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2279) Add UTs to cover timeline server authentication

2014-10-28 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187228#comment-14187228
 ] 

Xuan Gong commented on YARN-2279:
-

committed into trunk/branch-2/branch-2.6. Thanks, zhijie !

 Add UTs to cover timeline server authentication
 ---

 Key: YARN-2279
 URL: https://issues.apache.org/jira/browse/YARN-2279
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
  Labels: test
 Attachments: YARN-2279.1.patch


 Currently, timeline server authentication is lacking unit tests. We have to 
 verify each incremental patch manually. It's good to add some unit tests here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2279) Add UTs to cover timeline server authentication

2014-10-28 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-2279:

Fix Version/s: 2.6.0

 Add UTs to cover timeline server authentication
 ---

 Key: YARN-2279
 URL: https://issues.apache.org/jira/browse/YARN-2279
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
  Labels: test
 Fix For: 2.6.0

 Attachments: YARN-2279.1.patch


 Currently, timeline server authentication is lacking unit tests. We have to 
 verify each incremental patch manually. It's good to add some unit tests here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2737) Misleading msg in LogCLI when app is not successfully submitted

2014-10-28 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187253#comment-14187253
 ] 

Tsuyoshi OZAWA commented on YARN-2737:
--

[~jianhe], [~rohithsharma] reviewed a patch on YARN-1813. It includes the 
comment about this issue. Could you take a look?

 Misleading msg in LogCLI when app is not successfully submitted 
 

 Key: YARN-2737
 URL: https://issues.apache.org/jira/browse/YARN-2737
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, resourcemanager
Reporter: Jian He
Assignee: Rohith

 {{LogCLiHelpers#logDirNotExist}} prints msg {{Log aggregation has not 
 completed or is not enabled.}} if the app log file doesn't exist. This is 
 misleading because if the application is not submitted successfully. Clearly, 
 we won't have logs for this application. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2742) FairSchedulerConfiguration fails to parse if there is extra space between value and unit

2014-10-28 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187264#comment-14187264
 ] 

Tsuyoshi OZAWA commented on YARN-2742:
--

[~ywskycn], thanks for contribution. How about adding a test case to include 
trailing space like  1024 mb, 4 core ?

 FairSchedulerConfiguration fails to parse if there is extra space between 
 value and unit
 

 Key: YARN-2742
 URL: https://issues.apache.org/jira/browse/YARN-2742
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Wei Yan
Priority: Minor
 Attachments: YARN-2742-1.patch


 FairSchedulerConfiguration is very strict about the number of space 
 characters between the value and the unit: 0 or 1 space.
 For example, for values like the following:
 {noformat}
 maxResources4096  mb, 2 vcoresmaxResources
 {noformat}
 (note 2 spaces)
 This above line fails to parse:
 {noformat}
 2014-10-24 22:56:40,802 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService:
  Failed to reload fair scheduler config file - will use existing allocations.
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException:
  Missing resource: mb
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.findResource(FairSchedulerConfiguration.java:247)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.parseResourceConfigValue(FairSchedulerConfiguration.java:231)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:347)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:381)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:293)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService$1.run(AllocationFileLoaderService.java:117)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1813) Better error message for yarn logs when permission denied

2014-10-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187293#comment-14187293
 ] 

Hadoop QA commented on YARN-1813:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677648/YARN-1813.6.patch
  against trunk revision 0d3e7e2.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5607//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5607//console

This message is automatically generated.

 Better error message for yarn logs when permission denied
 ---

 Key: YARN-1813
 URL: https://issues.apache.org/jira/browse/YARN-1813
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.3.0, 2.4.1, 2.5.1
Reporter: Andrew Wang
Assignee: Tsuyoshi OZAWA
Priority: Minor
 Attachments: YARN-1813.1.patch, YARN-1813.2.patch, YARN-1813.2.patch, 
 YARN-1813.3.patch, YARN-1813.4.patch, YARN-1813.5.patch, YARN-1813.6.patch


 I ran some MR jobs as the hdfs user, and then forgot to sudo -u when 
 grabbing the logs. yarn logs prints an error message like the following:
 {noformat}
 [andrew.wang@a2402 ~]$ yarn logs -applicationId application_1394482121761_0010
 14/03/10 16:05:10 INFO client.RMProxy: Connecting to ResourceManager at 
 a2402.halxg.cloudera.com/10.20.212.10:8032
 Logs not available at 
 /tmp/logs/andrew.wang/logs/application_1394482121761_0010
 Log aggregation has not completed or is not enabled.
 {noformat}
 It'd be nicer if it said Permission denied or AccessControlException or 
 something like that instead, since that's the real issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2758) Update TestApplicationHistoryClientService to use the new generic history store

2014-10-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187295#comment-14187295
 ] 

Hadoop QA commented on YARN-2758:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677493/YARN-2758.1.patch
  against trunk revision 0d3e7e2.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5608//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5608//console

This message is automatically generated.

 Update TestApplicationHistoryClientService to use the new generic history 
 store
 ---

 Key: YARN-2758
 URL: https://issues.apache.org/jira/browse/YARN-2758
 Project: Hadoop YARN
  Issue Type: Test
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-2758.1.patch


 TestApplicationHistoryClientService is still testing against the mock data in 
 the old MemoryApplicationHistoryStore. hence it needs to be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2742) FairSchedulerConfiguration fails to parse if there is extra space between value and unit

2014-10-28 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-2742:
--
Attachment: YARN-2742-2.patch

Thanks, [~ozawa]. Update a patch to add that testcase.

 FairSchedulerConfiguration fails to parse if there is extra space between 
 value and unit
 

 Key: YARN-2742
 URL: https://issues.apache.org/jira/browse/YARN-2742
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Wei Yan
Priority: Minor
 Attachments: YARN-2742-1.patch, YARN-2742-2.patch


 FairSchedulerConfiguration is very strict about the number of space 
 characters between the value and the unit: 0 or 1 space.
 For example, for values like the following:
 {noformat}
 maxResources4096  mb, 2 vcoresmaxResources
 {noformat}
 (note 2 spaces)
 This above line fails to parse:
 {noformat}
 2014-10-24 22:56:40,802 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService:
  Failed to reload fair scheduler config file - will use existing allocations.
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException:
  Missing resource: mb
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.findResource(FairSchedulerConfiguration.java:247)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.parseResourceConfigValue(FairSchedulerConfiguration.java:231)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:347)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:381)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:293)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService$1.run(AllocationFileLoaderService.java:117)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2755) NM fails to clean up usercache_DEL_timestamp dirs after YARN-661

2014-10-28 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-2755:
--
Attachment: YARN-2755.v2.patch

 NM fails to clean up usercache_DEL_timestamp dirs after YARN-661
 --

 Key: YARN-2755
 URL: https://issues.apache.org/jira/browse/YARN-2755
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
Priority: Critical
 Attachments: YARN-2755.v1.patch, YARN-2755.v2.patch


 When NM restarts frequently due to some reason, a large number of directories 
 like these left in /data/disk$num/yarn/local/:
 /data/disk1/yarn/local/usercache_DEL_1414372756105
 /data/disk1/yarn/local/usercache_DEL_1413557901696
 /data/disk1/yarn/local/usercache_DEL_1413657004894
 /data/disk1/yarn/local/usercache_DEL_1413675321860
 /data/disk1/yarn/local/usercache_DEL_1414093167936
 /data/disk1/yarn/local/usercache_DEL_1413565841271
 These directories are empty, but take up 100M+ due to the number of them. 
 There were 38714 on the machine I looked at per data disk.
 It appears to be a regression introduced by YARN-661



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2755) NM fails to clean up usercache_DEL_timestamp dirs after YARN-661

2014-10-28 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-2755:
--
Attachment: YARN-2755.v3.patch

 NM fails to clean up usercache_DEL_timestamp dirs after YARN-661
 --

 Key: YARN-2755
 URL: https://issues.apache.org/jira/browse/YARN-2755
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
Priority: Critical
 Attachments: YARN-2755.v1.patch, YARN-2755.v2.patch, 
 YARN-2755.v3.patch


 When NM restarts frequently due to some reason, a large number of directories 
 like these left in /data/disk$num/yarn/local/:
 /data/disk1/yarn/local/usercache_DEL_1414372756105
 /data/disk1/yarn/local/usercache_DEL_1413557901696
 /data/disk1/yarn/local/usercache_DEL_1413657004894
 /data/disk1/yarn/local/usercache_DEL_1413675321860
 /data/disk1/yarn/local/usercache_DEL_1414093167936
 /data/disk1/yarn/local/usercache_DEL_1413565841271
 These directories are empty, but take up 100M+ due to the number of them. 
 There were 38714 on the machine I looked at per data disk.
 It appears to be a regression introduced by YARN-661



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2758) Update TestApplicationHistoryClientService to use the new generic history store

2014-10-28 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187387#comment-14187387
 ] 

Xuan Gong commented on YARN-2758:
-

+1 Looks good to me. Will commit

 Update TestApplicationHistoryClientService to use the new generic history 
 store
 ---

 Key: YARN-2758
 URL: https://issues.apache.org/jira/browse/YARN-2758
 Project: Hadoop YARN
  Issue Type: Test
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-2758.1.patch


 TestApplicationHistoryClientService is still testing against the mock data in 
 the old MemoryApplicationHistoryStore. hence it needs to be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2758) Update TestApplicationHistoryClientService to use the new generic history store

2014-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187401#comment-14187401
 ] 

Hudson commented on YARN-2758:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6372 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6372/])
YARN-2758. Update TestApplicationHistoryClientService to use the new generic 
history store. Contributed by Zhijie Shen (xgong: rev 
69f79bee8b3da07bf42e22e35e58c7719782e31f)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryClientService.java
* hadoop-yarn-project/CHANGES.txt


 Update TestApplicationHistoryClientService to use the new generic history 
 store
 ---

 Key: YARN-2758
 URL: https://issues.apache.org/jira/browse/YARN-2758
 Project: Hadoop YARN
  Issue Type: Test
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Attachments: YARN-2758.1.patch


 TestApplicationHistoryClientService is still testing against the mock data in 
 the old MemoryApplicationHistoryStore. hence it needs to be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2758) Update TestApplicationHistoryClientService to use the new generic history store

2014-10-28 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187416#comment-14187416
 ] 

Xuan Gong commented on YARN-2758:
-

Committed to trunk/branch-2/branch-2.6. Thanks, zhijie!

 Update TestApplicationHistoryClientService to use the new generic history 
 store
 ---

 Key: YARN-2758
 URL: https://issues.apache.org/jira/browse/YARN-2758
 Project: Hadoop YARN
  Issue Type: Test
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Zhijie Shen
 Fix For: 2.6.0

 Attachments: YARN-2758.1.patch


 TestApplicationHistoryClientService is still testing against the mock data in 
 the old MemoryApplicationHistoryStore. hence it needs to be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2742) FairSchedulerConfiguration fails to parse if there is extra space between value and unit

2014-10-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187426#comment-14187426
 ] 

Hadoop QA commented on YARN-2742:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677668/YARN-2742-2.patch
  against trunk revision 371a3b8.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5609//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5609//console

This message is automatically generated.

 FairSchedulerConfiguration fails to parse if there is extra space between 
 value and unit
 

 Key: YARN-2742
 URL: https://issues.apache.org/jira/browse/YARN-2742
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Wei Yan
Priority: Minor
 Attachments: YARN-2742-1.patch, YARN-2742-2.patch


 FairSchedulerConfiguration is very strict about the number of space 
 characters between the value and the unit: 0 or 1 space.
 For example, for values like the following:
 {noformat}
 maxResources4096  mb, 2 vcoresmaxResources
 {noformat}
 (note 2 spaces)
 This above line fails to parse:
 {noformat}
 2014-10-24 22:56:40,802 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService:
  Failed to reload fair scheduler config file - will use existing allocations.
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException:
  Missing resource: mb
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.findResource(FairSchedulerConfiguration.java:247)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.parseResourceConfigValue(FairSchedulerConfiguration.java:231)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:347)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:381)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:293)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService$1.run(AllocationFileLoaderService.java:117)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2755) NM fails to clean up usercache_DEL_timestamp dirs after YARN-661

2014-10-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187429#comment-14187429
 ] 

Hadoop QA commented on YARN-2755:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677674/YARN-2755.v2.patch
  against trunk revision e226b5b.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5610//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5610//console

This message is automatically generated.

 NM fails to clean up usercache_DEL_timestamp dirs after YARN-661
 --

 Key: YARN-2755
 URL: https://issues.apache.org/jira/browse/YARN-2755
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
Priority: Critical
 Attachments: YARN-2755.v1.patch, YARN-2755.v2.patch, 
 YARN-2755.v3.patch


 When NM restarts frequently due to some reason, a large number of directories 
 like these left in /data/disk$num/yarn/local/:
 /data/disk1/yarn/local/usercache_DEL_1414372756105
 /data/disk1/yarn/local/usercache_DEL_1413557901696
 /data/disk1/yarn/local/usercache_DEL_1413657004894
 /data/disk1/yarn/local/usercache_DEL_1413675321860
 /data/disk1/yarn/local/usercache_DEL_1414093167936
 /data/disk1/yarn/local/usercache_DEL_1413565841271
 These directories are empty, but take up 100M+ due to the number of them. 
 There were 38714 on the machine I looked at per data disk.
 It appears to be a regression introduced by YARN-661



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2742) FairSchedulerConfiguration fails to parse if there is extra space between value and unit

2014-10-28 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187428#comment-14187428
 ] 

Tsuyoshi OZAWA commented on YARN-2742:
--

LGTM.

 FairSchedulerConfiguration fails to parse if there is extra space between 
 value and unit
 

 Key: YARN-2742
 URL: https://issues.apache.org/jira/browse/YARN-2742
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Wei Yan
Priority: Minor
 Attachments: YARN-2742-1.patch, YARN-2742-2.patch


 FairSchedulerConfiguration is very strict about the number of space 
 characters between the value and the unit: 0 or 1 space.
 For example, for values like the following:
 {noformat}
 maxResources4096  mb, 2 vcoresmaxResources
 {noformat}
 (note 2 spaces)
 This above line fails to parse:
 {noformat}
 2014-10-24 22:56:40,802 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService:
  Failed to reload fair scheduler config file - will use existing allocations.
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException:
  Missing resource: mb
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.findResource(FairSchedulerConfiguration.java:247)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.parseResourceConfigValue(FairSchedulerConfiguration.java:231)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:347)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:381)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:293)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService$1.run(AllocationFileLoaderService.java:117)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2755) NM fails to clean up usercache_DEL_timestamp dirs after YARN-661

2014-10-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187432#comment-14187432
 ] 

Hadoop QA commented on YARN-2755:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677678/YARN-2755.v3.patch
  against trunk revision e226b5b.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5611//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5611//console

This message is automatically generated.

 NM fails to clean up usercache_DEL_timestamp dirs after YARN-661
 --

 Key: YARN-2755
 URL: https://issues.apache.org/jira/browse/YARN-2755
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
Priority: Critical
 Attachments: YARN-2755.v1.patch, YARN-2755.v2.patch, 
 YARN-2755.v3.patch


 When NM restarts frequently due to some reason, a large number of directories 
 like these left in /data/disk$num/yarn/local/:
 /data/disk1/yarn/local/usercache_DEL_1414372756105
 /data/disk1/yarn/local/usercache_DEL_1413557901696
 /data/disk1/yarn/local/usercache_DEL_1413657004894
 /data/disk1/yarn/local/usercache_DEL_1413675321860
 /data/disk1/yarn/local/usercache_DEL_1414093167936
 /data/disk1/yarn/local/usercache_DEL_1413565841271
 These directories are empty, but take up 100M+ due to the number of them. 
 There were 38714 on the machine I looked at per data disk.
 It appears to be a regression introduced by YARN-661



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2742) FairSchedulerConfiguration fails to parse if there is extra space between value and unit

2014-10-28 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187433#comment-14187433
 ] 

Sangjin Lee commented on YARN-2742:
---

+1

 FairSchedulerConfiguration fails to parse if there is extra space between 
 value and unit
 

 Key: YARN-2742
 URL: https://issues.apache.org/jira/browse/YARN-2742
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.0
Reporter: Sangjin Lee
Assignee: Wei Yan
Priority: Minor
 Attachments: YARN-2742-1.patch, YARN-2742-2.patch


 FairSchedulerConfiguration is very strict about the number of space 
 characters between the value and the unit: 0 or 1 space.
 For example, for values like the following:
 {noformat}
 maxResources4096  mb, 2 vcoresmaxResources
 {noformat}
 (note 2 spaces)
 This above line fails to parse:
 {noformat}
 2014-10-24 22:56:40,802 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService:
  Failed to reload fair scheduler config file - will use existing allocations.
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException:
  Missing resource: mb
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.findResource(FairSchedulerConfiguration.java:247)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration.parseResourceConfigValue(FairSchedulerConfiguration.java:231)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:347)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:381)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:293)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService$1.run(AllocationFileLoaderService.java:117)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2741) Windows: Node manager cannot serve up log files via the web user interface when yarn.nodemanager.log-dirs to any drive letter other than C: (or, the drive that nodemanag

2014-10-28 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187455#comment-14187455
 ] 

Zhijie Shen commented on YARN-2741:
---

+1 LGTM. The test case will verify the drive letter will not be skipped on both 
Linux and Windows. Will commit the patch.

 Windows: Node manager cannot serve up log files via the web user interface 
 when yarn.nodemanager.log-dirs to any drive letter other than C: (or, the 
 drive that nodemanager is running on)
 --

 Key: YARN-2741
 URL: https://issues.apache.org/jira/browse/YARN-2741
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
 Environment: Windows
Reporter: Craig Welch
Assignee: Craig Welch
 Attachments: YARN-2741.1.patch, YARN-2741.6.patch


 PROBLEM: User is getting No Logs available for Container Container_number 
 when setting the yarn.nodemanager.log-dirs to any drive letter other than C:
 STEPS TO REPRODUCE:
 On Windows
 1) Run NodeManager on C:
 2) Create two local drive partitions D: and E:
 3) Put yarn.nodemanager.log-dirs = D:\nmlogs or E:\nmlogs
 4) Run a MR job that will last at least 5 minutes
 5) While the job is in flight, log into the Yarn web ui , 
 resource_manager_server:8088/cluster
 6) Click on the application_idnumber
 7) Click on the logs link, you will get No Logs available for Container 
 Container_number
 ACTUAL BEHAVIOR: Getting an error message when viewing the container logs
 EXPECTED BEHAVIOR: Able to use different drive letters in 
 yarn.nodemanager.log-dirs and not get error
 NOTE: If we use the drive letter C: in yarn.nodemanager.log-dirs, we are able 
 to see the container logs while the MR job is in flight.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2741) Windows: Node manager cannot serve up log files via the web user interface when yarn.nodemanager.log-dirs to any drive letter other than C: (or, the drive that nodemanag

2014-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187497#comment-14187497
 ] 

Hudson commented on YARN-2741:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6374 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6374/])
YARN-2741. Made NM web UI serve logs on the drive other than C: on Windows. 
Contributed by Craig Welch. (zjshen: rev 
8984e9b1774033e379b57da1bd30a5c81888c7a3)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestContainerLogsPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ContainerLogsUtils.java


 Windows: Node manager cannot serve up log files via the web user interface 
 when yarn.nodemanager.log-dirs to any drive letter other than C: (or, the 
 drive that nodemanager is running on)
 --

 Key: YARN-2741
 URL: https://issues.apache.org/jira/browse/YARN-2741
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
 Environment: Windows
Reporter: Craig Welch
Assignee: Craig Welch
 Fix For: 2.6.0

 Attachments: YARN-2741.1.patch, YARN-2741.6.patch


 PROBLEM: User is getting No Logs available for Container Container_number 
 when setting the yarn.nodemanager.log-dirs to any drive letter other than C:
 STEPS TO REPRODUCE:
 On Windows
 1) Run NodeManager on C:
 2) Create two local drive partitions D: and E:
 3) Put yarn.nodemanager.log-dirs = D:\nmlogs or E:\nmlogs
 4) Run a MR job that will last at least 5 minutes
 5) While the job is in flight, log into the Yarn web ui , 
 resource_manager_server:8088/cluster
 6) Click on the application_idnumber
 7) Click on the logs link, you will get No Logs available for Container 
 Container_number
 ACTUAL BEHAVIOR: Getting an error message when viewing the container logs
 EXPECTED BEHAVIOR: Able to use different drive letters in 
 yarn.nodemanager.log-dirs and not get error
 NOTE: If we use the drive letter C: in yarn.nodemanager.log-dirs, we are able 
 to see the container logs while the MR job is in flight.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2647) Add yarn queue CLI to get queue infos

2014-10-28 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187521#comment-14187521
 ] 

Xuan Gong commented on YARN-2647:
-

Thanks for the patch. [~sunilg].

The patch looks good overall. I only have one comment.
This check seems unnecessary
{code}
if (args.length  0  args[0].equalsIgnoreCase(QUEUE)) {
 
} else {
  syserr.println(Invalid Command usage. Command must start with 'queue');
  return exitCode;
}
{code}

If we did not call command: yarn queue,  this class will not be used. So, this 
check is not necessary. We do have such check in ApplicationCLI. Because 
command: yarn application and command: yarn applicationattempt will use the 
same ApplicationCLI class. So, they need to do this.

Also, could you create a patch for branch-2, please ? The latest patch can not 
apply to branch-2. 

 Add yarn queue CLI to get queue infos
 -

 Key: YARN-2647
 URL: https://issues.apache.org/jira/browse/YARN-2647
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Reporter: Wangda Tan
Assignee: Sunil G
 Attachments: 0001-YARN-2647.patch, 0002-YARN-2647.patch, 
 0003-YARN-2647.patch, 0004-YARN-2647.patch, 0005-YARN-2647.patch, 
 0006-YARN-2647.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2753) Fix potential issues and code clean up for *NodeLabelsManager

2014-10-28 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2753:

Attachment: YARN-2753.004.patch

 Fix potential issues and code clean up for *NodeLabelsManager
 -

 Key: YARN-2753
 URL: https://issues.apache.org/jira/browse/YARN-2753
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2753.000.patch, YARN-2753.001.patch, 
 YARN-2753.002.patch, YARN-2753.003.patch, YARN-2753.004.patch


 Issues include:
 * CommonNodeLabelsManager#addToCluserNodeLabels should not change the value 
 in labelCollections if the key already exists otherwise the Label.resource 
 will be changed(reset).
 * potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
 CommonNodeLabelsManager.
 ** because when a Node is created, Node.labels can be null.
 ** In this case, nm.labels; may be null. So we need check originalLabels not 
 null before use it(originalLabels.containsAll).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2753) Fix potential issues and code clean up for *NodeLabelsManager

2014-10-28 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2753:

Description: 
Issues include:

* CommonNodeLabelsManager#addToCluserNodeLabels should not change the value in 
labelCollections if the key already exists otherwise the Label.resource will be 
changed(reset).
* potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
CommonNodeLabelsManager.
** because when a Node is created, Node.labels can be null.
** In this case, nm.labels; may be null. So we need check originalLabels not 
null before use it(originalLabels.containsAll).
* addToCluserNodeLabels should be protected by writeLock in 
RMNodeLabelsManager.java. because we should protect labelCollections in 
RMNodeLabelsManager.
* use static variable (Resources.none()) for not-running Node.resource in 
CommonNodeLabelsManager to save memory.
** When a Node is not activated, the resource is never used, When a Node is 
activated, a new resource will be assigned to it in 
RMNodeLabelsManager#activateNode (nm.resource = resource) So it would be better 
to use static variable Resources.none() instead of allocating a new 
variable(Resource.newInstance(0, 0)) for each node deactivation.

  was:
Issues include:

* CommonNodeLabelsManager#addToCluserNodeLabels should not change the value in 
labelCollections if the key already exists otherwise the Label.resource will be 
changed(reset).
* potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
CommonNodeLabelsManager.
** because when a Node is created, Node.labels can be null.
** In this case, nm.labels; may be null. So we need check originalLabels not 
null before use it(originalLabels.containsAll).


 Fix potential issues and code clean up for *NodeLabelsManager
 -

 Key: YARN-2753
 URL: https://issues.apache.org/jira/browse/YARN-2753
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2753.000.patch, YARN-2753.001.patch, 
 YARN-2753.002.patch, YARN-2753.003.patch, YARN-2753.004.patch


 Issues include:
 * CommonNodeLabelsManager#addToCluserNodeLabels should not change the value 
 in labelCollections if the key already exists otherwise the Label.resource 
 will be changed(reset).
 * potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
 CommonNodeLabelsManager.
 ** because when a Node is created, Node.labels can be null.
 ** In this case, nm.labels; may be null. So we need check originalLabels not 
 null before use it(originalLabels.containsAll).
 * addToCluserNodeLabels should be protected by writeLock in 
 RMNodeLabelsManager.java. because we should protect labelCollections in 
 RMNodeLabelsManager.
 * use static variable (Resources.none()) for not-running Node.resource in 
 CommonNodeLabelsManager to save memory.
 ** When a Node is not activated, the resource is never used, When a Node is 
 activated, a new resource will be assigned to it in 
 RMNodeLabelsManager#activateNode (nm.resource = resource) So it would be 
 better to use static variable Resources.none() instead of allocating a new 
 variable(Resource.newInstance(0, 0)) for each node deactivation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2556) Tool to measure the performance of the timeline server

2014-10-28 Thread chang li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chang li updated YARN-2556:
---
Attachment: yarn2556_wip.patch

current work in progress patch. implement the measure for iorate and 
transaction rate

 Tool to measure the performance of the timeline server
 --

 Key: YARN-2556
 URL: https://issues.apache.org/jira/browse/YARN-2556
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: chang li
 Attachments: YARN-2556-WIP.patch, yarn2556_wip.patch, 
 yarn2556_wip.patch


 We need to be able to understand the capacity model for the timeline server 
 to give users the tools they need to deploy a timeline server with the 
 correct capacity.
 I propose we create a mapreduce job that can measure timeline server write 
 and read performance. Transactions per second, I/O for both read and write 
 would be a good start.
 This could be done as an example or test job that could be tied into gridmix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2753) Fix potential issues and code clean up for *NodeLabelsManager

2014-10-28 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187575#comment-14187575
 ] 

zhihai xu commented on YARN-2753:
-

Hi [~leftnoteasy],
thanks for your suggestion. I also merged YARN-2754 and YARN-2756 to this Jira.
zhihai



 Fix potential issues and code clean up for *NodeLabelsManager
 -

 Key: YARN-2753
 URL: https://issues.apache.org/jira/browse/YARN-2753
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2753.000.patch, YARN-2753.001.patch, 
 YARN-2753.002.patch, YARN-2753.003.patch, YARN-2753.004.patch


 Issues include:
 * CommonNodeLabelsManager#addToCluserNodeLabels should not change the value 
 in labelCollections if the key already exists otherwise the Label.resource 
 will be changed(reset).
 * potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
 CommonNodeLabelsManager.
 ** because when a Node is created, Node.labels can be null.
 ** In this case, nm.labels; may be null. So we need check originalLabels not 
 null before use it(originalLabels.containsAll).
 * addToCluserNodeLabels should be protected by writeLock in 
 RMNodeLabelsManager.java. because we should protect labelCollections in 
 RMNodeLabelsManager.
 * use static variable (Resources.none()) for not-running Node.resource in 
 CommonNodeLabelsManager to save memory.
 ** When a Node is not activated, the resource is never used, When a Node is 
 activated, a new resource will be assigned to it in 
 RMNodeLabelsManager#activateNode (nm.resource = resource) So it would be 
 better to use static variable Resources.none() instead of allocating a new 
 variable(Resource.newInstance(0, 0)) for each node deactivation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1964) Create Docker analog of the LinuxContainerExecutor in YARN

2014-10-28 Thread Abin Shahab (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abin Shahab updated YARN-1964:
--
Attachment: YARN-1964.patch

Fixed rebase issue.

 Create Docker analog of the LinuxContainerExecutor in YARN
 --

 Key: YARN-1964
 URL: https://issues.apache.org/jira/browse/YARN-1964
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.2.0
Reporter: Arun C Murthy
Assignee: Abin Shahab
 Attachments: YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, 
 YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, 
 yarn-1964-branch-2.2.0-docker.patch, yarn-1964-branch-2.2.0-docker.patch, 
 yarn-1964-docker.patch, yarn-1964-docker.patch, yarn-1964-docker.patch, 
 yarn-1964-docker.patch, yarn-1964-docker.patch


 Docker (https://www.docker.io/) is, increasingly, a very popular container 
 technology.
 In context of YARN, the support for Docker will provide a very elegant 
 solution to allow applications to *package* their software into a Docker 
 container (entire Linux file system incl. custom versions of perl, python 
 etc.) and use it as a blueprint to launch all their YARN containers with 
 requisite software environment. This provides both consistency (all YARN 
 containers will have the same software environment) and isolation (no 
 interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2753) Fix potential issues and code clean up for *NodeLabelsManager

2014-10-28 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187579#comment-14187579
 ] 

zhihai xu commented on YARN-2753:
-

The new patch YARN-2753.004.patch will include 4 patches:YARN-2753, YARN-2759, 
YARN-2754 and YARN-2756.

 Fix potential issues and code clean up for *NodeLabelsManager
 -

 Key: YARN-2753
 URL: https://issues.apache.org/jira/browse/YARN-2753
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2753.000.patch, YARN-2753.001.patch, 
 YARN-2753.002.patch, YARN-2753.003.patch, YARN-2753.004.patch


 Issues include:
 * CommonNodeLabelsManager#addToCluserNodeLabels should not change the value 
 in labelCollections if the key already exists otherwise the Label.resource 
 will be changed(reset).
 * potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
 CommonNodeLabelsManager.
 ** because when a Node is created, Node.labels can be null.
 ** In this case, nm.labels; may be null. So we need check originalLabels not 
 null before use it(originalLabels.containsAll).
 * addToCluserNodeLabels should be protected by writeLock in 
 RMNodeLabelsManager.java. because we should protect labelCollections in 
 RMNodeLabelsManager.
 * use static variable (Resources.none()) for not-running Node.resource in 
 CommonNodeLabelsManager to save memory.
 ** When a Node is not activated, the resource is never used, When a Node is 
 activated, a new resource will be assigned to it in 
 RMNodeLabelsManager#activateNode (nm.resource = resource) So it would be 
 better to use static variable Resources.none() instead of allocating a new 
 variable(Resource.newInstance(0, 0)) for each node deactivation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2753) Fix potential issues and code clean up for *NodeLabelsManager

2014-10-28 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187584#comment-14187584
 ] 

Wangda Tan commented on YARN-2753:
--

[~zxu],
I agree with other fixes but this one:
bq. use static variable (Resources.none()) for not-running Node.resource in 
CommonNodeLabelsManager to save memory.
The Resources.none() is used to do checking such as if 
(Resources.greaterThan(resource, Resources.none())) { .. do something }. Even 
if in nowadays, it will be replaced, but you cannot say in the future, some 
guys may write like node.resource.setMemory(...), basically I think it's a bad 
style.
That will throw runtime exception and destroy YARN daemons, comparing to memory 
it can save, the risk is much more serious, do you agree?

Thanks,
Wangda


 Fix potential issues and code clean up for *NodeLabelsManager
 -

 Key: YARN-2753
 URL: https://issues.apache.org/jira/browse/YARN-2753
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2753.000.patch, YARN-2753.001.patch, 
 YARN-2753.002.patch, YARN-2753.003.patch, YARN-2753.004.patch


 Issues include:
 * CommonNodeLabelsManager#addToCluserNodeLabels should not change the value 
 in labelCollections if the key already exists otherwise the Label.resource 
 will be changed(reset).
 * potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
 CommonNodeLabelsManager.
 ** because when a Node is created, Node.labels can be null.
 ** In this case, nm.labels; may be null. So we need check originalLabels not 
 null before use it(originalLabels.containsAll).
 * addToCluserNodeLabels should be protected by writeLock in 
 RMNodeLabelsManager.java. because we should protect labelCollections in 
 RMNodeLabelsManager.
 * use static variable (Resources.none()) for not-running Node.resource in 
 CommonNodeLabelsManager to save memory.
 ** When a Node is not activated, the resource is never used, When a Node is 
 activated, a new resource will be assigned to it in 
 RMNodeLabelsManager#activateNode (nm.resource = resource) So it would be 
 better to use static variable Resources.none() instead of allocating a new 
 variable(Resource.newInstance(0, 0)) for each node deactivation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1964) Create Docker analog of the LinuxContainerExecutor in YARN

2014-10-28 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187587#comment-14187587
 ] 

Abin Shahab commented on YARN-1964:
---

Fixed.
I had tested the patch on my box, and that passed. Not sure how it passed
when there was such an obvious error.

On Tue, Oct 28, 2014 at 2:39 PM, sidharta seethana (JIRA) j...@apache.org



 Create Docker analog of the LinuxContainerExecutor in YARN
 --

 Key: YARN-1964
 URL: https://issues.apache.org/jira/browse/YARN-1964
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.2.0
Reporter: Arun C Murthy
Assignee: Abin Shahab
 Attachments: YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, 
 YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, YARN-1964.patch, 
 yarn-1964-branch-2.2.0-docker.patch, yarn-1964-branch-2.2.0-docker.patch, 
 yarn-1964-docker.patch, yarn-1964-docker.patch, yarn-1964-docker.patch, 
 yarn-1964-docker.patch, yarn-1964-docker.patch


 Docker (https://www.docker.io/) is, increasingly, a very popular container 
 technology.
 In context of YARN, the support for Docker will provide a very elegant 
 solution to allow applications to *package* their software into a Docker 
 container (entire Linux file system incl. custom versions of perl, python 
 etc.) and use it as a blueprint to launch all their YARN containers with 
 requisite software environment. This provides both consistency (all YARN 
 containers will have the same software environment) and isolation (no 
 interference with whatever is installed on the physical machine).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (YARN-2756) use static variable (Resources.none()) for not-running Node.resource in CommonNodeLabelsManager to save memory.

2014-10-28 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu reopened YARN-2756:
-

 use static variable (Resources.none()) for not-running Node.resource in 
 CommonNodeLabelsManager to save memory.
 ---

 Key: YARN-2756
 URL: https://issues.apache.org/jira/browse/YARN-2756
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
 Attachments: YARN-2756.000.patch


 use static variable (Resources.none()) for not-running Node.resource in 
 CommonNodeLabelsManager to save memory. When a Node is not activated, the 
 resource is never used, When a Node is activated, a new resource will be 
 assigned to it in RMNodeLabelsManager#activateNode (nm.resource = resource) 
 So it would be better to use static variable Resources.none() instead of 
 allocating a new variable(Resource.newInstance(0, 0)) for each node 
 deactivation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2756) use static variable (Resources.none()) for not-running Node.resource in CommonNodeLabelsManager to save memory.

2014-10-28 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187595#comment-14187595
 ] 

zhihai xu commented on YARN-2756:
-

Separate this JIRA from YARN-2753 for discussion.

 use static variable (Resources.none()) for not-running Node.resource in 
 CommonNodeLabelsManager to save memory.
 ---

 Key: YARN-2756
 URL: https://issues.apache.org/jira/browse/YARN-2756
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
 Attachments: YARN-2756.000.patch


 use static variable (Resources.none()) for not-running Node.resource in 
 CommonNodeLabelsManager to save memory. When a Node is not activated, the 
 resource is never used, When a Node is activated, a new resource will be 
 assigned to it in RMNodeLabelsManager#activateNode (nm.resource = resource) 
 So it would be better to use static variable Resources.none() instead of 
 allocating a new variable(Resource.newInstance(0, 0)) for each node 
 deactivation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2753) Fix potential issues and code clean up for *NodeLabelsManager

2014-10-28 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2753:

Description: 
Issues include:

* CommonNodeLabelsManager#addToCluserNodeLabels should not change the value in 
labelCollections if the key already exists otherwise the Label.resource will be 
changed(reset).
* potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
CommonNodeLabelsManager.
** because when a Node is created, Node.labels can be null.
** In this case, nm.labels; may be null. So we need check originalLabels not 
null before use it(originalLabels.containsAll).
* addToCluserNodeLabels should be protected by writeLock in 
RMNodeLabelsManager.java. because we should protect labelCollections in 
RMNodeLabelsManager.

  was:
Issues include:

* CommonNodeLabelsManager#addToCluserNodeLabels should not change the value in 
labelCollections if the key already exists otherwise the Label.resource will be 
changed(reset).
* potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
CommonNodeLabelsManager.
** because when a Node is created, Node.labels can be null.
** In this case, nm.labels; may be null. So we need check originalLabels not 
null before use it(originalLabels.containsAll).
* addToCluserNodeLabels should be protected by writeLock in 
RMNodeLabelsManager.java. because we should protect labelCollections in 
RMNodeLabelsManager.
* use static variable (Resources.none()) for not-running Node.resource in 
CommonNodeLabelsManager to save memory.
** When a Node is not activated, the resource is never used, When a Node is 
activated, a new resource will be assigned to it in 
RMNodeLabelsManager#activateNode (nm.resource = resource) So it would be better 
to use static variable Resources.none() instead of allocating a new 
variable(Resource.newInstance(0, 0)) for each node deactivation.


 Fix potential issues and code clean up for *NodeLabelsManager
 -

 Key: YARN-2753
 URL: https://issues.apache.org/jira/browse/YARN-2753
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2753.000.patch, YARN-2753.001.patch, 
 YARN-2753.002.patch, YARN-2753.003.patch, YARN-2753.004.patch


 Issues include:
 * CommonNodeLabelsManager#addToCluserNodeLabels should not change the value 
 in labelCollections if the key already exists otherwise the Label.resource 
 will be changed(reset).
 * potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
 CommonNodeLabelsManager.
 ** because when a Node is created, Node.labels can be null.
 ** In this case, nm.labels; may be null. So we need check originalLabels not 
 null before use it(originalLabels.containsAll).
 * addToCluserNodeLabels should be protected by writeLock in 
 RMNodeLabelsManager.java. because we should protect labelCollections in 
 RMNodeLabelsManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2753) Fix potential issues and code clean up for *NodeLabelsManager

2014-10-28 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2753:

Attachment: YARN-2753.005.patch

 Fix potential issues and code clean up for *NodeLabelsManager
 -

 Key: YARN-2753
 URL: https://issues.apache.org/jira/browse/YARN-2753
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2753.000.patch, YARN-2753.001.patch, 
 YARN-2753.002.patch, YARN-2753.003.patch, YARN-2753.004.patch, 
 YARN-2753.005.patch


 Issues include:
 * CommonNodeLabelsManager#addToCluserNodeLabels should not change the value 
 in labelCollections if the key already exists otherwise the Label.resource 
 will be changed(reset).
 * potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
 CommonNodeLabelsManager.
 ** because when a Node is created, Node.labels can be null.
 ** In this case, nm.labels; may be null. So we need check originalLabels not 
 null before use it(originalLabels.containsAll).
 * addToCluserNodeLabels should be protected by writeLock in 
 RMNodeLabelsManager.java. because we should protect labelCollections in 
 RMNodeLabelsManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2753) Fix potential issues and code clean up for *NodeLabelsManager

2014-10-28 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187619#comment-14187619
 ] 

zhihai xu commented on YARN-2753:
-

I remove YARN-2756 from this JIRA YARN-2753, so we can discuss issue YARN-2756 
separately.
The new patch YARN-2753.005.patch will include 3 patches:YARN-2753, YARN-2759 
and YARN-2754.

Hi [~leftnoteasy]
thanks to review the patch.
I will move the discussion to YARN-2756.
zhihai

 Fix potential issues and code clean up for *NodeLabelsManager
 -

 Key: YARN-2753
 URL: https://issues.apache.org/jira/browse/YARN-2753
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2753.000.patch, YARN-2753.001.patch, 
 YARN-2753.002.patch, YARN-2753.003.patch, YARN-2753.004.patch, 
 YARN-2753.005.patch


 Issues include:
 * CommonNodeLabelsManager#addToCluserNodeLabels should not change the value 
 in labelCollections if the key already exists otherwise the Label.resource 
 will be changed(reset).
 * potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
 CommonNodeLabelsManager.
 ** because when a Node is created, Node.labels can be null.
 ** In this case, nm.labels; may be null. So we need check originalLabels not 
 null before use it(originalLabels.containsAll).
 * addToCluserNodeLabels should be protected by writeLock in 
 RMNodeLabelsManager.java. because we should protect labelCollections in 
 RMNodeLabelsManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2503) Changes in RM Web UI to better show labels to end users

2014-10-28 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2503:
-
Attachment: YARN-2503-20141028-1.patch

Thanks [~jianhe] for comment, updated a new patch addressed your comments.

 Changes in RM Web UI to better show labels to end users
 ---

 Key: YARN-2503
 URL: https://issues.apache.org/jira/browse/YARN-2503
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-2503-20141022-1.patch, YARN-2503-20141028-1.patch, 
 YARN-2503.patch


 Include but not limited to:
 - Show labels of nodes in RM/nodes page
 - Show labels of queue in RM/scheduler page



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2647) Add yarn queue CLI to get queue infos

2014-10-28 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187648#comment-14187648
 ] 

Craig Welch commented on YARN-2647:
---

[~sunilg] in the yarn file you can drop the (s) from information(s) ( 
information is singular and plural :-) )

In QueueCLI.listQueues I think it is safe to not check for a null queue list 
(not entirely sure), but in printQueueInfo I do think you need to check that 
the nodeLabels list is not null

Otherwise, +1 from me

 Add yarn queue CLI to get queue infos
 -

 Key: YARN-2647
 URL: https://issues.apache.org/jira/browse/YARN-2647
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Reporter: Wangda Tan
Assignee: Sunil G
 Attachments: 0001-YARN-2647.patch, 0002-YARN-2647.patch, 
 0003-YARN-2647.patch, 0004-YARN-2647.patch, 0005-YARN-2647.patch, 
 0006-YARN-2647.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2756) use static variable (Resources.none()) for not-running Node.resource in CommonNodeLabelsManager to save memory.

2014-10-28 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187660#comment-14187660
 ] 

zhihai xu commented on YARN-2756:
-

Hi [~leftnoteasy]

bq. but you cannot say in the future, some guys may write like 
node.resource.setMemory(...), basically I think it's a bad style. That will 
throw runtime exception and destroy YARN daemons, comparing to memory it can 
save, the risk is much more serious, do you agree?

IMO we should not permit people to call node.resource.setMemory(...) to change 
the node memory when the node is not running. Currently the only way to change 
the node memory from scheduler is by activateNode/deactivateNode.
The patch will force this constraint: when the node is not running, the 
resource in the node can't be change. We can only change the resource in the 
node when the node is running. In the future, if we really want to change the 
rule/constraint, we can change the implementation/architecture. But I don't see 
we need change the rule/constraint now or in the near future.
Saving memory is the second benefit for this patch.

thanks
zhihai

 use static variable (Resources.none()) for not-running Node.resource in 
 CommonNodeLabelsManager to save memory.
 ---

 Key: YARN-2756
 URL: https://issues.apache.org/jira/browse/YARN-2756
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
 Attachments: YARN-2756.000.patch


 use static variable (Resources.none()) for not-running Node.resource in 
 CommonNodeLabelsManager to save memory. When a Node is not activated, the 
 resource is never used, When a Node is activated, a new resource will be 
 assigned to it in RMNodeLabelsManager#activateNode (nm.resource = resource) 
 So it would be better to use static variable Resources.none() instead of 
 allocating a new variable(Resource.newInstance(0, 0)) for each node 
 deactivation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2757) potential NPE in checkNodeLabelExpression of SchedulerUtils for nodeLabels.

2014-10-28 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187672#comment-14187672
 ] 

zhihai xu commented on YARN-2757:
-

Hi [~leftnoteasy],

thanks to review the patch. I agree to change the priority to minor.
I just want to make sure the code is consistent either both check the null 
pointer or both don't check the null pointer.

zhihai

 potential NPE in checkNodeLabelExpression of SchedulerUtils for nodeLabels.
 ---

 Key: YARN-2757
 URL: https://issues.apache.org/jira/browse/YARN-2757
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
 Attachments: YARN-2757.000.patch


 pontential NPE in checkNodeLabelExpression of SchedulerUtils for nodeLabels.
 since we check the nodeLabels null at 
 {code}
 if (!str.trim().isEmpty()
  (nodeLabels == null || !nodeLabels.contains(str.trim( {
   return false;
 }
 {code}
 We should also check nodeLabels null at 
 {code}
   if (!nodeLabels.isEmpty()) {
 return false;
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2556) Tool to measure the performance of the timeline server

2014-10-28 Thread chang li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chang li updated YARN-2556:
---
Attachment: (was: yarn2556_wip.patch)

 Tool to measure the performance of the timeline server
 --

 Key: YARN-2556
 URL: https://issues.apache.org/jira/browse/YARN-2556
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: chang li
 Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, 
 yarn2556_wip.patch


 We need to be able to understand the capacity model for the timeline server 
 to give users the tools they need to deploy a timeline server with the 
 correct capacity.
 I propose we create a mapreduce job that can measure timeline server write 
 and read performance. Transactions per second, I/O for both read and write 
 would be a good start.
 This could be done as an example or test job that could be tied into gridmix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2556) Tool to measure the performance of the timeline server

2014-10-28 Thread chang li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chang li updated YARN-2556:
---
Attachment: YARN-2556-WIP.patch

 Tool to measure the performance of the timeline server
 --

 Key: YARN-2556
 URL: https://issues.apache.org/jira/browse/YARN-2556
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: chang li
 Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, 
 yarn2556_wip.patch


 We need to be able to understand the capacity model for the timeline server 
 to give users the tools they need to deploy a timeline server with the 
 correct capacity.
 I propose we create a mapreduce job that can measure timeline server write 
 and read performance. Transactions per second, I/O for both read and write 
 would be a good start.
 This could be done as an example or test job that could be tied into gridmix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2757) potential NPE in checkNodeLabelExpression of SchedulerUtils for nodeLabels.

2014-10-28 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2757:

Issue Type: Sub-task  (was: Bug)
Parent: YARN-2492

 potential NPE in checkNodeLabelExpression of SchedulerUtils for nodeLabels.
 ---

 Key: YARN-2757
 URL: https://issues.apache.org/jira/browse/YARN-2757
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
 Attachments: YARN-2757.000.patch


 pontential NPE in checkNodeLabelExpression of SchedulerUtils for nodeLabels.
 since we check the nodeLabels null at 
 {code}
 if (!str.trim().isEmpty()
  (nodeLabels == null || !nodeLabels.contains(str.trim( {
   return false;
 }
 {code}
 We should also check nodeLabels null at 
 {code}
   if (!nodeLabels.isEmpty()) {
 return false;
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2756) use static variable (Resources.none()) for not-running Node.resource in CommonNodeLabelsManager to save memory.

2014-10-28 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2756:

Issue Type: Sub-task  (was: Improvement)
Parent: YARN-2492

 use static variable (Resources.none()) for not-running Node.resource in 
 CommonNodeLabelsManager to save memory.
 ---

 Key: YARN-2756
 URL: https://issues.apache.org/jira/browse/YARN-2756
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
 Attachments: YARN-2756.000.patch


 use static variable (Resources.none()) for not-running Node.resource in 
 CommonNodeLabelsManager to save memory. When a Node is not activated, the 
 resource is never used, When a Node is activated, a new resource will be 
 assigned to it in RMNodeLabelsManager#activateNode (nm.resource = resource) 
 So it would be better to use static variable Resources.none() instead of 
 allocating a new variable(Resource.newInstance(0, 0)) for each node 
 deactivation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-2698) Move getClusterNodeLabels and getNodeToLabels to YARN CLI instead of RMAdminCLI

2014-10-28 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan reassigned YARN-2698:


Assignee: Wangda Tan  (was: Mayank Bansal)

 Move getClusterNodeLabels and getNodeToLabels to YARN CLI instead of 
 RMAdminCLI
 ---

 Key: YARN-2698
 URL: https://issues.apache.org/jira/browse/YARN-2698
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
Priority: Critical

 YARN RMAdminCLI and AdminService should have write API only, for other read 
 APIs, they should be located at YARNCLI and RMClientService.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2698) Move getClusterNodeLabels and getNodeToLabels to YARN CLI instead of RMAdminCLI

2014-10-28 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187691#comment-14187691
 ] 

Wangda Tan commented on YARN-2698:
--

[~mayank_bansal], taking this over, we need get this done today, will upload a 
patch soon.

Thanks,

 Move getClusterNodeLabels and getNodeToLabels to YARN CLI instead of 
 RMAdminCLI
 ---

 Key: YARN-2698
 URL: https://issues.apache.org/jira/browse/YARN-2698
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
Priority: Critical

 YARN RMAdminCLI and AdminService should have write API only, for other read 
 APIs, they should be located at YARNCLI and RMClientService.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2753) Fix potential issues and code clean up for *NodeLabelsManager

2014-10-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187703#comment-14187703
 ] 

Hadoop QA commented on YARN-2753:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677715/YARN-2753.004.patch
  against trunk revision 8984e9b.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5612//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5612//console

This message is automatically generated.

 Fix potential issues and code clean up for *NodeLabelsManager
 -

 Key: YARN-2753
 URL: https://issues.apache.org/jira/browse/YARN-2753
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-2753.000.patch, YARN-2753.001.patch, 
 YARN-2753.002.patch, YARN-2753.003.patch, YARN-2753.004.patch, 
 YARN-2753.005.patch


 Issues include:
 * CommonNodeLabelsManager#addToCluserNodeLabels should not change the value 
 in labelCollections if the key already exists otherwise the Label.resource 
 will be changed(reset).
 * potential NPE(NullPointerException) in checkRemoveLabelsFromNode of 
 CommonNodeLabelsManager.
 ** because when a Node is created, Node.labels can be null.
 ** In this case, nm.labels; may be null. So we need check originalLabels not 
 null before use it(originalLabels.containsAll).
 * addToCluserNodeLabels should be protected by writeLock in 
 RMNodeLabelsManager.java. because we should protect labelCollections in 
 RMNodeLabelsManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >