[jira] [Commented] (YARN-3840) Resource Manager web ui issue when sorting application by id (with application having id 9999)

2015-06-27 Thread Mohammad Shahid Khan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604020#comment-14604020
 ] 

Mohammad Shahid Khan commented on YARN-3840:


The datatable string sort algorithm has limitation.
It can not properly sort the string having having the combination of the string 
and numeric value. The application id application_numericValue
ie is why the sort is not working properly.

To fix the same we can use the datatable plugins natural sort alogorithm.
{CODE}
sb.append([\n)
  .append({'sType':'natural', 'aTargets': [0])
  .append(, 'mRender': parseHadoopID })
{CODE}
plugin - ref: 
https://github.com/DataTables/Plugins/blob/1.10.7/sorting/natural.js

 Resource Manager web ui issue when sorting application by id (with 
 application having id  )
 

 Key: YARN-3840
 URL: https://issues.apache.org/jira/browse/YARN-3840
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
 Environment: Centos 6.6
 Java 1.7
Reporter: LINTE
 Attachments: RMApps.png


 On the WEBUI, the global main view page : 
 http://resourcemanager:8088/cluster/apps doesn't display applications over 
 .
 With command line it works (# yarn application -list).
 Regards,
 Alexandre



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3768) Index out of range exception with environment variables without values

2015-06-27 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604027#comment-14604027
 ] 

zhihai xu commented on YARN-3768:
-

[~jira.shegalov], thanks for the suggestion! The current code will only look up 
the pattern {{getEnvironmentVariableRegex}} in value({{parts[1]}}) and replace 
the matched substring with the stored Env Variable's value. I looked at java 
Matcher class, I couldn't find a way to do capture group and replacement at the 
same time with a single regex. Is it possible to use a single regex with 
capture group to do both split and replacement with different variable? If it 
is possible, Could you tell me how to do that?

 Index out of range exception with environment variables without values
 --

 Key: YARN-3768
 URL: https://issues.apache.org/jira/browse/YARN-3768
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.5.0
Reporter: Joe Ferner
Assignee: zhihai xu
 Attachments: YARN-3768.000.patch, YARN-3768.001.patch


 Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range 
 exception occurs if an environment variable is encountered without a value.
 I believe this occurs because java will not return empty strings from the 
 split method. Similar to this 
 http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2871) TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk

2015-06-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604173#comment-14604173
 ] 

Hudson commented on YARN-2871:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #2169 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2169/])
YARN-2871. TestRMRestart#testRMRestartGetApplicationList sometime fails (xgong: 
rev fe6c1bd73aee188ed58df4d33bbc2d2fe0779a97)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
* hadoop-yarn-project/CHANGES.txt


 TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk
 -

 Key: YARN-2871
 URL: https://issues.apache.org/jira/browse/YARN-2871
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: zhihai xu
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-2871.000.patch, YARN-2871.001.patch, 
 YARN-2871.002.patch


 From trunk build #746 (https://builds.apache.org/job/Hadoop-Yarn-trunk/746):
 {code}
 Failed tests:
   TestRMRestart.testRMRestartGetApplicationList:957
 rMAppManager.logApplicationSummary(
 isA(org.apache.hadoop.yarn.api.records.ApplicationId)
 );
 Wanted 3 times:
 - at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:957)
 But was 2 times:
 - at 
 org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:66)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3850) NM fails to read files from full disks which can lead to container logs being lost and other issues

2015-06-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604112#comment-14604112
 ] 

Hudson commented on YARN-3850:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #971 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/971/])
YARN-3850. NM fails to read files from full disks which can lead to container 
logs being lost and other issues. Contributed by Varun Saxena (jlowe: rev 
40b256949ad6f6e0dbdd248f2d257b05899f4332)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/RecoveredContainerLaunch.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestContainerLogsPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ContainerLogsUtils.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java


 NM fails to read files from full disks which can lead to container logs being 
 lost and other issues
 ---

 Key: YARN-3850
 URL: https://issues.apache.org/jira/browse/YARN-3850
 Project: Hadoop YARN
  Issue Type: Bug
  Components: log-aggregation, nodemanager
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena
Priority: Blocker
 Fix For: 2.7.1

 Attachments: YARN-3850.01.patch, YARN-3850.02.patch


 *Container logs* can be lost if disk has become full(~90% full).
 When application finishes, we upload logs after aggregation by calling 
 {{AppLogAggregatorImpl#uploadLogsForContainers}}. But this call in turns 
 checks the eligible directories on call to 
 {{LocalDirsHandlerService#getLogDirs}} which in case of disk full would 
 return nothing. So none of the container logs are aggregated and uploaded.
 But on application finish, we also call 
 {{AppLogAggregatorImpl#doAppLogAggregationPostCleanUp()}}. This deletes the 
 application directory which contains container logs. This is because it calls 
 {{LocalDirsHandlerService#getLogDirsForCleanup}} which returns the full disks 
 as well.
 So we are left with neither aggregated logs for the app nor the individual 
 container logs for the app.
 In addition to this, there are 2 more issues :
 # {{ContainerLogsUtil#getContainerLogDirs}} does not consider full disks so 
 NM will fail to serve up logs from full disks from its web interfaces.
 # {{RecoveredContainerLaunch#locatePidFile}} also does not consider full 
 disks so it is possible that on container recovery, PID file is not found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2871) TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk

2015-06-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604111#comment-14604111
 ] 

Hudson commented on YARN-2871:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #971 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/971/])
YARN-2871. TestRMRestart#testRMRestartGetApplicationList sometime fails (xgong: 
rev fe6c1bd73aee188ed58df4d33bbc2d2fe0779a97)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
* hadoop-yarn-project/CHANGES.txt


 TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk
 -

 Key: YARN-2871
 URL: https://issues.apache.org/jira/browse/YARN-2871
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: zhihai xu
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-2871.000.patch, YARN-2871.001.patch, 
 YARN-2871.002.patch


 From trunk build #746 (https://builds.apache.org/job/Hadoop-Yarn-trunk/746):
 {code}
 Failed tests:
   TestRMRestart.testRMRestartGetApplicationList:957
 rMAppManager.logApplicationSummary(
 isA(org.apache.hadoop.yarn.api.records.ApplicationId)
 );
 Wanted 3 times:
 - at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:957)
 But was 2 times:
 - at 
 org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:66)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3846) RM Web UI queue fileter not working

2015-06-27 Thread Mohammad Shahid Khan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604131#comment-14604131
 ] 

Mohammad Shahid Khan commented on YARN-3846:


https://issues.apache.org/jira/browse/YARN-2238 is for different fix.
The label Queue  : was added before the queue name in 
https://issues.apache.org/jira/browse/YARN-3362. After the search was not 
working and the same has been handled in the  
https://issues.apache.org/jira/browse/YARN-3707.
The fix is OK for the for the first child of the queue 
but will not work for the second child
for example Queue: b.x
   
  Queue: root
Queue: a  _*For queue a it will work fine*_
Queue: b
   Queue: b.x   _*But for queue x and y this will not work*_
   Queue: b.y

*My question: *
  What is the significance of adding Label *Queue:*  before the queue name?
 



 RM Web UI queue fileter not working
 ---

 Key: YARN-3846
 URL: https://issues.apache.org/jira/browse/YARN-3846
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan

 Click on root queue will show the complete applications
 But click on the leaf queue is not filtering the application related to the 
 the clicked queue.
 The regular expression seems to be wrong 
 {code}
 q = '^' + q.substr(q.lastIndexOf(':') + 2) + '$';,
 {code}
 For example
 1. Suppose  queue name is  b
 them the above expression will try to substr at index 1 
 q.lastIndexOf(':')  = -1
 -1+2= 1
 which is wrong. its should look at the 0 index.
 2. if queue name is ab.x
 then it will parse it to .x 
 but it should be x



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3850) NM fails to read files from full disks which can lead to container logs being lost and other issues

2015-06-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604166#comment-14604166
 ] 

Hudson commented on YARN-3850:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #230 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/230/])
YARN-3850. NM fails to read files from full disks which can lead to container 
logs being lost and other issues. Contributed by Varun Saxena (jlowe: rev 
40b256949ad6f6e0dbdd248f2d257b05899f4332)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/RecoveredContainerLaunch.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ContainerLogsUtils.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestContainerLogsPage.java


 NM fails to read files from full disks which can lead to container logs being 
 lost and other issues
 ---

 Key: YARN-3850
 URL: https://issues.apache.org/jira/browse/YARN-3850
 Project: Hadoop YARN
  Issue Type: Bug
  Components: log-aggregation, nodemanager
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena
Priority: Blocker
 Fix For: 2.7.1

 Attachments: YARN-3850.01.patch, YARN-3850.02.patch


 *Container logs* can be lost if disk has become full(~90% full).
 When application finishes, we upload logs after aggregation by calling 
 {{AppLogAggregatorImpl#uploadLogsForContainers}}. But this call in turns 
 checks the eligible directories on call to 
 {{LocalDirsHandlerService#getLogDirs}} which in case of disk full would 
 return nothing. So none of the container logs are aggregated and uploaded.
 But on application finish, we also call 
 {{AppLogAggregatorImpl#doAppLogAggregationPostCleanUp()}}. This deletes the 
 application directory which contains container logs. This is because it calls 
 {{LocalDirsHandlerService#getLogDirsForCleanup}} which returns the full disks 
 as well.
 So we are left with neither aggregated logs for the app nor the individual 
 container logs for the app.
 In addition to this, there are 2 more issues :
 # {{ContainerLogsUtil#getContainerLogDirs}} does not consider full disks so 
 NM will fail to serve up logs from full disks from its web interfaces.
 # {{RecoveredContainerLaunch#locatePidFile}} also does not consider full 
 disks so it is possible that on container recovery, PID file is not found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3846) RM Web UI queue fileter not working

2015-06-27 Thread Mohammad Shahid Khan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604132#comment-14604132
 ] 

Mohammad Shahid Khan commented on YARN-3846:


Please confirm whether we have to keep the Label Queue:  or not.
Then will submit the patch accordingly.


 RM Web UI queue fileter not working
 ---

 Key: YARN-3846
 URL: https://issues.apache.org/jira/browse/YARN-3846
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan

 Click on root queue will show the complete applications
 But click on the leaf queue is not filtering the application related to the 
 the clicked queue.
 The regular expression seems to be wrong 
 {code}
 q = '^' + q.substr(q.lastIndexOf(':') + 2) + '$';,
 {code}
 For example
 1. Suppose  queue name is  b
 them the above expression will try to substr at index 1 
 q.lastIndexOf(':')  = -1
 -1+2= 1
 which is wrong. its should look at the 0 index.
 2. if queue name is ab.x
 then it will parse it to .x 
 but it should be x



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2871) TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk

2015-06-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604165#comment-14604165
 ] 

Hudson commented on YARN-2871:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #230 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/230/])
YARN-2871. TestRMRestart#testRMRestartGetApplicationList sometime fails (xgong: 
rev fe6c1bd73aee188ed58df4d33bbc2d2fe0779a97)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
* hadoop-yarn-project/CHANGES.txt


 TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk
 -

 Key: YARN-2871
 URL: https://issues.apache.org/jira/browse/YARN-2871
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: zhihai xu
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-2871.000.patch, YARN-2871.001.patch, 
 YARN-2871.002.patch


 From trunk build #746 (https://builds.apache.org/job/Hadoop-Yarn-trunk/746):
 {code}
 Failed tests:
   TestRMRestart.testRMRestartGetApplicationList:957
 rMAppManager.logApplicationSummary(
 isA(org.apache.hadoop.yarn.api.records.ApplicationId)
 );
 Wanted 3 times:
 - at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:957)
 But was 2 times:
 - at 
 org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:66)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3850) NM fails to read files from full disks which can lead to container logs being lost and other issues

2015-06-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604105#comment-14604105
 ] 

Hudson commented on YARN-3850:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #241 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/241/])
YARN-3850. NM fails to read files from full disks which can lead to container 
logs being lost and other issues. Contributed by Varun Saxena (jlowe: rev 
40b256949ad6f6e0dbdd248f2d257b05899f4332)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ContainerLogsUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestContainerLogsPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/RecoveredContainerLaunch.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java


 NM fails to read files from full disks which can lead to container logs being 
 lost and other issues
 ---

 Key: YARN-3850
 URL: https://issues.apache.org/jira/browse/YARN-3850
 Project: Hadoop YARN
  Issue Type: Bug
  Components: log-aggregation, nodemanager
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena
Priority: Blocker
 Fix For: 2.7.1

 Attachments: YARN-3850.01.patch, YARN-3850.02.patch


 *Container logs* can be lost if disk has become full(~90% full).
 When application finishes, we upload logs after aggregation by calling 
 {{AppLogAggregatorImpl#uploadLogsForContainers}}. But this call in turns 
 checks the eligible directories on call to 
 {{LocalDirsHandlerService#getLogDirs}} which in case of disk full would 
 return nothing. So none of the container logs are aggregated and uploaded.
 But on application finish, we also call 
 {{AppLogAggregatorImpl#doAppLogAggregationPostCleanUp()}}. This deletes the 
 application directory which contains container logs. This is because it calls 
 {{LocalDirsHandlerService#getLogDirsForCleanup}} which returns the full disks 
 as well.
 So we are left with neither aggregated logs for the app nor the individual 
 container logs for the app.
 In addition to this, there are 2 more issues :
 # {{ContainerLogsUtil#getContainerLogDirs}} does not consider full disks so 
 NM will fail to serve up logs from full disks from its web interfaces.
 # {{RecoveredContainerLaunch#locatePidFile}} also does not consider full 
 disks so it is possible that on container recovery, PID file is not found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2871) TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk

2015-06-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604104#comment-14604104
 ] 

Hudson commented on YARN-2871:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #241 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/241/])
YARN-2871. TestRMRestart#testRMRestartGetApplicationList sometime fails (xgong: 
rev fe6c1bd73aee188ed58df4d33bbc2d2fe0779a97)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
* hadoop-yarn-project/CHANGES.txt


 TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk
 -

 Key: YARN-2871
 URL: https://issues.apache.org/jira/browse/YARN-2871
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: zhihai xu
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-2871.000.patch, YARN-2871.001.patch, 
 YARN-2871.002.patch


 From trunk build #746 (https://builds.apache.org/job/Hadoop-Yarn-trunk/746):
 {code}
 Failed tests:
   TestRMRestart.testRMRestartGetApplicationList:957
 rMAppManager.logApplicationSummary(
 isA(org.apache.hadoop.yarn.api.records.ApplicationId)
 );
 Wanted 3 times:
 - at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:957)
 But was 2 times:
 - at 
 org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:66)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3848) TestNodeLabelContainerAllocation is timing out

2015-06-27 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3848:
---
Attachment: YARN-3848.01.patch

Could have put a sleep in the test. But checked for dispatcher queue being 
drained instead.

 TestNodeLabelContainerAllocation is timing out
 --

 Key: YARN-3848
 URL: https://issues.apache.org/jira/browse/YARN-3848
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3848.01.patch, test_output.txt


 A number of builds, pre-commit and otherwise, have been failing recently 
 because TestNodeLabelContainerAllocation has timed out.  See 
 https://builds.apache.org/job/Hadoop-Yarn-trunk/969/, YARN-3830, YARN-3802, 
 or YARN-3826 for examples.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3859) LeafQueue doesn't print user properly for application add

2015-06-27 Thread Devaraj K (JIRA)
Devaraj K created YARN-3859:
---

 Summary: LeafQueue doesn't print user properly for application add
 Key: YARN-3859
 URL: https://issues.apache.org/jira/browse/YARN-3859
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.7.0
Reporter: Devaraj K
Priority: Minor


{code:xml}
2015-06-28 04:36:22,721 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
Application added - appId: application_1435446241489_0003 user: 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@e8fb7a8,
 leaf-queue: default #user-pending-applications: 2 #user-active-applications: 1 
#queue-pending-applications: 2 #queue-active-applications: 1
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3859) LeafQueue doesn't print user properly for application add

2015-06-27 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena reassigned YARN-3859:
--

Assignee: Varun Saxena

 LeafQueue doesn't print user properly for application add
 -

 Key: YARN-3859
 URL: https://issues.apache.org/jira/browse/YARN-3859
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.7.0
Reporter: Devaraj K
Assignee: Varun Saxena
Priority: Minor

 {code:xml}
 2015-06-28 04:36:22,721 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
 Application added - appId: application_1435446241489_0003 user: 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@e8fb7a8,
  leaf-queue: default #user-pending-applications: 2 #user-active-applications: 
 1 #queue-pending-applications: 2 #queue-active-applications: 1
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3859) LeafQueue doesn't print user properly for application add

2015-06-27 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3859:
---
Attachment: YARN-3859.01.patch

[~devaraj.k], kindly review

 LeafQueue doesn't print user properly for application add
 -

 Key: YARN-3859
 URL: https://issues.apache.org/jira/browse/YARN-3859
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.7.0
Reporter: Devaraj K
Assignee: Varun Saxena
Priority: Minor
 Attachments: YARN-3859.01.patch


 {code:xml}
 2015-06-28 04:36:22,721 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
 Application added - appId: application_1435446241489_0003 user: 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@e8fb7a8,
  leaf-queue: default #user-pending-applications: 2 #user-active-applications: 
 1 #queue-pending-applications: 2 #queue-active-applications: 1
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3830) AbstractYarnScheduler.createReleaseCache may try to clean a null attempt

2015-06-27 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604299#comment-14604299
 ] 

Devaraj K commented on YARN-3830:
-

Nice catch [~nijel]. Thanks for working on this.

Can you add a test to cover this change? Thanks.


 AbstractYarnScheduler.createReleaseCache may try to clean a null attempt
 

 Key: YARN-3830
 URL: https://issues.apache.org/jira/browse/YARN-3830
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: nijel
Assignee: nijel
 Attachments: YARN-3830_1.patch, YARN-3830_2.patch, YARN-3830_3.patch


 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.createReleaseCache()
 {code}
 protected void createReleaseCache() {
 // Cleanup the cache after nm expire interval.
 new Timer().schedule(new TimerTask() {
   @Override
   public void run() {
 for (SchedulerApplicationT app : applications.values()) {
   T attempt = app.getCurrentAppAttempt();
   synchronized (attempt) {
 for (ContainerId containerId : attempt.getPendingRelease()) {
   RMAuditLogger.logFailure(
 {code}
 Here the attempt can be null since the attempt is created later. So null 
 pointer exception  will come
 {code}
 2015-06-19 09:29:16,195 | ERROR | Timer-3 | Thread Thread[Timer-3,5,main] 
 threw an Exception. | YarnUncaughtExceptionHandler.java:68
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler$1.run(AbstractYarnScheduler.java:457)
   at java.util.TimerThread.mainLoop(Timer.java:555)
   at java.util.TimerThread.run(Timer.java:505)
 {code}
 This will skip the other applications in this run.
 Can add a null check and continue with other applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3850) NM fails to read files from full disks which can lead to container logs being lost and other issues

2015-06-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604191#comment-14604191
 ] 

Hudson commented on YARN-3850:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #239 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/239/])
YARN-3850. NM fails to read files from full disks which can lead to container 
logs being lost and other issues. Contributed by Varun Saxena (jlowe: rev 
40b256949ad6f6e0dbdd248f2d257b05899f4332)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestContainerLogsPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/RecoveredContainerLaunch.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ContainerLogsUtils.java
* hadoop-yarn-project/CHANGES.txt


 NM fails to read files from full disks which can lead to container logs being 
 lost and other issues
 ---

 Key: YARN-3850
 URL: https://issues.apache.org/jira/browse/YARN-3850
 Project: Hadoop YARN
  Issue Type: Bug
  Components: log-aggregation, nodemanager
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena
Priority: Blocker
 Fix For: 2.7.1

 Attachments: YARN-3850.01.patch, YARN-3850.02.patch


 *Container logs* can be lost if disk has become full(~90% full).
 When application finishes, we upload logs after aggregation by calling 
 {{AppLogAggregatorImpl#uploadLogsForContainers}}. But this call in turns 
 checks the eligible directories on call to 
 {{LocalDirsHandlerService#getLogDirs}} which in case of disk full would 
 return nothing. So none of the container logs are aggregated and uploaded.
 But on application finish, we also call 
 {{AppLogAggregatorImpl#doAppLogAggregationPostCleanUp()}}. This deletes the 
 application directory which contains container logs. This is because it calls 
 {{LocalDirsHandlerService#getLogDirsForCleanup}} which returns the full disks 
 as well.
 So we are left with neither aggregated logs for the app nor the individual 
 container logs for the app.
 In addition to this, there are 2 more issues :
 # {{ContainerLogsUtil#getContainerLogDirs}} does not consider full disks so 
 NM will fail to serve up logs from full disks from its web interfaces.
 # {{RecoveredContainerLaunch#locatePidFile}} also does not consider full 
 disks so it is possible that on container recovery, PID file is not found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2871) TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk

2015-06-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604190#comment-14604190
 ] 

Hudson commented on YARN-2871:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #239 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/239/])
YARN-2871. TestRMRestart#testRMRestartGetApplicationList sometime fails (xgong: 
rev fe6c1bd73aee188ed58df4d33bbc2d2fe0779a97)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java


 TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk
 -

 Key: YARN-2871
 URL: https://issues.apache.org/jira/browse/YARN-2871
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: zhihai xu
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-2871.000.patch, YARN-2871.001.patch, 
 YARN-2871.002.patch


 From trunk build #746 (https://builds.apache.org/job/Hadoop-Yarn-trunk/746):
 {code}
 Failed tests:
   TestRMRestart.testRMRestartGetApplicationList:957
 rMAppManager.logApplicationSummary(
 isA(org.apache.hadoop.yarn.api.records.ApplicationId)
 );
 Wanted 3 times:
 - at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:957)
 But was 2 times:
 - at 
 org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:66)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3850) NM fails to read files from full disks which can lead to container logs being lost and other issues

2015-06-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604200#comment-14604200
 ] 

Hudson commented on YARN-3850:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2187 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2187/])
YARN-3850. NM fails to read files from full disks which can lead to container 
logs being lost and other issues. Contributed by Varun Saxena (jlowe: rev 
40b256949ad6f6e0dbdd248f2d257b05899f4332)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/RecoveredContainerLaunch.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestContainerLogsPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ContainerLogsUtils.java
* hadoop-yarn-project/CHANGES.txt


 NM fails to read files from full disks which can lead to container logs being 
 lost and other issues
 ---

 Key: YARN-3850
 URL: https://issues.apache.org/jira/browse/YARN-3850
 Project: Hadoop YARN
  Issue Type: Bug
  Components: log-aggregation, nodemanager
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena
Priority: Blocker
 Fix For: 2.7.1

 Attachments: YARN-3850.01.patch, YARN-3850.02.patch


 *Container logs* can be lost if disk has become full(~90% full).
 When application finishes, we upload logs after aggregation by calling 
 {{AppLogAggregatorImpl#uploadLogsForContainers}}. But this call in turns 
 checks the eligible directories on call to 
 {{LocalDirsHandlerService#getLogDirs}} which in case of disk full would 
 return nothing. So none of the container logs are aggregated and uploaded.
 But on application finish, we also call 
 {{AppLogAggregatorImpl#doAppLogAggregationPostCleanUp()}}. This deletes the 
 application directory which contains container logs. This is because it calls 
 {{LocalDirsHandlerService#getLogDirsForCleanup}} which returns the full disks 
 as well.
 So we are left with neither aggregated logs for the app nor the individual 
 container logs for the app.
 In addition to this, there are 2 more issues :
 # {{ContainerLogsUtil#getContainerLogDirs}} does not consider full disks so 
 NM will fail to serve up logs from full disks from its web interfaces.
 # {{RecoveredContainerLaunch#locatePidFile}} also does not consider full 
 disks so it is possible that on container recovery, PID file is not found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3857) Memory leak in ResourceManager with SIMPLE mode

2015-06-27 Thread Devaraj K (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604236#comment-14604236
 ] 

Devaraj K commented on YARN-3857:
-

Thanks [~mujunchao] for reporting and working in this. I am assigning this 
issue to you.

Adding to [~zxu], Can you also take care of the naming convention for the patch 
like JIRA-ID-patch-version.patch?


 Memory leak in ResourceManager with SIMPLE mode
 ---

 Key: YARN-3857
 URL: https://issues.apache.org/jira/browse/YARN-3857
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: mujunchao
Assignee: mujunchao
Priority: Critical
 Attachments: hadoop-yarn-server-resourcemanager.patch


  We register the ClientTokenMasterKey to avoid client may hold an invalid 
 ClientToken after RM restarts. In SIMPLE mode, we register 
 PairApplicationAttemptId, null ,  But we never remove it from HashMap, as 
 unregister only runing while in Security mode, so memory leak coming. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3857) Memory leak in ResourceManager with SIMPLE mode

2015-06-27 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-3857:

Assignee: mujunchao

 Memory leak in ResourceManager with SIMPLE mode
 ---

 Key: YARN-3857
 URL: https://issues.apache.org/jira/browse/YARN-3857
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: mujunchao
Assignee: mujunchao
Priority: Critical
 Attachments: hadoop-yarn-server-resourcemanager.patch


  We register the ClientTokenMasterKey to avoid client may hold an invalid 
 ClientToken after RM restarts. In SIMPLE mode, we register 
 PairApplicationAttemptId, null ,  But we never remove it from HashMap, as 
 unregister only runing while in Security mode, so memory leak coming. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3857) Memory leak in ResourceManager with SIMPLE mode

2015-06-27 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604240#comment-14604240
 ] 

Brahma Reddy Battula commented on YARN-3857:


Nice Catch!!

 Memory leak in ResourceManager with SIMPLE mode
 ---

 Key: YARN-3857
 URL: https://issues.apache.org/jira/browse/YARN-3857
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: mujunchao
Assignee: mujunchao
Priority: Critical
 Attachments: hadoop-yarn-server-resourcemanager.patch


  We register the ClientTokenMasterKey to avoid client may hold an invalid 
 ClientToken after RM restarts. In SIMPLE mode, we register 
 PairApplicationAttemptId, null ,  But we never remove it from HashMap, as 
 unregister only runing while in Security mode, so memory leak coming. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3848) TestNodeLabelContainerAllocation is timing out

2015-06-27 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena reassigned YARN-3848:
--

Assignee: Varun Saxena

 TestNodeLabelContainerAllocation is timing out
 --

 Key: YARN-3848
 URL: https://issues.apache.org/jira/browse/YARN-3848
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Reporter: Jason Lowe
Assignee: Varun Saxena

 A number of builds, pre-commit and otherwise, have been failing recently 
 because TestNodeLabelContainerAllocation has timed out.  See 
 https://builds.apache.org/job/Hadoop-Yarn-trunk/969/, YARN-3830, YARN-3802, 
 or YARN-3826 for examples.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3848) TestNodeLabelContainerAllocation is timing out

2015-06-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604255#comment-14604255
 ] 

Varun Saxena commented on YARN-3848:


I mean the test does not have timeout.

 TestNodeLabelContainerAllocation is timing out
 --

 Key: YARN-3848
 URL: https://issues.apache.org/jira/browse/YARN-3848
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: test_output.txt


 A number of builds, pre-commit and otherwise, have been failing recently 
 because TestNodeLabelContainerAllocation has timed out.  See 
 https://builds.apache.org/job/Hadoop-Yarn-trunk/969/, YARN-3830, YARN-3802, 
 or YARN-3826 for examples.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2871) TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk

2015-06-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604199#comment-14604199
 ] 

Hudson commented on YARN-2871:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2187 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2187/])
YARN-2871. TestRMRestart#testRMRestartGetApplicationList sometime fails (xgong: 
rev fe6c1bd73aee188ed58df4d33bbc2d2fe0779a97)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java
* hadoop-yarn-project/CHANGES.txt


 TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk
 -

 Key: YARN-2871
 URL: https://issues.apache.org/jira/browse/YARN-2871
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: zhihai xu
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-2871.000.patch, YARN-2871.001.patch, 
 YARN-2871.002.patch


 From trunk build #746 (https://builds.apache.org/job/Hadoop-Yarn-trunk/746):
 {code}
 Failed tests:
   TestRMRestart.testRMRestartGetApplicationList:957
 rMAppManager.logApplicationSummary(
 isA(org.apache.hadoop.yarn.api.records.ApplicationId)
 );
 Wanted 3 times:
 - at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:957)
 But was 2 times:
 - at 
 org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:66)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3848) TestNodeLabelContainerAllocation is timing out

2015-06-27 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3848:
---
Attachment: test_output.txt

 TestNodeLabelContainerAllocation is timing out
 --

 Key: YARN-3848
 URL: https://issues.apache.org/jira/browse/YARN-3848
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: test_output.txt


 A number of builds, pre-commit and otherwise, have been failing recently 
 because TestNodeLabelContainerAllocation has timed out.  See 
 https://builds.apache.org/job/Hadoop-Yarn-trunk/969/, YARN-3830, YARN-3802, 
 or YARN-3826 for examples.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3848) TestNodeLabelContainerAllocation is timing out

2015-06-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604254#comment-14604254
 ] 

Varun Saxena commented on YARN-3848:


Test which is failing is 
{{testQueueMaxCapacitiesWillNotBeHonoredWhenNotRespectingExclusivity}}. Test 
output has been attached.
Basically MockRM is being stopped while dispatcher still has events in its 
queue which leads to InterruptedException. JUnit wrongly interprets this as 
timeout, even though it isn't.

 TestNodeLabelContainerAllocation is timing out
 --

 Key: YARN-3848
 URL: https://issues.apache.org/jira/browse/YARN-3848
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: test_output.txt


 A number of builds, pre-commit and otherwise, have been failing recently 
 because TestNodeLabelContainerAllocation has timed out.  See 
 https://builds.apache.org/job/Hadoop-Yarn-trunk/969/, YARN-3830, YARN-3802, 
 or YARN-3826 for examples.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3848) TestNodeLabelContainerAllocation is timing out

2015-06-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604309#comment-14604309
 ] 

Hadoop QA commented on YARN-3848:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 20s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 38s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 30s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 59s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   1m 57s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |  50m 53s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  94m 27s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12742338/YARN-3848.01.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 79ed0f9 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8362/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8362/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8362/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8362/console |


This message was automatically generated.

 TestNodeLabelContainerAllocation is timing out
 --

 Key: YARN-3848
 URL: https://issues.apache.org/jira/browse/YARN-3848
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3848.01.patch, test_output.txt


 A number of builds, pre-commit and otherwise, have been failing recently 
 because TestNodeLabelContainerAllocation has timed out.  See 
 https://builds.apache.org/job/Hadoop-Yarn-trunk/969/, YARN-3830, YARN-3802, 
 or YARN-3826 for examples.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3859) LeafQueue doesn't print user properly for application add

2015-06-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604322#comment-14604322
 ] 

Hadoop QA commented on YARN-3859:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 51s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 46s | The applied patch generated  1 
new checkstyle issues (total was 151, now 151). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 25s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m 49s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  88m 29s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12742339/YARN-3859.01.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 79ed0f9 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8363/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8363/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8363/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8363/console |


This message was automatically generated.

 LeafQueue doesn't print user properly for application add
 -

 Key: YARN-3859
 URL: https://issues.apache.org/jira/browse/YARN-3859
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.7.0
Reporter: Devaraj K
Assignee: Varun Saxena
Priority: Minor
 Attachments: YARN-3859.01.patch


 {code:xml}
 2015-06-28 04:36:22,721 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
 Application added - appId: application_1435446241489_0003 user: 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@e8fb7a8,
  leaf-queue: default #user-pending-applications: 2 #user-active-applications: 
 1 #queue-pending-applications: 2 #queue-active-applications: 1
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3859) LeafQueue doesn't print user properly for application add

2015-06-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604323#comment-14604323
 ] 

Varun Saxena commented on YARN-3859:


Checkstyle issue related to file length

 LeafQueue doesn't print user properly for application add
 -

 Key: YARN-3859
 URL: https://issues.apache.org/jira/browse/YARN-3859
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.7.0
Reporter: Devaraj K
Assignee: Varun Saxena
Priority: Minor
 Attachments: YARN-3859.01.patch


 {code:xml}
 2015-06-28 04:36:22,721 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
 Application added - appId: application_1435446241489_0003 user: 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@e8fb7a8,
  leaf-queue: default #user-pending-applications: 2 #user-active-applications: 
 1 #queue-pending-applications: 2 #queue-active-applications: 1
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3768) Index out of range exception with environment variables without values

2015-06-27 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated YARN-3768:

Attachment: YARN-3768.002.patch

You are right [~zxu], and I actually meant to combine matching k=v pairs and 
capturing k and v in one shot.

 Index out of range exception with environment variables without values
 --

 Key: YARN-3768
 URL: https://issues.apache.org/jira/browse/YARN-3768
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.5.0
Reporter: Joe Ferner
Assignee: zhihai xu
 Attachments: YARN-3768.000.patch, YARN-3768.001.patch, 
 YARN-3768.002.patch


 Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range 
 exception occurs if an environment variable is encountered without a value.
 I believe this occurs because java will not return empty strings from the 
 split method. Similar to this 
 http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3768) Index out of range exception with environment variables without values

2015-06-27 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604372#comment-14604372
 ] 

Gera Shegalov commented on YARN-3768:
-

002 attached, with this idea and proper name validation.

 Index out of range exception with environment variables without values
 --

 Key: YARN-3768
 URL: https://issues.apache.org/jira/browse/YARN-3768
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.5.0
Reporter: Joe Ferner
Assignee: zhihai xu
 Attachments: YARN-3768.000.patch, YARN-3768.001.patch, 
 YARN-3768.002.patch


 Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range 
 exception occurs if an environment variable is encountered without a value.
 I believe this occurs because java will not return empty strings from the 
 split method. Similar to this 
 http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3768) Index out of range exception with environment variables without values

2015-06-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604401#comment-14604401
 ] 

Hadoop QA commented on YARN-3768:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 45s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 47s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 25s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  22m  1s | Tests passed in 
hadoop-common. |
| {color:green}+1{color} | yarn tests |   1m 56s | Tests passed in 
hadoop-yarn-common. |
| | |  66m 36s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12742353/YARN-3768.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 79ed0f9 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8364/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8364/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8364/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8364/console |


This message was automatically generated.

 Index out of range exception with environment variables without values
 --

 Key: YARN-3768
 URL: https://issues.apache.org/jira/browse/YARN-3768
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.5.0
Reporter: Joe Ferner
Assignee: zhihai xu
 Attachments: YARN-3768.000.patch, YARN-3768.001.patch, 
 YARN-3768.002.patch


 Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range 
 exception occurs if an environment variable is encountered without a value.
 I believe this occurs because java will not return empty strings from the 
 split method. Similar to this 
 http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3859) LeafQueue doesn't print user properly for application add

2015-06-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604528#comment-14604528
 ] 

Hudson commented on YARN-3859:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8078 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8078/])
YARN-3859. LeafQueue doesn't print user properly for application add. (devaraj: 
rev b543d1a390a67e5e92fea67d3a2635058c29e9da)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java


 LeafQueue doesn't print user properly for application add
 -

 Key: YARN-3859
 URL: https://issues.apache.org/jira/browse/YARN-3859
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.7.0
Reporter: Devaraj K
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3859.01.patch


 {code:xml}
 2015-06-28 04:36:22,721 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
 Application added - appId: application_1435446241489_0003 user: 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@e8fb7a8,
  leaf-queue: default #user-pending-applications: 2 #user-active-applications: 
 1 #queue-pending-applications: 2 #queue-active-applications: 1
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node

2015-06-27 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated YARN-3860:
---
Attachment: YARN-3860.002.patch

I fixed inappropriate name of the argument variable of getTargetIds in 002.

 rmadmin -transitionToActive should check the state of non-target node
 -

 Key: YARN-3860
 URL: https://issues.apache.org/jira/browse/YARN-3860
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: YARN-3860.001.patch, YARN-3860.002.patch


 Users can make both ResouceManagers active by {{rmadmin -transitionToActive}} 
 even if {{\--forceactive}} option is not given. {{haadmin 
 -transitionToActive}} of HDFS checks whether non-target nodes are already 
 active but {{rmadmin -transitionToActive}} does not do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node

2015-06-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604506#comment-14604506
 ] 

Hadoop QA commented on YARN-3860:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 41s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 29s | The applied patch generated  1 
new checkstyle issues (total was 38, now 39). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 51s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  19m 52s | Tests failed in 
hadoop-yarn-client. |
| | |  56m 34s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | 
org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12742365/YARN-3860.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 79ed0f9 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8365/artifact/patchprocess/diffcheckstylehadoop-yarn-client.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/8365/artifact/patchprocess/whitespace.txt
 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8365/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8365/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8365/console |


This message was automatically generated.

 rmadmin -transitionToActive should check the state of non-target node
 -

 Key: YARN-3860
 URL: https://issues.apache.org/jira/browse/YARN-3860
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: YARN-3860.001.patch, YARN-3860.002.patch


 Users can make both ResouceManagers active by {{rmadmin -transitionToActive}} 
 even if {{\--forceactive}} option is not given. {{haadmin 
 -transitionToActive}} of HDFS checks whether non-target nodes are already 
 active but {{rmadmin -transitionToActive}} does not do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3859) LeafQueue doesn't print user properly for application add

2015-06-27 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-3859:

Hadoop Flags: Reviewed

+1 for the patch, committing it.

 LeafQueue doesn't print user properly for application add
 -

 Key: YARN-3859
 URL: https://issues.apache.org/jira/browse/YARN-3859
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.7.0
Reporter: Devaraj K
Assignee: Varun Saxena
Priority: Minor
 Attachments: YARN-3859.01.patch


 {code:xml}
 2015-06-28 04:36:22,721 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
 Application added - appId: application_1435446241489_0003 user: 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@e8fb7a8,
  leaf-queue: default #user-pending-applications: 2 #user-active-applications: 
 1 #queue-pending-applications: 2 #queue-active-applications: 1
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3859) LeafQueue doesn't print user properly for application add

2015-06-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604540#comment-14604540
 ] 

Varun Saxena commented on YARN-3859:


Thanks [~devaraj.k] for the review and commit.

 LeafQueue doesn't print user properly for application add
 -

 Key: YARN-3859
 URL: https://issues.apache.org/jira/browse/YARN-3859
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.7.0
Reporter: Devaraj K
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3859.01.patch


 {code:xml}
 2015-06-28 04:36:22,721 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
 Application added - appId: application_1435446241489_0003 user: 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@e8fb7a8,
  leaf-queue: default #user-pending-applications: 2 #user-active-applications: 
 1 #queue-pending-applications: 2 #queue-active-applications: 1
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node

2015-06-27 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated YARN-3860:
---
Attachment: YARN-3860.001.patch

 rmadmin -transitionToActive should check the state of non-target node
 -

 Key: YARN-3860
 URL: https://issues.apache.org/jira/browse/YARN-3860
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: YARN-3860.001.patch


 {{haadmin -transitionToActive}} of HDFS checks whether non-target nodes are 
 already active but {{rmadmin -transitionToActive}} does not do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node

2015-06-27 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated YARN-3860:
---
Attachment: YARN-3860.003.patch

addressing checkstyle and whitespace warnings.

 rmadmin -transitionToActive should check the state of non-target node
 -

 Key: YARN-3860
 URL: https://issues.apache.org/jira/browse/YARN-3860
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: YARN-3860.001.patch, YARN-3860.002.patch, 
 YARN-3860.003.patch


 Users can make both ResouceManagers active by {{rmadmin -transitionToActive}} 
 even if {{\--forceactive}} option is not given. {{haadmin 
 -transitionToActive}} of HDFS checks whether non-target nodes are already 
 active but {{rmadmin -transitionToActive}} does not do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node

2015-06-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604531#comment-14604531
 ] 

Hadoop QA commented on YARN-3860:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 15s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 31s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 28s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 52s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m 55s | Tests passed in 
hadoop-yarn-client. |
| | |  43m  9s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12742371/YARN-3860.003.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 79ed0f9 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8367/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8367/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8367/console |


This message was automatically generated.

 rmadmin -transitionToActive should check the state of non-target node
 -

 Key: YARN-3860
 URL: https://issues.apache.org/jira/browse/YARN-3860
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: YARN-3860.001.patch, YARN-3860.002.patch, 
 YARN-3860.003.patch


 Users can make both ResouceManagers active by {{rmadmin -transitionToActive}} 
 even if {{\--forceactive}} option is not given. {{haadmin 
 -transitionToActive}} of HDFS checks whether non-target nodes are already 
 active but {{rmadmin -transitionToActive}} does not do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node

2015-06-27 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated YARN-3860:
---
Description: Users can make both ResouceManagers active by {{rmadmin 
-transitionToActive}} even if {{\--forceactive}} option is not given. {{haadmin 
-transitionToActive}} of HDFS checks whether non-target nodes are already 
active but {{rmadmin -transitionToActive}} does not do.  (was: {{haadmin 
-transitionToActive}} of HDFS checks whether non-target nodes are already 
active but {{rmadmin -transitionToActive}} does not do.)

 rmadmin -transitionToActive should check the state of non-target node
 -

 Key: YARN-3860
 URL: https://issues.apache.org/jira/browse/YARN-3860
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: YARN-3860.001.patch


 Users can make both ResouceManagers active by {{rmadmin -transitionToActive}} 
 even if {{\--forceactive}} option is not given. {{haadmin 
 -transitionToActive}} of HDFS checks whether non-target nodes are already 
 active but {{rmadmin -transitionToActive}} does not do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node

2015-06-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604509#comment-14604509
 ] 

Hadoop QA commented on YARN-3860:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 37s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 36s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 39s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 30s | The applied patch generated  1 
new checkstyle issues (total was 38, now 39). |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 52s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m 55s | Tests passed in 
hadoop-yarn-client. |
| | |  43m 42s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12742367/YARN-3860.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 79ed0f9 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8366/artifact/patchprocess/diffcheckstylehadoop-yarn-client.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/8366/artifact/patchprocess/whitespace.txt
 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8366/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8366/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8366/console |


This message was automatically generated.

 rmadmin -transitionToActive should check the state of non-target node
 -

 Key: YARN-3860
 URL: https://issues.apache.org/jira/browse/YARN-3860
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: YARN-3860.001.patch, YARN-3860.002.patch


 Users can make both ResouceManagers active by {{rmadmin -transitionToActive}} 
 even if {{\--forceactive}} option is not given. {{haadmin 
 -transitionToActive}} of HDFS checks whether non-target nodes are already 
 active but {{rmadmin -transitionToActive}} does not do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node

2015-06-27 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604539#comment-14604539
 ] 

zhihai xu commented on YARN-3860:
-

[~iwasakims], thanks for working on this issue. This looks like a good catch.
One nit: I think times(1) is used by default, Can we just use 
{{verify(haadmin).getServiceStatus();}}? because all other tests didn't have 
times(1) in verify.

 rmadmin -transitionToActive should check the state of non-target node
 -

 Key: YARN-3860
 URL: https://issues.apache.org/jira/browse/YARN-3860
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor
 Attachments: YARN-3860.001.patch, YARN-3860.002.patch, 
 YARN-3860.003.patch


 Users can make both ResouceManagers active by {{rmadmin -transitionToActive}} 
 even if {{\--forceactive}} option is not given. {{haadmin 
 -transitionToActive}} of HDFS checks whether non-target nodes are already 
 active but {{rmadmin -transitionToActive}} does not do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node

2015-06-27 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604474#comment-14604474
 ] 

Masatake Iwasaki commented on YARN-3860:


HAAdmin#isOtherTargetNodeActive does not checks the other nodes are active 
without overriding HAAdmin#getTargetIds which returns list including only given 
target id. RMAdminCLI should have getTargetIds method which returns list of all 
node ids as DFSHAAdmin do.

 rmadmin -transitionToActive should check the state of non-target node
 -

 Key: YARN-3860
 URL: https://issues.apache.org/jira/browse/YARN-3860
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor

 {{haadmin -transitionToActive}} of HDFS checks whether non-target nodes are 
 already active but {{rmadmin -transitionToActive}} does not do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node

2015-06-27 Thread Masatake Iwasaki (JIRA)
Masatake Iwasaki created YARN-3860:
--

 Summary: rmadmin -transitionToActive should check the state of 
non-target node
 Key: YARN-3860
 URL: https://issues.apache.org/jira/browse/YARN-3860
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor


{{haadmin -transitionToActive}} of HDFS checks whether non-target nodes are 
already active but {{rmadmin -transitionToActive}} does not do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3017) ContainerID in ResourceManager Log Has Slightly Different Format From AppAttemptID

2015-06-27 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604503#comment-14604503
 ] 

zhihai xu commented on YARN-3017:
-

I just found this change may cause problem in LogAggregation during rolling 
upgrade with NM-Recovery-supervised enabled.
The following code in 
{{AggregatedLogFormat#getPendingLogFilesToUploadForThisContainer}} will upload 
the log based on the containerId String. So we may miss uploading the old log 
files after upgrade.
{code}
File containerLogDir =
new File(appLogDir, ConverterUtils.toString(this.containerId));
if (!containerLogDir.isDirectory()) {
  continue; // ContainerDir may have been deleted by the user.
}
pendingUploadFiles
  .addAll(getPendingLogFilesToUpload(containerLogDir));
{code}
To support this issue, we also need make change in 
{{getPendingLogFilesToUploadForThisContainer}} to compare containerId using 
{{ContainerId#fromString}}.
It looks like it makes sense to keep the old format for compatibility.

 ContainerID in ResourceManager Log Has Slightly Different Format From 
 AppAttemptID
 --

 Key: YARN-3017
 URL: https://issues.apache.org/jira/browse/YARN-3017
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.8.0
Reporter: MUFEED USMAN
Assignee: Mohammad Shahid Khan
Priority: Minor
  Labels: PatchAvailable
 Attachments: YARN-3017.patch, YARN-3017_1.patch, YARN-3017_2.patch, 
 YARN-3017_3.patch


 Not sure if this should be filed as a bug or not.
 In the ResourceManager log in the events surrounding the creation of a new
 application attempt,
 ...
 ...
 2014-11-14 17:45:37,258 INFO
 org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching
 masterappattempt_1412150883650_0001_02
 ...
 ...
 The application attempt has the ID format _1412150883650_0001_02.
 Whereas the associated ContainerID goes by _1412150883650_0001_02_.
 ...
 ...
 2014-11-14 17:45:37,260 INFO
 org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting 
 up
 container Container: [ContainerId: container_1412150883650_0001_02_01,
 NodeId: n67:55933, NodeHttpAddress: n67:8042, Resource: memory:2048, 
 vCores:1,
 disks:0.0, Priority: 0, Token: Token { kind: ContainerToken, service:
 10.10.70.67:55933 }, ] for AM appattempt_1412150883650_0001_02
 ...
 ...
 Curious to know if this is kept like that for a reason. If not while using
 filtering tools to, say, grep events surrounding a specific attempt by the
 numeric ID part information may slip out during troubleshooting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)