[jira] [Created] (YARN-1912) ResourceLocalizer started without any jvm memory control

2014-04-08 Thread stanley shi (JIRA)
stanley shi created YARN-1912:
-

 Summary: ResourceLocalizer started without any jvm memory control
 Key: YARN-1912
 URL: https://issues.apache.org/jira/browse/YARN-1912
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.2.0
Reporter: stanley shi


In the LinuxContainerExecutor.java#startLocalizer, it does not specify any 
-Xmx configurations in the command, this caused the ResourceLocalizer to be 
started with default memory setting.
In an server-level hardware, it will use 25% of the system memory as the max 
heap size, this will cause memory issue in some cases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1878) Yarn standby RM taking long to transition to active

2014-04-08 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-1878:
-

Issue Type: Sub-task  (was: Bug)
Parent: YARN-149

 Yarn standby RM taking long to transition to active
 ---

 Key: YARN-1878
 URL: https://issues.apache.org/jira/browse/YARN-1878
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.4.0
Reporter: Arpit Gupta
Assignee: Xuan Gong
 Attachments: YARN-1878.1.patch


 In our HA tests we are noticing that some times it can take upto 10s for the 
 standby RM to transition to active.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1912) ResourceLocalizer started without any jvm memory control

2014-04-08 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963010#comment-13963010
 ] 

Nathan Roberts commented on YARN-1912:
--

Doesn't it default to MIN(25%_of_memory,1GB)? May not be too bad on modern 
server class machines, but probably best to explicitly call out a maximum.

 ResourceLocalizer started without any jvm memory control
 

 Key: YARN-1912
 URL: https://issues.apache.org/jira/browse/YARN-1912
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.2.0
Reporter: stanley shi

 In the LinuxContainerExecutor.java#startLocalizer, it does not specify any 
 -Xmx configurations in the command, this caused the ResourceLocalizer to be 
 started with default memory setting.
 In an server-level hardware, it will use 25% of the system memory as the max 
 heap size, this will cause memory issue in some cases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1906) TestRMRestart#testQueueMetricsOnRMRestart fails intermittently on trunk and branch2

2014-04-08 Thread Mit Desai (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963114#comment-13963114
 ] 

Mit Desai commented on YARN-1906:
-

Thanks for the feedback Jon. I will take a look today

 TestRMRestart#testQueueMetricsOnRMRestart fails intermittently on trunk and 
 branch2
 ---

 Key: YARN-1906
 URL: https://issues.apache.org/jira/browse/YARN-1906
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Mit Desai
Assignee: Mit Desai
 Fix For: 3.0.0, 2.5.0

 Attachments: YARN-1906.patch


 Here is the output of the format
 {noformat}
 testQueueMetricsOnRMRestart(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart)
   Time elapsed: 9.757 sec   FAILURE!
 java.lang.AssertionError: expected:2 but was:1
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.assertQueueMetrics(TestRMRestart.java:1735)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testQueueMetricsOnRMRestart(TestRMRestart.java:1706)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1913) Cluster logjam when all resources are consumed by AM

2014-04-08 Thread bc Wong (JIRA)
bc Wong created YARN-1913:
-

 Summary: Cluster logjam when all resources are consumed by AM
 Key: YARN-1913
 URL: https://issues.apache.org/jira/browse/YARN-1913
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.3.0
Reporter: bc Wong


It's possible to deadlock a cluster by submitting many applications at once, 
and have all cluster resources taken up by AMs.

One solution is for the scheduler to limit resources taken up by AMs, as a 
percentage of total cluster resources, via a maxApplicationMasterShare config.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1913) Cluster logjam when all resources are consumed by AM

2014-04-08 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963143#comment-13963143
 ] 

Jason Lowe commented on YARN-1913:
--

Which scheduler are you using?  The CapacityScheduler already has a 
yarn.scheduler.capacity.maximum-am-resource-percent property, and there's a 
per-queue form of it as well.

 Cluster logjam when all resources are consumed by AM
 

 Key: YARN-1913
 URL: https://issues.apache.org/jira/browse/YARN-1913
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.3.0
Reporter: bc Wong

 It's possible to deadlock a cluster by submitting many applications at once, 
 and have all cluster resources taken up by AMs.
 One solution is for the scheduler to limit resources taken up by AMs, as a 
 percentage of total cluster resources, via a maxApplicationMasterShare 
 config.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs

2014-04-08 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963164#comment-13963164
 ] 

Sandy Ryza commented on YARN-1913:
--

This is a Fair Scheduler issue.  We need to add an equivalent property to the 
Fair Scheduler.

 With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
 --

 Key: YARN-1913
 URL: https://issues.apache.org/jira/browse/YARN-1913
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.3.0
Reporter: bc Wong

 It's possible to deadlock a cluster by submitting many applications at once, 
 and have all cluster resources taken up by AMs.
 One solution is for the scheduler to limit resources taken up by AMs, as a 
 percentage of total cluster resources, via a maxApplicationMasterShare 
 config.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs

2014-04-08 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1913:
-

Summary: With Fair Scheduler, cluster can logjam when all resources are 
consumed by AMs  (was: Cluster logjam when all resources are consumed by AM)

 With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
 --

 Key: YARN-1913
 URL: https://issues.apache.org/jira/browse/YARN-1913
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.3.0
Reporter: bc Wong

 It's possible to deadlock a cluster by submitting many applications at once, 
 and have all cluster resources taken up by AMs.
 One solution is for the scheduler to limit resources taken up by AMs, as a 
 percentage of total cluster resources, via a maxApplicationMasterShare 
 config.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1757) NM Recovery. Auxiliary service support.

2014-04-08 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1757:
---

Summary: NM Recovery. Auxiliary service support.  (was: Auxiliary service 
support for nodemanager recovery)

 NM Recovery. Auxiliary service support.
 ---

 Key: YARN-1757
 URL: https://issues.apache.org/jira/browse/YARN-1757
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1757-v2.patch, YARN-1757.patch, YARN-1757.patch


 There needs to be a mechanism for communicating to auxiliary services whether 
 nodemanager recovery is enabled and where they should store their state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1757) NM Recovery. Auxiliary service support.

2014-04-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963218#comment-13963218
 ] 

Hudson commented on YARN-1757:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5469 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5469/])
YARN-1757. NM Recovery. Auxiliary service support. (Jason Lowe via kasha) 
(kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585783)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/AuxiliaryService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestAuxServices.java


 NM Recovery. Auxiliary service support.
 ---

 Key: YARN-1757
 URL: https://issues.apache.org/jira/browse/YARN-1757
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Fix For: 2.5.0

 Attachments: YARN-1757-v2.patch, YARN-1757.patch, YARN-1757.patch


 There needs to be a mechanism for communicating to auxiliary services whether 
 nodemanager recovery is enabled and where they should store their state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1910) TestAMRMTokens fails on windows

2014-04-08 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963259#comment-13963259
 ] 

Varun Vasudev commented on YARN-1910:
-

Patch looks good. Just one suggestion - rename maxWaitingTime to 
maxWaitAttempts.

 TestAMRMTokens fails on windows
 ---

 Key: YARN-1910
 URL: https://issues.apache.org/jira/browse/YARN-1910
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Xuan Gong
Assignee: Xuan Gong
 Fix For: 2.4.0

 Attachments: YARN-1910.1.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1537) TestLocalResourcesTrackerImpl.testLocalResourceCache often failed

2014-04-08 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963268#comment-13963268
 ] 

Varun Vasudev commented on YARN-1537:
-

+1 looks fine to me.

 TestLocalResourcesTrackerImpl.testLocalResourceCache often failed
 -

 Key: YARN-1537
 URL: https://issues.apache.org/jira/browse/YARN-1537
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.2.0
Reporter: shenhong
Assignee: Xuan Gong
 Attachments: YARN-1537.1.patch


 Here is the error log
 {code}
 Results :
 Failed tests: 
   TestLocalResourcesTrackerImpl.testLocalResourceCache:351 
 Wanted but not invoked:
 eventHandler.handle(
 
 isA(org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerResourceLocalizedEvent)
 );
 - at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestLocalResourcesTrackerImpl.testLocalResourceCache(TestLocalResourcesTrackerImpl.java:351)
 However, there were other interactions with this mock:
 - at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
 - at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions

2014-04-08 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963271#comment-13963271
 ] 

Sandy Ryza commented on YARN-596:
-

bq. Sandy Ryza, so here the ''safe'' means usage.memory  fairshare.memory and 
usage.vcores  fairshare.vcores? 
Right, but with those s as =s.

bq. But the fairshare.vcores for FSQueue (except root) is always 0. And 
fairscheduler only considers memory when do scheduling, so do we still need to 
consider vcores here?
These are only the case when the FairSharePolicy is used.  When the 
DominantResourceFairnessPolicy is used, the vcores fair share is greater than 0 
and the Fair Scheduler considers both memory and vcores for scheduling.

 In fair scheduler, intra-application container priorities affect 
 inter-application preemption decisions
 ---

 Key: YARN-596
 URL: https://issues.apache.org/jira/browse/YARN-596
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-596.patch, YARN-596.patch, YARN-596.patch, 
 YARN-596.patch, YARN-596.patch


 In the fair scheduler, containers are chosen for preemption in the following 
 way:
 All containers for all apps that are in queues that are over their fair share 
 are put in a list.
 The list is sorted in order of the priority that the container was requested 
 in.
 This means that an application can shield itself from preemption by 
 requesting it's containers at higher priorities, which doesn't really make 
 sense.
 Also, an application that is not over its fair share, but that is in a queue 
 that is over it's fair share is just as likely to have containers preempted 
 as an application that is over its fair share.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1914) Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows

2014-04-08 Thread Varun Vasudev (JIRA)
Varun Vasudev created YARN-1914:
---

 Summary: Test TestFSDownload.testDownloadPublicWithStatCache fails 
on Windows
 Key: YARN-1914
 URL: https://issues.apache.org/jira/browse/YARN-1914
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Varun Vasudev


The TestFSDownload.testDownloadPublicWithStatCache test in hadoop-yarn-common 
consistently fails on Windows environments.

The root cause is that the test checks for execute permission for all users on 
every ancestor of the target directory. In windows, by default, group 
Everyone has no permissions on any directory in the install drive. It's 
unreasonable to expect this test to pass and we should skip it on Windows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1914) Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows

2014-04-08 Thread Varun Vasudev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-1914:


Attachment: apache-yarn-1914.0.patch

Patch skipping test on windows

 Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows
 

 Key: YARN-1914
 URL: https://issues.apache.org/jira/browse/YARN-1914
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Varun Vasudev
 Attachments: apache-yarn-1914.0.patch


 The TestFSDownload.testDownloadPublicWithStatCache test in hadoop-yarn-common 
 consistently fails on Windows environments.
 The root cause is that the test checks for execute permission for all users 
 on every ancestor of the target directory. In windows, by default, group 
 Everyone has no permissions on any directory in the install drive. It's 
 unreasonable to expect this test to pass and we should skip it on Windows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (YARN-1914) Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows

2014-04-08 Thread Varun Vasudev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev reassigned YARN-1914:
---

Assignee: Varun Vasudev

 Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows
 

 Key: YARN-1914
 URL: https://issues.apache.org/jira/browse/YARN-1914
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-1914.0.patch


 The TestFSDownload.testDownloadPublicWithStatCache test in hadoop-yarn-common 
 consistently fails on Windows environments.
 The root cause is that the test checks for execute permission for all users 
 on every ancestor of the target directory. In windows, by default, group 
 Everyone has no permissions on any directory in the install drive. It's 
 unreasonable to expect this test to pass and we should skip it on Windows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1908) Distributed shell with custom script has permission error.

2014-04-08 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963382#comment-13963382
 ] 

Xuan Gong commented on YARN-1908:
-

+1 LGTM

 Distributed shell with custom script has permission error.
 --

 Key: YARN-1908
 URL: https://issues.apache.org/jira/browse/YARN-1908
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Affects Versions: 2.4.0
Reporter: Tassapol Athiapinya
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-1908.1.patch, YARN-1908.2.patch, YARN-1908.3.patch, 
 YARN-1908.4.patch


 Create test1.sh having pwd.
 Run this command as user1:
 hadoop jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar 
 -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar 
 -shell_script test1.sh
 NM is run by yarn user. An exception is thrown because yarn user has no 
 permissions on custom script in hdfs path. The custom script is created with 
 distributed shell app.
 {code}
 Caused by: 
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
  Permission denied: user=yarn, access=WRITE, 
 inode=/user/user1/DistributedShell/70:user1:user1:drwxr-xr-x
   at 
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1910) TestAMRMTokens fails on windows

2014-04-08 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1910:


Attachment: YARN-1910.2.patch

 TestAMRMTokens fails on windows
 ---

 Key: YARN-1910
 URL: https://issues.apache.org/jira/browse/YARN-1910
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Xuan Gong
Assignee: Xuan Gong
 Fix For: 2.4.0

 Attachments: YARN-1910.1.patch, YARN-1910.2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1910) TestAMRMTokens fails on windows

2014-04-08 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963396#comment-13963396
 ] 

Varun Vasudev commented on YARN-1910:
-

+1 looks good to me.

 TestAMRMTokens fails on windows
 ---

 Key: YARN-1910
 URL: https://issues.apache.org/jira/browse/YARN-1910
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Xuan Gong
Assignee: Xuan Gong
 Fix For: 2.4.0

 Attachments: YARN-1910.1.patch, YARN-1910.2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1914) Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows

2014-04-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963399#comment-13963399
 ] 

Hadoop QA commented on YARN-1914:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12639253/apache-yarn-1914.0.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3531//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3531//console

This message is automatically generated.

 Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows
 

 Key: YARN-1914
 URL: https://issues.apache.org/jira/browse/YARN-1914
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-1914.0.patch


 The TestFSDownload.testDownloadPublicWithStatCache test in hadoop-yarn-common 
 consistently fails on Windows environments.
 The root cause is that the test checks for execute permission for all users 
 on every ancestor of the target directory. In windows, by default, group 
 Everyone has no permissions on any directory in the install drive. It's 
 unreasonable to expect this test to pass and we should skip it on Windows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1908) Distributed shell with custom script has permission error.

2014-04-08 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963401#comment-13963401
 ] 

Jian He commented on YARN-1908:
---

Patch looks good to me, + 1

 Distributed shell with custom script has permission error.
 --

 Key: YARN-1908
 URL: https://issues.apache.org/jira/browse/YARN-1908
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Affects Versions: 2.4.0
Reporter: Tassapol Athiapinya
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-1908.1.patch, YARN-1908.2.patch, YARN-1908.3.patch, 
 YARN-1908.4.patch


 Create test1.sh having pwd.
 Run this command as user1:
 hadoop jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar 
 -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar 
 -shell_script test1.sh
 NM is run by yarn user. An exception is thrown because yarn user has no 
 permissions on custom script in hdfs path. The custom script is created with 
 distributed shell app.
 {code}
 Caused by: 
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
  Permission denied: user=yarn, access=WRITE, 
 inode=/user/user1/DistributedShell/70:user1:user1:drwxr-xr-x
   at 
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1910) TestAMRMTokens fails on windows

2014-04-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963457#comment-13963457
 ] 

Hadoop QA commented on YARN-1910:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12639256/YARN-1910.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3532//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3532//console

This message is automatically generated.

 TestAMRMTokens fails on windows
 ---

 Key: YARN-1910
 URL: https://issues.apache.org/jira/browse/YARN-1910
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Xuan Gong
Assignee: Xuan Gong
 Fix For: 2.4.0

 Attachments: YARN-1910.1.patch, YARN-1910.2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1784) TestContainerAllocation assumes CapacityScheduler

2014-04-08 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-1784:


Attachment: YARN-1784.patch

The patch configures the tests to always use the CapcityScheduler

 TestContainerAllocation assumes CapacityScheduler
 -

 Key: YARN-1784
 URL: https://issues.apache.org/jira/browse/YARN-1784
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0
Reporter: Karthik Kambatla
Assignee: Robert Kanter
Priority: Minor
 Attachments: YARN-1784.patch


 TestContainerAllocation assumes CapacityScheduler



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1907) TestRMApplicationHistoryWriter#testRMWritingMassiveHistory runs slow and intermittently fails

2014-04-08 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963480#comment-13963480
 ] 

Zhijie Shen commented on YARN-1907:
---

The patch makes sense to me. I tried the test in eclipse as well. Sometimes it 
would be failed after 200 rounds of node heartbeat, and the container still 
haven't completed cleaned up.

However, is it a better code practice to loop until all the containers are 
cleaned up (removing the 200 round bounds), and set a suitable timeout for this 
test case?

 TestRMApplicationHistoryWriter#testRMWritingMassiveHistory runs slow and 
 intermittently fails
 -

 Key: YARN-1907
 URL: https://issues.apache.org/jira/browse/YARN-1907
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 2.5.0
Reporter: Mit Desai
Assignee: Mit Desai
 Attachments: HDFS-6195.patch


 The test has 1 containers that it tries to cleanup.
 The cleanup has a timeout of 2ms in which the test sometimes cannot do 
 the cleanup completely and gives out an Assertion Failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1906) TestRMRestart#testQueueMetricsOnRMRestart fails intermittently on trunk and branch2

2014-04-08 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963491#comment-13963491
 ] 

Zhijie Shen commented on YARN-1906:
---

Maybe we can wait until, for example, the appSubmitted, is changed (increased 
or decreased by 1)?

 TestRMRestart#testQueueMetricsOnRMRestart fails intermittently on trunk and 
 branch2
 ---

 Key: YARN-1906
 URL: https://issues.apache.org/jira/browse/YARN-1906
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Mit Desai
Assignee: Mit Desai
 Fix For: 3.0.0, 2.5.0

 Attachments: YARN-1906.patch


 Here is the output of the format
 {noformat}
 testQueueMetricsOnRMRestart(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart)
   Time elapsed: 9.757 sec   FAILURE!
 java.lang.AssertionError: expected:2 but was:1
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.assertQueueMetrics(TestRMRestart.java:1735)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testQueueMetricsOnRMRestart(TestRMRestart.java:1706)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1784) TestContainerAllocation assumes CapacityScheduler

2014-04-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963512#comment-13963512
 ] 

Hadoop QA commented on YARN-1784:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12639268/YARN-1784.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3533//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3533//console

This message is automatically generated.

 TestContainerAllocation assumes CapacityScheduler
 -

 Key: YARN-1784
 URL: https://issues.apache.org/jira/browse/YARN-1784
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0
Reporter: Karthik Kambatla
Assignee: Robert Kanter
Priority: Minor
 Attachments: YARN-1784.patch


 TestContainerAllocation assumes CapacityScheduler



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1908) Distributed shell with custom script has permission error.

2014-04-08 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963517#comment-13963517
 ] 

Vinod Kumar Vavilapalli commented on YARN-1908:
---

Tx for the reviews [~xgong] and [~jianhe]!. Checking this in..

 Distributed shell with custom script has permission error.
 --

 Key: YARN-1908
 URL: https://issues.apache.org/jira/browse/YARN-1908
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Affects Versions: 2.4.0
Reporter: Tassapol Athiapinya
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-1908.1.patch, YARN-1908.2.patch, YARN-1908.3.patch, 
 YARN-1908.4.patch


 Create test1.sh having pwd.
 Run this command as user1:
 hadoop jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar 
 -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar 
 -shell_script test1.sh
 NM is run by yarn user. An exception is thrown because yarn user has no 
 permissions on custom script in hdfs path. The custom script is created with 
 distributed shell app.
 {code}
 Caused by: 
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
  Permission denied: user=yarn, access=WRITE, 
 inode=/user/user1/DistributedShell/70:user1:user1:drwxr-xr-x
   at 
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-04-08 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-1341:
-

Attachment: YARN-1341v3.patch

Updating patch after YARN-1757 was committed.

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1908) Distributed shell with custom script has permission error.

2014-04-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963532#comment-13963532
 ] 

Hudson commented on YARN-1908:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5471 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5471/])
YARN-1908. Fixed DistributedShell to not fail in secure clusters. Contributed 
by Vinod Kumar Vavilapalli and Jian He. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585849)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java


 Distributed shell with custom script has permission error.
 --

 Key: YARN-1908
 URL: https://issues.apache.org/jira/browse/YARN-1908
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Affects Versions: 2.4.0
Reporter: Tassapol Athiapinya
Assignee: Vinod Kumar Vavilapalli
 Fix For: 2.4.1

 Attachments: YARN-1908.1.patch, YARN-1908.2.patch, YARN-1908.3.patch, 
 YARN-1908.4.patch


 Create test1.sh having pwd.
 Run this command as user1:
 hadoop jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar 
 -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar 
 -shell_script test1.sh
 NM is run by yarn user. An exception is thrown because yarn user has no 
 permissions on custom script in hdfs path. The custom script is created with 
 distributed shell app.
 {code}
 Caused by: 
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
  Permission denied: user=yarn, access=WRITE, 
 inode=/user/user1/DistributedShell/70:user1:user1:drwxr-xr-x
   at 
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1342) Recover container tokens upon nodemanager restart

2014-04-08 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-1342:
-

Attachment: YARN-1342v2.patch

Updating patch after YARN-1757 was committed.

 Recover container tokens upon nodemanager restart
 -

 Key: YARN-1342
 URL: https://issues.apache.org/jira/browse/YARN-1342
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1342.patch, YARN-1342v2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1784) TestContainerAllocation assumes CapacityScheduler

2014-04-08 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963553#comment-13963553
 ] 

Karthik Kambatla commented on YARN-1784:


Instead of adding it to individual tests, can we add a setup method (@Before) 
so future tests using the available conf don't cause this.

 TestContainerAllocation assumes CapacityScheduler
 -

 Key: YARN-1784
 URL: https://issues.apache.org/jira/browse/YARN-1784
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0
Reporter: Karthik Kambatla
Assignee: Robert Kanter
Priority: Minor
 Attachments: YARN-1784.patch


 TestContainerAllocation assumes CapacityScheduler



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1339) Recover DeletionService state upon nodemanager restart

2014-04-08 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-1339:
-

Attachment: YARN-1339v2.patch

Updating patch after YARN-1757 was committed.

 Recover DeletionService state upon nodemanager restart
 --

 Key: YARN-1339
 URL: https://issues.apache.org/jira/browse/YARN-1339
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1339.patch, YARN-1339v2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1906) TestRMRestart#testQueueMetricsOnRMRestart fails intermittently on trunk and branch2

2014-04-08 Thread Mit Desai (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963539#comment-13963539
 ] 

Mit Desai commented on YARN-1906:
-

I was going through the app state transitions and I have the following 
conclusions:
* The state form NEW to SUBMITTED and SUBMITTED to ACCEPTED is almost 
instantaneous.
* When assertQueueMetrics is called, the app state I found was always ACCEPTED. 
The next state that it can transition into is RUNNING /KILLING/FINAL_SAVING. 
Which will not be changed until the scheduler picks up the app.

[~jeagles], So we will not be able to use the waitForState method here.
[~zjshen], When the app is submitted and we check the numbers in the 
assertQueueMetrics method, the appSubmitted will already have been 1, so 
waiting for it to increment will be an infinite waiting. Moreover, in the 
assertQueueMetrics we verifying the same thing (i.e the number of submitted 
apps and the number of apps in pending state is what we expect it to be).

I do not have another solution for the problem. May be we need to think some 
other ways.

Please provide your feedback if you have other ideas or think I am heading in 
some wrong direction.

 TestRMRestart#testQueueMetricsOnRMRestart fails intermittently on trunk and 
 branch2
 ---

 Key: YARN-1906
 URL: https://issues.apache.org/jira/browse/YARN-1906
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Mit Desai
Assignee: Mit Desai
 Fix For: 3.0.0, 2.5.0

 Attachments: YARN-1906.patch


 Here is the output of the format
 {noformat}
 testQueueMetricsOnRMRestart(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart)
   Time elapsed: 9.757 sec   FAILURE!
 java.lang.AssertionError: expected:2 but was:1
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.assertQueueMetrics(TestRMRestart.java:1735)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testQueueMetricsOnRMRestart(TestRMRestart.java:1706)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1915) ClientToAMTokenMasterKey should be provided to AM at launch time

2014-04-08 Thread Hitesh Shah (JIRA)
Hitesh Shah created YARN-1915:
-

 Summary: ClientToAMTokenMasterKey should be provided to AM at 
launch time
 Key: YARN-1915
 URL: https://issues.apache.org/jira/browse/YARN-1915
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.2.0
Reporter: Hitesh Shah
Priority: Critical


Currently, the AM receives the key as part of registration. This introduces a 
race where a client can connect to the AM when the AM has not received the key. 

Current Flow:
1) AM needs to start the client listening service in order to get host:port and 
send it to the RM as part of registration
2) RM gets the port info in register() and transitions the app to RUNNING. 
Responds back with client secret to AM.
3) User asks RM for client token. Gets it and pings the AM. AM hasn't received 
client secret from RM and so RPC itself rejects the request.





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart

2014-04-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963581#comment-13963581
 ] 

Hadoop QA commented on YARN-1341:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12639280/YARN-1341v3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3534//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3534//console

This message is automatically generated.

 Recover NMTokens upon nodemanager restart
 -

 Key: YARN-1341
 URL: https://issues.apache.org/jira/browse/YARN-1341
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1784) TestContainerAllocation assumes CapacityScheduler

2014-04-08 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-1784:


Attachment: YARN-1784.patch

New patch uses a setup method.

 TestContainerAllocation assumes CapacityScheduler
 -

 Key: YARN-1784
 URL: https://issues.apache.org/jira/browse/YARN-1784
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0
Reporter: Karthik Kambatla
Assignee: Robert Kanter
Priority: Minor
 Attachments: YARN-1784.patch, YARN-1784.patch


 TestContainerAllocation assumes CapacityScheduler



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1342) Recover container tokens upon nodemanager restart

2014-04-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963604#comment-13963604
 ] 

Hadoop QA commented on YARN-1342:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12639283/YARN-1342v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3535//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3535//console

This message is automatically generated.

 Recover container tokens upon nodemanager restart
 -

 Key: YARN-1342
 URL: https://issues.apache.org/jira/browse/YARN-1342
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1342.patch, YARN-1342v2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1339) Recover DeletionService state upon nodemanager restart

2014-04-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963606#comment-13963606
 ] 

Hadoop QA commented on YARN-1339:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12639285/YARN-1339v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3536//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3536//console

This message is automatically generated.

 Recover DeletionService state upon nodemanager restart
 --

 Key: YARN-1339
 URL: https://issues.apache.org/jira/browse/YARN-1339
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-1339.patch, YARN-1339v2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1784) TestContainerAllocation assumes CapacityScheduler

2014-04-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963650#comment-13963650
 ] 

Hadoop QA commented on YARN-1784:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12639292/YARN-1784.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3537//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3537//console

This message is automatically generated.

 TestContainerAllocation assumes CapacityScheduler
 -

 Key: YARN-1784
 URL: https://issues.apache.org/jira/browse/YARN-1784
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0
Reporter: Karthik Kambatla
Assignee: Robert Kanter
Priority: Minor
 Attachments: YARN-1784.patch, YARN-1784.patch


 TestContainerAllocation assumes CapacityScheduler



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1916) Leveldb timeline store applies secondary filters incorrectly

2014-04-08 Thread Billie Rinaldi (JIRA)
Billie Rinaldi created YARN-1916:


 Summary: Leveldb timeline store applies secondary filters 
incorrectly
 Key: YARN-1916
 URL: https://issues.apache.org/jira/browse/YARN-1916
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Billie Rinaldi
Assignee: Billie Rinaldi


When applying a secondary filter (fieldname:fieldvalue) in a get entities 
query, LeveldbTimelineStore retrieves entities that do not have the specified 
fieldname, in addition to correctly retrieving entities that have the fieldname 
with the specified fieldvalue.  It should not return entities that do not have 
the fieldname.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1916) Leveldb timeline store applies secondary filters incorrectly

2014-04-08 Thread Billie Rinaldi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-1916:
-

Attachment: YARN-1916.1.patch

 Leveldb timeline store applies secondary filters incorrectly
 

 Key: YARN-1916
 URL: https://issues.apache.org/jira/browse/YARN-1916
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Billie Rinaldi
Assignee: Billie Rinaldi
 Attachments: YARN-1916.1.patch


 When applying a secondary filter (fieldname:fieldvalue) in a get entities 
 query, LeveldbTimelineStore retrieves entities that do not have the specified 
 fieldname, in addition to correctly retrieving entities that have the 
 fieldname with the specified fieldvalue.  It should not return entities that 
 do not have the fieldname.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1914) Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows

2014-04-08 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963681#comment-13963681
 ] 

Sangjin Lee commented on YARN-1914:
---

LGTM. Sorry for missing the windows build.

 Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows
 

 Key: YARN-1914
 URL: https://issues.apache.org/jira/browse/YARN-1914
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-1914.0.patch


 The TestFSDownload.testDownloadPublicWithStatCache test in hadoop-yarn-common 
 consistently fails on Windows environments.
 The root cause is that the test checks for execute permission for all users 
 on every ancestor of the target directory. In windows, by default, group 
 Everyone has no permissions on any directory in the install drive. It's 
 unreasonable to expect this test to pass and we should skip it on Windows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1914) Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows

2014-04-08 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963710#comment-13963710
 ] 

Bikas Saha commented on YARN-1914:
--

[~cnauroth] [~ivanmi] Was this specific issue not already handled? The fact 
that directory traversal all the way to the top will not work on Windows so 
there was special logic for windows.

 Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows
 

 Key: YARN-1914
 URL: https://issues.apache.org/jira/browse/YARN-1914
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-1914.0.patch


 The TestFSDownload.testDownloadPublicWithStatCache test in hadoop-yarn-common 
 consistently fails on Windows environments.
 The root cause is that the test checks for execute permission for all users 
 on every ancestor of the target directory. In windows, by default, group 
 Everyone has no permissions on any directory in the install drive. It's 
 unreasonable to expect this test to pass and we should skip it on Windows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1916) Leveldb timeline store applies secondary filters incorrectly

2014-04-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963712#comment-13963712
 ] 

Hadoop QA commented on YARN-1916:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12639313/YARN-1916.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3538//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3538//console

This message is automatically generated.

 Leveldb timeline store applies secondary filters incorrectly
 

 Key: YARN-1916
 URL: https://issues.apache.org/jira/browse/YARN-1916
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Billie Rinaldi
Assignee: Billie Rinaldi
 Attachments: YARN-1916.1.patch


 When applying a secondary filter (fieldname:fieldvalue) in a get entities 
 query, LeveldbTimelineStore retrieves entities that do not have the specified 
 fieldname, in addition to correctly retrieving entities that have the 
 fieldname with the specified fieldvalue.  It should not return entities that 
 do not have the fieldname.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1912) ResourceLocalizer started without any jvm memory control

2014-04-08 Thread stanley shi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963789#comment-13963789
 ] 

stanley shi commented on YARN-1912:
---

No, it's not the minumum. On one of my environment which has 32GB mem:
{code}
/opt/jdk1.7.0_15/bin/java -XX:+PrintFlagsFinal -version 21 | grep MaxHeapSize
uintx MaxHeapSize  := 8415870976  {product}
{code}
And also some answer from oracle: 

{quote}Server JVM heap configuration ergonomics are now the same as the Client, 
except that the default maximum heap size for 32-bit JVMs is 1 gigabyte, 
corresponding to a physical memory size of 4 gigabytes, and for 64-bit JVMs is 
32 gigabytes, corresponding to a physical memory size of 128 gigabytes.
{quote}
http://www.oracle.com/technetwork/java/javase/6u18-142093.html

 ResourceLocalizer started without any jvm memory control
 

 Key: YARN-1912
 URL: https://issues.apache.org/jira/browse/YARN-1912
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.2.0
Reporter: stanley shi

 In the LinuxContainerExecutor.java#startLocalizer, it does not specify any 
 -Xmx configurations in the command, this caused the ResourceLocalizer to be 
 started with default memory setting.
 In an server-level hardware, it will use 25% of the system memory as the max 
 heap size, this will cause memory issue in some cases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1784) TestContainerAllocation assumes CapacityScheduler

2014-04-08 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963802#comment-13963802
 ] 

Karthik Kambatla commented on YARN-1784:


+1. Will commit this tonight.

 TestContainerAllocation assumes CapacityScheduler
 -

 Key: YARN-1784
 URL: https://issues.apache.org/jira/browse/YARN-1784
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0
Reporter: Karthik Kambatla
Assignee: Robert Kanter
Priority: Minor
 Attachments: YARN-1784.patch, YARN-1784.patch


 TestContainerAllocation assumes CapacityScheduler



--
This message was sent by Atlassian JIRA
(v6.2#6252)