[jira] [Created] (YARN-1912) ResourceLocalizer started without any jvm memory control
stanley shi created YARN-1912: - Summary: ResourceLocalizer started without any jvm memory control Key: YARN-1912 URL: https://issues.apache.org/jira/browse/YARN-1912 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.2.0 Reporter: stanley shi In the LinuxContainerExecutor.java#startLocalizer, it does not specify any -Xmx configurations in the command, this caused the ResourceLocalizer to be started with default memory setting. In an server-level hardware, it will use 25% of the system memory as the max heap size, this will cause memory issue in some cases. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1878) Yarn standby RM taking long to transition to active
[ https://issues.apache.org/jira/browse/YARN-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA updated YARN-1878: - Issue Type: Sub-task (was: Bug) Parent: YARN-149 Yarn standby RM taking long to transition to active --- Key: YARN-1878 URL: https://issues.apache.org/jira/browse/YARN-1878 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Arpit Gupta Assignee: Xuan Gong Attachments: YARN-1878.1.patch In our HA tests we are noticing that some times it can take upto 10s for the standby RM to transition to active. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1912) ResourceLocalizer started without any jvm memory control
[ https://issues.apache.org/jira/browse/YARN-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963010#comment-13963010 ] Nathan Roberts commented on YARN-1912: -- Doesn't it default to MIN(25%_of_memory,1GB)? May not be too bad on modern server class machines, but probably best to explicitly call out a maximum. ResourceLocalizer started without any jvm memory control Key: YARN-1912 URL: https://issues.apache.org/jira/browse/YARN-1912 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.2.0 Reporter: stanley shi In the LinuxContainerExecutor.java#startLocalizer, it does not specify any -Xmx configurations in the command, this caused the ResourceLocalizer to be started with default memory setting. In an server-level hardware, it will use 25% of the system memory as the max heap size, this will cause memory issue in some cases. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1906) TestRMRestart#testQueueMetricsOnRMRestart fails intermittently on trunk and branch2
[ https://issues.apache.org/jira/browse/YARN-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963114#comment-13963114 ] Mit Desai commented on YARN-1906: - Thanks for the feedback Jon. I will take a look today TestRMRestart#testQueueMetricsOnRMRestart fails intermittently on trunk and branch2 --- Key: YARN-1906 URL: https://issues.apache.org/jira/browse/YARN-1906 Project: Hadoop YARN Issue Type: Bug Reporter: Mit Desai Assignee: Mit Desai Fix For: 3.0.0, 2.5.0 Attachments: YARN-1906.patch Here is the output of the format {noformat} testQueueMetricsOnRMRestart(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart) Time elapsed: 9.757 sec FAILURE! java.lang.AssertionError: expected:2 but was:1 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.assertQueueMetrics(TestRMRestart.java:1735) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testQueueMetricsOnRMRestart(TestRMRestart.java:1706) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1913) Cluster logjam when all resources are consumed by AM
bc Wong created YARN-1913: - Summary: Cluster logjam when all resources are consumed by AM Key: YARN-1913 URL: https://issues.apache.org/jira/browse/YARN-1913 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.3.0 Reporter: bc Wong It's possible to deadlock a cluster by submitting many applications at once, and have all cluster resources taken up by AMs. One solution is for the scheduler to limit resources taken up by AMs, as a percentage of total cluster resources, via a maxApplicationMasterShare config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) Cluster logjam when all resources are consumed by AM
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963143#comment-13963143 ] Jason Lowe commented on YARN-1913: -- Which scheduler are you using? The CapacityScheduler already has a yarn.scheduler.capacity.maximum-am-resource-percent property, and there's a per-queue form of it as well. Cluster logjam when all resources are consumed by AM Key: YARN-1913 URL: https://issues.apache.org/jira/browse/YARN-1913 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.3.0 Reporter: bc Wong It's possible to deadlock a cluster by submitting many applications at once, and have all cluster resources taken up by AMs. One solution is for the scheduler to limit resources taken up by AMs, as a percentage of total cluster resources, via a maxApplicationMasterShare config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963164#comment-13963164 ] Sandy Ryza commented on YARN-1913: -- This is a Fair Scheduler issue. We need to add an equivalent property to the Fair Scheduler. With Fair Scheduler, cluster can logjam when all resources are consumed by AMs -- Key: YARN-1913 URL: https://issues.apache.org/jira/browse/YARN-1913 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.3.0 Reporter: bc Wong It's possible to deadlock a cluster by submitting many applications at once, and have all cluster resources taken up by AMs. One solution is for the scheduler to limit resources taken up by AMs, as a percentage of total cluster resources, via a maxApplicationMasterShare config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1913: - Summary: With Fair Scheduler, cluster can logjam when all resources are consumed by AMs (was: Cluster logjam when all resources are consumed by AM) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs -- Key: YARN-1913 URL: https://issues.apache.org/jira/browse/YARN-1913 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.3.0 Reporter: bc Wong It's possible to deadlock a cluster by submitting many applications at once, and have all cluster resources taken up by AMs. One solution is for the scheduler to limit resources taken up by AMs, as a percentage of total cluster resources, via a maxApplicationMasterShare config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1757) NM Recovery. Auxiliary service support.
[ https://issues.apache.org/jira/browse/YARN-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1757: --- Summary: NM Recovery. Auxiliary service support. (was: Auxiliary service support for nodemanager recovery) NM Recovery. Auxiliary service support. --- Key: YARN-1757 URL: https://issues.apache.org/jira/browse/YARN-1757 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1757-v2.patch, YARN-1757.patch, YARN-1757.patch There needs to be a mechanism for communicating to auxiliary services whether nodemanager recovery is enabled and where they should store their state. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1757) NM Recovery. Auxiliary service support.
[ https://issues.apache.org/jira/browse/YARN-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963218#comment-13963218 ] Hudson commented on YARN-1757: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5469 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5469/]) YARN-1757. NM Recovery. Auxiliary service support. (Jason Lowe via kasha) (kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585783) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/AuxiliaryService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestAuxServices.java NM Recovery. Auxiliary service support. --- Key: YARN-1757 URL: https://issues.apache.org/jira/browse/YARN-1757 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Fix For: 2.5.0 Attachments: YARN-1757-v2.patch, YARN-1757.patch, YARN-1757.patch There needs to be a mechanism for communicating to auxiliary services whether nodemanager recovery is enabled and where they should store their state. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1910) TestAMRMTokens fails on windows
[ https://issues.apache.org/jira/browse/YARN-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963259#comment-13963259 ] Varun Vasudev commented on YARN-1910: - Patch looks good. Just one suggestion - rename maxWaitingTime to maxWaitAttempts. TestAMRMTokens fails on windows --- Key: YARN-1910 URL: https://issues.apache.org/jira/browse/YARN-1910 Project: Hadoop YARN Issue Type: Bug Reporter: Xuan Gong Assignee: Xuan Gong Fix For: 2.4.0 Attachments: YARN-1910.1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1537) TestLocalResourcesTrackerImpl.testLocalResourceCache often failed
[ https://issues.apache.org/jira/browse/YARN-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963268#comment-13963268 ] Varun Vasudev commented on YARN-1537: - +1 looks fine to me. TestLocalResourcesTrackerImpl.testLocalResourceCache often failed - Key: YARN-1537 URL: https://issues.apache.org/jira/browse/YARN-1537 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.2.0 Reporter: shenhong Assignee: Xuan Gong Attachments: YARN-1537.1.patch Here is the error log {code} Results : Failed tests: TestLocalResourcesTrackerImpl.testLocalResourceCache:351 Wanted but not invoked: eventHandler.handle( isA(org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerResourceLocalizedEvent) ); - at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestLocalResourcesTrackerImpl.testLocalResourceCache(TestLocalResourcesTrackerImpl.java:351) However, there were other interactions with this mock: - at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134) - at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions
[ https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963271#comment-13963271 ] Sandy Ryza commented on YARN-596: - bq. Sandy Ryza, so here the ''safe'' means usage.memory fairshare.memory and usage.vcores fairshare.vcores? Right, but with those s as =s. bq. But the fairshare.vcores for FSQueue (except root) is always 0. And fairscheduler only considers memory when do scheduling, so do we still need to consider vcores here? These are only the case when the FairSharePolicy is used. When the DominantResourceFairnessPolicy is used, the vcores fair share is greater than 0 and the Fair Scheduler considers both memory and vcores for scheduling. In fair scheduler, intra-application container priorities affect inter-application preemption decisions --- Key: YARN-596 URL: https://issues.apache.org/jira/browse/YARN-596 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-596.patch, YARN-596.patch, YARN-596.patch, YARN-596.patch, YARN-596.patch In the fair scheduler, containers are chosen for preemption in the following way: All containers for all apps that are in queues that are over their fair share are put in a list. The list is sorted in order of the priority that the container was requested in. This means that an application can shield itself from preemption by requesting it's containers at higher priorities, which doesn't really make sense. Also, an application that is not over its fair share, but that is in a queue that is over it's fair share is just as likely to have containers preempted as an application that is over its fair share. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1914) Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows
Varun Vasudev created YARN-1914: --- Summary: Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows Key: YARN-1914 URL: https://issues.apache.org/jira/browse/YARN-1914 Project: Hadoop YARN Issue Type: Bug Reporter: Varun Vasudev The TestFSDownload.testDownloadPublicWithStatCache test in hadoop-yarn-common consistently fails on Windows environments. The root cause is that the test checks for execute permission for all users on every ancestor of the target directory. In windows, by default, group Everyone has no permissions on any directory in the install drive. It's unreasonable to expect this test to pass and we should skip it on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1914) Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows
[ https://issues.apache.org/jira/browse/YARN-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-1914: Attachment: apache-yarn-1914.0.patch Patch skipping test on windows Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows Key: YARN-1914 URL: https://issues.apache.org/jira/browse/YARN-1914 Project: Hadoop YARN Issue Type: Bug Reporter: Varun Vasudev Attachments: apache-yarn-1914.0.patch The TestFSDownload.testDownloadPublicWithStatCache test in hadoop-yarn-common consistently fails on Windows environments. The root cause is that the test checks for execute permission for all users on every ancestor of the target directory. In windows, by default, group Everyone has no permissions on any directory in the install drive. It's unreasonable to expect this test to pass and we should skip it on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (YARN-1914) Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows
[ https://issues.apache.org/jira/browse/YARN-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev reassigned YARN-1914: --- Assignee: Varun Vasudev Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows Key: YARN-1914 URL: https://issues.apache.org/jira/browse/YARN-1914 Project: Hadoop YARN Issue Type: Bug Reporter: Varun Vasudev Assignee: Varun Vasudev Attachments: apache-yarn-1914.0.patch The TestFSDownload.testDownloadPublicWithStatCache test in hadoop-yarn-common consistently fails on Windows environments. The root cause is that the test checks for execute permission for all users on every ancestor of the target directory. In windows, by default, group Everyone has no permissions on any directory in the install drive. It's unreasonable to expect this test to pass and we should skip it on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1908) Distributed shell with custom script has permission error.
[ https://issues.apache.org/jira/browse/YARN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963382#comment-13963382 ] Xuan Gong commented on YARN-1908: - +1 LGTM Distributed shell with custom script has permission error. -- Key: YARN-1908 URL: https://issues.apache.org/jira/browse/YARN-1908 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Affects Versions: 2.4.0 Reporter: Tassapol Athiapinya Assignee: Vinod Kumar Vavilapalli Attachments: YARN-1908.1.patch, YARN-1908.2.patch, YARN-1908.3.patch, YARN-1908.4.patch Create test1.sh having pwd. Run this command as user1: hadoop jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar -shell_script test1.sh NM is run by yarn user. An exception is thrown because yarn user has no permissions on custom script in hdfs path. The custom script is created with distributed shell app. {code} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=yarn, access=WRITE, inode=/user/user1/DistributedShell/70:user1:user1:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1910) TestAMRMTokens fails on windows
[ https://issues.apache.org/jira/browse/YARN-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1910: Attachment: YARN-1910.2.patch TestAMRMTokens fails on windows --- Key: YARN-1910 URL: https://issues.apache.org/jira/browse/YARN-1910 Project: Hadoop YARN Issue Type: Bug Reporter: Xuan Gong Assignee: Xuan Gong Fix For: 2.4.0 Attachments: YARN-1910.1.patch, YARN-1910.2.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1910) TestAMRMTokens fails on windows
[ https://issues.apache.org/jira/browse/YARN-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963396#comment-13963396 ] Varun Vasudev commented on YARN-1910: - +1 looks good to me. TestAMRMTokens fails on windows --- Key: YARN-1910 URL: https://issues.apache.org/jira/browse/YARN-1910 Project: Hadoop YARN Issue Type: Bug Reporter: Xuan Gong Assignee: Xuan Gong Fix For: 2.4.0 Attachments: YARN-1910.1.patch, YARN-1910.2.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1914) Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows
[ https://issues.apache.org/jira/browse/YARN-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963399#comment-13963399 ] Hadoop QA commented on YARN-1914: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639253/apache-yarn-1914.0.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3531//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3531//console This message is automatically generated. Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows Key: YARN-1914 URL: https://issues.apache.org/jira/browse/YARN-1914 Project: Hadoop YARN Issue Type: Bug Reporter: Varun Vasudev Assignee: Varun Vasudev Attachments: apache-yarn-1914.0.patch The TestFSDownload.testDownloadPublicWithStatCache test in hadoop-yarn-common consistently fails on Windows environments. The root cause is that the test checks for execute permission for all users on every ancestor of the target directory. In windows, by default, group Everyone has no permissions on any directory in the install drive. It's unreasonable to expect this test to pass and we should skip it on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1908) Distributed shell with custom script has permission error.
[ https://issues.apache.org/jira/browse/YARN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963401#comment-13963401 ] Jian He commented on YARN-1908: --- Patch looks good to me, + 1 Distributed shell with custom script has permission error. -- Key: YARN-1908 URL: https://issues.apache.org/jira/browse/YARN-1908 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Affects Versions: 2.4.0 Reporter: Tassapol Athiapinya Assignee: Vinod Kumar Vavilapalli Attachments: YARN-1908.1.patch, YARN-1908.2.patch, YARN-1908.3.patch, YARN-1908.4.patch Create test1.sh having pwd. Run this command as user1: hadoop jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar -shell_script test1.sh NM is run by yarn user. An exception is thrown because yarn user has no permissions on custom script in hdfs path. The custom script is created with distributed shell app. {code} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=yarn, access=WRITE, inode=/user/user1/DistributedShell/70:user1:user1:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1910) TestAMRMTokens fails on windows
[ https://issues.apache.org/jira/browse/YARN-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963457#comment-13963457 ] Hadoop QA commented on YARN-1910: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639256/YARN-1910.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3532//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3532//console This message is automatically generated. TestAMRMTokens fails on windows --- Key: YARN-1910 URL: https://issues.apache.org/jira/browse/YARN-1910 Project: Hadoop YARN Issue Type: Bug Reporter: Xuan Gong Assignee: Xuan Gong Fix For: 2.4.0 Attachments: YARN-1910.1.patch, YARN-1910.2.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1784) TestContainerAllocation assumes CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated YARN-1784: Attachment: YARN-1784.patch The patch configures the tests to always use the CapcityScheduler TestContainerAllocation assumes CapacityScheduler - Key: YARN-1784 URL: https://issues.apache.org/jira/browse/YARN-1784 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0 Reporter: Karthik Kambatla Assignee: Robert Kanter Priority: Minor Attachments: YARN-1784.patch TestContainerAllocation assumes CapacityScheduler -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1907) TestRMApplicationHistoryWriter#testRMWritingMassiveHistory runs slow and intermittently fails
[ https://issues.apache.org/jira/browse/YARN-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963480#comment-13963480 ] Zhijie Shen commented on YARN-1907: --- The patch makes sense to me. I tried the test in eclipse as well. Sometimes it would be failed after 200 rounds of node heartbeat, and the container still haven't completed cleaned up. However, is it a better code practice to loop until all the containers are cleaned up (removing the 200 round bounds), and set a suitable timeout for this test case? TestRMApplicationHistoryWriter#testRMWritingMassiveHistory runs slow and intermittently fails - Key: YARN-1907 URL: https://issues.apache.org/jira/browse/YARN-1907 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 2.5.0 Reporter: Mit Desai Assignee: Mit Desai Attachments: HDFS-6195.patch The test has 1 containers that it tries to cleanup. The cleanup has a timeout of 2ms in which the test sometimes cannot do the cleanup completely and gives out an Assertion Failure. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1906) TestRMRestart#testQueueMetricsOnRMRestart fails intermittently on trunk and branch2
[ https://issues.apache.org/jira/browse/YARN-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963491#comment-13963491 ] Zhijie Shen commented on YARN-1906: --- Maybe we can wait until, for example, the appSubmitted, is changed (increased or decreased by 1)? TestRMRestart#testQueueMetricsOnRMRestart fails intermittently on trunk and branch2 --- Key: YARN-1906 URL: https://issues.apache.org/jira/browse/YARN-1906 Project: Hadoop YARN Issue Type: Bug Reporter: Mit Desai Assignee: Mit Desai Fix For: 3.0.0, 2.5.0 Attachments: YARN-1906.patch Here is the output of the format {noformat} testQueueMetricsOnRMRestart(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart) Time elapsed: 9.757 sec FAILURE! java.lang.AssertionError: expected:2 but was:1 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.assertQueueMetrics(TestRMRestart.java:1735) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testQueueMetricsOnRMRestart(TestRMRestart.java:1706) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1784) TestContainerAllocation assumes CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963512#comment-13963512 ] Hadoop QA commented on YARN-1784: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639268/YARN-1784.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3533//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3533//console This message is automatically generated. TestContainerAllocation assumes CapacityScheduler - Key: YARN-1784 URL: https://issues.apache.org/jira/browse/YARN-1784 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0 Reporter: Karthik Kambatla Assignee: Robert Kanter Priority: Minor Attachments: YARN-1784.patch TestContainerAllocation assumes CapacityScheduler -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1908) Distributed shell with custom script has permission error.
[ https://issues.apache.org/jira/browse/YARN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963517#comment-13963517 ] Vinod Kumar Vavilapalli commented on YARN-1908: --- Tx for the reviews [~xgong] and [~jianhe]!. Checking this in.. Distributed shell with custom script has permission error. -- Key: YARN-1908 URL: https://issues.apache.org/jira/browse/YARN-1908 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Affects Versions: 2.4.0 Reporter: Tassapol Athiapinya Assignee: Vinod Kumar Vavilapalli Attachments: YARN-1908.1.patch, YARN-1908.2.patch, YARN-1908.3.patch, YARN-1908.4.patch Create test1.sh having pwd. Run this command as user1: hadoop jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar -shell_script test1.sh NM is run by yarn user. An exception is thrown because yarn user has no permissions on custom script in hdfs path. The custom script is created with distributed shell app. {code} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=yarn, access=WRITE, inode=/user/user1/DistributedShell/70:user1:user1:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1341) Recover NMTokens upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1341: - Attachment: YARN-1341v3.patch Updating patch after YARN-1757 was committed. Recover NMTokens upon nodemanager restart - Key: YARN-1341 URL: https://issues.apache.org/jira/browse/YARN-1341 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1908) Distributed shell with custom script has permission error.
[ https://issues.apache.org/jira/browse/YARN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963532#comment-13963532 ] Hudson commented on YARN-1908: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5471 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5471/]) YARN-1908. Fixed DistributedShell to not fail in secure clusters. Contributed by Vinod Kumar Vavilapalli and Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1585849) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java Distributed shell with custom script has permission error. -- Key: YARN-1908 URL: https://issues.apache.org/jira/browse/YARN-1908 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Affects Versions: 2.4.0 Reporter: Tassapol Athiapinya Assignee: Vinod Kumar Vavilapalli Fix For: 2.4.1 Attachments: YARN-1908.1.patch, YARN-1908.2.patch, YARN-1908.3.patch, YARN-1908.4.patch Create test1.sh having pwd. Run this command as user1: hadoop jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar -shell_script test1.sh NM is run by yarn user. An exception is thrown because yarn user has no permissions on custom script in hdfs path. The custom script is created with distributed shell app. {code} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=yarn, access=WRITE, inode=/user/user1/DistributedShell/70:user1:user1:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1342) Recover container tokens upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1342: - Attachment: YARN-1342v2.patch Updating patch after YARN-1757 was committed. Recover container tokens upon nodemanager restart - Key: YARN-1342 URL: https://issues.apache.org/jira/browse/YARN-1342 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1342.patch, YARN-1342v2.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1784) TestContainerAllocation assumes CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963553#comment-13963553 ] Karthik Kambatla commented on YARN-1784: Instead of adding it to individual tests, can we add a setup method (@Before) so future tests using the available conf don't cause this. TestContainerAllocation assumes CapacityScheduler - Key: YARN-1784 URL: https://issues.apache.org/jira/browse/YARN-1784 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0 Reporter: Karthik Kambatla Assignee: Robert Kanter Priority: Minor Attachments: YARN-1784.patch TestContainerAllocation assumes CapacityScheduler -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1339) Recover DeletionService state upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1339: - Attachment: YARN-1339v2.patch Updating patch after YARN-1757 was committed. Recover DeletionService state upon nodemanager restart -- Key: YARN-1339 URL: https://issues.apache.org/jira/browse/YARN-1339 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1339.patch, YARN-1339v2.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1906) TestRMRestart#testQueueMetricsOnRMRestart fails intermittently on trunk and branch2
[ https://issues.apache.org/jira/browse/YARN-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963539#comment-13963539 ] Mit Desai commented on YARN-1906: - I was going through the app state transitions and I have the following conclusions: * The state form NEW to SUBMITTED and SUBMITTED to ACCEPTED is almost instantaneous. * When assertQueueMetrics is called, the app state I found was always ACCEPTED. The next state that it can transition into is RUNNING /KILLING/FINAL_SAVING. Which will not be changed until the scheduler picks up the app. [~jeagles], So we will not be able to use the waitForState method here. [~zjshen], When the app is submitted and we check the numbers in the assertQueueMetrics method, the appSubmitted will already have been 1, so waiting for it to increment will be an infinite waiting. Moreover, in the assertQueueMetrics we verifying the same thing (i.e the number of submitted apps and the number of apps in pending state is what we expect it to be). I do not have another solution for the problem. May be we need to think some other ways. Please provide your feedback if you have other ideas or think I am heading in some wrong direction. TestRMRestart#testQueueMetricsOnRMRestart fails intermittently on trunk and branch2 --- Key: YARN-1906 URL: https://issues.apache.org/jira/browse/YARN-1906 Project: Hadoop YARN Issue Type: Bug Reporter: Mit Desai Assignee: Mit Desai Fix For: 3.0.0, 2.5.0 Attachments: YARN-1906.patch Here is the output of the format {noformat} testQueueMetricsOnRMRestart(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart) Time elapsed: 9.757 sec FAILURE! java.lang.AssertionError: expected:2 but was:1 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.assertQueueMetrics(TestRMRestart.java:1735) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testQueueMetricsOnRMRestart(TestRMRestart.java:1706) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1915) ClientToAMTokenMasterKey should be provided to AM at launch time
Hitesh Shah created YARN-1915: - Summary: ClientToAMTokenMasterKey should be provided to AM at launch time Key: YARN-1915 URL: https://issues.apache.org/jira/browse/YARN-1915 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.2.0 Reporter: Hitesh Shah Priority: Critical Currently, the AM receives the key as part of registration. This introduces a race where a client can connect to the AM when the AM has not received the key. Current Flow: 1) AM needs to start the client listening service in order to get host:port and send it to the RM as part of registration 2) RM gets the port info in register() and transitions the app to RUNNING. Responds back with client secret to AM. 3) User asks RM for client token. Gets it and pings the AM. AM hasn't received client secret from RM and so RPC itself rejects the request. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1341) Recover NMTokens upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963581#comment-13963581 ] Hadoop QA commented on YARN-1341: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639280/YARN-1341v3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3534//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3534//console This message is automatically generated. Recover NMTokens upon nodemanager restart - Key: YARN-1341 URL: https://issues.apache.org/jira/browse/YARN-1341 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1341.patch, YARN-1341v2.patch, YARN-1341v3.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1784) TestContainerAllocation assumes CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated YARN-1784: Attachment: YARN-1784.patch New patch uses a setup method. TestContainerAllocation assumes CapacityScheduler - Key: YARN-1784 URL: https://issues.apache.org/jira/browse/YARN-1784 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0 Reporter: Karthik Kambatla Assignee: Robert Kanter Priority: Minor Attachments: YARN-1784.patch, YARN-1784.patch TestContainerAllocation assumes CapacityScheduler -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1342) Recover container tokens upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963604#comment-13963604 ] Hadoop QA commented on YARN-1342: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639283/YARN-1342v2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3535//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3535//console This message is automatically generated. Recover container tokens upon nodemanager restart - Key: YARN-1342 URL: https://issues.apache.org/jira/browse/YARN-1342 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1342.patch, YARN-1342v2.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1339) Recover DeletionService state upon nodemanager restart
[ https://issues.apache.org/jira/browse/YARN-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963606#comment-13963606 ] Hadoop QA commented on YARN-1339: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639285/YARN-1339v2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3536//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3536//console This message is automatically generated. Recover DeletionService state upon nodemanager restart -- Key: YARN-1339 URL: https://issues.apache.org/jira/browse/YARN-1339 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.3.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-1339.patch, YARN-1339v2.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1784) TestContainerAllocation assumes CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963650#comment-13963650 ] Hadoop QA commented on YARN-1784: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639292/YARN-1784.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3537//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3537//console This message is automatically generated. TestContainerAllocation assumes CapacityScheduler - Key: YARN-1784 URL: https://issues.apache.org/jira/browse/YARN-1784 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0 Reporter: Karthik Kambatla Assignee: Robert Kanter Priority: Minor Attachments: YARN-1784.patch, YARN-1784.patch TestContainerAllocation assumes CapacityScheduler -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1916) Leveldb timeline store applies secondary filters incorrectly
Billie Rinaldi created YARN-1916: Summary: Leveldb timeline store applies secondary filters incorrectly Key: YARN-1916 URL: https://issues.apache.org/jira/browse/YARN-1916 Project: Hadoop YARN Issue Type: Bug Reporter: Billie Rinaldi Assignee: Billie Rinaldi When applying a secondary filter (fieldname:fieldvalue) in a get entities query, LeveldbTimelineStore retrieves entities that do not have the specified fieldname, in addition to correctly retrieving entities that have the fieldname with the specified fieldvalue. It should not return entities that do not have the fieldname. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1916) Leveldb timeline store applies secondary filters incorrectly
[ https://issues.apache.org/jira/browse/YARN-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Billie Rinaldi updated YARN-1916: - Attachment: YARN-1916.1.patch Leveldb timeline store applies secondary filters incorrectly Key: YARN-1916 URL: https://issues.apache.org/jira/browse/YARN-1916 Project: Hadoop YARN Issue Type: Bug Reporter: Billie Rinaldi Assignee: Billie Rinaldi Attachments: YARN-1916.1.patch When applying a secondary filter (fieldname:fieldvalue) in a get entities query, LeveldbTimelineStore retrieves entities that do not have the specified fieldname, in addition to correctly retrieving entities that have the fieldname with the specified fieldvalue. It should not return entities that do not have the fieldname. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1914) Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows
[ https://issues.apache.org/jira/browse/YARN-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963681#comment-13963681 ] Sangjin Lee commented on YARN-1914: --- LGTM. Sorry for missing the windows build. Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows Key: YARN-1914 URL: https://issues.apache.org/jira/browse/YARN-1914 Project: Hadoop YARN Issue Type: Bug Reporter: Varun Vasudev Assignee: Varun Vasudev Attachments: apache-yarn-1914.0.patch The TestFSDownload.testDownloadPublicWithStatCache test in hadoop-yarn-common consistently fails on Windows environments. The root cause is that the test checks for execute permission for all users on every ancestor of the target directory. In windows, by default, group Everyone has no permissions on any directory in the install drive. It's unreasonable to expect this test to pass and we should skip it on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1914) Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows
[ https://issues.apache.org/jira/browse/YARN-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963710#comment-13963710 ] Bikas Saha commented on YARN-1914: -- [~cnauroth] [~ivanmi] Was this specific issue not already handled? The fact that directory traversal all the way to the top will not work on Windows so there was special logic for windows. Test TestFSDownload.testDownloadPublicWithStatCache fails on Windows Key: YARN-1914 URL: https://issues.apache.org/jira/browse/YARN-1914 Project: Hadoop YARN Issue Type: Bug Reporter: Varun Vasudev Assignee: Varun Vasudev Attachments: apache-yarn-1914.0.patch The TestFSDownload.testDownloadPublicWithStatCache test in hadoop-yarn-common consistently fails on Windows environments. The root cause is that the test checks for execute permission for all users on every ancestor of the target directory. In windows, by default, group Everyone has no permissions on any directory in the install drive. It's unreasonable to expect this test to pass and we should skip it on Windows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1916) Leveldb timeline store applies secondary filters incorrectly
[ https://issues.apache.org/jira/browse/YARN-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963712#comment-13963712 ] Hadoop QA commented on YARN-1916: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12639313/YARN-1916.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3538//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3538//console This message is automatically generated. Leveldb timeline store applies secondary filters incorrectly Key: YARN-1916 URL: https://issues.apache.org/jira/browse/YARN-1916 Project: Hadoop YARN Issue Type: Bug Reporter: Billie Rinaldi Assignee: Billie Rinaldi Attachments: YARN-1916.1.patch When applying a secondary filter (fieldname:fieldvalue) in a get entities query, LeveldbTimelineStore retrieves entities that do not have the specified fieldname, in addition to correctly retrieving entities that have the fieldname with the specified fieldvalue. It should not return entities that do not have the fieldname. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1912) ResourceLocalizer started without any jvm memory control
[ https://issues.apache.org/jira/browse/YARN-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963789#comment-13963789 ] stanley shi commented on YARN-1912: --- No, it's not the minumum. On one of my environment which has 32GB mem: {code} /opt/jdk1.7.0_15/bin/java -XX:+PrintFlagsFinal -version 21 | grep MaxHeapSize uintx MaxHeapSize := 8415870976 {product} {code} And also some answer from oracle: {quote}Server JVM heap configuration ergonomics are now the same as the Client, except that the default maximum heap size for 32-bit JVMs is 1 gigabyte, corresponding to a physical memory size of 4 gigabytes, and for 64-bit JVMs is 32 gigabytes, corresponding to a physical memory size of 128 gigabytes. {quote} http://www.oracle.com/technetwork/java/javase/6u18-142093.html ResourceLocalizer started without any jvm memory control Key: YARN-1912 URL: https://issues.apache.org/jira/browse/YARN-1912 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.2.0 Reporter: stanley shi In the LinuxContainerExecutor.java#startLocalizer, it does not specify any -Xmx configurations in the command, this caused the ResourceLocalizer to be started with default memory setting. In an server-level hardware, it will use 25% of the system memory as the max heap size, this will cause memory issue in some cases. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1784) TestContainerAllocation assumes CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13963802#comment-13963802 ] Karthik Kambatla commented on YARN-1784: +1. Will commit this tonight. TestContainerAllocation assumes CapacityScheduler - Key: YARN-1784 URL: https://issues.apache.org/jira/browse/YARN-1784 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.3.0 Reporter: Karthik Kambatla Assignee: Robert Kanter Priority: Minor Attachments: YARN-1784.patch, YARN-1784.patch TestContainerAllocation assumes CapacityScheduler -- This message was sent by Atlassian JIRA (v6.2#6252)