Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/479/ No changes -1 overall The following subsystems voted -1: docker Powered by Apache Yetus 0.8.0 http://yetus.apache.org - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1294/ [Oct 18, 2019 9:19:49 AM] (snemeth) YARN-9841. Capacity scheduler: add support for combined %user + [Oct 18, 2019 3:25:02 PM] (weichiu) HADOOP-16152. Upgrade Eclipse Jetty version to 9.4.x. Contributed by [Oct 18, 2019 8:26:20 PM] (weichiu) HADOOP-16579. Upgrade to Curator 4.2.0 and ZooKeeper 3.5.5 (#1656). [Oct 18, 2019 11:10:32 PM] (eyang) YARN-9884. Make container-executor mount logic modular [Oct 19, 2019 12:30:11 AM] (eyang) YARN-9875. Improve fair scheduler configuration store on HDFS. -1 overall The following subsystems voted -1: docker Powered by Apache Yetus 0.8.0 http://yetus.apache.org - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9920) YarnAuthorizationProvider AccessRequest has Null RemoteAddress in case of FairScheduler
Prabhu Joseph created YARN-9920: --- Summary: YarnAuthorizationProvider AccessRequest has Null RemoteAddress in case of FairScheduler Key: YARN-9920 URL: https://issues.apache.org/jira/browse/YARN-9920 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.3.0 Reporter: Prabhu Joseph Assignee: Prabhu Joseph YarnAuthorizationProvider AccessRequest has Null RemoteAddress in case of FairScheduler. FSQueue#hasAccess uses Server.getRemoteAddress() which will be Null when the call is from RMWebServices and EventDispatcher. It works fine only when called by IPC Server Handler. FSQueue#hasAccess is called at three places where (2) and (3) returns NULL. 1. IPC Server -> RMAppManager#createAndPopulateNewRMApp -> FSQueue#hasAccess -> Server.getRemoteAddress returns correct Remote IP. 2. IPC Server -> RMAppManager#createAndPopulateNewRMApp -> AppAddedSchedulerEvent EventDispatcher -> FairScheduler#addApplication -> FSQueue.hasAccess -> Server.getRemoteAddress returns NULL. {code} org.apache.hadoop.yarn.security.ConfiguredYarnAuthorizer.checkPermission(ConfiguredYarnAuthorizer.java:101) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.hasAccess(FSQueue.java:316) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplication(FairScheduler.java:509) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1268) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:133) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) {code} 3. RMWebServices -> QueueACLsManager#checkAccess -> FSQueue.hasAccess -> Server.getRemoteAddress returns NULL. {code} org.apache.hadoop.yarn.security.ConfiguredYarnAuthorizer.checkPermission(ConfiguredYarnAuthorizer.java:101) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.hasAccess(FSQueue.java:316) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.checkAccess(FairScheduler.java:1610) at org.apache.hadoop.yarn.server.resourcemanager.security.QueueACLsManager.checkAccess(QueueACLsManager.java:84) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.hasAccess(RMWebServices.java:270) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.getApps(RMWebServices.java:553) {code} Have verified with CapacityScheduler and it works fine. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9919) QueueMetrics for observe only preemption
Prashant Golash created YARN-9919: - Summary: QueueMetrics for observe only preemption Key: YARN-9919 URL: https://issues.apache.org/jira/browse/YARN-9919 Project: Hadoop YARN Issue Type: Improvement Reporter: Prashant Golash There should be a way to track how many containers will be preempted if preemption is enabled in "observe only" mode. In the case of lazy preemption, it could be bit involved as the scheduler makes the decision. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9918) AggregatedAllocatedContainers metrics not getting reported for MR in 2.6.x
Prashant Golash created YARN-9918: - Summary: AggregatedAllocatedContainers metrics not getting reported for MR in 2.6.x Key: YARN-9918 URL: https://issues.apache.org/jira/browse/YARN-9918 Project: Hadoop YARN Issue Type: Bug Reporter: Prashant Golash One of our YARN clusters is 2.6.x cdh. I have observed that aggregated allocated container metrics are not getting reported for the MR jobs. Some queues have specific MR workload, but that queue always shows 0 as "aggregatedAllocatedContainers". Created this Jira to track this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9917) Rate limit RMWebServices API
Prashant Golash created YARN-9917: - Summary: Rate limit RMWebServices API Key: YARN-9917 URL: https://issues.apache.org/jira/browse/YARN-9917 Project: Hadoop YARN Issue Type: Improvement Reporter: Prashant Golash In our cluster, spark clients call getApplications without specifying filters to solve some use cases. Sometimes when the number of applications is too many and there are too many calls, this causes GC on RM and result in a failover. Ideally, we should be able to rate-limit this API. Any other ideas are also welcome. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9915) Fix FindBug issue in QueueMetrics
Prabhu Joseph created YARN-9915: --- Summary: Fix FindBug issue in QueueMetrics Key: YARN-9915 URL: https://issues.apache.org/jira/browse/YARN-9915 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.3.0 Reporter: Prabhu Joseph Assignee: Prabhu Joseph Below FindBug issue appears in the trunk build {code} org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.registerCustomResources() invokes inefficient new Long(long) constructor; use Long.valueOf(long) instead Bug type DM_NUMBER_CTOR (click for details) In class org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics In method org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.registerCustomResources() Called method new Long(long) Should call Long.valueOf(long) instead At QueueMetrics.java:[line 468] {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9916) Improving Async Dispatcher
Prashant Golash created YARN-9916: - Summary: Improving Async Dispatcher Key: YARN-9916 URL: https://issues.apache.org/jira/browse/YARN-9916 Project: Hadoop YARN Issue Type: Improvement Reporter: Prashant Golash Currently, async dispatcher works in the single-threaded model. There is another queue for the scheduler handler, but not all handlers are non-blocking. In our cluster, this queue can go sometimes to 16M events, which takes time to drain. We should think of improving it: # Either make multi-threads in the dispatcher which will pick queue events, but this would require careful evaluation of the order of events. # Or Make all downstream handlers similar to scheduler queue (this also needs careful evaluation of out of order events). Any other ideas are also welcome. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9914) Use separate configs for free disk space checking for full and not-full disks
Jim Brennan created YARN-9914: - Summary: Use separate configs for free disk space checking for full and not-full disks Key: YARN-9914 URL: https://issues.apache.org/jira/browse/YARN-9914 Project: Hadoop YARN Issue Type: Improvement Components: yarn Reporter: Jim Brennan Assignee: Jim Brennan [YARN-3943] added separate configurations for the nodemanager health check disk utilization full disk check: {{max-disk-utilization-per-disk-percentage}} - threshold for marking a good disk full {{disk-utilization-watermark-low-per-disk-percentage}} - threshold for marking a full disk as not full. On our clusters, we do not use these configs. We instead use {{min-free-space-per-disk-mb}} so we can specify the limit in mb instead of percent of utilization. We have observed the same oscillation behavior as described in [YARN-3943] with this parameter. I would like to add an optional config to specify a separate threshold for marking a full disk as not full: {{min-free-space-per-disk-mb}} - threshold at which a good disk is marked full {{disk-free-space-per-disk-high-watermark-mb}} - threshold at which a full disk is marked good. So for example, we could set {{min-free-space-per-disk-mb = 5GB}}, which would cause a disk to be marked full when free space goes below 5GB, and {{disk-free-space-per-disk-high-watermark-mb = 10GB}} to keep the disk in the full state until free space goes above 10GB. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-9913) In YARN ui2 attempt container tab, The Container's ElapsedTime of running Application is incorrect when the browser and the yarn server are in different timezons.
jenny created YARN-9913: --- Summary: In YARN ui2 attempt container tab, The Container's ElapsedTime of running Application is incorrect when the browser and the yarn server are in different timezons. Key: YARN-9913 URL: https://issues.apache.org/jira/browse/YARN-9913 Project: Hadoop YARN Issue Type: Bug Components: yarn-ui-v2 Affects Versions: 3.2.1, 3.1.1 Reporter: jenny In YARN ui2 attempt container tab, The Container's ElapsedTime of running Application is incorrect when the browser and the yarn server are in different timezons. Please see the screenshots below: Yarn UI2: -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org