[jira] [Updated] (YARN-8794) QueuePlacementPolicy add more rules
[ https://issues.apache.org/jira/browse/YARN-8794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Zhang updated YARN-8794: -- Attachment: YARN-8794.002.patch > QueuePlacementPolicy add more rules > --- > > Key: YARN-8794 > URL: https://issues.apache.org/jira/browse/YARN-8794 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Shuai Zhang >Priority: Major > Attachments: YARN-8794.001.patch, YARN-8794.002.patch > > > Still need more useful rules: > # RejectNonLeafQueue > # RejectDefaultQueue > # RejectUsers > # RejectQueues > # DefaultByUser -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8811) Support Container Storage Interface (CSI) in YARN
Weiwei Yang created YARN-8811: - Summary: Support Container Storage Interface (CSI) in YARN Key: YARN-8811 URL: https://issues.apache.org/jira/browse/YARN-8811 Project: Hadoop YARN Issue Type: New Feature Reporter: Weiwei Yang The Container Storage Interface (CSI) is a vendor neutral interface to bridge Container Orchestrators and Storage Providers. With the adoption of CSI in YARN, it will be easier to integrate 3rd party storage systems, and provide the ability to attach persistent volumes for stateful applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7086) Release all containers aynchronously
[ https://issues.apache.org/jira/browse/YARN-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16623100#comment-16623100 ] Manikandan R commented on YARN-7086: [~jlowe] I did simple performance test to understand the containers release behaviour. Was trying to release 10K containers in single AM allocate call and measured the time taken (in secs) for all containers release with below three different flows: 1. Exisitng code: No changes. 2. With Patch (Async release + multiple container list traversal): Used .002.patch as is with batch size as 1K. 3. With Patch (Not Async release + multiple container list traversal): Slightly modified .002.patch to call new completeContainers(Map containersToBeReleased, RMContainerEventType event) directly rather than going through events flow. ||Run||Existing code||With Patch (Async release + multiple container list traversal)||With Patch (Not Async release + multiple container list traversal) || |1|6.8| 4.6|8.6| |2|8.3| 7.5| 9.9| |3|6.8| 7.2| 8.2| |4|7.2| 7.1| 8.9| |5| 7.2| 4.6| 10| |Average of 5 runs|7.26|6.2|9.12| Attaching patch containing only test case to explain the above flow. Can you please validate the approach? > Release all containers aynchronously > > > Key: YARN-7086 > URL: https://issues.apache.org/jira/browse/YARN-7086 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Arun Suresh >Assignee: Manikandan R >Priority: Major > Attachments: YARN-7086.001.patch, YARN-7086.002.patch, > YARN-7086.Perf-test-case.patch > > > We have noticed in production two situations that can cause deadlocks and > cause scheduling of new containers to come to a halt, especially with regard > to applications that have a lot of live containers: > # When these applicaitons release these containers in bulk. > # When these applications terminate abruptly due to some failure, the > scheduler releases all its live containers in a loop. > To handle the issues mentioned above, we have a patch in production to make > sure ALL container releases happen asynchronously - and it has served us well. > Opening this JIRA to gather feedback on if this is a good idea generally (cc > [~leftnoteasy], [~jlowe], [~curino], [~kasha], [~subru], [~roniburd]) > BTW, In YARN-6251, we already have an asyncReleaseContainer() in the > AbstractYarnScheduler and a corresponding scheduler event, which is currently > used specifically for the container-update code paths (where the scheduler > realeases temp containers which it creates for the update) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7086) Release all containers aynchronously
[ https://issues.apache.org/jira/browse/YARN-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-7086: --- Attachment: YARN-7086.Perf-test-case.patch > Release all containers aynchronously > > > Key: YARN-7086 > URL: https://issues.apache.org/jira/browse/YARN-7086 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Arun Suresh >Assignee: Manikandan R >Priority: Major > Attachments: YARN-7086.001.patch, YARN-7086.002.patch, > YARN-7086.Perf-test-case.patch > > > We have noticed in production two situations that can cause deadlocks and > cause scheduling of new containers to come to a halt, especially with regard > to applications that have a lot of live containers: > # When these applicaitons release these containers in bulk. > # When these applications terminate abruptly due to some failure, the > scheduler releases all its live containers in a loop. > To handle the issues mentioned above, we have a patch in production to make > sure ALL container releases happen asynchronously - and it has served us well. > Opening this JIRA to gather feedback on if this is a good idea generally (cc > [~leftnoteasy], [~jlowe], [~curino], [~kasha], [~subru], [~roniburd]) > BTW, In YARN-6251, we already have an asyncReleaseContainer() in the > AbstractYarnScheduler and a corresponding scheduler event, which is currently > used specifically for the container-update code paths (where the scheduler > realeases temp containers which it creates for the update) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8804) resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues
[ https://issues.apache.org/jira/browse/YARN-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16623035#comment-16623035 ] Tao Yang edited comment on YARN-8804 at 9/21/18 4:17 AM: - Thanks [~jlowe],[~leftnoteasy] for your review and reply. For the volatile keyword, it is a mistake when I copied from headroom field in ResourceLimits. I should removed it after did that. The resourceLimits in scheduling process is thread-safe because it isn't shared by multiple scheduling threads. Every scheduling thread will create a ResourceLimits instance at the beginning of scheduling process in CapacityScheduler#allocateOrReserveNewContainers or CapacityScheduler#allocateContainerOnSingleNode then pass it on. {quote} I think it would be cleaner if a queue could return an assignment result that not only indicated the allocation was skipped due to queue limits but also how much needs to be reserved as a result of that skipped assignment. {quote} Now we can get reserved resource from \{{childLimits.getHeadroom()}} for leaf queue, then add it into the blockedHeadroom of leaf/parent queue, so that later queues can get correct net limits through {{limit - blockedHeadroom}}. I think it's enough to solve this problem. Thoughts? {quote} The result would be less overhead for the normal scheduler loop, as we would only be adjusting when necessary rather than every time. {quote} Thanks for this mention. I will improve the calculation to avoid doing it every time through adding ResourceLimits#getNetLimit, this method will do calculation when necessary rather than every time. {quote} >From my analysis of YARN-8513, scheduler tries to allocate containers to queue >when it will go beyond max capacity (used + allocating > max). But resource >committer will reject such proposal. {quote} YARN-8513 is not the same problem with this issue as the former comments from [~jlowe], it seems similar to YARN-8771 whose problem is caused by wrong calculation for needUnreservedResource in RegularContainerAllocator#assignContainer when cluster has empty resource type. But I am not sure they are the same problem. was (Author: tao yang): Thanks [~jlowe],[~leftnoteasy] for your review and reply. For the volatile keyword, it is a mistake when I copied from headroom field in ResourceLimits. I should removed it after did that. The resourceLimits in scheduling process is thread-safe because it isn't shared by multiple scheduling threads. Every scheduling thread will create a ResourceLimits instance at the beginning of scheduling process in CapacityScheduler#allocateOrReserveNewContainers or CapacityScheduler#allocateContainerOnSingleNode then pass it on. {quote} I think it would be cleaner if a queue could return an assignment result that not only indicated the allocation was skipped due to queue limits but also how much needs to be reserved as a result of that skipped assignment. {quote} Now we can get reserved resource from \{{childLimits.getHeadroom()}} for leaf queue, then add it into the blockedHeadroom of leaf/parent queue, so that later queues can get correct net limits through {{limit - blockedHeadroom}}. I think it's enough to solve this problem. Thoughts? {quote} The result would be less overhead for the normal scheduler loop, as we would only be adjusting when necessary rather than every time. {quote} Thanks for this mention. I will improve the calculation to avoid doing it every time through adding ResourceLimits#getNetLimit, this method will do calculation when necessary rather than every time. {quote} >From my analysis of YARN-8513, scheduler tries to allocate containers to queue >when it will go beyond max capacity (used + allocating > max). But resource >committer will reject such proposal. {quote} YARN-8513 is not the same problem with this issue as the former comments from [~jlowe], it seems similar to YARN-8771 which may be caused by wrong calculation when needUnreservedResource with empty resource type in RegularContainerAllocator#assignContainer. But I am not sure they are the same problem. > resourceLimits may be wrongly calculated when leaf-queue is blocked in > cluster with 3+ level queues > --- > > Key: YARN-8804 > URL: https://issues.apache.org/jira/browse/YARN-8804 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8804.001.patch, YARN-8804.002.patch > > > This problem is due to YARN-4280, parent queue will deduct child queue's > headroom when the child queue reached its resource limit and the skipped type > is QUEUE_LIMIT, the resource limits of deepest parent queue
[jira] [Commented] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized
[ https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16623040#comment-16623040 ] Tao Yang commented on YARN-8513: Hi, [~cyfdecyf] Dose your cluster have empty resource type? This problem seems similar to YARN-8771 whose problem is caused by wrong calculation for needUnreservedResource in RegularContainerAllocator#assignContainer when cluster has empty resource type. > CapacityScheduler infinite loop when queue is near fully utilized > - > > Key: YARN-8513 > URL: https://issues.apache.org/jira/browse/YARN-8513 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, yarn >Affects Versions: 3.1.0, 2.9.1 > Environment: Ubuntu 14.04.5 and 16.04.4 > YARN is configured with one label and 5 queues. >Reporter: Chen Yufei >Priority: Major > Attachments: jstack-1.log, jstack-2.log, jstack-3.log, jstack-4.log, > jstack-5.log, top-during-lock.log, top-when-normal.log, yarn3-jstack1.log, > yarn3-jstack2.log, yarn3-jstack3.log, yarn3-jstack4.log, yarn3-jstack5.log, > yarn3-resourcemanager.log, yarn3-top > > > ResourceManager does not respond to any request when queue is near fully > utilized sometimes. Sending SIGTERM won't stop RM, only SIGKILL can. After RM > restart, it can recover running jobs and start accepting new ones. > > Seems like CapacityScheduler is in an infinite loop printing out the > following log messages (more than 25,000 lines in a second): > > {{2018-07-10 17:16:29,227 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: > assignedContainer queue=root usedCapacity=0.99816763 > absoluteUsedCapacity=0.99816763 used= > cluster=}} > {{2018-07-10 17:16:29,227 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: > Failed to accept allocation proposal}} > {{2018-07-10 17:16:29,227 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator: > assignedContainer application attempt=appattempt_1530619767030_1652_01 > container=null > queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@14420943 > clusterResource= type=NODE_LOCAL > requestedPartition=}} > > I encounter this problem several times after upgrading to YARN 2.9.1, while > the same configuration works fine under version 2.7.3. > > YARN-4477 is an infinite loop bug in FairScheduler, not sure if this is a > similar problem. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8804) resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues
[ https://issues.apache.org/jira/browse/YARN-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16623037#comment-16623037 ] Tao Yang commented on YARN-8804: Attached v2 patch to remove violate keyword and improve the calculation. > resourceLimits may be wrongly calculated when leaf-queue is blocked in > cluster with 3+ level queues > --- > > Key: YARN-8804 > URL: https://issues.apache.org/jira/browse/YARN-8804 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8804.001.patch, YARN-8804.002.patch > > > This problem is due to YARN-4280, parent queue will deduct child queue's > headroom when the child queue reached its resource limit and the skipped type > is QUEUE_LIMIT, the resource limits of deepest parent queue will be correctly > calculated, but for non-deepest parent queue, its headroom may be much more > than the sum of reached-limit child queues' headroom, so that the resource > limit of non-deepest parent may be much less than its true value and block > the allocation for later queues. > To reproduce this problem with UT: > (1) Cluster has two nodes whose node resource both are <10GB, 10core> and > 3-level queues as below, among them max-capacity of "c1" is 10 and others are > all 100, so that max-capacity of queue "c1" is <2GB, 2core> > {noformat} > Root > / | \ > a bc >10 20 70 > | \ > c1 c2 > 10(max=10) 90 > {noformat} > (2) Submit app1 to queue "c1" and launch am1(resource=<1GB, 1 core>) on nm1 > (3) Submit app2 to queue "b" and launch am2(resource=<1GB, 1 core>) on nm1 > (4) app1 and app2 both ask one <2GB, 1core> containers. > (5) nm1 do 1 heartbeat > Now queue "c" has lower capacity percentage than queue "b", the allocation > sequence will be "a" -> "c" -> "b", > queue "c1" has reached queue limit so that requests of app1 should be > pending, > headroom of queue "c1" is <1GB, 1core> (=max-capacity - used), > headroom of queue "c" is <18GB, 18core> (=max-capacity - used), > after allocation for queue "c", resource limit of queue "b" will be wrongly > calculated as <2GB, 2core>, > headroom of queue "b" will be <1GB, 1core> (=resource-limit - used) > so that scheduler won't allocate one container for app2 on nm1 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8804) resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues
[ https://issues.apache.org/jira/browse/YARN-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8804: --- Attachment: YARN-8804.002.patch > resourceLimits may be wrongly calculated when leaf-queue is blocked in > cluster with 3+ level queues > --- > > Key: YARN-8804 > URL: https://issues.apache.org/jira/browse/YARN-8804 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8804.001.patch, YARN-8804.002.patch > > > This problem is due to YARN-4280, parent queue will deduct child queue's > headroom when the child queue reached its resource limit and the skipped type > is QUEUE_LIMIT, the resource limits of deepest parent queue will be correctly > calculated, but for non-deepest parent queue, its headroom may be much more > than the sum of reached-limit child queues' headroom, so that the resource > limit of non-deepest parent may be much less than its true value and block > the allocation for later queues. > To reproduce this problem with UT: > (1) Cluster has two nodes whose node resource both are <10GB, 10core> and > 3-level queues as below, among them max-capacity of "c1" is 10 and others are > all 100, so that max-capacity of queue "c1" is <2GB, 2core> > {noformat} > Root > / | \ > a bc >10 20 70 > | \ > c1 c2 > 10(max=10) 90 > {noformat} > (2) Submit app1 to queue "c1" and launch am1(resource=<1GB, 1 core>) on nm1 > (3) Submit app2 to queue "b" and launch am2(resource=<1GB, 1 core>) on nm1 > (4) app1 and app2 both ask one <2GB, 1core> containers. > (5) nm1 do 1 heartbeat > Now queue "c" has lower capacity percentage than queue "b", the allocation > sequence will be "a" -> "c" -> "b", > queue "c1" has reached queue limit so that requests of app1 should be > pending, > headroom of queue "c1" is <1GB, 1core> (=max-capacity - used), > headroom of queue "c" is <18GB, 18core> (=max-capacity - used), > after allocation for queue "c", resource limit of queue "b" will be wrongly > calculated as <2GB, 2core>, > headroom of queue "b" will be <1GB, 1core> (=resource-limit - used) > so that scheduler won't allocate one container for app2 on nm1 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8804) resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues
[ https://issues.apache.org/jira/browse/YARN-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16623035#comment-16623035 ] Tao Yang edited comment on YARN-8804 at 9/21/18 4:05 AM: - Thanks [~jlowe],[~leftnoteasy] for your review and reply. For the volatile keyword, it is a mistake when I copied from headroom field in ResourceLimits. I should removed it after did that. The resourceLimits in scheduling process is thread-safe because it isn't shared by multiple scheduling threads. Every scheduling thread will create a ResourceLimits instance at the beginning of scheduling process in CapacityScheduler#allocateOrReserveNewContainers or CapacityScheduler#allocateContainerOnSingleNode then pass it on. {quote} I think it would be cleaner if a queue could return an assignment result that not only indicated the allocation was skipped due to queue limits but also how much needs to be reserved as a result of that skipped assignment. {quote} Now we can get reserved resource from \{{childLimits.getHeadroom()}} for leaf queue, then add it into the blockedHeadroom of leaf/parent queue, so that later queues can get correct net limits through {{limit - blockedHeadroom}}. I think it's enough to solve this problem. Thoughts? {quote} The result would be less overhead for the normal scheduler loop, as we would only be adjusting when necessary rather than every time. {quote} Thanks for this mention. I will improve the calculation to avoid doing it every time through adding ResourceLimits#getNetLimit, this method will do calculation when necessary rather than every time. {quote} >From my analysis of YARN-8513, scheduler tries to allocate containers to queue >when it will go beyond max capacity (used + allocating > max). But resource >committer will reject such proposal. {quote} YARN-8513 is not the same problem with this issue as the former comments from [~jlowe], it seems similar to YARN-8771 which may be caused by wrong calculation when needUnreservedResource with empty resource type in RegularContainerAllocator#assignContainer. But I am not sure they are the same problem. was (Author: tao yang): Thanks [~jlowe],[~leftnoteasy] for your review and reply. For the volatile keyword, it is a mistake when I copied from headroom field in ResourceLimits. I should removed it after did that. The resourceLimits in scheduling process is thread-safe because it isn't shared by multiple scheduling threads. Every scheduling thread will create a ResourceLimits instance at the beginning of scheduling process in CapacityScheduler#allocateOrReserveNewContainers or CapacityScheduler#allocateContainerOnSingleNode then pass it on. {quote} I think it would be cleaner if a queue could return an assignment result that not only indicated the allocation was skipped due to queue limits but also how much needs to be reserved as a result of that skipped assignment. {quote} Now we can get reserved resource from \{{childLimits.getHeadroom()}} for leaf queue, then add it into the blockedHeadroom of leaf/parent queue, so that later queues can get correct net limits through {{limit - blockedHeadroom}}. {quote} The result would be less overhead for the normal scheduler loop, as we would only be adjusting when necessary rather than every time. {quote} Thanks for this mention. I will improve the calculation to avoid doing it every time through adding ResourceLimits#getNetLimit, this method will do calculation when necessary rather than every time. {quote} >From my analysis of YARN-8513, scheduler tries to allocate containers to queue >when it will go beyond max capacity (used + allocating > max). But resource >committer will reject such proposal. {quote} YARN-8513 is not the same problem with this issue as the former comments from [~jlowe], it seems similar to YARN-8771 which may be caused by wrong calculation when needUnreservedResource with empty resource type in RegularContainerAllocator#assignContainer. But I am not sure they are the same problem. > resourceLimits may be wrongly calculated when leaf-queue is blocked in > cluster with 3+ level queues > --- > > Key: YARN-8804 > URL: https://issues.apache.org/jira/browse/YARN-8804 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8804.001.patch > > > This problem is due to YARN-4280, parent queue will deduct child queue's > headroom when the child queue reached its resource limit and the skipped type > is QUEUE_LIMIT, the resource limits of deepest parent queue will be correctly > calculated, but for non-deepest parent queue, its headroom may be
[jira] [Commented] (YARN-8804) resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues
[ https://issues.apache.org/jira/browse/YARN-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16623035#comment-16623035 ] Tao Yang commented on YARN-8804: Thanks [~jlowe],[~leftnoteasy] for your review and reply. For the volatile keyword, it is a mistake when I copied from headroom field in ResourceLimits. I should removed it after did that. The resourceLimits in scheduling process is thread-safe because it isn't shared by multiple scheduling threads. Every scheduling thread will create a ResourceLimits instance at the beginning of scheduling process in CapacityScheduler#allocateOrReserveNewContainers or CapacityScheduler#allocateContainerOnSingleNode then pass it on. {quote} I think it would be cleaner if a queue could return an assignment result that not only indicated the allocation was skipped due to queue limits but also how much needs to be reserved as a result of that skipped assignment. {quote} Now we can get reserved resource from \{{childLimits.getHeadroom()}} for leaf queue, then add it into the blockedHeadroom of leaf/parent queue, so that later queues can get correct net limits through {{limit - blockedHeadroom}}. {quote} The result would be less overhead for the normal scheduler loop, as we would only be adjusting when necessary rather than every time. {quote} Thanks for this mention. I will improve the calculation to avoid doing it every time through adding ResourceLimits#getNetLimit, this method will do calculation when necessary rather than every time. {quote} >From my analysis of YARN-8513, scheduler tries to allocate containers to queue >when it will go beyond max capacity (used + allocating > max). But resource >committer will reject such proposal. {quote} YARN-8513 is not the same problem with this issue as the former comments from [~jlowe], it seems similar to YARN-8771 which may be caused by wrong calculation when needUnreservedResource with empty resource type in RegularContainerAllocator#assignContainer. But I am not sure they are the same problem. > resourceLimits may be wrongly calculated when leaf-queue is blocked in > cluster with 3+ level queues > --- > > Key: YARN-8804 > URL: https://issues.apache.org/jira/browse/YARN-8804 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8804.001.patch > > > This problem is due to YARN-4280, parent queue will deduct child queue's > headroom when the child queue reached its resource limit and the skipped type > is QUEUE_LIMIT, the resource limits of deepest parent queue will be correctly > calculated, but for non-deepest parent queue, its headroom may be much more > than the sum of reached-limit child queues' headroom, so that the resource > limit of non-deepest parent may be much less than its true value and block > the allocation for later queues. > To reproduce this problem with UT: > (1) Cluster has two nodes whose node resource both are <10GB, 10core> and > 3-level queues as below, among them max-capacity of "c1" is 10 and others are > all 100, so that max-capacity of queue "c1" is <2GB, 2core> > {noformat} > Root > / | \ > a bc >10 20 70 > | \ > c1 c2 > 10(max=10) 90 > {noformat} > (2) Submit app1 to queue "c1" and launch am1(resource=<1GB, 1 core>) on nm1 > (3) Submit app2 to queue "b" and launch am2(resource=<1GB, 1 core>) on nm1 > (4) app1 and app2 both ask one <2GB, 1core> containers. > (5) nm1 do 1 heartbeat > Now queue "c" has lower capacity percentage than queue "b", the allocation > sequence will be "a" -> "c" -> "b", > queue "c1" has reached queue limit so that requests of app1 should be > pending, > headroom of queue "c1" is <1GB, 1core> (=max-capacity - used), > headroom of queue "c" is <18GB, 18core> (=max-capacity - used), > after allocation for queue "c", resource limit of queue "b" will be wrongly > calculated as <2GB, 2core>, > headroom of queue "b" will be <1GB, 1core> (=resource-limit - used) > so that scheduler won't allocate one container for app2 on nm1 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8809) Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers are released.
[ https://issues.apache.org/jira/browse/YARN-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622940#comment-16622940 ] Hadoop QA commented on YARN-8809: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} YARN-1011 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 31m 3s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 54s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 23s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} YARN-1011 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 35s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 67m 56s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}131m 26s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:c6870a1 | | JIRA Issue | YARN-8809 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12940692/YARN-8809-YARN-1011.01.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 5661790892f3 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 17:03:53 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | YARN-1011 / 36ec27e | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21922/testReport/ | | Max. process+thread count | 1006 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21922/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Fair
[jira] [Commented] (YARN-8808) Use aggregate container utilization instead of node utilization to determine resources available for oversubscription
[ https://issues.apache.org/jira/browse/YARN-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622924#comment-16622924 ] Hadoop QA commented on YARN-8808: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} YARN-1011 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 17s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 59s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 8s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} YARN-1011 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 48s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 71m 22s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}129m 42s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:c6870a1 | | JIRA Issue | YARN-8808 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12940684/YARN-8808-YARN-1011.00.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux fead03a97655 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | YARN-1011 / 36ec27e | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/21920/artifact/out/whitespace-eol.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21920/testReport/ | | Max. process+thread count | 1001 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output
[jira] [Commented] (YARN-8807) FairScheduler crashes RM with oversubscription turned on if an application is killed.
[ https://issues.apache.org/jira/browse/YARN-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622900#comment-16622900 ] Hadoop QA commented on YARN-8807: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 37s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} YARN-1011 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 10s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 8s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} YARN-1011 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 19s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 25s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}143m 48s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:c6870a1 | | JIRA Issue | YARN-8807 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12940672/YARN-8807-YARN-1011.00.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 2a50a3f8f61e 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | YARN-1011 / 36ec27e | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/21916/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21916/testReport/ | | Max. process+thread count | 1048 (vs. ulimit of 1) | | modules | C:
[jira] [Commented] (YARN-8658) [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor
[ https://issues.apache.org/jira/browse/YARN-8658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622887#comment-16622887 ] Hadoop QA commented on YARN-8658: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 41s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 12s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 28s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 12s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 32s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 35s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 2 new + 2 unchanged - 0 fixed = 4 total (was 2) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 30s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 15m 4s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 46m 5s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:a716388 | | JIRA Issue | YARN-8658 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12940681/YARN-8658-branch-2.11.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d1c45453c89e 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 17:03:53 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-2 / 3a6ad9c | | maven | version: Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) | | Default Java | 1.7.0_181 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/21921/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21921/testReport/ | | Max. process+thread count | 139 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common
[jira] [Commented] (YARN-7599) [GPG] ApplicationCleaner in Global Policy Generator
[ https://issues.apache.org/jira/browse/YARN-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622862#comment-16622862 ] Hadoop QA commented on YARN-7599: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} YARN-7402 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 24s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 29s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 20s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 30s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 46s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 49s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 27s{color} | {color:green} YARN-7402 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 32s{color} | {color:green} YARN-7402 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 18s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 49s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 24s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 31s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 45s{color} | {color:green} hadoop-yarn-server-globalpolicygenerator in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}104m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-7599 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12940673/YARN-7599-YARN-7402.v7.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
[jira] [Updated] (YARN-8809) Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers are released.
[ https://issues.apache.org/jira/browse/YARN-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8809: - Attachment: YARN-8809-YARN-1011.01.patch > Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers > are released. > --- > > Key: YARN-8809 > URL: https://issues.apache.org/jira/browse/YARN-8809 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-8809-YARN-1011.00.patch, > YARN-8809-YARN-1011.01.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command
[ https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622850#comment-16622850 ] Eric Yang commented on YARN-8777: - [~ebadger] I am confused by your statement. We have been using comma delimited launch command in .cmd file for ENTRYPOINT in Hadoop 3.1.1. YARN-8805 seems to be a request to make sure the user supplied command with space delimited to be converted to the common delimited format for exec. The requested change is on Java side serialization. There is no changes in C code or .cmd format to prevent breaking backward compatibility, if I understand the problem correctly. The format was what we agreed upon without rewrite .cmd serialization technique. What is the concern with launch_command deserialization delimited by comma on C side? Should I change launch_command to a single parameter like I suggested as a single binary to prevent inconsistent length of argv between container-executor and docker exec? > Container Executor C binary change to execute interactive docker command > > > Key: YARN-8777 > URL: https://issues.apache.org/jira/browse/YARN-8777 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Eric Yang >Priority: Major > Labels: Docker > Attachments: YARN-8777.001.patch, YARN-8777.002.patch, > YARN-8777.003.patch > > > Since Container Executor provides Container execution using the native > container-executor binary, we also need to make changes to accept new > “dockerExec” method to invoke the corresponding native function to execute > docker exec command to the running container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8809) Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers are released.
[ https://issues.apache.org/jira/browse/YARN-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622845#comment-16622845 ] Haibo Chen commented on YARN-8809: -- I see. Will make the change and upload a new patch shortly. > Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers > are released. > --- > > Key: YARN-8809 > URL: https://issues.apache.org/jira/browse/YARN-8809 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-8809-YARN-1011.00.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8810) Yarn Service: discrepancy between hashcode and equals of ConfigFile
[ https://issues.apache.org/jira/browse/YARN-8810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandni Singh updated YARN-8810: Priority: Minor (was: Major) > Yarn Service: discrepancy between hashcode and equals of ConfigFile > --- > > Key: YARN-8810 > URL: https://issues.apache.org/jira/browse/YARN-8810 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Minor > > The {{ConfigFile}} class {{equals}} method doesn't check the equality of > {{properties}}. The {{hashCode}} does include the {{properties}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8810) Yarn Service: discrepancy between hashcode and equals of ConfigFile
Chandni Singh created YARN-8810: --- Summary: Yarn Service: discrepancy between hashcode and equals of ConfigFile Key: YARN-8810 URL: https://issues.apache.org/jira/browse/YARN-8810 Project: Hadoop YARN Issue Type: Bug Reporter: Chandni Singh Assignee: Chandni Singh The {{ConfigFile}} class {{equals}} method doesn't check the equality of {{properties}}. The {{hashCode}} does include the {{properties}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8809) Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers are released.
[ https://issues.apache.org/jira/browse/YARN-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622822#comment-16622822 ] Arun Suresh commented on YARN-8809: --- I believe {{TestOpportunisticContainerAllocatorAMService}} does exercise the of containerComplete code paths in the CapacityScheduler.. So please go ahead and make the change in the CapScheduler as well.. > Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers > are released. > --- > > Key: YARN-8809 > URL: https://issues.apache.org/jira/browse/YARN-8809 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-8809-YARN-1011.00.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8809) Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers are released.
[ https://issues.apache.org/jira/browse/YARN-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622822#comment-16622822 ] Arun Suresh edited comment on YARN-8809 at 9/20/18 10:42 PM: - I believe {{TestOpportunisticContainerAllocatorAMService}} does exercise the containerComplete code paths in the CapacityScheduler.. So please go ahead and make the change in the CapScheduler as well.. was (Author: asuresh): I believe {{TestOpportunisticContainerAllocatorAMService}} does exercise the of containerComplete code paths in the CapacityScheduler.. So please go ahead and make the change in the CapScheduler as well.. > Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers > are released. > --- > > Key: YARN-8809 > URL: https://issues.apache.org/jira/browse/YARN-8809 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-8809-YARN-1011.00.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8658) [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor
[ https://issues.apache.org/jira/browse/YARN-8658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622813#comment-16622813 ] Hadoop QA commented on YARN-8658: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 42s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 50s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 20s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 19s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 32s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 2 new + 1 unchanged - 0 fixed = 3 total (was 1) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 30s{color} | {color:red} hadoop-yarn-server-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 59s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 44m 35s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.metrics.TestAMRMClientRelayerMetrics | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:a716388 | | JIRA Issue | YARN-8658 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12940677/YARN-8658-branch-2.10.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux b398543fefb4 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-2 / 3a6ad9c | | maven | version: Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) | | Default Java | 1.7.0_181 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/21918/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/21918/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-common.txt | | Test Results |
[jira] [Updated] (YARN-8808) Use aggregate container utilization instead of node utilization to determine resources available for oversubscription
[ https://issues.apache.org/jira/browse/YARN-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8808: - Attachment: YARN-8808-YARN-1011.00.patch > Use aggregate container utilization instead of node utilization to determine > resources available for oversubscription > - > > Key: YARN-8808 > URL: https://issues.apache.org/jira/browse/YARN-8808 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-8808-YARN-1011.00.patch > > > Resource oversubscription should be bound to the amount of the resources that > can be allocated to containers, hence the allocation threshold should be with > respect to aggregate container utilization. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8658) [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor
[ https://issues.apache.org/jira/browse/YARN-8658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622811#comment-16622811 ] Hadoop QA commented on YARN-8658: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 19m 41s{color} | {color:red} Docker failed to build yetus/hadoop:a716388. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-8658 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12940681/YARN-8658-branch-2.11.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21919/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor > -- > > Key: YARN-8658 > URL: https://issues.apache.org/jira/browse/YARN-8658 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Young Chen >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-8658-branch-2.09.patch, > YARN-8658-branch-2.10.patch, YARN-8658-branch-2.11.patch, YARN-8658.01.patch, > YARN-8658.02.patch, YARN-8658.03.patch, YARN-8658.04.patch, > YARN-8658.05.patch, YARN-8658.06.patch, YARN-8658.07.patch, > YARN-8658.08.patch, YARN-8658.09.patch > > > AMRMClientRelayer (YARN-7900) is introduced for stateful > FederationInterceptor (YARN-7899), to keep track of all pending requests sent > to every subcluster YarnRM. We need to add metrics for AMRMClientRelayer to > show the state of things in FederationInterceptor. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8665) Yarn Service Upgrade: Support cancelling upgrade
[ https://issues.apache.org/jira/browse/YARN-8665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622803#comment-16622803 ] Chandni Singh commented on YARN-8665: - [~eyang] thanks for review. About 2 {quote} The upgrades directory in HDFS is not removed after cancel. Not sure if this can cause future problem. {quote} I am deleting the dir for the cancelled version. Here is the code in {{ServiceManager}} which gets executed during finalization: {code} if (cancelUpgrade) { fs.deleteClusterUpgradeDir(getName(), cancelledVersion); } else { fs.deleteClusterUpgradeDir(getName(), upgradeVersion); } {code} The parent directory {{upgrade}} is not deleted, but that doesn't create any issue. We create version directories under it, so if {{upgrade}} doesn't exist, it gets created otherwise just the subdir is created. I will address rest of the comments in the new patch. > Yarn Service Upgrade: Support cancelling upgrade > - > > Key: YARN-8665 > URL: https://issues.apache.org/jira/browse/YARN-8665 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8665.001.patch, YARN-8665.002.patch, > YARN-8665.003.patch > > > When a service is upgraded without auto-finalization or express upgrade, then > the upgrade can be cancelled. This provides the user ability to test upgrade > of a single instance and if that doesn't go well, they get a chance to cancel > it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8809) Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers are released.
[ https://issues.apache.org/jira/browse/YARN-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622797#comment-16622797 ] Haibo Chen commented on YARN-8809: -- Totally agreed. This goes back to my intention to keep the behavior change in FS only though. In theory, if we were to include the change you are suggesting, there should be a CS unit test too that covers this behavior change. How about we do that in another jira, which can be self-contained more or less? > Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers > are released. > --- > > Key: YARN-8809 > URL: https://issues.apache.org/jira/browse/YARN-8809 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-8809-YARN-1011.00.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8658) [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor
[ https://issues.apache.org/jira/browse/YARN-8658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622788#comment-16622788 ] Young Chen commented on YARN-8658: -- Fixed unit test failure due to bug introduced by bad merge. > [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor > -- > > Key: YARN-8658 > URL: https://issues.apache.org/jira/browse/YARN-8658 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Young Chen >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-8658-branch-2.09.patch, > YARN-8658-branch-2.10.patch, YARN-8658-branch-2.11.patch, YARN-8658.01.patch, > YARN-8658.02.patch, YARN-8658.03.patch, YARN-8658.04.patch, > YARN-8658.05.patch, YARN-8658.06.patch, YARN-8658.07.patch, > YARN-8658.08.patch, YARN-8658.09.patch > > > AMRMClientRelayer (YARN-7900) is introduced for stateful > FederationInterceptor (YARN-7899), to keep track of all pending requests sent > to every subcluster YarnRM. We need to add metrics for AMRMClientRelayer to > show the state of things in FederationInterceptor. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8658) [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor
[ https://issues.apache.org/jira/browse/YARN-8658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Young Chen updated YARN-8658: - Attachment: YARN-8658-branch-2.11.patch > [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor > -- > > Key: YARN-8658 > URL: https://issues.apache.org/jira/browse/YARN-8658 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Young Chen >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-8658-branch-2.09.patch, > YARN-8658-branch-2.10.patch, YARN-8658-branch-2.11.patch, YARN-8658.01.patch, > YARN-8658.02.patch, YARN-8658.03.patch, YARN-8658.04.patch, > YARN-8658.05.patch, YARN-8658.06.patch, YARN-8658.07.patch, > YARN-8658.08.patch, YARN-8658.09.patch > > > AMRMClientRelayer (YARN-7900) is introduced for stateful > FederationInterceptor (YARN-7899), to keep track of all pending requests sent > to every subcluster YarnRM. We need to add metrics for AMRMClientRelayer to > show the state of things in FederationInterceptor. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8809) Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers are released.
[ https://issues.apache.org/jira/browse/YARN-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622784#comment-16622784 ] Arun Suresh commented on YARN-8809: --- Got it... So, in essence, FS has a unified code path for completing O and G containers, but it is different in the CS. Maybe more intuitive if we move the current implementation of {{completeOpportunisticContainerInternal}} method from the AbstractYarnScheduler into the CS and in the AbstractYarnScheduler, have the default implementation of {{completeOpportunisticContainerInternal}} just call {{completeGuaranteedContainerInternal}} ? > Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers > are released. > --- > > Key: YARN-8809 > URL: https://issues.apache.org/jira/browse/YARN-8809 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-8809-YARN-1011.00.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8804) resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues
[ https://issues.apache.org/jira/browse/YARN-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622771#comment-16622771 ] Jason Lowe commented on YARN-8804: -- >From the looks of YARN-8513 this appears to be a separate issue. Looks like >YARN-8513 describes an infinite loop where an allocation keeps getting >rejected, whereas this describes a situation where an allocation never occurs >so therefore cannot be rejected by the committer. > resourceLimits may be wrongly calculated when leaf-queue is blocked in > cluster with 3+ level queues > --- > > Key: YARN-8804 > URL: https://issues.apache.org/jira/browse/YARN-8804 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8804.001.patch > > > This problem is due to YARN-4280, parent queue will deduct child queue's > headroom when the child queue reached its resource limit and the skipped type > is QUEUE_LIMIT, the resource limits of deepest parent queue will be correctly > calculated, but for non-deepest parent queue, its headroom may be much more > than the sum of reached-limit child queues' headroom, so that the resource > limit of non-deepest parent may be much less than its true value and block > the allocation for later queues. > To reproduce this problem with UT: > (1) Cluster has two nodes whose node resource both are <10GB, 10core> and > 3-level queues as below, among them max-capacity of "c1" is 10 and others are > all 100, so that max-capacity of queue "c1" is <2GB, 2core> > {noformat} > Root > / | \ > a bc >10 20 70 > | \ > c1 c2 > 10(max=10) 90 > {noformat} > (2) Submit app1 to queue "c1" and launch am1(resource=<1GB, 1 core>) on nm1 > (3) Submit app2 to queue "b" and launch am2(resource=<1GB, 1 core>) on nm1 > (4) app1 and app2 both ask one <2GB, 1core> containers. > (5) nm1 do 1 heartbeat > Now queue "c" has lower capacity percentage than queue "b", the allocation > sequence will be "a" -> "c" -> "b", > queue "c1" has reached queue limit so that requests of app1 should be > pending, > headroom of queue "c1" is <1GB, 1core> (=max-capacity - used), > headroom of queue "c" is <18GB, 18core> (=max-capacity - used), > after allocation for queue "c", resource limit of queue "b" will be wrongly > calculated as <2GB, 2core>, > headroom of queue "b" will be <1GB, 1core> (=resource-limit - used) > so that scheduler won't allocate one container for app2 on nm1 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1011) [Umbrella] Schedule containers based on utilization of currently allocated containers
[ https://issues.apache.org/jira/browse/YARN-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622767#comment-16622767 ] Haibo Chen commented on YARN-1011: -- I'm +1 on moving this. > [Umbrella] Schedule containers based on utilization of currently allocated > containers > - > > Key: YARN-1011 > URL: https://issues.apache.org/jira/browse/YARN-1011 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Arun C Murthy >Assignee: Karthik Kambatla >Priority: Major > Attachments: patch-for-yarn-1011.patch, yarn-1011-design-v0.pdf, > yarn-1011-design-v1.pdf, yarn-1011-design-v2.pdf, yarn-1011-design-v3.pdf > > > Currently RM allocates containers and assumes resources allocated are > utilized. > RM can, and should, get to a point where it measures utilization of allocated > containers and, if appropriate, allocate more (speculative?) containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8665) Yarn Service Upgrade: Support cancelling upgrade
[ https://issues.apache.org/jira/browse/YARN-8665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622765#comment-16622765 ] Eric Yang commented on YARN-8665: - [~csingh] Thank you for the patch. Patch 3 works correctly in my tests. A few nitpicks: # The new version of the components directory in HDFS is not removed. (i.e. .yarn/services/abc/components/6) . # The upgrades directory in HDFS is not removed after cancel. Not sure if this can cause future problem. # ServiceTestUtils.java, Comment out code can be removed. # hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/api/records/Container.java changes is not necessary. Other than those minor issues, the patch looks good to me. > Yarn Service Upgrade: Support cancelling upgrade > - > > Key: YARN-8665 > URL: https://issues.apache.org/jira/browse/YARN-8665 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8665.001.patch, YARN-8665.002.patch, > YARN-8665.003.patch > > > When a service is upgraded without auto-finalization or express upgrade, then > the upgrade can be cancelled. This provides the user ability to test upgrade > of a single instance and if that doesn't go well, they get a chance to cancel > it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8809) Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers are released.
[ https://issues.apache.org/jira/browse/YARN-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622762#comment-16622762 ] Haibo Chen commented on YARN-8809: -- The failed unit tests are all unrelated, see YARN-7387, YARN-8433 > Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers > are released. > --- > > Key: YARN-8809 > URL: https://issues.apache.org/jira/browse/YARN-8809 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-8809-YARN-1011.00.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8658) [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor
[ https://issues.apache.org/jira/browse/YARN-8658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622752#comment-16622752 ] Young Chen edited comment on YARN-8658 at 9/20/18 9:50 PM: --- Fixed indentation issues from merging conflicts. was (Author: youchen): Fixed indentation issues from merging conflicts. yarn-common unit test failure is unrelated. > [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor > -- > > Key: YARN-8658 > URL: https://issues.apache.org/jira/browse/YARN-8658 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Young Chen >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-8658-branch-2.09.patch, > YARN-8658-branch-2.10.patch, YARN-8658.01.patch, YARN-8658.02.patch, > YARN-8658.03.patch, YARN-8658.04.patch, YARN-8658.05.patch, > YARN-8658.06.patch, YARN-8658.07.patch, YARN-8658.08.patch, YARN-8658.09.patch > > > AMRMClientRelayer (YARN-7900) is introduced for stateful > FederationInterceptor (YARN-7899), to keep track of all pending requests sent > to every subcluster YarnRM. We need to add metrics for AMRMClientRelayer to > show the state of things in FederationInterceptor. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8665) Yarn Service Upgrade: Support cancelling upgrade
[ https://issues.apache.org/jira/browse/YARN-8665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622756#comment-16622756 ] Hadoop QA commented on YARN-8665: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 8 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 41s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 40s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 18s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 30s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 3 new + 458 unchanged - 4 fixed = 461 total (was 462) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 18s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 33s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 23s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m 49s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 25m 24s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 23s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 13s{color} | {color:green} hadoop-yarn-services-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}154m 22s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.TestNMProxy | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce
[jira] [Commented] (YARN-8809) Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers are released.
[ https://issues.apache.org/jira/browse/YARN-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622754#comment-16622754 ] Haibo Chen commented on YARN-8809: -- [~asuresh] The core change is the overriding of completedOpportunisticContainerInternal() in FairScheduler. The behavior in AbstractYarnScheduler is not changed at all. But in FairScheduler, handling the completion of an opportunistic container is now the same as that of an guaranteed container, which updates the queue metrics correctly. The change would in theory apply to capacity scheduler as well, but I can see the locking in CS is different from FS, hence I have left that change to folks that are more familiar with CS. > Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers > are released. > --- > > Key: YARN-8809 > URL: https://issues.apache.org/jira/browse/YARN-8809 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-8809-YARN-1011.00.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8658) [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor
[ https://issues.apache.org/jira/browse/YARN-8658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622752#comment-16622752 ] Young Chen commented on YARN-8658: -- Fixed indentation issues from merging conflicts. yarn-common unit test failure is unrelated. > [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor > -- > > Key: YARN-8658 > URL: https://issues.apache.org/jira/browse/YARN-8658 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Young Chen >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-8658-branch-2.09.patch, > YARN-8658-branch-2.10.patch, YARN-8658.01.patch, YARN-8658.02.patch, > YARN-8658.03.patch, YARN-8658.04.patch, YARN-8658.05.patch, > YARN-8658.06.patch, YARN-8658.07.patch, YARN-8658.08.patch, YARN-8658.09.patch > > > AMRMClientRelayer (YARN-7900) is introduced for stateful > FederationInterceptor (YARN-7899), to keep track of all pending requests sent > to every subcluster YarnRM. We need to add metrics for AMRMClientRelayer to > show the state of things in FederationInterceptor. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8658) [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor
[ https://issues.apache.org/jira/browse/YARN-8658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Young Chen updated YARN-8658: - Attachment: YARN-8658-branch-2.10.patch > [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor > -- > > Key: YARN-8658 > URL: https://issues.apache.org/jira/browse/YARN-8658 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Young Chen >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-8658-branch-2.09.patch, > YARN-8658-branch-2.10.patch, YARN-8658.01.patch, YARN-8658.02.patch, > YARN-8658.03.patch, YARN-8658.04.patch, YARN-8658.05.patch, > YARN-8658.06.patch, YARN-8658.07.patch, YARN-8658.08.patch, YARN-8658.09.patch > > > AMRMClientRelayer (YARN-7900) is introduced for stateful > FederationInterceptor (YARN-7899), to keep track of all pending requests sent > to every subcluster YarnRM. We need to add metrics for AMRMClientRelayer to > show the state of things in FederationInterceptor. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7599) [GPG] ApplicationCleaner in Global Policy Generator
[ https://issues.apache.org/jira/browse/YARN-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-7599: --- Attachment: YARN-7599-YARN-7402.v7.patch > [GPG] ApplicationCleaner in Global Policy Generator > --- > > Key: YARN-7599 > URL: https://issues.apache.org/jira/browse/YARN-7599 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Minor > Labels: federation, gpg > Attachments: YARN-7599-YARN-7402.v1.patch, > YARN-7599-YARN-7402.v2.patch, YARN-7599-YARN-7402.v3.patch, > YARN-7599-YARN-7402.v4.patch, YARN-7599-YARN-7402.v5.patch, > YARN-7599-YARN-7402.v6.patch, YARN-7599-YARN-7402.v7.patch > > > In Federation, we need a cleanup service for StateStore as well as Yarn > Registry. For the former, we need to remove old application records. For the > latter, failed and killed applications might leave records in the Yarn > Registry (see YARN-6128). We plan to do both cleanup work in > ApplicationCleaner in GPG -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8807) FairScheduler crashes RM with oversubscription turned on if an application is killed.
[ https://issues.apache.org/jira/browse/YARN-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8807: - Attachment: YARN-8807-YARN-1011.00.patch > FairScheduler crashes RM with oversubscription turned on if an application is > killed. > - > > Key: YARN-8807 > URL: https://issues.apache.org/jira/browse/YARN-8807 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler, resourcemanager >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-8807-YARN-1011.00.patch > > > When an application, that has got opportunistic containers allocated, is > killed, its containers are not released immediately. > Fair scheduler would therefore continue to try to promote such orphaned > containers, which results in NPE. > {code:java} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptToAssignReservedResourcesOrPromoteOpportunisticContainers(FairScheduler.java:1158) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1129) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:1001) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1275) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testKillingApplicationWithOpportunisticContainersAssigned(TestFairScheduler.java:4019){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8807) FairScheduler crashes RM with oversubscription turned on if an application is killed.
[ https://issues.apache.org/jira/browse/YARN-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8807: - Description: When an application, that has got opportunistic containers allocated, is killed, its containers are not released immediately. Fair scheduler would therefore continue to try to promote such orphaned containers, which results in NPE. {code:java} java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptToAssignReservedResourcesOrPromoteOpportunisticContainers(FairScheduler.java:1158) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1129) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:1001) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1275) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testKillingApplicationWithOpportunisticContainersAssigned(TestFairScheduler.java:4019){code} > FairScheduler crashes RM with oversubscription turned on if an application is > killed. > - > > Key: YARN-8807 > URL: https://issues.apache.org/jira/browse/YARN-8807 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler, resourcemanager >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > > When an application, that has got opportunistic containers allocated, is > killed, its containers are not released immediately. > Fair scheduler would therefore continue to try to promote such orphaned > containers, which results in NPE. > {code:java} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptToAssignReservedResourcesOrPromoteOpportunisticContainers(FairScheduler.java:1158) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1129) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:1001) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1275) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testKillingApplicationWithOpportunisticContainersAssigned(TestFairScheduler.java:4019){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-8785) Error Message "Invalid docker rw mount" not helpful
[ https://issues.apache.org/jira/browse/YARN-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Prewo reassigned YARN-8785: - Assignee: Simon Prewo > Error Message "Invalid docker rw mount" not helpful > --- > > Key: YARN-8785 > URL: https://issues.apache.org/jira/browse/YARN-8785 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.9.1, 3.1.1 >Reporter: Simon Prewo >Assignee: Simon Prewo >Priority: Major > Labels: Docker > Original Estimate: 2h > Remaining Estimate: 2h > > A user recieves the error message _Invalid docker rw mount_ when a container > tries to mount a directory which is not configured in property > *docker.allowed.rw-mounts*. > {code:java} > Invalid docker rw mount > '/usr/local/hadoop/logs/userlogs/application_1536476159258_0004/container_1536476159258_0004_02_01:/usr/local/hadoop/logs/userlogs/application_1536476159258_0004/container_1536476159258_0004_02_01', > > realpath=/usr/local/hadoop/logs/userlogs/application_1536476159258_0004/container_1536476159258_0004_02_01{code} > The error message makes the user think "It is not possible due to a docker > issue". My suggestion would be to put there a message like *Configuration of > the container executor does not allow mounting directory.*. > hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c > CURRENT: > {code:java} > permitted_rw = check_mount_permitted((const char **) permitted_rw_mounts, > mount_src); > permitted_ro = check_mount_permitted((const char **) permitted_ro_mounts, > mount_src); > if (permitted_ro == -1 || permitted_rw == -1) { > fprintf(ERRORFILE, "Invalid docker mount '%s', realpath=%s\n", > values[i], mount_src); > ... > {code} > NEW: > {code:java} > permitted_rw = check_mount_permitted((const char **) permitted_rw_mounts, > mount_src); > permitted_ro = check_mount_permitted((const char **) permitted_ro_mounts, > mount_src); > if (permitted_ro == -1 || permitted_rw == -1) { > fprintf(ERRORFILE, "Configuration of the container executor does not > allow mounting directory '%s', realpath=%s\n", values[i], mount_src); > ... > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8789) Add BoundedQueue to AsyncDispatcher
[ https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622719#comment-16622719 ] Hadoop QA commented on YARN-8789: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 8 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 34s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 6s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 13m 54s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 47s{color} | {color:orange} root: The patch generated 10 new + 814 unchanged - 11 fixed = 824 total (was 825) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 28s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 33s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 5m 0s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}155m 26s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 8s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 11m 9s{color} | {color:red} hadoop-mapreduce-client-app in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}277m 25s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.event.TestAsyncDispatcher | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueManagementDynamicEditPolicy | | | hadoop.yarn.server.resourcemanager.TestOpportunisticContainerAllocatorAMService | | | hadoop.mapreduce.v2.app.TestKill | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-8789 | | JIRA Patch URL |
[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command
[ https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622690#comment-16622690 ] Eric Badger commented on YARN-8777: --- Thanks for the updated patch, [~eyang]! {noformat} + launch_command = get_configuration_values_delimiter("launch-command", DOCKER_COMMAND_FILE_SECTION, _config, + ","); {noformat} I think if we split on "," then we will get into the same situation as YARN-8805. > Container Executor C binary change to execute interactive docker command > > > Key: YARN-8777 > URL: https://issues.apache.org/jira/browse/YARN-8777 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Eric Yang >Priority: Major > Labels: Docker > Attachments: YARN-8777.001.patch, YARN-8777.002.patch, > YARN-8777.003.patch > > > Since Container Executor provides Container execution using the native > container-executor binary, we also need to make changes to accept new > “dockerExec” method to invoke the corresponding native function to execute > docker exec command to the running container. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8658) [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor
[ https://issues.apache.org/jira/browse/YARN-8658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622653#comment-16622653 ] Hadoop QA commented on YARN-8658: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 26m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 43s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 30s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 23s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 22s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 34s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 13 new + 1 unchanged - 0 fixed = 14 total (was 1) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 36s{color} | {color:red} hadoop-yarn-server-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 15m 35s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 72m 28s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.metrics.TestAMRMClientRelayerMetrics | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:a716388 | | JIRA Issue | YARN-8658 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12940649/YARN-8658-branch-2.09.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux f44884ef351c 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-2 / 3a6ad9c | | maven | version: Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) | | Default Java | 1.7.0_181 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/21914/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/21914/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-common.txt | | Test Results
[jira] [Commented] (YARN-8809) Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers are released.
[ https://issues.apache.org/jira/browse/YARN-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622648#comment-16622648 ] Hadoop QA commented on YARN-8809: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} YARN-1011 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 33m 10s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 46s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 34s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 26s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} YARN-1011 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 15s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 7s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}147m 48s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer | | | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | | | hadoop.yarn.server.resourcemanager.TestRM | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:c6870a1 | | JIRA Issue | YARN-8809 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12940632/YARN-8809-YARN-1011.00.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux a42db4adcc6b 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | YARN-1011 / 36ec27e | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/21912/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21912/testReport/ | | Max.
[jira] [Commented] (YARN-8468) Enable the use of queue based maximum container allocation limit and implement it in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622605#comment-16622605 ] Hadoop QA commented on YARN-8468: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 15 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 8s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 43s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 31s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 17s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 19 new + 880 unchanged - 22 fixed = 899 total (was 902) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 46s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 73m 37s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 22s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}143m 16s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-8468 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12940629/YARN-8468.016.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs
[jira] [Commented] (YARN-8805) Automatically convert the launch command to the exec form when using entrypoint support
[ https://issues.apache.org/jira/browse/YARN-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622555#comment-16622555 ] Zian Chen commented on YARN-8805: - Thanks [~shaneku...@gmail.com], I'll work on the patch > Automatically convert the launch command to the exec form when using > entrypoint support > --- > > Key: YARN-8805 > URL: https://issues.apache.org/jira/browse/YARN-8805 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Shane Kumpf >Assignee: Zian Chen >Priority: Major > Labels: Docker > > When {{YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE}} is true, and a > launch command is provided, it is expected that the launch command is > provided by the user in exec form. > For example: > {code:java} > "/usr/bin/sleep 6000"{code} > must be changed to: > {code}"/usr/bin/sleep,6000"{code} > If this is not done, the container will never start and will be in a Created > state. We should automatically do this conversion vs making the user > understand this nuance of using the entrypoint support. Docs should be > updated to reflect this change. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-8805) Automatically convert the launch command to the exec form when using entrypoint support
[ https://issues.apache.org/jira/browse/YARN-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zian Chen reassigned YARN-8805: --- Assignee: Zian Chen > Automatically convert the launch command to the exec form when using > entrypoint support > --- > > Key: YARN-8805 > URL: https://issues.apache.org/jira/browse/YARN-8805 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Shane Kumpf >Assignee: Zian Chen >Priority: Major > Labels: Docker > > When {{YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE}} is true, and a > launch command is provided, it is expected that the launch command is > provided by the user in exec form. > For example: > {code:java} > "/usr/bin/sleep 6000"{code} > must be changed to: > {code}"/usr/bin/sleep,6000"{code} > If this is not done, the container will never start and will be in a Created > state. We should automatically do this conversion vs making the user > understand this nuance of using the entrypoint support. Docs should be > updated to reflect this change. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8665) Yarn Service Upgrade: Support cancelling upgrade
[ https://issues.apache.org/jira/browse/YARN-8665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandni Singh updated YARN-8665: Attachment: YARN-8665.003.patch > Yarn Service Upgrade: Support cancelling upgrade > - > > Key: YARN-8665 > URL: https://issues.apache.org/jira/browse/YARN-8665 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8665.001.patch, YARN-8665.002.patch, > YARN-8665.003.patch > > > When a service is upgraded without auto-finalization or express upgrade, then > the upgrade can be cancelled. This provides the user ability to test upgrade > of a single instance and if that doesn't go well, they get a chance to cancel > it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Reopened] (YARN-8658) [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor
[ https://issues.apache.org/jira/browse/YARN-8658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola reopened YARN-8658: Re-opening the Jira for Branch-2 patch. > [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor > -- > > Key: YARN-8658 > URL: https://issues.apache.org/jira/browse/YARN-8658 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Young Chen >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-8658-branch-2.09.patch, YARN-8658.01.patch, > YARN-8658.02.patch, YARN-8658.03.patch, YARN-8658.04.patch, > YARN-8658.05.patch, YARN-8658.06.patch, YARN-8658.07.patch, > YARN-8658.08.patch, YARN-8658.09.patch > > > AMRMClientRelayer (YARN-7900) is introduced for stateful > FederationInterceptor (YARN-7899), to keep track of all pending requests sent > to every subcluster YarnRM. We need to add metrics for AMRMClientRelayer to > show the state of things in FederationInterceptor. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8658) [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor
[ https://issues.apache.org/jira/browse/YARN-8658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Young Chen updated YARN-8658: - Attachment: YARN-8658-branch-2.09.patch > [AMRMProxy] Metrics for AMRMClientRelayer inside FederationInterceptor > -- > > Key: YARN-8658 > URL: https://issues.apache.org/jira/browse/YARN-8658 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Young Chen >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-8658-branch-2.09.patch, YARN-8658.01.patch, > YARN-8658.02.patch, YARN-8658.03.patch, YARN-8658.04.patch, > YARN-8658.05.patch, YARN-8658.06.patch, YARN-8658.07.patch, > YARN-8658.08.patch, YARN-8658.09.patch > > > AMRMClientRelayer (YARN-7900) is introduced for stateful > FederationInterceptor (YARN-7899), to keep track of all pending requests sent > to every subcluster YarnRM. We need to add metrics for AMRMClientRelayer to > show the state of things in FederationInterceptor. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5215) Scheduling containers based on external load in the servers
[ https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622518#comment-16622518 ] Hadoop QA commented on YARN-5215: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s{color} | {color:red} YARN-5215 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-5215 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12835650/YARN-5215.002.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21913/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Scheduling containers based on external load in the servers > --- > > Key: YARN-5215 > URL: https://issues.apache.org/jira/browse/YARN-5215 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Labels: oct16-hard > Attachments: YARN-5215.000.patch, YARN-5215.001.patch, > YARN-5215.002.patch > > > Currently YARN runs containers in the servers assuming that they own all the > resources. The proposal is to use the utilization information in the node and > the containers to estimate how much is consumed by external processes and > schedule based on this estimation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8805) Automatically convert the launch command to the exec form when using entrypoint support
[ https://issues.apache.org/jira/browse/YARN-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622503#comment-16622503 ] Shane Kumpf commented on YARN-8805: --- Thanks, [~Zian Chen]. Please feel free to take this. > Automatically convert the launch command to the exec form when using > entrypoint support > --- > > Key: YARN-8805 > URL: https://issues.apache.org/jira/browse/YARN-8805 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Shane Kumpf >Priority: Major > Labels: Docker > > When {{YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE}} is true, and a > launch command is provided, it is expected that the launch command is > provided by the user in exec form. > For example: > {code:java} > "/usr/bin/sleep 6000"{code} > must be changed to: > {code}"/usr/bin/sleep,6000"{code} > If this is not done, the container will never start and will be in a Created > state. We should automatically do this conversion vs making the user > understand this nuance of using the entrypoint support. Docs should be > updated to reflect this change. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8805) Automatically convert the launch command to the exec form when using entrypoint support
[ https://issues.apache.org/jira/browse/YARN-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622499#comment-16622499 ] Zian Chen commented on YARN-8805: - Yes, just checked the latest released doc, [https://hadoop.apache.org/docs/r3.1.1/hadoop-yarn/hadoop-yarn-site/yarn-service/Examples.html,] format needs to be fixed. Also agree with [~shaneku...@gmail.com], we should make the convert automatically when YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE is set to true. Would you like to provide a patch for this [~shaneku...@gmail.com], or I can help > Automatically convert the launch command to the exec form when using > entrypoint support > --- > > Key: YARN-8805 > URL: https://issues.apache.org/jira/browse/YARN-8805 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Shane Kumpf >Priority: Major > Labels: Docker > > When {{YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE}} is true, and a > launch command is provided, it is expected that the launch command is > provided by the user in exec form. > For example: > {code:java} > "/usr/bin/sleep 6000"{code} > must be changed to: > {code}"/usr/bin/sleep,6000"{code} > If this is not done, the container will never start and will be in a Created > state. We should automatically do this conversion vs making the user > understand this nuance of using the entrypoint support. Docs should be > updated to reflect this change. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8785) Error Message "Invalid docker rw mount" not helpful
[ https://issues.apache.org/jira/browse/YARN-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622495#comment-16622495 ] Zian Chen commented on YARN-8785: - Hi [~simonprewo], would you like to work on this Jira and provide a patch? Or I can help with it. > Error Message "Invalid docker rw mount" not helpful > --- > > Key: YARN-8785 > URL: https://issues.apache.org/jira/browse/YARN-8785 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.9.1, 3.1.1 >Reporter: Simon Prewo >Priority: Major > Labels: Docker > Original Estimate: 2h > Remaining Estimate: 2h > > A user recieves the error message _Invalid docker rw mount_ when a container > tries to mount a directory which is not configured in property > *docker.allowed.rw-mounts*. > {code:java} > Invalid docker rw mount > '/usr/local/hadoop/logs/userlogs/application_1536476159258_0004/container_1536476159258_0004_02_01:/usr/local/hadoop/logs/userlogs/application_1536476159258_0004/container_1536476159258_0004_02_01', > > realpath=/usr/local/hadoop/logs/userlogs/application_1536476159258_0004/container_1536476159258_0004_02_01{code} > The error message makes the user think "It is not possible due to a docker > issue". My suggestion would be to put there a message like *Configuration of > the container executor does not allow mounting directory.*. > hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c > CURRENT: > {code:java} > permitted_rw = check_mount_permitted((const char **) permitted_rw_mounts, > mount_src); > permitted_ro = check_mount_permitted((const char **) permitted_ro_mounts, > mount_src); > if (permitted_ro == -1 || permitted_rw == -1) { > fprintf(ERRORFILE, "Invalid docker mount '%s', realpath=%s\n", > values[i], mount_src); > ... > {code} > NEW: > {code:java} > permitted_rw = check_mount_permitted((const char **) permitted_rw_mounts, > mount_src); > permitted_ro = check_mount_permitted((const char **) permitted_ro_mounts, > mount_src); > if (permitted_ro == -1 || permitted_rw == -1) { > fprintf(ERRORFILE, "Configuration of the container executor does not > allow mounting directory '%s', realpath=%s\n", values[i], mount_src); > ... > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8804) resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues
[ https://issues.apache.org/jira/browse/YARN-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622460#comment-16622460 ] Wangda Tan commented on YARN-8804: -- Thanks [~Tao Yang] for the nice analysis, I'm not sure if YARN-8513 caused by this or similar issue. From my analysis of YARN-8513, scheduler tries to allocate containers to queue when it will go beyond max capacity (used + allocating > max). But resource committer will reject such proposal. > resourceLimits may be wrongly calculated when leaf-queue is blocked in > cluster with 3+ level queues > --- > > Key: YARN-8804 > URL: https://issues.apache.org/jira/browse/YARN-8804 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8804.001.patch > > > This problem is due to YARN-4280, parent queue will deduct child queue's > headroom when the child queue reached its resource limit and the skipped type > is QUEUE_LIMIT, the resource limits of deepest parent queue will be correctly > calculated, but for non-deepest parent queue, its headroom may be much more > than the sum of reached-limit child queues' headroom, so that the resource > limit of non-deepest parent may be much less than its true value and block > the allocation for later queues. > To reproduce this problem with UT: > (1) Cluster has two nodes whose node resource both are <10GB, 10core> and > 3-level queues as below, among them max-capacity of "c1" is 10 and others are > all 100, so that max-capacity of queue "c1" is <2GB, 2core> > {noformat} > Root > / | \ > a bc >10 20 70 > | \ > c1 c2 > 10(max=10) 90 > {noformat} > (2) Submit app1 to queue "c1" and launch am1(resource=<1GB, 1 core>) on nm1 > (3) Submit app2 to queue "b" and launch am2(resource=<1GB, 1 core>) on nm1 > (4) app1 and app2 both ask one <2GB, 1core> containers. > (5) nm1 do 1 heartbeat > Now queue "c" has lower capacity percentage than queue "b", the allocation > sequence will be "a" -> "c" -> "b", > queue "c1" has reached queue limit so that requests of app1 should be > pending, > headroom of queue "c1" is <1GB, 1core> (=max-capacity - used), > headroom of queue "c" is <18GB, 18core> (=max-capacity - used), > after allocation for queue "c", resource limit of queue "b" will be wrongly > calculated as <2GB, 2core>, > headroom of queue "b" will be <1GB, 1core> (=resource-limit - used) > so that scheduler won't allocate one container for app2 on nm1 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7225) Add queue and partition info to RM audit log
[ https://issues.apache.org/jira/browse/YARN-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622458#comment-16622458 ] Eric Payne commented on YARN-7225: -- Thanks [~jhung] for raising this issue. This came up for us recently when trying to debug a tricky resource allocation problem in the Capacity Scheduler. I wanted to track which containers were being assigned to which queues during a specific time period, but in the RM audit log, there is nowhere that links app, app attempt, or container events to the queue. I would like to add the queue name to at least the following audited events: {noformat} "AM Allocated Container"; "AM Released Container"; "Submit Application Request"; {noformat} There are many other possible events into which we could include the queue name, but I think the above 3 are the only necessary ones. > Add queue and partition info to RM audit log > > > Key: YARN-7225 > URL: https://issues.apache.org/jira/browse/YARN-7225 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Jonathan Hung >Priority: Major > > Right now RM audit log has fields such as user, ip, resource, etc. Having > queue and partition is useful for resource tracking. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-1011) [Umbrella] Schedule containers based on utilization of currently allocated containers
[ https://issues.apache.org/jira/browse/YARN-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622450#comment-16622450 ] Arun Suresh edited comment on YARN-1011 at 9/20/18 6:11 PM: Planning on spending more cycles on this now. Looking at the SubTasks, it looks like some of them are already committed to trunk - mostly the ones pertaining to ResourceUtilization plumbing and NM CGroups based improvements.. Wondering if it is ok to move those into another umbrella JIRA ? was (Author: asuresh): Planning on spending more cycles on this now. Looking at the SubTasks, it looks like some of them are already committed - mostly the ones pertaining to ResourceUtilization plumbing and NM CGroups based improvements.. Wondering if it is ok to move those into another umbrella JIRA ? > [Umbrella] Schedule containers based on utilization of currently allocated > containers > - > > Key: YARN-1011 > URL: https://issues.apache.org/jira/browse/YARN-1011 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Arun C Murthy >Assignee: Karthik Kambatla >Priority: Major > Attachments: patch-for-yarn-1011.patch, yarn-1011-design-v0.pdf, > yarn-1011-design-v1.pdf, yarn-1011-design-v2.pdf, yarn-1011-design-v3.pdf > > > Currently RM allocates containers and assumes resources allocated are > utilized. > RM can, and should, get to a point where it measures utilization of allocated > containers and, if appropriate, allocate more (speculative?) containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1011) [Umbrella] Schedule containers based on utilization of currently allocated containers
[ https://issues.apache.org/jira/browse/YARN-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622450#comment-16622450 ] Arun Suresh commented on YARN-1011: --- Planning on spending more cycles on this now. Looking at the SubTasks, it looks like some of them are already committed - mostly the ones pertaining to ResourceUtilization plumbing and NM CGroups based improvements.. Wondering if it is ok to move those into another umbrella JIRA ? > [Umbrella] Schedule containers based on utilization of currently allocated > containers > - > > Key: YARN-1011 > URL: https://issues.apache.org/jira/browse/YARN-1011 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Arun C Murthy >Assignee: Karthik Kambatla >Priority: Major > Attachments: patch-for-yarn-1011.patch, yarn-1011-design-v0.pdf, > yarn-1011-design-v1.pdf, yarn-1011-design-v2.pdf, yarn-1011-design-v3.pdf > > > Currently RM allocates containers and assumes resources allocated are > utilized. > RM can, and should, get to a point where it measures utilization of allocated > containers and, if appropriate, allocate more (speculative?) containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8809) Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers are released.
[ https://issues.apache.org/jira/browse/YARN-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622445#comment-16622445 ] Arun Suresh commented on YARN-8809: --- Thanks for the patch [~haibochen].. >From the patch, it is not obvious what the change is that allows the FS to >correctly account for completed containers. Are you just extending the testcase to verify ? > Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers > are released. > --- > > Key: YARN-8809 > URL: https://issues.apache.org/jira/browse/YARN-8809 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-8809-YARN-1011.00.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8801) java doc comments in docker-util.h is confusing
[ https://issues.apache.org/jira/browse/YARN-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622438#comment-16622438 ] Zian Chen commented on YARN-8801: - Thank you [~eyang] > java doc comments in docker-util.h is confusing > --- > > Key: YARN-8801 > URL: https://issues.apache.org/jira/browse/YARN-8801 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Zian Chen >Priority: Minor > Labels: Docker > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-8801.001.patch > > > {code:java} > /** > + * Get the Docker exec command line string. The function will verify that > the params file is meant for the exec command. > + * @param command_file File containing the params for the Docker start > command > + * @param conf Configuration struct containing the container-executor.cfg > details > + * @param out Buffer to fill with the exec command > + * @param outlen Size of the output buffer > + * @return Return code with 0 indicating success and non-zero codes > indicating error > + */ > +int get_docker_exec_command(const char* command_file, const struct > configuration* conf, args *args);{code} > The method param list have out an outlen which didn't match the signature, > and we miss description for param args. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8468) Enable the use of queue based maximum container allocation limit and implement it in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622433#comment-16622433 ] Antal Bálint Steinbach commented on YARN-8468: -- Hi [~leftnoteasy] Sure, I updated the title to "Enable the use of queue based maximum container allocation limit and implement it in FairScheduler" Added to the description to implementation steps: Enforce the use of queue based maximum allocation limit if it is available, if not use the general scheduler level setting * Use it during validation and normalization of requests in scheduler.allocate, app submit and resource request > Enable the use of queue based maximum container allocation limit and > implement it in FairScheduler > -- > > Key: YARN-8468 > URL: https://issues.apache.org/jira/browse/YARN-8468 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Critical > Attachments: YARN-8468.000.patch, YARN-8468.001.patch, > YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, > YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, > YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, > YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, > YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch > > > When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" > to limit the overall size of a container. This applies globally to all > containers and cannot be limited by queue or and is not scheduler dependent. > The goal of this ticket is to allow this value to be set on a per queue basis. > The use case: User has two pools, one for ad hoc jobs and one for enterprise > apps. User wants to limit ad hoc jobs to small containers but allow > enterprise apps to request as many resources as needed. Setting > yarn.scheduler.maximum-allocation-mb sets a default value for maximum > container size for all queues and setting maximum resources per queue with > “maxContainerResources” queue config value. > Suggested solution: > All the infrastructure is already in the code. We need to do the following: > * add the setting to the queue properties for all queue types (parent and > leaf), this will cover dynamically created queues. > * if we set it on the root we override the scheduler setting and we should > not allow that. > * make sure that queue resource cap can not be larger than scheduler max > resource cap in the config. > * implement getMaximumResourceCapability(String queueName) in the > FairScheduler > * implement getMaximumResourceCapability(String queueName) in both > FSParentQueue and FSLeafQueue as follows > * expose the setting in the queue information in the RM web UI. > * expose the setting in the metrics etc for the queue. > * Enforce the use of queue based maximum allocation limit if it is > available, if not use the general scheduler level setting > ** Use it during validation and normalization of requests in > scheduler.allocate, app submit and resource request -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8468) Enable the use of queue based maximum container allocation limit and implement it in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antal Bálint Steinbach updated YARN-8468: - Description: When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" to limit the overall size of a container. This applies globally to all containers and cannot be limited by queue or and is not scheduler dependent. The goal of this ticket is to allow this value to be set on a per queue basis. The use case: User has two pools, one for ad hoc jobs and one for enterprise apps. User wants to limit ad hoc jobs to small containers but allow enterprise apps to request as many resources as needed. Setting yarn.scheduler.maximum-allocation-mb sets a default value for maximum container size for all queues and setting maximum resources per queue with “maxContainerResources” queue config value. Suggested solution: All the infrastructure is already in the code. We need to do the following: * add the setting to the queue properties for all queue types (parent and leaf), this will cover dynamically created queues. * if we set it on the root we override the scheduler setting and we should not allow that. * make sure that queue resource cap can not be larger than scheduler max resource cap in the config. * implement getMaximumResourceCapability(String queueName) in the FairScheduler * implement getMaximumResourceCapability(String queueName) in both FSParentQueue and FSLeafQueue as follows * expose the setting in the queue information in the RM web UI. * expose the setting in the metrics etc for the queue. * Enforce the use of queue based maximum allocation limit if it is available, if not use the general scheduler level setting ** Use it during validation and normalization of requests in scheduler.allocate, app submit and resource request was: When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" to limit the overall size of a container. This applies globally to all containers and cannot be limited by queue or and is not scheduler dependent. The goal of this ticket is to allow this value to be set on a per queue basis. The use case: User has two pools, one for ad hoc jobs and one for enterprise apps. User wants to limit ad hoc jobs to small containers but allow enterprise apps to request as many resources as needed. Setting yarn.scheduler.maximum-allocation-mb sets a default value for maximum container size for all queues and setting maximum resources per queue with “maxContainerResources” queue config value. Suggested solution: All the infrastructure is already in the code. We need to do the following: * add the setting to the queue properties for all queue types (parent and leaf), this will cover dynamically created queues. * if we set it on the root we override the scheduler setting and we should not allow that. * make sure that queue resource cap can not be larger than scheduler max resource cap in the config. * implement getMaximumResourceCapability(String queueName) in the FairScheduler * implement getMaximumResourceCapability(String queueName) in both FSParentQueue and FSLeafQueue as follows * expose the setting in the queue information in the RM web UI. * expose the setting in the metrics etc for the queue. * Enforce the use of queue based maximum allocation limit if available, if not use the general scheduler level setting ** Use it during validation and normalization of requests in scheduler.allocate, app submit and resource request > Enable the use of queue based maximum container allocation limit and > implement it in FairScheduler > -- > > Key: YARN-8468 > URL: https://issues.apache.org/jira/browse/YARN-8468 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Critical > Attachments: YARN-8468.000.patch, YARN-8468.001.patch, > YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, > YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, > YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, > YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, > YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch > > > When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" > to limit the overall size of a container. This applies globally to all > containers and cannot be limited by queue or and is not scheduler dependent. > The goal of this ticket is to allow this value to be set on a per queue basis. > The use case: User has two pools, one for ad hoc jobs and one for enterprise > apps. User wants to limit
[jira] [Updated] (YARN-8468) Enable the use of queue based maximum allocation limit and implement it in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antal Bálint Steinbach updated YARN-8468: - Summary: Enable the use of queue based maximum allocation limit and implement it in FairScheduler (was: Enable queue based maximum allocation limit and implement it in FairScheduler) > Enable the use of queue based maximum allocation limit and implement it in > FairScheduler > > > Key: YARN-8468 > URL: https://issues.apache.org/jira/browse/YARN-8468 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Critical > Attachments: YARN-8468.000.patch, YARN-8468.001.patch, > YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, > YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, > YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, > YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, > YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch > > > When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" > to limit the overall size of a container. This applies globally to all > containers and cannot be limited by queue or and is not scheduler dependent. > The goal of this ticket is to allow this value to be set on a per queue basis. > The use case: User has two pools, one for ad hoc jobs and one for enterprise > apps. User wants to limit ad hoc jobs to small containers but allow > enterprise apps to request as many resources as needed. Setting > yarn.scheduler.maximum-allocation-mb sets a default value for maximum > container size for all queues and setting maximum resources per queue with > “maxContainerResources” queue config value. > Suggested solution: > All the infrastructure is already in the code. We need to do the following: > * add the setting to the queue properties for all queue types (parent and > leaf), this will cover dynamically created queues. > * if we set it on the root we override the scheduler setting and we should > not allow that. > * make sure that queue resource cap can not be larger than scheduler max > resource cap in the config. > * implement getMaximumResourceCapability(String queueName) in the > FairScheduler > * implement getMaximumResourceCapability(String queueName) in both > FSParentQueue and FSLeafQueue as follows > * expose the setting in the queue information in the RM web UI. > * expose the setting in the metrics etc for the queue. > * Enforce the use of queue based maximum allocation limit if available, if > not use the general scheduler level setting > ** Use it during validation and normalization of requests in > scheduler.allocate, app submit and resource request -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8468) Enable the use of queue based maximum container allocation limit and implement it in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antal Bálint Steinbach updated YARN-8468: - Summary: Enable the use of queue based maximum container allocation limit and implement it in FairScheduler (was: Enable the use of queue based maximum allocation limit and implement it in FairScheduler) > Enable the use of queue based maximum container allocation limit and > implement it in FairScheduler > -- > > Key: YARN-8468 > URL: https://issues.apache.org/jira/browse/YARN-8468 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Critical > Attachments: YARN-8468.000.patch, YARN-8468.001.patch, > YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, > YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, > YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, > YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, > YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch > > > When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" > to limit the overall size of a container. This applies globally to all > containers and cannot be limited by queue or and is not scheduler dependent. > The goal of this ticket is to allow this value to be set on a per queue basis. > The use case: User has two pools, one for ad hoc jobs and one for enterprise > apps. User wants to limit ad hoc jobs to small containers but allow > enterprise apps to request as many resources as needed. Setting > yarn.scheduler.maximum-allocation-mb sets a default value for maximum > container size for all queues and setting maximum resources per queue with > “maxContainerResources” queue config value. > Suggested solution: > All the infrastructure is already in the code. We need to do the following: > * add the setting to the queue properties for all queue types (parent and > leaf), this will cover dynamically created queues. > * if we set it on the root we override the scheduler setting and we should > not allow that. > * make sure that queue resource cap can not be larger than scheduler max > resource cap in the config. > * implement getMaximumResourceCapability(String queueName) in the > FairScheduler > * implement getMaximumResourceCapability(String queueName) in both > FSParentQueue and FSLeafQueue as follows > * expose the setting in the queue information in the RM web UI. > * expose the setting in the metrics etc for the queue. > * Enforce the use of queue based maximum allocation limit if available, if > not use the general scheduler level setting > ** Use it during validation and normalization of requests in > scheduler.allocate, app submit and resource request -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8808) Use aggregate container utilization instead of node utilization to determine resources available for oversubscription
[ https://issues.apache.org/jira/browse/YARN-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8808: - Attachment: (was: YARN-8808-YARN-1011.00.patch) > Use aggregate container utilization instead of node utilization to determine > resources available for oversubscription > - > > Key: YARN-8808 > URL: https://issues.apache.org/jira/browse/YARN-8808 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > > Resource oversubscription should be bound to the amount of the resources that > can be allocated to containers, hence the allocation threshold should be with > respect to aggregate container utilization. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8468) Enable queue based maximum allocation limit and implement it in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antal Bálint Steinbach updated YARN-8468: - Description: When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" to limit the overall size of a container. This applies globally to all containers and cannot be limited by queue or and is not scheduler dependent. The goal of this ticket is to allow this value to be set on a per queue basis. The use case: User has two pools, one for ad hoc jobs and one for enterprise apps. User wants to limit ad hoc jobs to small containers but allow enterprise apps to request as many resources as needed. Setting yarn.scheduler.maximum-allocation-mb sets a default value for maximum container size for all queues and setting maximum resources per queue with “maxContainerResources” queue config value. Suggested solution: All the infrastructure is already in the code. We need to do the following: * add the setting to the queue properties for all queue types (parent and leaf), this will cover dynamically created queues. * if we set it on the root we override the scheduler setting and we should not allow that. * make sure that queue resource cap can not be larger than scheduler max resource cap in the config. * implement getMaximumResourceCapability(String queueName) in the FairScheduler * implement getMaximumResourceCapability(String queueName) in both FSParentQueue and FSLeafQueue as follows * expose the setting in the queue information in the RM web UI. * expose the setting in the metrics etc for the queue. * Enforce the use of queue based maximum allocation limit if available, if not use the general scheduler level setting ** Use it during validation and normalization of requests in scheduler.allocate, app submit and resource request was: When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" to limit the overall size of a container. This applies globally to all containers and cannot be limited by queue or and is not scheduler dependent. The goal of this ticket is to allow this value to be set on a per queue basis. The use case: User has two pools, one for ad hoc jobs and one for enterprise apps. User wants to limit ad hoc jobs to small containers but allow enterprise apps to request as many resources as needed. Setting yarn.scheduler.maximum-allocation-mb sets a default value for maximum container size for all queues and setting maximum resources per queue with “maxContainerResources” queue config value. Suggested solution: All the infrastructure is already in the code. We need to do the following: * add the setting to the queue properties for all queue types (parent and leaf), this will cover dynamically created queues. * if we set it on the root we override the scheduler setting and we should not allow that. * make sure that queue resource cap can not be larger than scheduler max resource cap in the config. * implement getMaximumResourceCapability(String queueName) in the FairScheduler * implement getMaximumResourceCapability() in both FSParentQueue and FSLeafQueue as follows * expose the setting in the queue information in the RM web UI. * expose the setting in the metrics etc for the queue. * Enforce in general the use of queue based maximum allocation limit if available, if not use the general scheduler level setting ** Use it during validation and normalization of requests in scheduler.allocate, app submit and resource request > Enable queue based maximum allocation limit and implement it in FairScheduler > - > > Key: YARN-8468 > URL: https://issues.apache.org/jira/browse/YARN-8468 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Critical > Attachments: YARN-8468.000.patch, YARN-8468.001.patch, > YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, > YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, > YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, > YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, > YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch > > > When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" > to limit the overall size of a container. This applies globally to all > containers and cannot be limited by queue or and is not scheduler dependent. > The goal of this ticket is to allow this value to be set on a per queue basis. > The use case: User has two pools, one for ad hoc jobs and one for enterprise > apps. User wants to limit ad hoc jobs to small containers but allow > enterprise
[jira] [Updated] (YARN-8808) Use aggregate container utilization instead of node utilization to determine resources available for oversubscription
[ https://issues.apache.org/jira/browse/YARN-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8808: - Attachment: YARN-8808-YARN-1011.00.patch > Use aggregate container utilization instead of node utilization to determine > resources available for oversubscription > - > > Key: YARN-8808 > URL: https://issues.apache.org/jira/browse/YARN-8808 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-8808-YARN-1011.00.patch > > > Resource oversubscription should be bound to the amount of the resources that > can be allocated to containers, hence the allocation threshold should be with > respect to aggregate container utilization. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8468) Enable queue based maximum allocation limit and implement it in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antal Bálint Steinbach updated YARN-8468: - Description: When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" to limit the overall size of a container. This applies globally to all containers and cannot be limited by queue or and is not scheduler dependent. The goal of this ticket is to allow this value to be set on a per queue basis. The use case: User has two pools, one for ad hoc jobs and one for enterprise apps. User wants to limit ad hoc jobs to small containers but allow enterprise apps to request as many resources as needed. Setting yarn.scheduler.maximum-allocation-mb sets a default value for maximum container size for all queues and setting maximum resources per queue with “maxContainerResources” queue config value. Suggested solution: All the infrastructure is already in the code. We need to do the following: * add the setting to the queue properties for all queue types (parent and leaf), this will cover dynamically created queues. * if we set it on the root we override the scheduler setting and we should not allow that. * make sure that queue resource cap can not be larger than scheduler max resource cap in the config. * implement getMaximumResourceCapability(String queueName) in the FairScheduler * implement getMaximumResourceCapability() in both FSParentQueue and FSLeafQueue as follows * expose the setting in the queue information in the RM web UI. * expose the setting in the metrics etc for the queue. * Enforce in general the use of queue based maximum allocation limit if available, if not use the general scheduler level setting ** Use it during validation and normalization of requests in scheduler.allocate, app submit and resource request was: When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" to limit the overall size of a container. This applies globally to all containers and cannot be limited by queue or and is not scheduler dependent. The goal of this ticket is to allow this value to be set on a per queue basis. The use case: User has two pools, one for ad hoc jobs and one for enterprise apps. User wants to limit ad hoc jobs to small containers but allow enterprise apps to request as many resources as needed. Setting yarn.scheduler.maximum-allocation-mb sets a default value for maximum container size for all queues and setting maximum resources per queue with “maxContainerResources” queue config value. Suggested solution: All the infrastructure is already in the code. We need to do the following: * add the setting to the queue properties for all queue types (parent and leaf), this will cover dynamically created queues. * if we set it on the root we override the scheduler setting and we should not allow that. * make sure that queue resource cap can not be larger than scheduler max resource cap in the config. * implement getMaximumResourceCapability(String queueName) in the FairScheduler * implement getMaximumResourceCapability() in both FSParentQueue and FSLeafQueue as follows * expose the setting in the queue information in the RM web UI. * expose the setting in the metrics etc for the queue. * Enforce in general the use of queue based maximum allocation limit if available, if not use the general scheduler level setting ** Use it during validation and normaloization of > Enable queue based maximum allocation limit and implement it in FairScheduler > - > > Key: YARN-8468 > URL: https://issues.apache.org/jira/browse/YARN-8468 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Critical > Attachments: YARN-8468.000.patch, YARN-8468.001.patch, > YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, > YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, > YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, > YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, > YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch > > > When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" > to limit the overall size of a container. This applies globally to all > containers and cannot be limited by queue or and is not scheduler dependent. > The goal of this ticket is to allow this value to be set on a per queue basis. > The use case: User has two pools, one for ad hoc jobs and one for enterprise > apps. User wants to limit ad hoc jobs to small containers but allow > enterprise apps to request as many resources as needed. Setting >
[jira] [Commented] (YARN-8801) java doc comments in docker-util.h is confusing
[ https://issues.apache.org/jira/browse/YARN-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622383#comment-16622383 ] Hudson commented on YARN-8801: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15031 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15031/]) YARN-8801. Fixed header comments for docker utility functions. (eyang: rev aa4bd493c309f09f8f2ea7449aa33c8b641fb8d2) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.h > java doc comments in docker-util.h is confusing > --- > > Key: YARN-8801 > URL: https://issues.apache.org/jira/browse/YARN-8801 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Zian Chen >Priority: Minor > Labels: Docker > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-8801.001.patch > > > {code:java} > /** > + * Get the Docker exec command line string. The function will verify that > the params file is meant for the exec command. > + * @param command_file File containing the params for the Docker start > command > + * @param conf Configuration struct containing the container-executor.cfg > details > + * @param out Buffer to fill with the exec command > + * @param outlen Size of the output buffer > + * @return Return code with 0 indicating success and non-zero codes > indicating error > + */ > +int get_docker_exec_command(const char* command_file, const struct > configuration* conf, args *args);{code} > The method param list have out an outlen which didn't match the signature, > and we miss description for param args. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8468) Enable queue based maximum allocation limit and implement it in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antal Bálint Steinbach updated YARN-8468: - Description: When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" to limit the overall size of a container. This applies globally to all containers and cannot be limited by queue or and is not scheduler dependent. The goal of this ticket is to allow this value to be set on a per queue basis. The use case: User has two pools, one for ad hoc jobs and one for enterprise apps. User wants to limit ad hoc jobs to small containers but allow enterprise apps to request as many resources as needed. Setting yarn.scheduler.maximum-allocation-mb sets a default value for maximum container size for all queues and setting maximum resources per queue with “maxContainerResources” queue config value. Suggested solution: All the infrastructure is already in the code. We need to do the following: * add the setting to the queue properties for all queue types (parent and leaf), this will cover dynamically created queues. * if we set it on the root we override the scheduler setting and we should not allow that. * make sure that queue resource cap can not be larger than scheduler max resource cap in the config. * implement getMaximumResourceCapability(String queueName) in the FairScheduler * implement getMaximumResourceCapability() in both FSParentQueue and FSLeafQueue as follows * expose the setting in the queue information in the RM web UI. * expose the setting in the metrics etc for the queue. * Enforce in general the use of queue based maximum allocation limit if available, if not use the general scheduler level setting ** Use it during validation and normaloization of was: When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" to limit the overall size of a container. This applies globally to all containers and cannot be limited by queue or and is not scheduler dependent. The goal of this ticket is to allow this value to be set on a per queue basis. The use case: User has two pools, one for ad hoc jobs and one for enterprise apps. User wants to limit ad hoc jobs to small containers but allow enterprise apps to request as many resources as needed. Setting yarn.scheduler.maximum-allocation-mb sets a default value for maximum container size for all queues and setting maximum resources per queue with “maxContainerResources” queue config value. Suggested solution: All the infrastructure is already in the code. We need to do the following: * add the setting to the queue properties for all queue types (parent and leaf), this will cover dynamically created queues. * if we set it on the root we override the scheduler setting and we should not allow that. * make sure that queue resource cap can not be larger than scheduler max resource cap in the config. * implement getMaximumResourceCapability(String queueName) in the FairScheduler * implement getMaximumResourceCapability() in both FSParentQueue and FSLeafQueue as follows * expose the setting in the queue information in the RM web UI. * expose the setting in the metrics etc for the queue. * Enforce the usage of queue based maximum allocation limit > Enable queue based maximum allocation limit and implement it in FairScheduler > - > > Key: YARN-8468 > URL: https://issues.apache.org/jira/browse/YARN-8468 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Critical > Attachments: YARN-8468.000.patch, YARN-8468.001.patch, > YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, > YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, > YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, > YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, > YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch > > > When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" > to limit the overall size of a container. This applies globally to all > containers and cannot be limited by queue or and is not scheduler dependent. > The goal of this ticket is to allow this value to be set on a per queue basis. > The use case: User has two pools, one for ad hoc jobs and one for enterprise > apps. User wants to limit ad hoc jobs to small containers but allow > enterprise apps to request as many resources as needed. Setting > yarn.scheduler.maximum-allocation-mb sets a default value for maximum > container size for all queues and setting maximum resources per queue with > “maxContainerResources” queue config value. >
[jira] [Updated] (YARN-8809) Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers are released.
[ https://issues.apache.org/jira/browse/YARN-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8809: - Attachment: YARN-8809-YARN-1011.00.patch > Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers > are released. > --- > > Key: YARN-8809 > URL: https://issues.apache.org/jira/browse/YARN-8809 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-8809-YARN-1011.00.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8808) Use aggregate container utilization instead of node utilization to determine resources available for oversubscription
[ https://issues.apache.org/jira/browse/YARN-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-8808: - Description: Resource oversubscription should be bound to the amount of the resources that can be allocated to containers, hence the allocation threshold should be with respect to aggregate container utilization. > Use aggregate container utilization instead of node utilization to determine > resources available for oversubscription > - > > Key: YARN-8808 > URL: https://issues.apache.org/jira/browse/YARN-8808 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: YARN-1011 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > > Resource oversubscription should be bound to the amount of the resources that > can be allocated to containers, hence the allocation threshold should be with > respect to aggregate container utilization. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8468) Enable queue based maximum allocation limit and implement it in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antal Bálint Steinbach updated YARN-8468: - Description: When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" to limit the overall size of a container. This applies globally to all containers and cannot be limited by queue or and is not scheduler dependent. The goal of this ticket is to allow this value to be set on a per queue basis. The use case: User has two pools, one for ad hoc jobs and one for enterprise apps. User wants to limit ad hoc jobs to small containers but allow enterprise apps to request as many resources as needed. Setting yarn.scheduler.maximum-allocation-mb sets a default value for maximum container size for all queues and setting maximum resources per queue with “maxContainerResources” queue config value. Suggested solution: All the infrastructure is already in the code. We need to do the following: * add the setting to the queue properties for all queue types (parent and leaf), this will cover dynamically created queues. * if we set it on the root we override the scheduler setting and we should not allow that. * make sure that queue resource cap can not be larger than scheduler max resource cap in the config. * implement getMaximumResourceCapability(String queueName) in the FairScheduler * implement getMaximumResourceCapability() in both FSParentQueue and FSLeafQueue as follows * expose the setting in the queue information in the RM web UI. * expose the setting in the metrics etc for the queue. * Enforce the usage of queue based maximum allocation limit was: When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" to limit the overall size of a container. This applies globally to all containers and cannot be limited by queue or and is not scheduler dependent. The goal of this ticket is to allow this value to be set on a per queue basis. The use case: User has two pools, one for ad hoc jobs and one for enterprise apps. User wants to limit ad hoc jobs to small containers but allow enterprise apps to request as many resources as needed. Setting yarn.scheduler.maximum-allocation-mb sets a default value for maximum container size for all queues and setting maximum resources per queue with “maxContainerResources” queue config value. Suggested solution: All the infrastructure is already in the code. We need to do the following: * add the setting to the queue properties for all queue types (parent and leaf), this will cover dynamically created queues. * if we set it on the root we override the scheduler setting and we should not allow that. * make sure that queue resource cap can not be larger than scheduler max resource cap in the config. * implement getMaximumResourceCapability(String queueName) in the FairScheduler * implement getMaximumResourceCapability() in both FSParentQueue and FSLeafQueue as follows * expose the setting in the queue information in the RM web UI. * expose the setting in the metrics etc for the queue. * write JUnit tests. * update the scheduler documentation. > Enable queue based maximum allocation limit and implement it in FairScheduler > - > > Key: YARN-8468 > URL: https://issues.apache.org/jira/browse/YARN-8468 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Critical > Attachments: YARN-8468.000.patch, YARN-8468.001.patch, > YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, > YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, > YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, > YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, > YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch > > > When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" > to limit the overall size of a container. This applies globally to all > containers and cannot be limited by queue or and is not scheduler dependent. > The goal of this ticket is to allow this value to be set on a per queue basis. > The use case: User has two pools, one for ad hoc jobs and one for enterprise > apps. User wants to limit ad hoc jobs to small containers but allow > enterprise apps to request as many resources as needed. Setting > yarn.scheduler.maximum-allocation-mb sets a default value for maximum > container size for all queues and setting maximum resources per queue with > “maxContainerResources” queue config value. > Suggested solution: > All the infrastructure is already in the code. We need to do the following: > * add the setting
[jira] [Updated] (YARN-8468) Enable queue based maximum allocation limit and implement it in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antal Bálint Steinbach updated YARN-8468: - Description: When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" to limit the overall size of a container. This applies globally to all containers and cannot be limited by queue or and is not scheduler dependent. The goal of this ticket is to allow this value to be set on a per queue basis. The use case: User has two pools, one for ad hoc jobs and one for enterprise apps. User wants to limit ad hoc jobs to small containers but allow enterprise apps to request as many resources as needed. Setting yarn.scheduler.maximum-allocation-mb sets a default value for maximum container size for all queues and setting maximum resources per queue with “maxContainerResources” queue config value. Suggested solution: All the infrastructure is already in the code. We need to do the following: * add the setting to the queue properties for all queue types (parent and leaf), this will cover dynamically created queues. * if we set it on the root we override the scheduler setting and we should not allow that. * make sure that queue resource cap can not be larger than scheduler max resource cap in the config. * implement getMaximumResourceCapability(String queueName) in the FairScheduler * implement getMaximumResourceCapability() in both FSParentQueue and FSLeafQueue as follows * expose the setting in the queue information in the RM web UI. * expose the setting in the metrics etc for the queue. * write JUnit tests. * update the scheduler documentation. was: When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" to limit the overall size of a container. This applies globally to all containers and cannot be limited by queue or and is not scheduler dependent. The goal of this ticket is to allow this value to be set on a per queue basis. The use case: User has two pools, one for ad hoc jobs and one for enterprise apps. User wants to limit ad hoc jobs to small containers but allow enterprise apps to request as many resources as needed. Setting yarn.scheduler.maximum-allocation-mb sets a default value for maximum container size for all queues and setting maximum resources per queue with “maxContainerResources” queue config value. Suggested solution: All the infrastructure is already in the code. We need to do the following: * add the setting to the queue properties for all queue types (parent and leaf), this will cover dynamically created queues. * if we set it on the root we override the scheduler setting and we should not allow that. * make sure that queue resource cap can not be larger than scheduler max resource cap in the config. * implement getMaximumResourceCapability(String queueName) in the FairScheduler * implement getMaximumResourceCapability() in both FSParentQueue and FSLeafQueue as follows * expose the setting in the queue information in the RM web UI. * expose the setting in the metrics etc for the queue. * write JUnit tests. * update the scheduler documentation. > Enable queue based maximum allocation limit and implement it in FairScheduler > - > > Key: YARN-8468 > URL: https://issues.apache.org/jira/browse/YARN-8468 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Critical > Attachments: YARN-8468.000.patch, YARN-8468.001.patch, > YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, > YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, > YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, > YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, > YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch > > > When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" > to limit the overall size of a container. This applies globally to all > containers and cannot be limited by queue or and is not scheduler dependent. > The goal of this ticket is to allow this value to be set on a per queue basis. > The use case: User has two pools, one for ad hoc jobs and one for enterprise > apps. User wants to limit ad hoc jobs to small containers but allow > enterprise apps to request as many resources as needed. Setting > yarn.scheduler.maximum-allocation-mb sets a default value for maximum > container size for all queues and setting maximum resources per queue with > “maxContainerResources” queue config value. > > Suggested solution: > > All the infrastructure is already in the code. We need to do the following: >
[jira] [Updated] (YARN-8468) Enable queue based maximum allocation limit and implement it in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antal Bálint Steinbach updated YARN-8468: - Summary: Enable queue based maximum allocation limit and implement it in FairScheduler (was: Limit container sizes per queue in FairScheduler) > Enable queue based maximum allocation limit and implement it in FairScheduler > - > > Key: YARN-8468 > URL: https://issues.apache.org/jira/browse/YARN-8468 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Critical > Attachments: YARN-8468.000.patch, YARN-8468.001.patch, > YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, > YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, > YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, > YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, > YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch > > > When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" > to limit the overall size of a container. This applies globally to all > containers and cannot be limited by queue or and is not scheduler dependent. > > The goal of this ticket is to allow this value to be set on a per queue basis. > > The use case: User has two pools, one for ad hoc jobs and one for enterprise > apps. User wants to limit ad hoc jobs to small containers but allow > enterprise apps to request as many resources as needed. Setting > yarn.scheduler.maximum-allocation-mb sets a default value for maximum > container size for all queues and setting maximum resources per queue with > “maxContainerResources” queue config value. > > Suggested solution: > > All the infrastructure is already in the code. We need to do the following: > * add the setting to the queue properties for all queue types (parent and > leaf), this will cover dynamically created queues. > * if we set it on the root we override the scheduler setting and we should > not allow that. > * make sure that queue resource cap can not be larger than scheduler max > resource cap in the config. > * implement getMaximumResourceCapability(String queueName) in the > FairScheduler > * implement getMaximumResourceCapability() in both FSParentQueue and > FSLeafQueue as follows > * expose the setting in the queue information in the RM web UI. > * expose the setting in the metrics etc for the queue. > * write JUnit tests. > * update the scheduler documentation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8468) Limit container sizes per queue in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antal Bálint Steinbach updated YARN-8468: - Attachment: YARN-8468.016.patch > Limit container sizes per queue in FairScheduler > > > Key: YARN-8468 > URL: https://issues.apache.org/jira/browse/YARN-8468 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Critical > Attachments: YARN-8468.000.patch, YARN-8468.001.patch, > YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, > YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, > YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, > YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, > YARN-8468.014.patch, YARN-8468.015.patch, YARN-8468.016.patch > > > When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" > to limit the overall size of a container. This applies globally to all > containers and cannot be limited by queue or and is not scheduler dependent. > > The goal of this ticket is to allow this value to be set on a per queue basis. > > The use case: User has two pools, one for ad hoc jobs and one for enterprise > apps. User wants to limit ad hoc jobs to small containers but allow > enterprise apps to request as many resources as needed. Setting > yarn.scheduler.maximum-allocation-mb sets a default value for maximum > container size for all queues and setting maximum resources per queue with > “maxContainerResources” queue config value. > > Suggested solution: > > All the infrastructure is already in the code. We need to do the following: > * add the setting to the queue properties for all queue types (parent and > leaf), this will cover dynamically created queues. > * if we set it on the root we override the scheduler setting and we should > not allow that. > * make sure that queue resource cap can not be larger than scheduler max > resource cap in the config. > * implement getMaximumResourceCapability(String queueName) in the > FairScheduler > * implement getMaximumResourceCapability() in both FSParentQueue and > FSLeafQueue as follows > * expose the setting in the queue information in the RM web UI. > * expose the setting in the metrics etc for the queue. > * write JUnit tests. > * update the scheduler documentation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher
[ https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated YARN-8789: -- Description: I recently came across a scenario where an MR ApplicationMaster was failing with an OOM exception. It had many thousands of Mappers and thousands of Reducers. It was noted that in the logging that the event-queue of {{AsyncDispatcher}} had a very large number of item in it and was seemingly never decreasing. I started looking at the code and thought it could use some clean up, simplification, and the ability to specify a bounded queue so that any incoming events are throttled until they can be processed. This will protect the ApplicationMaster from a flood of events. Logging Message: Size of event-queue is xxx was: I recently came across a scenario where an MR ApplicationMaster was failing with an OOM exception. It had many thousands of Mappers and thousands of Reducers. It was noted that in the logging that the event-queue of {{AsyncDispatcher}} had a very large number of item in it and was seemingly never decreasing. I started looking at the code and thought it could use some clean up, simplification, and the ability to specify a bounded queue so that any incoming events are throttled until they can be processed. This will protect the ApplicationMaster from a flood of events. > Add BoundedQueue to AsyncDispatcher > --- > > Key: YARN-8789 > URL: https://issues.apache.org/jira/browse/YARN-8789 > Project: Hadoop YARN > Issue Type: Improvement > Components: applications >Affects Versions: 3.2.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Major > Attachments: YARN-8789.1.patch, YARN-8789.2.patch, YARN-8789.3.patch > > > I recently came across a scenario where an MR ApplicationMaster was failing > with an OOM exception. It had many thousands of Mappers and thousands of > Reducers. It was noted that in the logging that the event-queue of > {{AsyncDispatcher}} had a very large number of item in it and was seemingly > never decreasing. > I started looking at the code and thought it could use some clean up, > simplification, and the ability to specify a bounded queue so that any > incoming events are throttled until they can be processed. This will protect > the ApplicationMaster from a flood of events. > Logging Message: > Size of event-queue is xxx -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8468) Limit container sizes per queue in FairScheduler
[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622355#comment-16622355 ] Wangda Tan commented on YARN-8468: -- Thanks [~bsteinbach], I think YARN-8720 only covers the enforcement path for the allocate call, there're other code paths like submit app, scheduling requests, etc. are not covered the YARN-8720. Let's not worry about size but do the patch correct. [~bsteinbach] could u update title / description properly to outline changes you have done? > Limit container sizes per queue in FairScheduler > > > Key: YARN-8468 > URL: https://issues.apache.org/jira/browse/YARN-8468 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Critical > Attachments: YARN-8468.000.patch, YARN-8468.001.patch, > YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, > YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, > YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, > YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, > YARN-8468.014.patch, YARN-8468.015.patch > > > When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" > to limit the overall size of a container. This applies globally to all > containers and cannot be limited by queue or and is not scheduler dependent. > > The goal of this ticket is to allow this value to be set on a per queue basis. > > The use case: User has two pools, one for ad hoc jobs and one for enterprise > apps. User wants to limit ad hoc jobs to small containers but allow > enterprise apps to request as many resources as needed. Setting > yarn.scheduler.maximum-allocation-mb sets a default value for maximum > container size for all queues and setting maximum resources per queue with > “maxContainerResources” queue config value. > > Suggested solution: > > All the infrastructure is already in the code. We need to do the following: > * add the setting to the queue properties for all queue types (parent and > leaf), this will cover dynamically created queues. > * if we set it on the root we override the scheduler setting and we should > not allow that. > * make sure that queue resource cap can not be larger than scheduler max > resource cap in the config. > * implement getMaximumResourceCapability(String queueName) in the > FairScheduler > * implement getMaximumResourceCapability() in both FSParentQueue and > FSLeafQueue as follows > * expose the setting in the queue information in the RM web UI. > * expose the setting in the metrics etc for the queue. > * write JUnit tests. > * update the scheduler documentation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8809) Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers are released.
Haibo Chen created YARN-8809: Summary: Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers are released. Key: YARN-8809 URL: https://issues.apache.org/jira/browse/YARN-8809 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Affects Versions: YARN-1011 Reporter: Haibo Chen Assignee: Haibo Chen -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8808) Use aggregate container utilization instead of node utilization to determine resources available for oversubscription
Haibo Chen created YARN-8808: Summary: Use aggregate container utilization instead of node utilization to determine resources available for oversubscription Key: YARN-8808 URL: https://issues.apache.org/jira/browse/YARN-8808 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: YARN-1011 Reporter: Haibo Chen Assignee: Haibo Chen -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8807) FairScheduler crashes RM with oversubscription turned on if an application is killed.
Haibo Chen created YARN-8807: Summary: FairScheduler crashes RM with oversubscription turned on if an application is killed. Key: YARN-8807 URL: https://issues.apache.org/jira/browse/YARN-8807 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler, resourcemanager Affects Versions: YARN-1011 Reporter: Haibo Chen Assignee: Haibo Chen -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8774) Memory leak when CapacityScheduler allocates from reserved container with non-default label
[ https://issues.apache.org/jira/browse/YARN-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622323#comment-16622323 ] Eric Payne commented on YARN-8774: -- [~Tao Yang], The patch and UT look good to me. However, I have one request. The bug does not exist in 2.8, but it does exist in 2.9, and I think we should backport this all the way back to 2.9. However, although the patch does apply to 2.9, it does not build: {noformat} [ERROR] /home/ericp/hadoop/source/YARN-8774/branch-2.8/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/allocator/RegularContainerAllocator.java:[744,9] cannot find symbol [ERROR] symbol: class AppPlacementAllocator {noformat} So, before I commit this, can you please provide a branch-2.9 patch? > Memory leak when CapacityScheduler allocates from reserved container with > non-default label > --- > > Key: YARN-8774 > URL: https://issues.apache.org/jira/browse/YARN-8774 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0, 2.8.5 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8774.001.patch > > > The cause is that the RMContainerImpl instance of reserved container lost its > node label expression, when scheduler reserves containers for non-default > node-label requests, it will be wrongly added into > LeafQueue#ignorePartitionExclusivityRMContainers and never be removed. > To reproduce this memory leak: > (1) create reserved container > RegularContainerAllocator#doAllocation: create RMContainerImpl instanceA > (nodeLabelExpression="") > LeafQueue#allocateResource: RMContainerImpl instanceA is put into > LeafQueue#ignorePartitionExclusivityRMContainers > (2) allocate from reserved container > RegularContainerAllocator#doAllocation: create RMContainerImpl instanceB > (nodeLabelExpression="test-label") > (3) From now on, RMContainerImpl instanceA will be left in memory (be kept in > LeafQueue#ignorePartitionExclusivityRMContainers) forever until RM restarted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8789) Add BoundedQueue to AsyncDispatcher
[ https://issues.apache.org/jira/browse/YARN-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated YARN-8789: -- Attachment: YARN-8789.3.patch > Add BoundedQueue to AsyncDispatcher > --- > > Key: YARN-8789 > URL: https://issues.apache.org/jira/browse/YARN-8789 > Project: Hadoop YARN > Issue Type: Improvement > Components: applications >Affects Versions: 3.2.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Major > Attachments: YARN-8789.1.patch, YARN-8789.2.patch, YARN-8789.3.patch > > > I recently came across a scenario where an MR ApplicationMaster was failing > with an OOM exception. It had many thousands of Mappers and thousands of > Reducers. It was noted that in the logging that the event-queue of > {{AsyncDispatcher}} had a very large number of item in it and was seemingly > never decreasing. > I started looking at the code and thought it could use some clean up, > simplification, and the ability to specify a bounded queue so that any > incoming events are throttled until they can be processed. This will protect > the ApplicationMaster from a flood of events. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8804) resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues
[ https://issues.apache.org/jira/browse/YARN-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622281#comment-16622281 ] Jason Lowe commented on YARN-8804: -- Thanks for the report and patch! Nice analysis. A naked volatile here is not going to work when there are multiple threads. It only works if we are swapping in precomputed objects, as is done with the headroom, and not when we're trying to modify them in-place as is done in this patch. Otherwise we can have this classic multithreaded race where we lose information: # blockedHeadroom is null # Thread 1 and Thread 2 both call addBlockedHeadroom at the same time # Thread 1 and Thread 2 both see that blockedHeadroom is null and both decide to create a new, empty instance # Thread 2 races ahead of Thread 1 at this point, storing the new, updated object with its resource information # Thread 1 finally gets around to storing the zero instance to blockedHeadroom, and now we just lost the computation from Thread 2 Since this is trying to compute on an existing object, it should use an AtomicReference where a new Resource object is stored with compareAndSet each time it is updated. This allows threads to detect when they collide with another thread, looping around and trying again when a collision is detected. We could avoid all of the blocked tracking business if the allocation result included the amount that needs to be blocked. The crux of the problem lies in the computation of that amount when the queue returning the queue skipped assignment result wasn't a leaf queue. I think it would be cleaner if a queue could return an assignment result that not only indicated the allocation was skipped due to queue limits but also how much needs to be reserved as a result of that skipped assignment. Then that amount can be referenced directly as the limit is adjusted up the queue hierarchy rather than constantly subtracting the blocked limit from the parent limit. The result would be less overhead for the normal scheduler loop, as we would only be adjusting when necessary rather than every time. > resourceLimits may be wrongly calculated when leaf-queue is blocked in > cluster with 3+ level queues > --- > > Key: YARN-8804 > URL: https://issues.apache.org/jira/browse/YARN-8804 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Critical > Attachments: YARN-8804.001.patch > > > This problem is due to YARN-4280, parent queue will deduct child queue's > headroom when the child queue reached its resource limit and the skipped type > is QUEUE_LIMIT, the resource limits of deepest parent queue will be correctly > calculated, but for non-deepest parent queue, its headroom may be much more > than the sum of reached-limit child queues' headroom, so that the resource > limit of non-deepest parent may be much less than its true value and block > the allocation for later queues. > To reproduce this problem with UT: > (1) Cluster has two nodes whose node resource both are <10GB, 10core> and > 3-level queues as below, among them max-capacity of "c1" is 10 and others are > all 100, so that max-capacity of queue "c1" is <2GB, 2core> > {noformat} > Root > / | \ > a bc >10 20 70 > | \ > c1 c2 > 10(max=10) 90 > {noformat} > (2) Submit app1 to queue "c1" and launch am1(resource=<1GB, 1 core>) on nm1 > (3) Submit app2 to queue "b" and launch am2(resource=<1GB, 1 core>) on nm1 > (4) app1 and app2 both ask one <2GB, 1core> containers. > (5) nm1 do 1 heartbeat > Now queue "c" has lower capacity percentage than queue "b", the allocation > sequence will be "a" -> "c" -> "b", > queue "c1" has reached queue limit so that requests of app1 should be > pending, > headroom of queue "c1" is <1GB, 1core> (=max-capacity - used), > headroom of queue "c" is <18GB, 18core> (=max-capacity - used), > after allocation for queue "c", resource limit of queue "b" will be wrongly > calculated as <2GB, 2core>, > headroom of queue "b" will be <1GB, 1core> (=resource-limit - used) > so that scheduler won't allocate one container for app2 on nm1 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8806) Enable local staging directory and clean it up when submarine job is submitted
[ https://issues.apache.org/jira/browse/YARN-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622280#comment-16622280 ] Hadoop QA commented on YARN-8806: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 3s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 30s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine in trunk has 2 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 11s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine: The patch generated 8 new + 27 unchanged - 0 fixed = 35 total (was 27) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 5s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 30s{color} | {color:red} hadoop-yarn-submarine in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 46m 22s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.submarine.client.cli.yarnservice.TestYarnServiceRunJobCli | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-8806 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12940612/YARN-8806.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 2d1d1676ca0a 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 429a07e | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-YARN-Build/21909/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-submarine-warnings.html | | checkstyle |
[jira] [Commented] (YARN-8805) Automatically convert the launch command to the exec form when using entrypoint support
[ https://issues.apache.org/jira/browse/YARN-8805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622208#comment-16622208 ] Eric Yang commented on YARN-8805: - This has been documented in https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/Examples.md . The formatting seems a little off on Apache website. > Automatically convert the launch command to the exec form when using > entrypoint support > --- > > Key: YARN-8805 > URL: https://issues.apache.org/jira/browse/YARN-8805 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Shane Kumpf >Priority: Major > Labels: Docker > > When {{YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE}} is true, and a > launch command is provided, it is expected that the launch command is > provided by the user in exec form. > For example: > {code:java} > "/usr/bin/sleep 6000"{code} > must be changed to: > {code}"/usr/bin/sleep,6000"{code} > If this is not done, the container will never start and will be in a Created > state. We should automatically do this conversion vs making the user > understand this nuance of using the entrypoint support. Docs should be > updated to reflect this change. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8806) Enable local staging directory and clean it up when submarine job is submitted
[ https://issues.apache.org/jira/browse/YARN-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zac Zhou updated YARN-8806: --- Attachment: YARN-8806.002.patch > Enable local staging directory and clean it up when submarine job is submitted > -- > > Key: YARN-8806 > URL: https://issues.apache.org/jira/browse/YARN-8806 > Project: Hadoop YARN > Issue Type: Sub-task > Environment: In the /tmp dir, there are launch scripts which are not > cleaned up as follows: > -rw-r--r-- 1 hadoop netease 1100 Sep 18 10:46 > PRIMARY_WORKER-launch-script8635233314077649086.sh > -rw-r--r-- 1 hadoop netease 1100 Sep 18 10:46 > WORKER-launch-script129488020578466938.sh > -rw-r--r-- 1 hadoop netease 1028 Sep 18 10:46 > PS-launch-script471092031021738136.sh >Reporter: Zac Zhou >Assignee: Zac Zhou >Priority: Major > Attachments: YARN-8806.001.patch, YARN-8806.002.patch > > > YarnServiceJobSubmitter.generateCommandLaunchScript creates container launch > scripts in the local filesystem. Container launch scripts would be uploaded > to hdfs staging dir, but would not be not deleted after the job is submitted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8801) java doc comments in docker-util.h is confusing
[ https://issues.apache.org/jira/browse/YARN-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622193#comment-16622193 ] Eric Yang commented on YARN-8801: - +1 > java doc comments in docker-util.h is confusing > --- > > Key: YARN-8801 > URL: https://issues.apache.org/jira/browse/YARN-8801 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zian Chen >Assignee: Zian Chen >Priority: Minor > Labels: Docker > Attachments: YARN-8801.001.patch > > > {code:java} > /** > + * Get the Docker exec command line string. The function will verify that > the params file is meant for the exec command. > + * @param command_file File containing the params for the Docker start > command > + * @param conf Configuration struct containing the container-executor.cfg > details > + * @param out Buffer to fill with the exec command > + * @param outlen Size of the output buffer > + * @return Return code with 0 indicating success and non-zero codes > indicating error > + */ > +int get_docker_exec_command(const char* command_file, const struct > configuration* conf, args *args);{code} > The method param list have out an outlen which didn't match the signature, > and we miss description for param args. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8725) Submarine job staging directory has a lot of useless PRIMARY_WORKER-launch-script-***.sh scripts when submitting a job multiple times
[ https://issues.apache.org/jira/browse/YARN-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622159#comment-16622159 ] Zac Zhou edited comment on YARN-8725 at 9/20/18 2:51 PM: - [~leftnoteasy], [~tangzhankun]. Thanks a lot for your efforts to make this sub task in progress. I just found there were files in local file system, which were not clean up as well. I just opened a new ticket YARN-8806 and submitted a patch. it would be nice if you could look into it as well ~ was (Author: yuan_zac): [~leftnoteasy], [~tangzhankun]. Thanks a lot for your efforts to make this sub task in progress. I just found there were files in local file system, which were not clean up as well. I just opened a new ticket [YARN-8806|https://issues.apache.org/jira/browse/YARN-8806] and submit a patch. it would be nice if you could look into it as well ~ > Submarine job staging directory has a lot of useless > PRIMARY_WORKER-launch-script-***.sh scripts when submitting a job multiple > times > -- > > Key: YARN-8725 > URL: https://issues.apache.org/jira/browse/YARN-8725 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zac Zhou >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8725-trunk.001.patch > > > Submarine jobs upload core-site.xml, hdfs-site.xml, job.info and > PRIMARY_WORKER-launch-script.sh to staging dir. > The core-site.xml, hdfs-site.xml and job.info would be overwritten if a job > is submitted multiple times. > But PRIMARY_WORKER-launch-script.sh would not be overwritten, as it has > random numbers in its name. > The files in the staging dir are as follows: > {code:java} > -rw-r- 2 hadoop hdfs 580 2018-08-17 10:11 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script6954941665090337726.sh > -rw-r- 2 hadoop hdfs 580 2018-08-17 10:02 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script7037369696166769734.sh > -rw-r- 2 hadoop hdfs 580 2018-08-17 10:06 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script8047707294763488040.sh > -rw-r- 2 hadoop hdfs 15225 2018-08-17 18:46 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script8122565781159446375.sh > -rw-r- 2 hadoop hdfs 580 2018-08-16 20:48 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script8598604480700049845.sh > -rw-r- 2 hadoop hdfs 580 2018-08-17 14:53 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script971703616848859353.sh > -rw-r- 2 hadoop hdfs 580 2018-08-17 10:16 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script990214235580089093.sh > -rw-r- 2 hadoop hdfs 8815 2018-08-27 15:54 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/core-site.xml > -rw-r- 2 hadoop hdfs 11583 2018-08-27 15:54 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/hdfs-site.xml > -rw-rw-rw- 2 hadoop hdfs 846 2018-08-22 10:56 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/job.info > {code} > > We should stop the staging dir from growing or have a way to clean it up -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8725) Submarine job staging directory has a lot of useless PRIMARY_WORKER-launch-script-***.sh scripts when submitting a job multiple times
[ https://issues.apache.org/jira/browse/YARN-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622159#comment-16622159 ] Zac Zhou commented on YARN-8725: [~leftnoteasy], [~tangzhankun]. Thanks a lot for your efforts to make this sub task in progress. I just found there were files in local file system, which were not clean up as well. I just opened a new ticket [YARN-8806|https://issues.apache.org/jira/browse/YARN-8806] and submit a patch. it would be nice if you could look into it as well ~ > Submarine job staging directory has a lot of useless > PRIMARY_WORKER-launch-script-***.sh scripts when submitting a job multiple > times > -- > > Key: YARN-8725 > URL: https://issues.apache.org/jira/browse/YARN-8725 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zac Zhou >Assignee: Zhankun Tang >Priority: Major > Attachments: YARN-8725-trunk.001.patch > > > Submarine jobs upload core-site.xml, hdfs-site.xml, job.info and > PRIMARY_WORKER-launch-script.sh to staging dir. > The core-site.xml, hdfs-site.xml and job.info would be overwritten if a job > is submitted multiple times. > But PRIMARY_WORKER-launch-script.sh would not be overwritten, as it has > random numbers in its name. > The files in the staging dir are as follows: > {code:java} > -rw-r- 2 hadoop hdfs 580 2018-08-17 10:11 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script6954941665090337726.sh > -rw-r- 2 hadoop hdfs 580 2018-08-17 10:02 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script7037369696166769734.sh > -rw-r- 2 hadoop hdfs 580 2018-08-17 10:06 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script8047707294763488040.sh > -rw-r- 2 hadoop hdfs 15225 2018-08-17 18:46 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script8122565781159446375.sh > -rw-r- 2 hadoop hdfs 580 2018-08-16 20:48 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script8598604480700049845.sh > -rw-r- 2 hadoop hdfs 580 2018-08-17 14:53 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script971703616848859353.sh > -rw-r- 2 hadoop hdfs 580 2018-08-17 10:16 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script990214235580089093.sh > -rw-r- 2 hadoop hdfs 8815 2018-08-27 15:54 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/core-site.xml > -rw-r- 2 hadoop hdfs 11583 2018-08-27 15:54 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/hdfs-site.xml > -rw-rw-rw- 2 hadoop hdfs 846 2018-08-22 10:56 > hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/job.info > {code} > > We should stop the staging dir from growing or have a way to clean it up -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8806) Enable local staging directory and clean it up when submarine job is submitted
[ https://issues.apache.org/jira/browse/YARN-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622156#comment-16622156 ] Hadoop QA commented on YARN-8806: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} YARN-8806 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-8806 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12940604/YARN-8806.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21908/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Enable local staging directory and clean it up when submarine job is submitted > -- > > Key: YARN-8806 > URL: https://issues.apache.org/jira/browse/YARN-8806 > Project: Hadoop YARN > Issue Type: Sub-task > Environment: In the /tmp dir, there are launch scripts which are not > cleaned up as follows: > -rw-r--r-- 1 hadoop netease 1100 Sep 18 10:46 > PRIMARY_WORKER-launch-script8635233314077649086.sh > -rw-r--r-- 1 hadoop netease 1100 Sep 18 10:46 > WORKER-launch-script129488020578466938.sh > -rw-r--r-- 1 hadoop netease 1028 Sep 18 10:46 > PS-launch-script471092031021738136.sh >Reporter: Zac Zhou >Assignee: Zac Zhou >Priority: Major > Attachments: YARN-8806.001.patch > > > YarnServiceJobSubmitter.generateCommandLaunchScript creates container launch > scripts in the local filesystem. Container launch scripts would be uploaded > to hdfs staging dir, but would not be not deleted after the job is submitted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8806) Enable local staging directory and clean it up when submarine job is submitted
Zac Zhou created YARN-8806: -- Summary: Enable local staging directory and clean it up when submarine job is submitted Key: YARN-8806 URL: https://issues.apache.org/jira/browse/YARN-8806 Project: Hadoop YARN Issue Type: Sub-task Environment: In the /tmp dir, there are launch scripts which are not cleaned up as follows: -rw-r--r-- 1 hadoop netease 1100 Sep 18 10:46 PRIMARY_WORKER-launch-script8635233314077649086.sh -rw-r--r-- 1 hadoop netease 1100 Sep 18 10:46 WORKER-launch-script129488020578466938.sh -rw-r--r-- 1 hadoop netease 1028 Sep 18 10:46 PS-launch-script471092031021738136.sh Reporter: Zac Zhou Assignee: Zac Zhou YarnServiceJobSubmitter.generateCommandLaunchScript creates container launch scripts in the local filesystem. Container launch scripts would be uploaded to hdfs staging dir, but would not be not deleted after the job is submitted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8800) Updated documentation of Submarine with latest examples.
[ https://issues.apache.org/jira/browse/YARN-8800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622110#comment-16622110 ] Sunil Govindan commented on YARN-8800: -- Hi [~leftnoteasy] Cud u pls help in this ticket as its marked for 3.2 release > Updated documentation of Submarine with latest examples. > > > Key: YARN-8800 > URL: https://issues.apache.org/jira/browse/YARN-8800 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Critical > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8769) [Submarine] Allow user to specify customized quicklink(s) when submit Submarine job
[ https://issues.apache.org/jira/browse/YARN-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622108#comment-16622108 ] Sunil Govindan commented on YARN-8769: -- +1. Committing shortly. > [Submarine] Allow user to specify customized quicklink(s) when submit > Submarine job > --- > > Key: YARN-8769 > URL: https://issues.apache.org/jira/browse/YARN-8769 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8769.001.patch, YARN-8769.002.patch, > YARN-8769.003.patch, YARN-8769.004.patch > > > This will be helpful when user submit a job and some links need to be shown > on YARN UI2 (service page). For example, user can specify a quick link to > Zeppelin notebook UI when a Zeppelin notebook got launched. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4858) start-yarn and stop-yarn scripts to support timeline and sharedcachemanager
[ https://issues.apache.org/jira/browse/YARN-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621948#comment-16621948 ] Hadoop QA commented on YARN-4858: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 26m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 42s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 4s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 55s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} shellcheck {color} | {color:red} 0m 0s{color} | {color:red} The patch generated 2 new + 19 unchanged - 0 fixed = 21 total (was 19) {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 10s{color} | {color:green} There were no new shelldocs issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 27s{color} | {color:green} hadoop-yarn in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 47s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 61m 55s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:a716388 | | JIRA Issue | YARN-4858 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12795001/YARN-4858-branch-2.001.patch | | Optional Tests | dupname asflicense mvnsite unit shellcheck shelldocs | | uname | Linux 9c382396e72c 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-2 / 0dd6861 | | maven | version: Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) | | shellcheck | v0.4.7 | | shellcheck | https://builds.apache.org/job/PreCommit-YARN-Build/21906/artifact/out/diff-patch-shellcheck.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21906/testReport/ | | Max. process+thread count | 75 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn U: hadoop-yarn-project/hadoop-yarn | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21906/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > start-yarn and stop-yarn scripts to support timeline and sharedcachemanager > --- > > Key: YARN-4858 > URL: https://issues.apache.org/jira/browse/YARN-4858 > Project: Hadoop YARN > Issue Type: Improvement > Components: scripts >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Labels: oct16-easy > Attachments: YARN-4858-001.patch, YARN-4858-branch-2.001.patch > > > The start-yarn and stop-yarn scripts don't have any (even commented out) > support for the timeline and sharedcachemanager > Proposed: > * bash and cmd start-yarn scripts have commented out start actions > * stop-yarn scripts stop the servers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8805) Automatically convert the launch command to the exec form when using entrypoint support
Shane Kumpf created YARN-8805: - Summary: Automatically convert the launch command to the exec form when using entrypoint support Key: YARN-8805 URL: https://issues.apache.org/jira/browse/YARN-8805 Project: Hadoop YARN Issue Type: Sub-task Reporter: Shane Kumpf When {{YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE}} is true, and a launch command is provided, it is expected that the launch command is provided by the user in exec form. For example: {code:java} "/usr/bin/sleep 6000"{code} must be changed to: {code}"/usr/bin/sleep,6000"{code} If this is not done, the container will never start and will be in a Created state. We should automatically do this conversion vs making the user understand this nuance of using the entrypoint support. Docs should be updated to reflect this change. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7834) [GQ] Rebalance queue configuration for load-balancing and locality affinities
[ https://issues.apache.org/jira/browse/YARN-7834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621935#comment-16621935 ] Hadoop QA commented on YARN-7834: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 40s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 24s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 25s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 25s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 33s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 25 new + 0 unchanged - 0 fixed = 25 total (was 0) {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 27s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 3 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 4m 9s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 26s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 27s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 5 new + 4 unchanged - 0 fixed = 9 total (was 4) {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 27s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 23s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 43m 8s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-7834 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12910823/YARN-7834.v1.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle | | uname | Linux 18e1bafc0b0e 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 7ad27e9 | | maven | version: Apache Maven 3.3.9 | | Default Java |