[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16525713#comment-16525713 ] Sunil Govindan commented on YARN-8379: -- Thanks [~Zian Chen]. Latest patch looks good to me. +1 > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch, YARN-8379.004.patch, YARN-8379.005.patch, > YARN-8379.006.patch, ericpayne.confs.tgz > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16525676#comment-16525676 ] Zian Chen commented on YARN-8379: - Quickly checked the failed UTs, TestAMRestart.testPreemptedAMRestartOnRMRestart[FAIR], passed in local environment, don't get timeout failure. > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch, YARN-8379.004.patch, YARN-8379.005.patch, > YARN-8379.006.patch, ericpayne.confs.tgz > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16525636#comment-16525636 ] genericqa commented on YARN-8379: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 44s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 31m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 25s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 16s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 27s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}137m 44s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8379 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12929459/YARN-8379.006.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux a1d04630eec8 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / fbaff36 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/21131/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21131/testReport/ | | Max. process+thread count | 853 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/h
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16525476#comment-16525476 ] Zian Chen commented on YARN-8379: - [~leftnoteasy] [~sunilg], thanks for the comments. Just updated the patch to fix curCandidates and refactor all selector to use addToPreemptMap to update both toPreempt and curCandidates. Also fixed other minor issues. Could you help review it? Thanks! > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch, YARN-8379.004.patch, YARN-8379.005.patch, > YARN-8379.006.patch, ericpayne.confs.tgz > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16525328#comment-16525328 ] Wangda Tan commented on YARN-8379: -- bq. we could definitely make a method inside PreemptionCandidatesSelector, and call it explicitly to reset curCandidates per round, but this way it makes the code even harder to read. Any better suggestions here? Can we simply new the curCandidates map inside {{selectCandidates}} for each selector? bq. This test case was intend to demonstrate selected candidates will be actually killed after custom timeout was reached. This part of code is the intention. What I can see from the UT is, queue1 gets all containers (39G) and queue2 asks a 4G container. After wait the 4G containers will be preempted from queue1. I think our purpose is: both queue1 / queue2 are overutilized, we need to balance resources from queue1 to queue2 and only after X secs, containers from queue1 will be preempted. correct? It will be similar to follow the example {{testPreemptionToBalanceUsedPlusPendingLessThanGuaranteed}}. > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch, YARN-8379.004.patch, YARN-8379.005.patch, > ericpayne.confs.tgz > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16525285#comment-16525285 ] Sunil Govindan commented on YARN-8379: -- Hi [~Zian Chen] Few more comments # In {{preemptOrkillSelectedContainerAfterWait}} could we avoid computing toPreemptCount. Instead we can use toPreempt.size or something similar. Ideally no: of containers in toPreempt and toPreemptPerSelector should be same. # {quote} we should give a clean curCandidates HashMap every time we call editSchedule, otherwise like this UT, we call editschedule multiple times but the selector remain the same instance {quote} To me, this is more likely a UT bug. As per the semantics, we do not need to pass curCandidates and rather consider the return value alone. # I think in line with above comments, we can try to rename updateCurCandidates to updatePerSelectorCandidates and curCandidates to something similar? # May be more cleaner solution is to handle curCandidates map updates inside CapacitySchedulerPreemptionUtils#addToPreemptMap method. All duplicate checks are done before this method is called, so we can just add container to curCandidates inside this method which avoid a lot of external handling and cause more bugs later for new selector etc. > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch, YARN-8379.004.patch, YARN-8379.005.patch, > ericpayne.confs.tgz > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524608#comment-16524608 ] Zian Chen commented on YARN-8379: - Hi [~leftnoteasy]. thanks for the comments. # This test case was intend to demonstrate selected candidates will be actually killed after custom timeout was reached. This part of code is the intention. {code:java} editPolicy.editSchedule(); Assert.assertEquals(4, editPolicy.getToPreemptContainers().size()); // check live containers immediately, nothing happen Assert.assertEquals(39, schedulerApp1.getLiveContainers().size()); Thread.sleep(20*1000); // Call editSchedule again: selected containers are killed editPolicy.editSchedule(); waitNumberOfLiveContainersFromApp(schedulerApp1, 35);{code} # Totally understand your suggestion here, actually I was define a private member variable inside interface PreemptionCandidatesSelector so that we don't need to pass curCandidates for every selector instance. However I found TestCapacitySchedulerSurgicalPreemption#testPriorityPreemptionFromHighestPriorityQueueAndOldestContainer() get failed which helped me realize that curCandidates is a "per round reinitialize thing" rather than a "per selector" thing, we should give a clean curCandidates HashMap every time we call editSchedule, otherwise like this UT, we call editschedule multiple times but the selector remain the same instance, then curCandidates will contains old selected containers the second time we call the editSchedule, we could definitely make a method inside PreemptionCandidatesSelector, and call it explicitly to reset curCandidates per round, but this way it makes the code even harder to read. Any better suggestions here? # Your suggestion is not updating toPreempt on the fly but only update curCandidates, then after each round, update toPreempt right after selectCandidates is finished? If this is the correct understanding, I think from time complexity point of view, these two strategy should be same since we always need to go through each temp selected candidates and add it into toPreempt with duplicate check, but it helps with maintaining status of selected containers in one place(curCandidates) and avoid inconsistency. We can further discuss this part. I'll address the asflicense issue with the next patch. Thanks! > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch, YARN-8379.004.patch, YARN-8379.005.patch, > ericpayne.confs.tgz > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524583#comment-16524583 ] Wangda Tan commented on YARN-8379: -- [~Zian Chen], Thanks for updating the patch, Few comments: 1) testPreemptionToBalanceWithCustomTimeout is better to move to a separate class. (Maybe something like TestCapacitySchedulerQueueBalancePreemption. The test looks not testing this feature, could u check it? I might misunderstand what you did here. 2) For interface of {{selectCandidates}}, I think we can avoid passing the curCandidates, correct? According to semantics of curCandidates, it should be candidates selected *with in the selector*. An additional comment is: - Now all selectors need to update two maps, curCandidates and selectedCandidates, this causes confusion and developers could forget updating both of them in some cases. Instead of doing this, I think we should refactor this part of code to simplify the logic. This can be done in a separated JIRA. [~Zian Chen], could u create a JIRA for this? > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch, YARN-8379.004.patch, YARN-8379.005.patch, > ericpayne.confs.tgz > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524550#comment-16524550 ] genericqa commented on YARN-8379: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 23s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 45s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 68m 59s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 19s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}122m 20s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8379 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12929299/YARN-8379.005.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 5753e1ea0879 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 62d83ca | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21124/testReport/ | | asflicense | https://builds.apache.org/job/PreCommit-YARN-Build/21124/artifact/out/patch-asflicense-problems.txt | | Max. process+thread count | 883 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21124/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://y
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524393#comment-16524393 ] Zian Chen commented on YARN-8379: - [~sunilg], thanks for the comments. here is the updates for the latest patch. # I fixed the variable naming as you suggested. # For the failed UT. I also tuned the queue config to avoid the cap issue. # I add a data structure named "toPreemptPerSelector" and the reason we need this along with curCandidates is because in ProportionalCapacityPreemptionPolicy#preemptOrkillSelectedContainerAfterWait, when we check each container and see if it can be killed or not, we need to get custom timeout interval based on the selector, so we need matching between list of candidates and selector here. I also added appropriate comments to help people understand the logic for this. # Add a surgical UT and make sure custom timeout works as expected. Could you help review the latest patch? Thanks! > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch, YARN-8379.004.patch, YARN-8379.005.patch, > ericpayne.confs.tgz > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524263#comment-16524263 ] Sunil Govindan commented on YARN-8379: -- Thanks [~eepayne]. I will be having bandwidth to check this. So I ll help in this. Few comments in this patch. # I think its better to rename PREEMPTION_TO_BALANCE_QUEUES_AFTER_SATISFIED to PREEMPTION_TO_BALANCE_QUEUES_BEYOND_GUARANTEED and similar variables. # PREEMPTION_TO_BALANCE_QUEUES_AFTER_SATISFIED_MAX_WAIT_BEFORE_KILL to MAX_WAIT_BEFORE_KILL_FOR_QUEUE_BALANCE_PREEMPTION and other similar variables. # Could we move this one as the list selector. {code:java} 263 if (isPreemptionToBalanceRequired) { 264 PreemptionCandidatesSelector selector = new FifoCandidatesSelector(this, 265 false, true); 266 selector.setMaximumKillWaitTime(maximumKillWaitTimeForPreemptionToQueueBalance); 267 candidatesSelectionPolicies.add(selector); 268 }{code} # I have some doubt about below code {code:java} curCandidates = selector.selectCandidates(toPreempt, curCandidates, clusterResources, totalPreemptionAllowed); {code} {code:java} toPreempt = selector.selectCandidates(toPreempt, clusterResources, totalPreemptionAllowed); {code} Earlier selector know what the selectedCandidates from toPreempt. Now i dnt this we will get this > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch, YARN-8379.004.patch, ericpayne.confs.tgz > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524091#comment-16524091 ] Zian Chen commented on YARN-8379: - [~eepayne], thank you for the notice. Let me ask Wangda and Sunil to review it. > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch, YARN-8379.004.patch, ericpayne.confs.tgz > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524088#comment-16524088 ] Zian Chen commented on YARN-8379: - Quickly checked the failed UT, it's a rounding cap issue which makes one container difference for the selected to preempt candidates. Tested in local works as expected. > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch, YARN-8379.004.patch, ericpayne.confs.tgz > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16523691#comment-16523691 ] Eric Payne commented on YARN-8379: -- [~Zian Chen] , I probably won't be able to get to this for a couple of weeks. > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch, YARN-8379.004.patch, ericpayne.confs.tgz > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16523086#comment-16523086 ] genericqa commented on YARN-8379: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 30m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 58s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 27s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 12s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 36s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}133m 54s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.monitor.capacity.TestProportionalCapacityPreemptionPolicyPreemptToBalance | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8379 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12929105/YARN-8379.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ae8d5c30f508 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 7a3c6e9 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/21101/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21101/testReport/ | | asflicense | https://builds.apache.org/job/PreCommit-YARN-Build/21101/artifact/out/patch-asflicense-problems.txt | | Max. process+thread count | 939 (vs. ulimit of 100
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16522977#comment-16522977 ] Zian Chen commented on YARN-8379: - [~eepayne], for the suggestions you mentioned before which is make the balancing was done all at once instead of add FifoSelector twice. Let me explain this in two aspects, 1. in TempQueuePerPartition#offer. When we calculate ideal assignment, first time we calculate accepted using this, {code:java} // accepted = min{avail, // max - assigned, // current + pending - assigned, // # Make sure a queue will not get more than max of its // # used/guaranteed, this is to make sure preemption won't // # happen if all active queues are beyond their guaranteed // # This is for leaf queue only. // max(guaranteed, used) - assigned} {code} The second time, we calculate accepted without check max(guaranteed, used), as far as I can see, this two steps should be done sequentially instead of in one shot. 2. Another reason is we add an option to set configureable timeout for preempt-to-balance selected containers (selected by fifo2) which can let user set custom timeout for these preempt-to-balance containers and actually kill them faster/slower based on user needs, which leads to control the balance process to be faster or slower. But this timeout should only be affects containers selected for balance, not for an underutilized queue to reach its guaranteed resource. So we need to separate these two process. All other comments should be handled by the latest patch already. Thanks! > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch, YARN-8379.004.patch, ericpayne.confs.tgz > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16522930#comment-16522930 ] Zian Chen commented on YARN-8379: - Fixed all the failed cases and re-upload the patch.[~eepayne], could you please help review the newest patch and share your comments? Thanks! [~leftnoteasy], [~sunilg], could you also share your thoughts on the latest patch please? > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch, YARN-8379.004.patch, ericpayne.confs.tgz > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16514383#comment-16514383 ] Zian Chen commented on YARN-8379: - Hi [~eepayne] , thank you for the latest comments, I'll think it through and fix the failed UTs first. > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch, ericpayne.confs.tgz > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16511694#comment-16511694 ] Eric Payne commented on YARN-8379: -- [~Zian Chen], I attached the confs I used to create my pseudo cluster. I was using patch 003. {quote}3. The reason we add FifoCandidatesSelector to candidatesSelectionPolicies twice is that we want to make conservative preemption when we do the balance. {quote} I don't see why this is necessary. In 2.8 (and earlier 3.x releases prior to YARN-5864), the balancing was done all at once inside the {{FifoCandidatesSelector}} by properly adjusting the ideal assigned values per queue and the values of offered resources to each queue. Why can't we adjust these values to either 1) keep the same behavior or 2) balance queues, depending on the setting of the new property ({{fairness-balance-queue-after-satisfied.enabled}}). {quote}4. The reason for this code is explained in item 3. {quote} My question here is why is {{selectedCandidates}} always returned after the first time through the for loop? If this was the intention, a for loop is not necessary. It looks like the intention was to only return if containers exist in {{selectedCandidates}} (the for loop) AND {{if (!containers.isEmpty())}}. Did you want the return to be inside of the {{if (!containers.isEmpty())}}? > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch, ericpayne.confs.tgz > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16511522#comment-16511522 ] Zian Chen commented on YARN-8379: - Thank you [~eepayne] for your suggestions. I'm totally agreed for the concerns of misleading by naming this feature using " fair" or "fairness". For your concerns, please check my questions and comments below, 1. Which patch were you used for the cross queue preemption test? Patch 003? Could you share your queue settings (like queue hierarchy, each queue's guarantee, used and pending resources, queue priority) so that I can reproduce the case and see what the issue is? 2. Will address failed UTs, find bugs, etc. 3. The reason we add {{FifoCandidatesSelector}} to candidatesSelectionPolicies twice is that we want to make conservative preemption when we do the balance. The balance preemption (which is FifoCandidatesSelector with allowQueuesBalanceAfterAllQueuesSatisfied set to true) will only happen when all queues are meet or beyond its guarantee if any of the queues is underutilized, this balance will not be triggered. The reason we implement like this is we don't want balance and preempt queue to satisfy its guarantee not conflict with each other and preempt containers back and forth. 4. The reason for this code is explained in item 3. {code:java|title=FifoCandidatesSelector#selectCandidates} for (Set containers : selectedCandidates.values()) { if (!containers.isEmpty()) { if (LOG.isDebugEnabled()) { LOG.debug(...); } } return selectedCandidates; } {code} 5. Agree for the log statement. 6. Agree to remove "Fair" or "Fairness" 7. Agree [~leftnoteasy] , [~cheersyang] , any concerns or suggestions on the latest patch? Thanks! > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16511469#comment-16511469 ] Eric Payne commented on YARN-8379: -- [~Zian Chen], thank you for the work on this issue and for the new patch. I am still working through the patch, but I have the following concerns so far. - Cross-queue preemption does not work when this patch is applied. My test environment simulates a 6-node pseudo YARN cluster. I use the same queue configs with and without this patch. With this patch, no cross-queue preemption happens at all. - Please address the failed unit tests, failed findbugs, and failed shadedclient warnings. I think they are related to this patch. - {{ProportionalCapacityPreemptionPolicy#updateConfigIfNeeded}} This code adds {{FifoCandidatesSelector}} to candidatesSelectionPolicies twice, which will cause it to be called twice since candidatesSelectionPolicies is an {{ArrayList}}. If I understand correctly, the reason for this is so that the first time {{FifoCandidatesSelector#selectCandidates}} is called, it will preempt up to the requesting queue's guarantee, and the second time it will not preempt until the requsting queue is above its guarantee AND the {{allowQueuesBalanceAfterAllQueuesSatisfied}} variable is set. Why can't {{FifoCandidatesSelector}} just be added once and do all the processing it needs to based on whether or not {{allowQueuesBalanceAfterAllQueuesSatisfied}} is set? - {{FifoCandidatesSelector#selectCandidates}} If the skip logic is necessary (depending on answer to my first question), I think the return needs to be moved up above the previous curly brace (}). The way it is now, it returns whether containers is empty or not empty. {code:title=FifoCandidatesSelector#selectCandidates} for (Set containers : selectedCandidates.values()) { if (!containers.isEmpty()) { if (LOG.isDebugEnabled()) { LOG.debug(...); } } return selectedCandidates; } {code} - {{FifoCandidatesSelector#selectCandidates}} For the debug log statement, I would not use the word "Fairness" because the word "Fair" has a lot of different meanings when it comes to schedulers and policies. To make it more grammatically correct (and to remove the confusion surrounding "fairness"), I would say, "The preemption-to-balance feature is on. Some containers were chosen for preemption by previous selectors. Skipping container selection for FifoCandidatesSelector"); - General. For the same reason as above, I think we can just remove the workd "Fair" or "Fairness" from all method and variable names and the meaning would remain. - {{AbstractPreemptableResourceCalculator#getIdealPctOfGuaranteed}} bq. Should we allow queues continue grow after queues satisfied? This could be misinterpreted to mean that the capacity scheduler doesn't allow a queue to grow over its capacity guarantee. It may be better to modify this to make it clear that we are talking about preemption: "Should resources be preempted from an over-served queue when the requesting queues are all at or over their guarantees?" > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510996#comment-16510996 ] genericqa commented on YARN-8379: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 2m 34s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 11s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 2m 20s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 11s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 1 new + 1 unchanged - 0 fixed = 2 total (was 1) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 8s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}106m 9s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | | Null passed for non-null parameter of CapacitySchedulerPreemptionUtils.deductPreemptableResourcesBasedSelectedCandidates(CapacitySchedulerPreemptionContext, Map) in org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.FifoCandidatesSelector.selectCandidates(Map, Resource, Resource) Method invoked at FifoCandidatesSelector.java:of CapacitySchedulerPreemptionUtils.deductPreemptableResourcesBasedSelectedCandidates(CapacitySchedulerPreemptionContext, Map) in org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.FifoCandidatesSelector.selectCandidates(Map, Resource, Resource) Method invoked at FifoCandidatesSelector.java:[line 91] | | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerSurgicalPreemption | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerAutoCreatedQueuePreemption | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.Tes
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510896#comment-16510896 ] Zian Chen commented on YARN-8379: - Update patch 003 for stage 2. Add configured timeout for fairness balance selector and add UTs for this. [~eepayne] , [~sunilg] , could you help review the patch 003 please? this patch should complete the full feature as required in the description. Thanks! > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510225#comment-16510225 ] Zian Chen commented on YARN-8379: - Sure, [~eepayne] , no problem. Really appreciate your help for reviewing the patch! > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509992#comment-16509992 ] Eric Payne commented on YARN-8379: -- [~Zian Chen], I will review the patch but it may take a couple of days. > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508526#comment-16508526 ] Zian Chen commented on YARN-8379: - Hi [~eepayne] , could you help review my first stage patch and share your comments and thoughts on this? I'm working on the second stage of the patch and the work will depend on the first stage. Thanks! > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508524#comment-16508524 ] Zian Chen commented on YARN-8379: - {color:#33}Thanks, [Eric Payne|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=eepayne] for the explanation. Let me share my understanding about Vinod's question as well.{color} Since this involves with both allocation and preemption, also for this change, we should provide crispy semantic associated with all the other combinations of our features - like queue priority, application priority, user limit, inter-queue preemption, intra-queue preemption, etc. Let's talk about queue priority in details, for resource-allocation # For two queues with the same priority ** The queue with less relative used-capacity goes first. This won’t make conflicts with our change. For example two queues A and B, used/ guaranteed percent of A is 20 / 50 = 0.4, B is 30 / 50 = 0.6, The overall process of allocation here is filling the queue with as much resource as possible until less relative used-capacity queue (queue A) get used / guaranteed percent equal or larger than B, then allocation will fill B and continue this process to balance utilization of A and B until there is no more available resource to allocate. # For two queues with different priorities ** Our change will respect this part since priority should come first in allocation point of view. fairness should be a concept within priority, not across different queue priorities. - today’s behavior For resource-preemption * We need to release the constraint that prevents queue to further accept available resource when beyond its guaranteed resources. Achieve fairness by calculating ideal assigned capacity based on proportional assignment. Relation to other scheduler features |Feature|Relationship| |Queue capacity / max capacity|This feature still honors queue configured capacity and queue’s max-capacity as usual.| |Application priority|Since application priority is an intra-queue property, this feature doesn’t impact application-priorities.| |User limit|Same as above| |Inter-queue proportional capacity preemption|After YARN-5864, queue priority already been handled while calculating ideal assigned capacity for each queue level. The only change we need to make is to release the constraint that queue accepted capacity should not beyond max(guaranteed, used) - assigned, this constraint will make queue imbalanced and prevent further preemption when all queues are beyond its guaranteed.| |Queue preemption disable|Fairness should not violate this constraint if current queue is disabled preemption| |Intra-queue preemption|No impact, queue fairness only apply across the queues.| Will we encounter a case that preemption will try to preempt a queue while allocation try to give the available resource to this queue after my change? No, because preemption calculation always has delays(kill before wait timeout), if allocation happens during this timeout, preemption will re-calculate the ideal assignment before really kill containers. And the overall preemption purpose of selecting containers and the process of allocation are the same, which is make relatively less utilized queue (apply to both under-utilized case and beyond guaranteed case) and make them “balance” in the end. Current logic is allocation will make all queues balance while choose who should be next queue to give resource, while preemption prevents this balance further happen after all queues beyond its guarantee, after my change, preemption will release the constraint and let further preemption happen to ensure this balance also achieve on preemption side. > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic sinc
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16504630#comment-16504630 ] Eric Payne commented on YARN-8379: -- {quote} Why is this a preemption only concept? To avoid unnecessary thrash between the allocation doing one thing and the preemption doing another, we should also have a corresponding queue ordering-policy, right? {quote} This is only related to preemption because the capacity scheduler already balances if resources become available. However, currently, if preemption is enabled on all queues, preemption will stop freeing resources once all pending queues are over their queue capacity. The example in this JIRA's description outlines the current behavior. In that example, if resources free up naturally from queue_b, the capacity scheduler will assign them to queue_a. However, the preemption monitor will not preempt them because queue_a is at its 30% queue capacity. In 2.8 and prior releases, the preemption monitor does balance. As pointed out above, the balancing behavior was changed as part of YARN-5864 > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16504207#comment-16504207 ] Vinod Kumar Vavilapalli commented on YARN-8379: --- Why is this a preemption only concept? To avoid unnecessary thrash between the allocation doing one thing and the preemption doing another, we should also have a corresponding queue ordering-policy, right? > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16504195#comment-16504195 ] genericqa commented on YARN-8379: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 19s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 8s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 39s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 34 new + 526 unchanged - 0 fixed = 560 total (was 526) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 2s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 15s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 1 new + 1 unchanged - 0 fixed = 2 total (was 1) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 66m 52s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}127m 9s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | | Null passed for non-null parameter of CapacitySchedulerPreemptionUtils.deductPreemptableResourcesBasedSelectedCandidates(CapacitySchedulerPreemptionContext, Map) in org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.FifoCandidatesSelector.selectCandidates(Map, Resource, Resource) Method invoked at FifoCandidatesSelector.java:of CapacitySchedulerPreemptionUtils.deductPreemptableResourcesBasedSelectedCandidates(CapacitySchedulerPreemptionContext, Map) in org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.FifoCandidatesSelector.selectCandidates(Map, Resource, Resource) Method invoked at FifoCandidatesSelector.java:[line 91] | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16504115#comment-16504115 ] Zian Chen commented on YARN-8379: - Hi [~eepayne] , updated the patch based on latest trunk. Could you please help review it? Thanks > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503635#comment-16503635 ] Zian Chen commented on YARN-8379: - Sure, Eric, I see the conflict while applying the patch in latest trunk. Let me update it. > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503282#comment-16503282 ] Eric Payne commented on YARN-8379: -- Thanks [~Zian Chen] for working on this and providing an initial patch. The patch does not apply cleanly, so can you please provide an update? > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16502751#comment-16502751 ] Zian Chen commented on YARN-8379: - Thanks, [~leftnoteasy] for bringing up this feature. I worked on this Jira and finish an initial patch for the first stage of this feature which enables fairness balance after all queues beyond its guarantee. What I did is, # Release the constraint in TempQueuePerPartition#offer to let queues can continue to grows its ideal assigned resource after it satisfied it's guaranteed. # Make fairness balance preemption can be configurable through capacity-scheduler.xml # Add several test cases in PCPP to make sure this part of logic works for DRF and defaultResourceCalculator. What I plan to do in the second stage is, # Add timeout to candidateSelector and make fairness balance selector has its different timeout. # Figure out the list of containers each candidateSelector selected since we need this information to set the different timeout for selected containers from fairness balance selector Currently, in each step of candidate selection, selector always make changes in the same "toPreempt" set of RMContainer and only stores aggregated results, besides this information, we also need individual results from each selector as well. [~eepayne] , could you help review the patch for the first stage and see if there is any potential issue for changing the tempQueuePerPartition part? Also, could you share how you accomplished the fairness feature in 2.8 as you mentioned? Thanks! > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Zian Chen >Priority: Major > Attachments: YARN-8379.001.patch > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496715#comment-16496715 ] Eric Payne commented on YARN-8379: -- Thanks [~leftnoteasy] for bringing this up. Yes, our use case would benefit from this feature. We are currently running 2.8, which does the balancing, so this would help us in moving to 3.x in the future. > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Major > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_b since > queue_b has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8379) Add an option to allow Capacity Scheduler preemption to balance satisfied queues
[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16495817#comment-16495817 ] Wangda Tan commented on YARN-8379: -- To make better resource balances between queues, we propose to make the additional preemption between queues configurable. And admin can set a different kill-before-wait timeout to control the pace of the additional queue balance preemption. cc: [~jlowe], [~eepayne], [~sunilg] for suggestions. Thanks [~clayb]/[~Zian Chen] for offline suggestions and feedbacks. > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Major > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_b since > queue_b has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org