[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2056: - Attachment: YARN-2056.201411142002.txt [~leftnoteasy], Thank you for all of your help. Uploading new patch. bq. Instead of multiply you should use multiplyAndNormalizeUp here. Using {{multiplyAndNormalizeUp}} helps. However, for the use case in {{testHierarchicalLarge}}, the rounding is still different with the new algorithm (7 and 5 instead of 9 and 4). bq. Actually I think we should consider minimum_allocation in preemption policy, we can address that in a separated JIRA. Would you please create a new JIRA and elaborate on this further? {quote} bq. {{testDisablePreemptionOverCapPlusPending}} Since the result is not changed before/after we set preemption queue, I think it is unnecessary, I would suggest to take it out. {quote} I removed this test. > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal >Assignee: Eric Payne > Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, > YARN-2056.201408310117.txt, YARN-2056.201409022208.txt, > YARN-2056.201409181916.txt, YARN-2056.201409210049.txt, > YARN-2056.201409232329.txt, YARN-2056.201409242210.txt, > YARN-2056.201410132225.txt, YARN-2056.201410141330.txt, > YARN-2056.201410232244.txt, YARN-2056.201410311746.txt, > YARN-2056.201411041635.txt, YARN-2056.201411072153.txt, > YARN-2056.201411122305.txt, YARN-2056.201411132215.txt, > YARN-2056.201411142002.txt > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2056: - Attachment: YARN-2056.201411132215.txt Thank you very much [~leftnoteasy]. Here is the patch with my changes. {quote} 1) Typo of comment in cloneQueue: 2. The logic should be correct, but I think it might be simpler to say: untouchableExtra = max(extra - childrenPreemptable, 0) and as same as the code. {quote} Comment changed {quote} 3) bq. public double getIdealPctOfGuaranteed(TempQueue q) The method doesn't need to be public anymore {quote} Changed to {{private}} {quote} 4) Does it possible there's only one queue in getMostUnderservedQueues?If so, you should check if q2 is null {quote} Deep down in {{getIdealPctOfGuaranteed}}, it eventually does the null check, but I added a null check in {{getMostUnderservedQueues}} as well. Better safe than sorry :-) {quote} 1) testDisablePreemptionOverCapPlusPending Should disable queueB instead of queueA? Currently, the test will preempt from appB not matter if preemption disabled for queueA or not {quote} The point of that test was to indicate that preemption levelization will still happen even if the thing asking for resources is the one that is untouchable. If you think this test is unnecessary, I will take it out. {quote} 2) changes for t{{estHierarchicalLarge}}: I'm a little concern about this change, even if we considering round error, appA should be taken about 9-10 resources, 9->6 seems some potential bug caused issue, could you double check if it works as expected? (Without affect the normal preemption logic). 3) As above for {{testSkipAMContainer}} I suggest you can take some investigation about why some original numbers need to be changed, if it is just a round problem, that should be fine, but we should avoid behavior changes. {quote} Yes, these changes are definitely due to rounding. There are 2 things that cause the rounding problem: # The new algorithm in {{computeFixpointAllocation}} is different, so you would expect there to be differences in rounding. #* One oddity about this algorithm is that it is still using {{normalizedGuarantee}} to calculate {{wQavail}}. #* if you have A and B, and B is less served than A, when it gets down to the last one or two resources, it will try to multiply 1 or 2 by {{normalizedGuarantee}} and will offer B 0, so the last 1 or 2 go to A. #* This shows up noticeably in the unit tests where there are a smaller total number of resources. #* The algorithm could do a couple of things to clean this up. #* it could always round the wQavail #* It could also calculate how much resources it would take to get B up to the level of the next thing on the underserved queue, and if there are unassigned resources, just offer them to B. # The other thing that is causing rounding is that when {{cloneQueues}} gets {{root.getAbsoluteUsedCapacity()}}, it often comes out as something 0.524761581421 instead of 0.525, and so when that is used to calculate {{current}}, it thinks a queue's current is really one less than what it should be. #* Something that could be done here is we could round to the nearest 3 decimal places, for example. 3 decimal places is what shows up in the scheduler UI metrics. > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal >Assignee: Eric Payne > Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, > YARN-2056.201408310117.txt, YARN-2056.201409022208.txt, > YARN-2056.201409181916.txt, YARN-2056.201409210049.txt, > YARN-2056.201409232329.txt, YARN-2056.201409242210.txt, > YARN-2056.201410132225.txt, YARN-2056.201410141330.txt, > YARN-2056.201410232244.txt, YARN-2056.201410311746.txt, > YARN-2056.201411041635.txt, YARN-2056.201411072153.txt, > YARN-2056.201411122305.txt, YARN-2056.201411132215.txt > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2056: - Attachment: YARN-2056.201411122305.txt Thanks [~leftnoteasy]. Here is the updated patch with your suggested changes. > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal >Assignee: Eric Payne > Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, > YARN-2056.201408310117.txt, YARN-2056.201409022208.txt, > YARN-2056.201409181916.txt, YARN-2056.201409210049.txt, > YARN-2056.201409232329.txt, YARN-2056.201409242210.txt, > YARN-2056.201410132225.txt, YARN-2056.201410141330.txt, > YARN-2056.201410232244.txt, YARN-2056.201410311746.txt, > YARN-2056.201411041635.txt, YARN-2056.201411072153.txt, > YARN-2056.201411122305.txt > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2056: - Attachment: YARN-2056.201411072153.txt [~leftnoteasy], This patch has the enhancements I described above. Thanks. > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal >Assignee: Eric Payne > Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, > YARN-2056.201408310117.txt, YARN-2056.201409022208.txt, > YARN-2056.201409181916.txt, YARN-2056.201409210049.txt, > YARN-2056.201409232329.txt, YARN-2056.201409242210.txt, > YARN-2056.201410132225.txt, YARN-2056.201410141330.txt, > YARN-2056.201410232244.txt, YARN-2056.201410311746.txt, > YARN-2056.201411041635.txt, YARN-2056.201411072153.txt > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2056: - Attachment: YARN-2056.201411041635.txt > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal >Assignee: Eric Payne > Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, > YARN-2056.201408310117.txt, YARN-2056.201409022208.txt, > YARN-2056.201409181916.txt, YARN-2056.201409210049.txt, > YARN-2056.201409232329.txt, YARN-2056.201409242210.txt, > YARN-2056.201410132225.txt, YARN-2056.201410141330.txt, > YARN-2056.201410232244.txt, YARN-2056.201410311746.txt, > YARN-2056.201411041635.txt > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2056: - Attachment: YARN-2056.201410311746.txt [~leftnoteasy], sorry for putting another patch up, but there was a slight bug in the previous patch. > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal >Assignee: Eric Payne > Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, > YARN-2056.201408310117.txt, YARN-2056.201409022208.txt, > YARN-2056.201409181916.txt, YARN-2056.201409210049.txt, > YARN-2056.201409232329.txt, YARN-2056.201409242210.txt, > YARN-2056.201410132225.txt, YARN-2056.201410141330.txt, > YARN-2056.201410232244.txt, YARN-2056.201410311746.txt > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2056: - Attachment: YARN-2056.201410232244.txt Thanks very much [~leftnoteasy]. I have attached a patch which uses PriorityQueue instead of an internal queue class. Please note that since the algorithm for building up needy queues is different, the rounding is also different, so some of the tests' expected values needed to change. I stepped through several of the tests and they seem to be working as I expect. > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal >Assignee: Eric Payne > Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, > YARN-2056.201408310117.txt, YARN-2056.201409022208.txt, > YARN-2056.201409181916.txt, YARN-2056.201409210049.txt, > YARN-2056.201409232329.txt, YARN-2056.201409242210.txt, > YARN-2056.201410132225.txt, YARN-2056.201410141330.txt, > YARN-2056.201410232244.txt > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2056: - Attachment: YARN-2056.201410141330.txt I'm sorry. The previous patch was bad. This one compiles cleanly. > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal >Assignee: Eric Payne > Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, > YARN-2056.201408310117.txt, YARN-2056.201409022208.txt, > YARN-2056.201409181916.txt, YARN-2056.201409210049.txt, > YARN-2056.201409232329.txt, YARN-2056.201409242210.txt, > YARN-2056.201410132225.txt, YARN-2056.201410141330.txt > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2056: - Attachment: YARN-2056.201410132225.txt [~leftnoteasy], Thanks for all of your help. After looking through your suggested algorithm and thinking about it some more, I think it is important to have a model where the most underserved queues are given first chance at the unassigned resources. I think the algorithm should build up each queue, reassessing its needs on every pass. Based on this, I have rewritten the patch with the following algorithm. Plese let me know what you think. {code} - Prior to assigning the unused resources, process each queue as follows: - If current > guaranteed, idealAssigned = guaranteed + untouchable extra Else idealAssigned = current; - Subtract idealAssigned resources from unassigned. - If the queue has all of its needs met (that is, if idealAssigned >= current + pending), remove the queue from consideration. - Sort queues from most under-guaranteed to most over-guaranteed. Call the this queue orderedByNeed - While there are unsatisfied queues and some unassigned resources exist - calculate normalized guaranteed (as today) for all remaining queues at this hierarchical level - Pull off the underserved queue(s) from orderedByNeed - For each underserved queue (or set of queues if multiple are equally underserved), offer its share of the unassigned resources based on its normalized guarantee. - After the offer, if the queue is not satisfied, place it back in the ordered list of queues (orderedByNeed), recalculating its place in the order of most under-guaranteed to most over-guaranteed. In this way, the most underserved queue(s) are always handled first. {code} > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal >Assignee: Eric Payne > Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, > YARN-2056.201408310117.txt, YARN-2056.201409022208.txt, > YARN-2056.201409181916.txt, YARN-2056.201409210049.txt, > YARN-2056.201409232329.txt, YARN-2056.201409242210.txt, > YARN-2056.201410132225.txt > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2056: - Attachment: YARN-2056.201409242210.txt [~leftnoteasy], I'm sorry for the churn on patches, and thanks again for helping me on this. The current patch maintains the existing behavior as before, and addresses the concern you raised in comment: https://issues.apache.org/jira/browse/YARN-2056?focusedCommentId=14142404&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14142404 That is, it addresses the case where {code} root has A,B,C, total capacity = 90 A.guaranteed = 30, A.pending = 20, A.current = 40 B.guaranteed = 30, B.pending = 0, B.current = 50 C.guaranteed = 30, C.pending = 0, C.current = 0 {code} It will levelize the over-capacity queues to be A.idealAssigned = 45, B.idealAssigned = 45 This patch implements the algorithm I described in comment: https://issues.apache.org/jira/browse/YARN-2056?focusedCommentId=14145650&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14145650 > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal >Assignee: Eric Payne > Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, > YARN-2056.201408310117.txt, YARN-2056.201409022208.txt, > YARN-2056.201409181916.txt, YARN-2056.201409210049.txt, > YARN-2056.201409232329.txt, YARN-2056.201409242210.txt > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2056: - Attachment: YARN-2056.201409232329.txt [~leftnoteasy], This patch now works in all cases, except IF queue preemption is disabled for a particular queue, it has a different behavior than previously. That is: {code} root has A,B,C, total capacity = 90 A.guaranteed = 30, A.pending = 20, A.current = 40 B.guaranteed = 30, B.pending = 0, B.current = 50 C.guaranteed = 30, C.pending = 0, C.current = 0 {code} In the above case, if all queues are preemptable, this patch works the same as it did before. That is, A and B will both end up with 45. However, with this patch {code} IF (A is not preemptable) AND (A is already over capacity) AND (all resources are used) AND (A is asking for more resources) { A will remain at 40 and B will remain at 50 } {code} I believe that there is a way to make the this patch maintain the old behavior, even if A is not preemptable, but it would require something like the following algorithm: {code} In ProportionalCapacityPreemptionPolicy#computeFixpointAllocation: FOR each queue { IF queue has untouchableExtra { queue.idalAssigned = queue.guaranteed + queue.untouchable unassigned -= queue.idealAssigned remove queue from qAlloc add queue to list of queues that were removed } } Assign the remaining unassigned resources, computing idealAssigned for the remaining queues, with the following modification: IF (queues at this level go over their capacity) AND (they are over by the same percentage as the queue(s) that were removed) { put the removed queue(s) back into qAlloc and continue to compute idealAssigned } {code} > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal >Assignee: Eric Payne > Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, > YARN-2056.201408310117.txt, YARN-2056.201409022208.txt, > YARN-2056.201409181916.txt, YARN-2056.201409210049.txt, > YARN-2056.201409232329.txt > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2056: - Attachment: YARN-2056.201409210049.txt Hi [~leftnoteasy]. Thank you for spending the time to look at this patch and provide helpful suggestions. {quote} IMHO, the right place to put reserving resource logic for un-preemptable queue is not {{resetCapacity}}, it should in {{computeFixpointAllocation}}. ... Does this make sense to you? {quote} Yes, that makes sense, and I think it is a simpler algorithm. I updated the patch, so please have a look. I have made a conscious decision to only allow disable preemption at the leaf queue level. This is because there may be a use case where you want to disable preemption at the parent level, and have other queue hierarchies leave it alone, but then allow preemption between children of the disabled parent. So, rather than solve that problem with this fix, I only allow leaf queues to disable preemption. Even if a leaf queue could inherit it's parent's disable preemption value, there will likely be cases where part of the parent queue's over-capacity resources are untouchable and part of them are preemptable. So, I adjusted your suggested algorithm somewhat. - I collected untouchableExtra instead of preemptableExtra at the TempQueue level. in {{computeFixpointAllocation}}, - I looped through each queue, and if one has any untouchableExtra, then the queue's {{idealAssigned = guaranteed + untouchableExtra}} - In {{TempQueue#offer}}, one of the calculations is {{current + pending - idealAssigned}}. I had to take into consideration that if the queue has over capacity, some of it may be untouchable and some may be preemptable. If some of it is preemptable, then {{current}} could be greater than {{idealAssigned}}, and {{TempQueue#offer}} would end up assigning more to that queue than it should. > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal >Assignee: Eric Payne > Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, > YARN-2056.201408310117.txt, YARN-2056.201409022208.txt, > YARN-2056.201409181916.txt, YARN-2056.201409210049.txt > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2056: - Attachment: YARN-2056.201409181916.txt [~leftnoteasy] and [~jlowe], Thank you both for your reviews of this jira. The previous patch did not handle all cases in hierarchical queues where the queue is over-capacity and some or all of it's children are non-preemptable. This patch has been rewritten and should address all use cases. > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal >Assignee: Eric Payne > Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, > YARN-2056.201408310117.txt, YARN-2056.201409022208.txt, > YARN-2056.201409181916.txt > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2056: - Attachment: YARN-2056.201409022208.txt Thanks very much, [~leftnoteasy]. I have added the test for disabling preemption of hierarchical queues. > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal >Assignee: Eric Payne > Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, > YARN-2056.201408310117.txt, YARN-2056.201409022208.txt > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2056: - Attachment: YARN-2056.201408310117.txt {quote} 1) I prefer to make per-queue disable preemption option follow the same config options in existing capacity-scheduler (same queue-path-prefix, etc.). {quote} Do you mean that the prefix should be {{yarn.scheduler.capacity}} instead of {{yarn.resourcemanager.monitor.capacity.preemption}}? I have done this in this patch. {quote} 2) {{mockNested}} {{when(q.getQueuePath())}} should consider hierarchy of queue as well {quote} I have added this {quote} 3) It's better to add tests for hierarchy of queues when preemption is disabled {quote} I have not done this yet, but I wanted to get the rest of this patch up for you. {quote} 4) In {{testPerQueueDisablePreemption}}, I think number of preemptions after enable queue-b's preemption is not very clear to me: {code} +// With no PREEMPTION_DISABLED set for queueB, get resources from both +// queueB and queueC (times() assertion is cumulative). +verify(mDisp, times(5)).handle(argThat(new IsPreemptionRequestFor(appB))); +verify(mDisp, times(16)).handle(argThat(new IsPreemptionRequestFor(appC))); {code} In the 2nd preemption, more resource reclaimed from appC than appB, I think it should get resource from app more, could you please take a look at what happened? I just afraid because we changed ideal resource calculation in 1st preemption, is it possible to affect 2nd preemption {quote} The problem with the mock of {{EventHandler }} (which is the {{mDisp}} variable) is that it counts the total number of events sent to the queue ({{queuec}}, for example), so for the second assertion that was acting upon {{queueC}}, I had to account for the total number of events, not just those for the second call to {{policy.editSchedule()}}. So, what I did for this patch is reset the test between the first and second calls to {{policy.editSchedule()}} > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal >Assignee: Eric Payne > Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, > YARN-2056.201408310117.txt > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2056: - Attachment: YARN-2056.201408260128.txt [~leftnoteasy], I think I have what we need with this patch. In this patch: - {{cloneQueues}} includes the queue in the TempQueue list even if it is over capacity and has the disablePreempt flag set, but it marks the queue as disablePreempt = true - {{computeIdealResourceDistribution}} then subtracts the {{disablePreempt}} queue's used resources from the total guaranteed resources and removes the queue from calculation during ideal resource distribution. - Then, whenever preemption is being considered after that, the disablePreemption flag is checked before determining if a queue's resources should be preempted. The result is that the pending allocations are taken from the queue's that do not have the {{disablePreempt}} flag set (if they are also over-capacity) > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal >Assignee: Eric Payne > Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-2056: - Attachment: YARN-2056.201408202039.txt This patch keeps the {{yarn.resourcemanager.monitor.capacity.preemption.max_ignored_over_capacity}} property as a global parameter, and then adds a per-queue property in this format: {{yarn.resourcemanager.monitor.capacity.preemption..max_ignored_over_capacity}} The preemption code makes two sets of passes through the queues. The first time through, it calculates the ideal resource allocation per queue based on normalized guaranteed capacity, and the second time through, it selects which queue's resources to preempt, taking into consideration the {{max_ignored_over_capacity)) In this patch, the per-queue {{...max_ignored_over_capacity}} is taken into consideration in the first pass to help determine which queues have resources available for preempting. This is necessary because without it, queues that could fulfill the need would otherwise be removed from the list of available resources. Then, for the second pass, the global {{...max_ignored_over_capacity}} setting is used, as before, to determine which resources out of the remaining available resources to use. This patch still requires an RM restart if the queue properties have changed. > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal >Assignee: Eric Payne > Attachments: YARN-2056.201408202039.txt > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-2056: -- Fix Version/s: (was: 2.1.0-beta) > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-2056) Disable preemption at Queue level
[ https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated YARN-2056: Description: We need to be able to disable preemption at individual queue level (was: If Queue A does not have enough capacity to run AM, then AM will borrow capacity from queue B to run AM in that case AM will be killed if queue B will reclaim its capacity and again AM will be launched and killed again, in that case job will be failed.) > Disable preemption at Queue level > - > > Key: YARN-2056 > URL: https://issues.apache.org/jira/browse/YARN-2056 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.4.0 >Reporter: Mayank Bansal > Fix For: 2.1.0-beta > > > We need to be able to disable preemption at individual queue level -- This message was sent by Atlassian JIRA (v6.2#6252)