[jira] [Comment Edited] (YARN-8967) Change FairScheduler to use PlacementRule interface

Wilfred Spiegelenburg (JIRA) Mon, 18 Mar 2019 07:16:17 -0700


    [ 
https://issues.apache.org/jira/browse/YARN-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16795049#comment-16795049
 ]


Wilfred Spiegelenburg edited comment on YARN-8967 at 3/18/19 2:15 PM:
----------------------------------------------------------------------

Thank you for the review [~yufeigu]

AllocationFileLoaderService file:
1) yes it did clean up nicely
2) The class is marked as {{@Unstable}} that should cover the change. Leaving 
the old constructors in could allow you to create a new 
{{AllocationFileLoaderService}} without a scheduler reference. That would cause 
a NPE on scheduler init and every single time the reload thread would run, 
leaving the RM in a failed state. I don't think it would be wise to leave them 
in.  _Based on all this I do think I need to file a follow up jira to fix the 
Hive SHIM that uses the policy at the moment and move that to the new code in a 
backward compatible way._
3) fixed that
4) fixed that
5) The difference between recovery and normal is just two if statements: in the 
first we ignore and empty context on recovery and the second one is to not 
generate an event on recovery. Moving the code out would not help. The checks 
are on opposite sides of the method and simple.
6) We could still have an empty queue that was why I left it. I just noticed 
that that case would be caught by the {{getLeafQueue}} so we should be OK with 
removing.
7) fixed that, it should have been removed

QueuePlacementPolicy file:
1) I have chosen to use the utility class solution and clean up a bit more. 
Keeping the QueuePlacementPolicy around in the allocation does not really help 
as the rules are really only relevant in the QueuePlacementManager in the new 
setup. There is no logic beside the rule list which is not 1:1 with the config 
that we could keep around.
2) fixed the reference (I used javadoc as there was nothing for other comments, 
now it is just a plain comment)
3) removed the comment and code
4) fixed
5) the tests look really similar but they are not. They test slight variations: 
the first two checks make sure the specified rule and create user rule trigger 
correctly. The last two make sure that the specified rule triggers but not the 
user rule and that the default rule does the catch it correctly.
6) fixed that, I left it at first with a view on possible extension later with 
other bits. I now moved the parent create code out and left the loop for 
elements which clears things up.
7) added a RuleMap class based on the suggestion
8) I think it is better to file a follow up jira as the same has happened in 
all new rule classes. We must have overlooked them in the previous jira when we 
did the cleanup. I checked and the exception is logged in the client service so 
it can be done.




was (Author: wilfreds):
Thank you for the review [~yufeigu]

1) yes it did clean up nicely
2) The class is marked as {{@ Unstable}} that should cover the change. Leaving 
the old constructors in could allow you to create a new 
{{AllocationFileLoaderService}} without a scheduler reference. That would cause 
a NPE on scheduler init and every single time the reload thread would run, 
leaving the RM in a failed state. I don't think it would be wise to leave them 
in. 
Based on all this I do think I need to file a follow up jira to fix the Hive 
SHIM that uses the policy at the moment and move that to the new code in a 
backward compatible way.
3) fixed that
4) fixed that
5) The difference between recovery and normal is just two if statements: in the 
first we ignore and empty context on recovery and the second one is to not 
generate an event on recovery. Moving the code out would not help. The checks 
are on opposite sides of the method and simple.
6) We could still have an empty queue that was why I left it. I just noticed 
that that case would be caught by the {{getLeafQueue}} so we should be OK with 
removing.
7) fixed that, it should have been removed

1) I have chosen to use the utility class solution and clean up a bit more. 
Keeping the QueuePlacementPolicy around in the allocation does not really help 
as the rules are really only relevant in the QueuePlacementManager in the new 
setup. There is no logic beside the rule list which is not 1:1 with the config 
that we could keep around.
2) fixed the reference (I used javadoc as there was nothing for other comments, 
now it is just a plain comment)
3) removed the comment and code
4) fixed
5) the tests look really similar but they are not. They test slight variations: 
the first two checks make sure the specified rule and create user rule trigger 
correctly. The last two make sure that the specified rule triggers but not the 
user rule and that the default rule does the catch it correctly.
6) fixed that, I left it at first with a view on possible extension later with 
other bits. I now moved the parent create code out and left the loop for 
elements which clears things up.
7) added a RuleMap class based on the suggestion
8) I think it is better to file a follow up jira as the same has happened in 
all new rule classes. We must have overlooked them in the previous jira when we 
did the cleanup. I checked and the exception is logged in the client service so 
it can be done.



> Change FairScheduler to use PlacementRule interface
> ---------------------------------------------------
>
>                 Key: YARN-8967
>                 URL: https://issues.apache.org/jira/browse/YARN-8967
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacityscheduler, fairscheduler
>            Reporter: Wilfred Spiegelenburg
>            Assignee: Wilfred Spiegelenburg
>            Priority: Major
>         Attachments: YARN-8967.001.patch, YARN-8967.002.patch, 
> YARN-8967.003.patch, YARN-8967.004.patch, YARN-8967.005.patch, 
> YARN-8967.006.patch, YARN-8967.007.patch, YARN-8967.008.patch
>
>
> The PlacementRule interface was introduced to be used by all schedulers as 
> per YARN-3635. The CapacityScheduler is using it but the FairScheduler is not 
> and is using its own rule definition.
> YARN-8948 cleans up the implementation and removes the CS references which 
> should allow this change to go through.
> This would be the first step in using one placement rule engine for both 
> schedulers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-8967) Change FairScheduler to use PlacementRule interface

Reply via email to