[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior
[ https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14393440#comment-14393440 ] Vinod Kumar Vavilapalli commented on YARN-3318: --- Filed YARN-3441 and YARN-3442 for parent queues and for limits. > Create Initial OrderingPolicy Framework, integrate with CapacityScheduler > LeafQueue supporting present behavior > --- > > Key: YARN-3318 > URL: https://issues.apache.org/jira/browse/YARN-3318 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Craig Welch >Assignee: Craig Welch > Attachments: YARN-3318.13.patch, YARN-3318.14.patch, > YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, > YARN-3318.36.patch, YARN-3318.39.patch > > > Create the initial framework required for using OrderingPolicies with > SchedulerApplicaitonAttempts and integrate with the CapacityScheduler. This > will include an implementation which is compatible with current FIFO behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior
[ https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14393388#comment-14393388 ] Vinod Kumar Vavilapalli commented on YARN-3318: --- bq. I think it should be fine to make policy interfaces define as well as CapacityScheduler changes together with this patch (only for FifoOrderingPolicy), it's good to see how interfaces and policies work in CS, is it easy or not, etc. = We can still do this with patches on two JIRAs - one for the framework, one for CS, one for FS etc. The Fifo one can be here for demonstration, no problem with that. Why is it so hard to focus one thing in one JIRA? > Create Initial OrderingPolicy Framework, integrate with CapacityScheduler > LeafQueue supporting present behavior > --- > > Key: YARN-3318 > URL: https://issues.apache.org/jira/browse/YARN-3318 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Craig Welch >Assignee: Craig Welch > Attachments: YARN-3318.13.patch, YARN-3318.14.patch, > YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, > YARN-3318.36.patch, YARN-3318.39.patch > > > Create the initial framework required for using OrderingPolicies with > SchedulerApplicaitonAttempts and integrate with the CapacityScheduler. This > will include an implementation which is compatible with current FIFO behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior
[ https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14393374#comment-14393374 ] Wangda Tan commented on YARN-3318: -- [~cwelch], I took a look at your latest patch as well as [~vinodkv]'s suggestions, comments: *1. I prefer what Vinod suggested, split "SchedulerProcess" to be "QueueSchedulable" and "AppSchedulable" to avoid notes in FairScheduler interface "Schedulable" like:* {code} /** Start time for jobs in FIFO queues; meaningless for QueueSchedulables.*/ {code} They can both inherit {{Schedulable}}. With this patch, we can limit to AppSchedulable and Schedulable definition. Also, regarding to schedulable comparator, not all "Schedulable" fit for all comparator, it's meaningless to do "FIFO" scheduling in parent queue level. I think: {code} Schedulable contains ResourceUsage (class), and name In addition, AppSchedulable contains compareSubmissionOrderTo(AppSchedulable) and Priority {code} *2. About inherit relationships between interfaces/classes, now it's not very clear to me, I spent some time got what they're doing. My suggestion is:* {code} FairOrderingPolicy/FifoOrderingPolicy > OrderingPolicy (implements) FairOrderingPolicy and FifoOrderingPolicy could inherit from AbstractOrderingPolicy with commmon implementations FairOrderingPolicy/FifoOrderingPolicy > FairSchedulableComparator/FifoSchedulableComparator (uses) It's no need to invent "SchedulerComparator" interface, use existing Java Comparator interface should be simple and enough. {code} *3. Regarding relationship between OrderingPolicy and comparator:* I understand the method of SchedulerComparator is to reduce unnecessary re-sort Schedulables being added/modified in OrderingPolicy, but actually we can 1) Do this in OrderingPolicy itself. For example, with my above suggestion, FifoOrderingPolicy will simply ignore container changed notifications. 2) Comparator doesn't know about global info, only OrderingPolicy knows how combination of Comparator actors, I don't want containerAllocate/Release coupled in Comparator interface. And we don't need a separated "CompoundComparator", this can be put in AbstractOrderingPolicy. *4. Regarding configuration (CapacitySchedulerConfiguration):* I think we don't need ORDERING_POLICY_CLASS, two operations for very similar purpose can confuse user. I suggest only leave ordering-policy, and it name can be: "fifo", "fair" regardless of its internal "comparator" implementaiton. And in the future we can add "priority-fifo", "priority-fair". (note the "-" in name doesn't means "AND" only, it could be collaborate of the two instead of simply combination). If user specify a name not in white-list-shortname given by us, we will try to load class with the name. *5. Regarding longer term plan, LimitPolicy:* This part seems not well discussed, to limit scope of this JIRA, so I think its implementation and definition should happen in separated ticket. For longer plan, considering YARN-2986 as well, we may configure queue like following: {code} fair true 50 .. .. {code} Changes of this patch in CapacitySchedulerConfiguration seems reasonable, as Craig mentioned, simply mark it to be unstable or experimental should be enough. Longer term is to define and stablize YARN-2986 to make a real uniformed scheduler. *6. Regarding scope of this JIRA* I think it should be fine to make policy interfaces define as well as CapacityScheduler changes together with this patch (only for FifoOrderingPolicy), it's good to see how interfaces and policies work in CS, is it easy or not, etc. = And following I suggest to move to a separated ticket: 1) UI (Web and CLI) 2) REST 3) PB related changes Along with patch getting changes, you don't have to maintain above changes together with the patch. Please feel free to let me know your thoughts. > Create Initial OrderingPolicy Framework, integrate with CapacityScheduler > LeafQueue supporting present behavior > --- > > Key: YARN-3318 > URL: https://issues.apache.org/jira/browse/YARN-3318 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Craig Welch >Assignee: Craig Welch > Attachments: YARN-3318.13.patch, YARN-3318.14.patch, > YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, > YARN-3318.36.patch, YARN-3318.39.patch > >
[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior
[ https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14393347#comment-14393347 ] Vinod Kumar Vavilapalli commented on YARN-3318: --- bq. I think it is useful to split off CS changes into their own JIRA. We can strictly focus on the policy framework here. You missed this, let's please do this. bq. well, I'd actually talked Wangda Tan into SchedulerProcess So, we can chew on this a bit more & see where we go SchedulerProcess is definitely misleading. It seems to point to a process that is doing scheduling. What you need is a Schedulable / SchedulableEntity / Consumer etc. You could also say SchedulableProcess, but Process is way too overloaded. bq. The goal is to make this available quickly but iteratively, keeping the changes small but making them available for use and feedback. (..) We should grow it organically, gradually, iteratively, think of it is a facet of the policy framework hooked up and available but with more to follow I agree to this, but we are not in a position to support the APIs, CLI, config names in a supportable manner yet. They may or may not change depending on how parent queue policies, limit policies evolve. For that reason alone, I am saying that (1) Don't make the configurations public yet, or put a warning saying that they are unstable and (2) don't expose them in CLI , REST APIs yet. It's okay to put in the web UI, web UI scraping is not a contract. bq. You add/remove applications to/from LeafQueue's policy but addition/removal of containers is an event... bq. This has been factored differently along Wangda Tan's suggestion, it should now be consistent It's a bit better now. Although we are hard-coding Containers. Can revisit this later. Other comments - SchedulerApplicationAttempt.getDemand() should be private. - SchedulerProcess -- updateCaches() -> updateState() / updateSchedulingState() as that is what it is doing? -- getCachedConsumption() / getCachedDemand(): simply getCurrent*() ? - SchedulerComparator -- We aren't comparing Schedulers. Given the current name, it should have been SchedulerProcessComparator, but SchedulerProcess itself should be renamed as mentioned before. -- What is the need for reorderOnContainerAllocate () / reorderOnContainerRelease()? - Move all the comparator related classed into their own package. - SchedulerComparatorPolicy -- This is really a ComparatorBasedOrderingPolicy. Do we really see non-comparator based ordering-policy. We are unnecessarily adding two abstractions - adding policies and comparators. -- Use className.getName() instead of hardcoded strings like "org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FifoComparator" > Create Initial OrderingPolicy Framework, integrate with CapacityScheduler > LeafQueue supporting present behavior > --- > > Key: YARN-3318 > URL: https://issues.apache.org/jira/browse/YARN-3318 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Craig Welch >Assignee: Craig Welch > Attachments: YARN-3318.13.patch, YARN-3318.14.patch, > YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, > YARN-3318.36.patch, YARN-3318.39.patch > > > Create the initial framework required for using OrderingPolicies with > SchedulerApplicaitonAttempts and integrate with the CapacityScheduler. This > will include an implementation which is compatible with current FIFO behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior
[ https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14392345#comment-14392345 ] Hadoop QA commented on YARN-3318: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12708923/YARN-3318.39.patch against trunk revision 867d5d2. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1149 javac compiler warnings (more than the trunk's current 1148 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.client.api.impl.TestAMRMClient Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7198//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-YARN-Build/7198//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7198//console This message is automatically generated. > Create Initial OrderingPolicy Framework, integrate with CapacityScheduler > LeafQueue supporting present behavior > --- > > Key: YARN-3318 > URL: https://issues.apache.org/jira/browse/YARN-3318 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Craig Welch >Assignee: Craig Welch > Attachments: YARN-3318.13.patch, YARN-3318.14.patch, > YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, > YARN-3318.36.patch, YARN-3318.39.patch > > > Create the initial framework required for using OrderingPolicies with > SchedulerApplicaitonAttempts and integrate with the CapacityScheduler. This > will include an implementation which is compatible with current FIFO behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior
[ https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14392211#comment-14392211 ] Craig Welch commented on YARN-3318: --- [~leftnoteasy] SchedulerProcessEvents replaced with containerAllocated and containerReleased Serial and SerialEpoch replaced with compareInputOrderTo(), which is the option 2 for addressing it which we settled on offline Added addSchedulerProcess/removeSchedulerProcess/addAllSchedulerProcesses Changed configuration so that yarn.scheduler.capacity.root.default.ordering-policy=fair will setup the fair configuration, "fifo" will setup fifo, "fair+fifo" will setup compound fair + fifo, etc. It is possible to setup a custom ordering policy class using a different configuration, but the base one will handle the "friendly" setup. [~vinodkv] bq. It is not entirely clear how the ordering and limits work together - as one policy with multiple facets or multiple policy types This should be modeled as different types of policies, so that they can each focus on their particular purpose and avoid a repetition of the intermingling which has made it difficult to mix, match, and share capabilities. Having multiple policy types is essential to make it easy to combine them as needed. bq. let's split the patch that exposes this to the client side / web UI and in the API records into its own JIRA...premature to support this as a publicly supportable configuration... The goal is to make this available quickly but iteratively, keeping the changes small but making them available for use and feedback. Clearly we can mark things "unstable", communicate that they are not fully mature/subject to change/should be used gently, but we will need to make it possible to activate the feature and use it in order to accomplish the use and feedback. We should grow it organically, gradually, iteratively, think of it is a facet of the policy framework hooked up and available but with more to follow bq. ...SchedulableEntity better... well, I'd actually talked [~leftnoteasy] into SchedulerProcess :-) So, we can chew on this a bit more & see where we go bq. You add/remove applications to/from LeafQueue's policy but addition/removal of containers is an event... This has been factored differently along [~leftnoteasy]'s suggestion, it should now be consistent bq. The notion of a comparator doesn't make sense to an admin. It is simply a policy... Have modeled "policy configuration" differently, so "comparator" is out of sight (see above). bq. Depending on how ordering and limits come together, they may become properties of a policy I expect them to be distinct, this is specifically an "ordering-policy", limits will be other types of "limit-policy"(ies) patch with these changes to follow in a few... > Create Initial OrderingPolicy Framework, integrate with CapacityScheduler > LeafQueue supporting present behavior > --- > > Key: YARN-3318 > URL: https://issues.apache.org/jira/browse/YARN-3318 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Craig Welch >Assignee: Craig Welch > Attachments: YARN-3318.13.patch, YARN-3318.14.patch, > YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, YARN-3318.36.patch > > > Create the initial framework required for using OrderingPolicies with > SchedulerApplicaitonAttempts and integrate with the CapacityScheduler. This > will include an implementation which is compatible with current FIFO behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior
[ https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14389749#comment-14389749 ] Vinod Kumar Vavilapalli commented on YARN-3318: --- Top level comments - It is not entirely clear how the ordering and limits work together - as one policy with multiple facets or multiple policy types. So we need to think more about what we expose right now -- let's split the patch that exposes this to the client side / web UI and in the API records into its own JIRA. We should do it a little later after the framework stabilizes. -- Capacity Scheduler Configuration also need more thought in terms of naming and layout. I think it's premature to support this as a publicly supportable configuration. But at the same time, to enable experimentation, we should simply not put in the public side of the code. - For future's sake and for discussion with FairScheduler folks, I think it is useful to split off CS changes into their own JIRA. We can strictly focus on the policy framework here. Quick comments on the policy stuff itself - I also like SchedulableEntity better. If not that, simply Schedulable works too - You add/remove applications to/from LeafQueue's policy but addition/removal of containers is an event. I think we should be consistent with these two. May be invent some sort of a notion of parent/child SchedulableEntities and make this generic? - The notion of a comparator doesn't make sense to an admin. It is simply a policy. Depending on how ordering and limits come together, they may become properties of a policy? > Create Initial OrderingPolicy Framework, integrate with CapacityScheduler > LeafQueue supporting present behavior > --- > > Key: YARN-3318 > URL: https://issues.apache.org/jira/browse/YARN-3318 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Craig Welch >Assignee: Craig Welch > Attachments: YARN-3318.13.patch, YARN-3318.14.patch, > YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, YARN-3318.36.patch > > > Create the initial framework required for using OrderingPolicies with > SchedulerApplicaitonAttempts and integrate with the CapacityScheduler. This > will include an implementation which is compatible with current FIFO behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior
[ https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14389583#comment-14389583 ] Craig Welch commented on YARN-3318: --- BTW, can't just do lexical sort on the string version of application id - one problem with using the lexical compare on appid, the format for the "id" component is a min of 4 digits, which means that going from to 1 will result in incorrect lexical sort wrt to actual order > Create Initial OrderingPolicy Framework, integrate with CapacityScheduler > LeafQueue supporting present behavior > --- > > Key: YARN-3318 > URL: https://issues.apache.org/jira/browse/YARN-3318 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Craig Welch >Assignee: Craig Welch > Attachments: YARN-3318.13.patch, YARN-3318.14.patch, > YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, YARN-3318.36.patch > > > Create the initial framework required for using OrderingPolicies with > SchedulerApplicaitonAttempts and integrate with the CapacityScheduler. This > will include an implementation which is compatible with current FIFO behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior
[ https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14389563#comment-14389563 ] Craig Welch commented on YARN-3318: --- The remaining javac error doesn't appear to be related to my changes, which is confusing. On the next patch will have a change to try and address it anyway. TestRM passes on my box, I assume it's a transient issue. > Create Initial OrderingPolicy Framework, integrate with CapacityScheduler > LeafQueue supporting present behavior > --- > > Key: YARN-3318 > URL: https://issues.apache.org/jira/browse/YARN-3318 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Craig Welch >Assignee: Craig Welch > Attachments: YARN-3318.13.patch, YARN-3318.14.patch, > YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, YARN-3318.36.patch > > > Create the initial framework required for using OrderingPolicies with > SchedulerApplicaitonAttempts and integrate with the CapacityScheduler. This > will include an implementation which is compatible with current FIFO behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior
[ https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14389315#comment-14389315 ] Hadoop QA commented on YARN-3318: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12708491/YARN-3318.36.patch against trunk revision 79f7f2a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1149 javac compiler warnings (more than the trunk's current 1148 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestRM Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7175//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-YARN-Build/7175//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7175//console This message is automatically generated. > Create Initial OrderingPolicy Framework, integrate with CapacityScheduler > LeafQueue supporting present behavior > --- > > Key: YARN-3318 > URL: https://issues.apache.org/jira/browse/YARN-3318 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Craig Welch >Assignee: Craig Welch > Attachments: YARN-3318.13.patch, YARN-3318.14.patch, > YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, YARN-3318.36.patch > > > Create the initial framework required for using OrderingPolicies with > SchedulerApplicaitonAttempts and integrate with the CapacityScheduler. This > will include an implementation which is compatible with current FIFO behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior
[ https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14389166#comment-14389166 ] Wangda Tan commented on YARN-3318: -- bq. Well, I'm certainly open to other name options, but I do prefer SchedulerProcess to SchedulableEntity ... I don't like generic "Entity" as well, I cannot get a better name now, so just keep it as-is now and see if anybody has better suggestions, renaming should be simple :) bq. Not all event handling must be asynchronous... *My goal was precisely to avoid needing to have implementer add a new method implementation every time a new event comes into play which may not be of interest to it* This is one thing I don't agree. CS implemnts EventHandler is that CS will be called by other modules in asynchronized way, it handle such events one by one, but dispatcher in RM will queuing events for CS. I think SchedulerProcess will likely to be used in a sycnhronized way, making EventHandler can make less changes for interfaces but also reduces readability. I think this is why all scheduler internal classes choose to use setters instead of event handler bq. SerialEpoch.. I read FairShareComparator and also FifoComparator again, how about keep an ID() only, it's a string and should contains everything we need to compare "nature" ordering of two SchedulerProcess, instead of policies in FairScheduler confusing different fields to compare nature order. We will keep a note that, we will use lexicographical order of id() to compare. submitTime should be an important property that need to be kept in SchedulerProcess even if we don't use it for now. bq. I looked at switching to using the pending value of the attemptResourceUsage in the SchedulerApplicationAttempt but I don't see that it is being updated anywhere, am I missing it? Oh I should mention that, such fields are set in YARN-3361, beyond requirements of fair policy, these pending values are used when doing normal scheduling as well as preemption. bq. I think there needs to be a "getCopyPending" or the like, for usecases where you get the value and then start to use it. Good suggestion, I think we should do that to avoid resource being modified while using by another reader. I think we can make internal resources in ResourceUsage to be "copy-on-write", to make sure resources returned by getters is just a snapshot. bq. I think the current factoring is correct. First, you'll notice that the return type of getSchedulerProcess() is a Collection The thing I'm worrying about is, this assumes caller can modify the collection. It is very possible returned collection is just a copy or immutable collection to avoid race condition according to different SchedulerProcess impl. So I think it's better to make callers can only use limited interfaces to get/set internal structure. And also, there could have some operations need to do when add/remove object in the internal data structure, caller shouldn't silently modify it. bq. ...Simplify configuration... I suggest you can take a look at YARN-2986, which is very related to this discussion. We don't need to manually set comparator for most cases, instead, it comes with policy. I don't want to expose "comparator" to end user since it's too specific to internal logics. Instead, user need just specify queue's policy="fair" and set some "policy-options" like application-priority-enabled=true, etc. "policy=fair" assumes use comparator=fair+fifo, when application-priority-enabled, it assumes "priority comparator" added to "CompoundComparator" -- they're all internal logics. The configuration in my mind is (it's hierarchy config, also we can make it to be plain config) {code} true true {code} > Create Initial OrderingPolicy Framework, integrate with CapacityScheduler > LeafQueue supporting present behavior > --- > > Key: YARN-3318 > URL: https://issues.apache.org/jira/browse/YARN-3318 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Craig Welch >Assignee: Craig Welch > Attachments: YARN-3318.13.patch, YARN-3318.14.patch, > YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch, YARN-3318.36.patch > > > Create the initial framework required for using OrderingPolicies with > SchedulerApplicaitonAttempts and integrate with the CapacityScheduler. This > will include an implementation which is compatible with current FIFO behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior
[ https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14388361#comment-14388361 ] Hadoop QA commented on YARN-3318: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12708385/YARN-3318.35.patch against trunk revision b5a22e9. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1151 javac compiler warnings (more than the trunk's current 1148 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.client.cli.TestYarnCLI org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7168//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/7168//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Javac warnings: https://builds.apache.org/job/PreCommit-YARN-Build/7168//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7168//console This message is automatically generated. > Create Initial OrderingPolicy Framework, integrate with CapacityScheduler > LeafQueue supporting present behavior > --- > > Key: YARN-3318 > URL: https://issues.apache.org/jira/browse/YARN-3318 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Craig Welch >Assignee: Craig Welch > Attachments: YARN-3318.13.patch, YARN-3318.14.patch, > YARN-3318.17.patch, YARN-3318.34.patch, YARN-3318.35.patch > > > Create the initial framework required for using OrderingPolicies with > SchedulerApplicaitonAttempts and integrate with the CapacityScheduler. This > will include an implementation which is compatible with current FIFO behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior
[ https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14388275#comment-14388275 ] Craig Welch commented on YARN-3318: --- Hi Wangda, I have changed the patch a bit in the background without updating it on the jira. The changes are not major but I think they render some of the comments obsolete. I've uploaded the up-to-date patch just now, before doing so I took a pass through your comments- below I'll respond to each in turn- bq. 1) SchedulerProcess bq. 1.1. ...name seems not very straightforward to me... Well, I'm certainly open to other name options, but I do prefer SchedulerProcess to SchedulableEntity - it's common to refer the items which a scheduler will schedule as "Processes", which is what these are in this case, which is why I chose this name. "Entity" is really very generic and empty of meaning. I do wish to avoid confusion with Scheduleable (and wasn't enamored of that name either...), I expect that as integration progresses there will be a period where Schedulable will be an extension of SchedulerProcess with the (remaining) FairScheduler specific bits (which will, I think, ultimately be incorporated in some way into SchedulerProcess, but that's down the line/should be addressed in further iteration). In any case, not in favor of adding "Entity", I think when you consider the terminology as explained above SchedulerProcess works, try it on and see, and feel free to give other options... bq. 1.2. ...SchedulerProcessEvent...asynchronized Not all event handling must be asynchronous. I believe the details regarding this were spelled out reasonably well in the interface definition - if you take a peek at how these events are handled in the capacity scheduler configuration you will see that they are synchronously/safely within protection against mutation of the schedulerprocess collection. My goal was precisely to avoid needing to have implementer add a new method implementation every time a new event comes into play which may not be of interest to it, this makes maintenance of implementations easier - they can manage those which they understand and have appropriate default logic otherwise. I think this is a classic case for an enumerated set of events to be handled by the interface & so I think it should be modeled as it is as opposed to adding a new method for each new event type to the interface itself... bq. 1.3 ...SerialEpoch Yeah, I don't like the names much either, I've gone through several versions and come to the conclusion that it's not the choice of names that's the problem. This is an attempt to "hide" the application id's while also exposing them, which is compound in nature, and is made stranger by the fact that this is totally irrelevant for other potential future implementors (such as queues). I want to factor it differnently not just change the names, these are the courses I'm considering: 1. Have SchedulerProcess implement "Comparable" and provide a "natural ordering", which is fifo for apps. This seems to "privilege" fifo but, as a matter of fact, it's the fallback for fair so I'm not sure that's really an inappropriate thing to do - it seems like it is the "natural ordering" for apps. Other things can give their own natural ordering (queues - the hierarchy...), so it should extend reasonably well without the current awkwardness. This would remove all of "getSerialEpoch", "getSerial", and "getId" in favor of just implementing "compareTo" from comparable. The downsides I see are the "privilage" and that if an implementor of SchedulerProcess implemented "comparable" in an unworkable fashion it would be an issue, not the case for what we are presently looking at supporting afaik 2. Have an explicit "compareCreationOrder(SchedulerProcess other)" method which returns 0 + - like compareTo. This is much like 1, but removes the "privilege" and the possible Comparable collision... this also does away getSerialEpoch, getSerial, and getId in favor of the comparison method. What do you think of these options? Preference? BTW, FS has an actual "startTime" for fsappattempts, but looking through it I don't like that approach - it doesn't appear to do the right thing in some cases (like rm failover or recovery), it still can be ambiguous for some cases (simultanious start w/in tsmillis granularity) where there's a fallback to appid, so it doesn't look to really add anything as you still have to be able to fall back to the app id for those cases so you can't get away from the issue, and it adds a bit of complexity to boot. bq. 1.4 ...currentConsumption is not enough to make choice, demand(pending-resource)/used and priority/weight are basic fields of a Schedulable, do you think so... Of those, only demand is required for the initial step of supporting application level fairness when sizeBasedWeight
[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior
[ https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14388178#comment-14388178 ] Hadoop QA commented on YARN-3318: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12708335/YARN-3318.34.patch against trunk revision 85dc3c1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 8 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1149 javac compiler warnings (more than the trunk's current 1146 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.client.cli.TestYarnCLI org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7163//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/7163//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Javac warnings: https://builds.apache.org/job/PreCommit-YARN-Build/7163//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7163//console This message is automatically generated. > Create Initial OrderingPolicy Framework, integrate with CapacityScheduler > LeafQueue supporting present behavior > --- > > Key: YARN-3318 > URL: https://issues.apache.org/jira/browse/YARN-3318 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Craig Welch >Assignee: Craig Welch > Attachments: YARN-3318.13.patch, YARN-3318.14.patch, > YARN-3318.17.patch, YARN-3318.34.patch > > > Create the initial framework required for using OrderingPolicies with > SchedulerApplicaitonAttempts and integrate with the CapacityScheduler. This > will include an implementation which is compatible with current FIFO behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior
[ https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387544#comment-14387544 ] Wangda Tan commented on YARN-3318: -- Hi Craig, Thanks for working on the patch, I just took a look, some overall comments for interface definitions: *1) {{SchedulerProcess}}* 1.1. It's name seems not very straightforward to me, many processes can run for an application. How about call it SchedulableEntity (to not confuse with fair scheduler's Schedulable)? 1.2. I found {{SchedulerProcessEvent}} defined in SchedulerProcess, is there any specific reason to make SchedulerProcess to be asynchronized? As far as I can imagine, fair/capacity(fifo)/priority, all of them need synchronized call (or synchronized calls are enough). The reason is, if you don't want to write a brand-new scheduler, now all schedulers assume scheduler event need handle under scheduler lock. 1.3. SerialEpoch and Serial not straight forward to me also, how about call them clusterTimeStamp and submissionId? 1.4. currentConsumption is not enough to make choice, demand(pending-resource)/used and priority/weight are basic fields of a Schedulable, do you think so? I suggest you can take a look at ResourceUsage class, now it becomes field of CSQueue/LeafQueue.User/SchedulerApplicationAttempt already, it should have enough resource-related fields to make scheduling decisions. And it considered node-label as well. *2) {{OrderingPolicy}}* 2.1 addSchedulerProcess/removeSchedulerProcess/addAllSchedulerProcesses should become interface of Orderingpolicy. Now doing getSchedulerProcess().add(..) assumes internal implementation is treeSet. It's better not depend on internal Collection<> behavior. 2.2 {{handleSchedulerProcessEvent}} makes the OrderingPolicy mixed synchronized/asynchronized writer interface, I suggest to make it to be, containerAllocated/Preempted(SchedulerProcess, RMContainer) 2.3 Rename it to be {{SchedulerProcessOrderingPolicy}}? *3) {{SchedulerComparatorPolicy}}* I understand this class is to make different comparators can share same policy implementation, and admins can combine different comparators via configuration. But it seems a little hard to use for me. For example, now admin need configure following options to enable fair schedling per queue: {code} policy=SchedulerComparatorPolicy comparator=CompoundComparator, and its comparators are {fair and fifo} {code} Instead, I think we need hide internal implementation of policy/comparator, we can predefine policy=fair means using CompoundComparator (fair+fifo) and policy = capacity means using fifo. We can still leave admin flexibility to configure their custom policy/comparator, but we need reduce configuration complexity for most common usecases. Will include review for tests once we have consent on interfaces. > Create Initial OrderingPolicy Framework, integrate with CapacityScheduler > LeafQueue supporting present behavior > --- > > Key: YARN-3318 > URL: https://issues.apache.org/jira/browse/YARN-3318 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Craig Welch >Assignee: Craig Welch > Attachments: YARN-3318.13.patch, YARN-3318.14.patch, > YARN-3318.17.patch > > > Create the initial framework required for using OrderingPolicies with > SchedulerApplicaitonAttempts and integrate with the CapacityScheduler. This > will include an implementation which is compatible with current FIFO behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior
[ https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353965#comment-14353965 ] Craig Welch commented on YARN-3318: --- The proposed initial implementation of the framework to support FIFO SchedulerApplicationAttempt ordering for the CapacityScheduler: A SchedulerComparatorPolicy which implements OrderingPolicy above. This implementation will take care of the common logic required for cases where the policy can be effectively implemented as a comparator (which is expected to be the case for several potential policies, including FIFO). A SchedulerComparator which is used by the SchedulerComparatorPolicy above. This is an extension of the java Comparator interface with additional logic required by the SchedulerComparatorPolicy, initially a method to accept SchedulerProcessEvents and indicate whether the require re-ordering of the associated SchedulerProcess. > Create Initial OrderingPolicy Framework, integrate with CapacityScheduler > LeafQueue supporting present behavior > --- > > Key: YARN-3318 > URL: https://issues.apache.org/jira/browse/YARN-3318 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Craig Welch >Assignee: Craig Welch > > Create the initial framework required for using OrderingPolicies with > SchedulerApplicaitonAttempts and integrate with the CapacityScheduler. This > will include an implementation which is compatible with current FIFO behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3318) Create Initial OrderingPolicy Framework, integrate with CapacityScheduler LeafQueue supporting present behavior
[ https://issues.apache.org/jira/browse/YARN-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353953#comment-14353953 ] Craig Welch commented on YARN-3318: --- Proposed elements of the framework: A SchedulerProcess interface which generalizes processes to be managed by the OrderingPolicy (initially, potentially in the future by other Policies as well) Initial implementer will be the SchedulerApplicaitonAttempt. An OrderingPolicy interface which exposes a collection of scheduler processes which will be ordered by the policy for container assignment and preemption. The ordering policy will provide one Iterator which presents processes in the policy specific order for container assignment and another Iterator which presents them in the proper order for preemption. It will also accept SchedulerProcessEvents which may indicate a need to re-order the associated SchedulerProcess (for example, after container completion, preemption, assignment, etc) > Create Initial OrderingPolicy Framework, integrate with CapacityScheduler > LeafQueue supporting present behavior > --- > > Key: YARN-3318 > URL: https://issues.apache.org/jira/browse/YARN-3318 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Craig Welch > > Create the initial framework required for using OrderingPolicies with > SchedulerApplicaitonAttempts and integrate with the CapacityScheduler. This > will include an implementation which is compatible with current FIFO behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)