Github user jerrypeng commented on a diff in the pull request: https://github.com/apache/storm/pull/2400#discussion_r148923121 --- Diff: docs/Resource_Aware_Scheduler_overview.md --- @@ -243,58 +243,81 @@ http://dl.acm.org/citation.cfm?id=2814808 <div id='Specifying-Topology-Prioritization-Strategy'/> ### Specifying Topology Prioritization Strategy -The order of scheduling is a pluggable interface in which a user could define a strategy that prioritizes topologies. For a user to define his or her own prioritization strategy, he or she needs to implement the ISchedulingPriorityStrategy interface. A user can set the scheduling priority strategy by setting the *Config.RESOURCE_AWARE_SCHEDULER_PRIORITY_STRATEGY* to point to the class that implements the strategy. For instance: +The order of scheduling and eviction is determined by a pluggable interface in which the cluster owner can define how topologies should be scheduled. For the owner to define his or her own prioritization strategy, she or he needs to implement the ISchedulingPriorityStrategy interface. A user can set the scheduling priority strategy by setting the `DaemonConfig.RESOURCE_AWARE_SCHEDULER_PRIORITY_STRATEGY` to point to the class that implements the strategy. For instance: ``` resource.aware.scheduler.priority.strategy: "org.apache.storm.scheduler.resource.strategies.priority.DefaultSchedulingPriorityStrategy" ``` -A default strategy will be provided. The following explains how the default scheduling priority strategy works. + +Topologies are scheduled starting at the beginning of the list returned by this plugin. If there are not enough resources to schedule the topology others are evicted starting at the end of the list. Eviction stops when there are no lower priority topologies left to evict. **DefaultSchedulingPriorityStrategy** -The order of scheduling should be based on the distance between a userâs current resource allocation and his or her guaranteed allocation. We should prioritize the users who are the furthest away from their resource guarantee. The difficulty of this problem is that a user may have multiple resource guarantees, and another user can have another set of resource guarantees, so how can we compare them in a fair manner? Let's use the average percentage of resource guarantees satisfied as a method of comparison. +In the past the order of scheduling was based on the distance between a userâs current resource allocation and his or her guaranteed allocation. + +We currently use a slightly different approach. We simulate scheduling the highest priority topology for each user and score the topology for each of the resources using the formula + +``` +(Requested + Assigned - Guaranteed)/Available +``` + +Where + + * `Requested` is the resource requested by this topology (or a approximation of it for complex requests like shared memory) + * `Assigned` is the resources already assigned by the simulation. + * `Guaranteed` is the resource guarantee for this user + * `Available` is the amount of that resource currently available in the cluster. -For example: +This gives a score that is negative for guaranteed requests and a score that is positive for requests that are not within the guarantee. -|User|Resource Guarantee|Resource Allocated| -|----|------------------|------------------| -|A|<10 CPU, 50GB>|<2 CPU, 40 GB>| -|B|< 20 CPU, 25GB>|<15 CPU, 10 GB>| +To combine different resources the maximum of all the indavidual resource scores is used. This guarantees that if a user would go over a guarantee for a single resource it would not be offset by being under guarantee on any other resources. --- End diff -- "indavidual" is misspelled
---