adding to documentation

Project: http://git-wip-us.apache.org/repos/asf/storm/repo
Commit: http://git-wip-us.apache.org/repos/asf/storm/commit/0d616ba5
Tree: http://git-wip-us.apache.org/repos/asf/storm/tree/0d616ba5
Diff: http://git-wip-us.apache.org/repos/asf/storm/diff/0d616ba5

Branch: refs/heads/master
Commit: 0d616ba539a3e72b7c26fe74a7e8690f7b60a6e8
Parents: caf525e
Author: Boyang Jerry Peng <jerryp...@yahoo-inc.com>
Authored: Sun Oct 9 22:44:32 2016 -0500
Committer: Boyang Jerry Peng <jerryp...@yahoo-inc.com>
Committed: Mon Oct 10 14:19:38 2016 -0500

----------------------------------------------------------------------
 docs/Resource_Aware_Scheduler_overview.md       | 253 +++++++++++++++----
 .../ras_new_strategy_network_cdf_random.png     | Bin 0 -> 157859 bytes
 ...tegy_network_metric_cdf_yahoo_topologies.png | Bin 0 -> 152034 bytes
 ...rategy_network_metric_improvement_random.png | Bin 0 -> 68976 bytes
 .../ras_new_strategy_network_metric_random.png  | Bin 0 -> 79004 bytes
 ...strategy_network_metric_yahoo_topologies.png | Bin 0 -> 75195 bytes
 docs/images/ras_new_strategy_runtime_random.png | Bin 0 -> 76534 bytes
 docs/images/ras_new_strategy_runtime_yahoo.png  | Bin 0 -> 71727 bytes
 8 files changed, 201 insertions(+), 52 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/storm/blob/0d616ba5/docs/Resource_Aware_Scheduler_overview.md
----------------------------------------------------------------------
diff --git a/docs/Resource_Aware_Scheduler_overview.md 
b/docs/Resource_Aware_Scheduler_overview.md
index 4dc26f5..e3e2b56 100644
--- a/docs/Resource_Aware_Scheduler_overview.md
+++ b/docs/Resource_Aware_Scheduler_overview.md
@@ -3,23 +3,45 @@ title: Resource Aware Scheduler
 layout: documentation
 documentation: true
 ---
-# Introduction
 
-The purpose of this document is to provide a description of the Resource Aware 
Scheduler for the Storm distributed real-time computation system.  This 
document will provide you with both a high level description of the resource 
aware scheduler in Storm
+# Introduction
 
+The purpose of this document is to provide a description of the Resource Aware 
Scheduler for the Storm distributed real-time computation system.  This 
document will provide you with both a high level description of the resource 
aware scheduler in Storm.  Some of the benefits are using a resource aware 
scheduler on top of Storm is outlined in the following presentation at Hadoop 
Summit 2016:
+
+http://www.slideshare.net/HadoopSummit/resource-aware-scheduling-in-apache-storm
+
+# Table of Contents
+1. [Using Resource Aware Scheduler](#Using-Resource-Aware-Scheduler)
+2. [API Overview](#API-Overview)
+    1. [Setting Memory Requirement](#Setting-Memory-Requirement)
+    2. [Setting CPU Requirement](#Setting-CPU-Requirement)
+    3. [Limiting the Heap Size per Worker (JVM) 
Process](#Limiting-the-Heap-Size-per-Worker-(JVM)Process)
+    4. [Setting Available Resources on 
Node](#Setting-Available-Resources-on-Node)
+    5. [Other Configurations](#Other-Configurations)
+3. [Topology Priorities and Per User 
Resource](#Topology-Priorities-and-Per-User-Resource)
+    1. [Setup](#Setup)
+    2. [Specifying Topology Priority](#Specifying-Topology-Priority)
+    3. [Specifying Scheduling Strategy](#Specifying-Scheduling-Strategy)
+    4. [Specifying Topology Prioritization 
Strategy](#Specifying-Topology-Prioritization-Strategy)
+    5. [Specifying Eviction Strategy](#Specifying-Eviction-Strategy)
+4. [Profiling Resource Usage](#Profiling-Resource-Usage)
+5. [Enhancements on original 
DefaultResourceAwareStrategy](#Enhancements-on-original-DefaultResourceAwareStrategy)
+
+<div id='Using-Resource-Aware-Scheduler'/>
 ## Using Resource Aware Scheduler
 
 The user can switch to using the Resource Aware Scheduler by setting the 
following in *conf/storm.yaml*
 
     storm.scheduler: 
“org.apache.storm.scheduler.resource.ResourceAwareScheduler”
-
-
+    
+<div id='API-Overview'/>
 ## API Overview
 
 For use with Trident, please see the [Trident RAS API](Trident-RAS-API.html)
 
 For a Storm Topology, the user can now specify the amount of resources a 
topology component (i.e. Spout or Bolt) is required to run a single instance of 
the component.  The user can specify the resource requirement for a topology 
component by using the following API calls.
 
+<div id='Setting-Memory-Requirement'/>
 ### Setting Memory Requirement
 
 API to set component memory requirement:
@@ -48,9 +70,9 @@ Example of Usage:
 
 The entire memory requested for this topology is 16.5 GB. That is from 10 
spouts with 1GB on heap memory and 0.5 GB off heap memory each and 3 bolts with 
0.5 GB on heap memory each.
 
+<div id='Setting-CPU-Requirement'/>
 ### Setting CPU Requirement
 
-
 API to set component CPU requirement:
 
     public T setCPULoad(Double amount)
@@ -71,9 +93,9 @@ Example of Usage:
     builder.setBolt("exclaim2", new HeavyBolt(), 1)
                     .shuffleGrouping("exclaim1").setCPULoad(450.0);
 
+<div id='Limiting-the-Heap-Size-per-Worker-(JVM)Process'/>
 ###    Limiting the Heap Size per Worker (JVM) Process
 
-
     public void setTopologyWorkerMaxHeapSize(Number size)
 
 Parameters:
@@ -85,7 +107,8 @@ Example of Usage:
 
     Config conf = new Config();
     conf.setTopologyWorkerMaxHeapSize(512.0);
-
+    
+<div id='Setting-Available-Resources-on-Node'/>
 ### Setting Available Resources on Node
 
 A storm administrator can specify node resource availability by modifying the 
*conf/storm.yaml* file located in the storm home directory of that node.
@@ -106,7 +129,7 @@ Example of Usage:
     supervisor.memory.capacity.mb: 20480.0
     supervisor.cpu.capacity: 100.0
 
-
+<div id='Other-Configurations'/>
 ### Other Configurations
 
 The user can set some default configurations for the Resource Aware Scheduler 
in *conf/storm.yaml*:
@@ -123,11 +146,13 @@ The user can set some default configurations for the 
Resource Aware Scheduler in
     //default value for the max heap size for a worker  
     topology.worker.max.heap.size.mb: 768.0
 
-# Topology Priorities and Per User Resource 
+<div id='Topology-Priorities-and-Per-User-Resource'/>
+## Topology Priorities and Per User Resource 
 
 The Resource Aware Scheduler or RAS also has multitenant capabilities since 
many Storm users typically share a Storm cluster.  Resource Aware Scheduler can 
allocate resources on a per user basis.  Each user can be guaranteed a certain 
amount of resources to run his or her topologies and the Resource Aware 
Scheduler will meet those guarantees when possible.  When the Storm cluster has 
extra free resources, Resource Aware Scheduler will to be able allocate 
additional resources to user in a fair manner. The importance of topologies can 
also vary.  Topologies can be used for actual production or just 
experimentation, thus Resource Aware Scheduler will take into account the 
importance of a topology when determining the order in which to schedule 
topologies or when to evict topologies
 
-## Setup
+<div id='Setup'/>
+### Setup
 
 The resource guarantees of a user can be specified 
*conf/user-resource-pools.yaml*.  Specify the resource guarantees of a user in 
the following format:
 
@@ -151,7 +176,7 @@ An example of what *user-resource-pools.yaml* can look like:
 
 Please note that the specified amount of Guaranteed CPU and Memory can be 
either a integer or double
 
-## API Overview
+<div id='Specifying-Topology-Priority'/>
 ### Specifying Topology Priority
 The range of topology priorities can range form 0-29.  The topologies 
priorities will be partitioned into several priority levels that may contain a 
range of priorities. 
 For example we can create a priority level mapping:
@@ -169,7 +194,8 @@ Parameters:
 
 Please note that the 0-29 range is not a hard limit.  Thus, a user can set a 
priority number that is higher than 29. However, the property of higher the 
priority number, lower the importance still holds
 
-### Specifying Scheduling Strategy:
+<div id='Specifying-Scheduling-Strategy'/>
+### Specifying Scheduling Strategy
 
 A user can specify on a per topology basis what scheduling strategy to use.  
Users can implement the IStrategy interface and define new strategies to 
schedule specific topologies.  This pluggable interface was created since we 
realize different topologies might have different scheduling needs.  A user can 
set the topology strategy within the topology definition by using the API:
 
@@ -184,13 +210,144 @@ Example Usage:
 
 A default scheduling is provided.  The DefaultResourceAwareStrategy is 
implemented based off the scheduling algorithm in the original paper describing 
resource aware scheduling in Storm:
 
-http://web.engr.illinois.edu/~bpeng/files/r-storm.pdf
+Peng, Boyang, Mohammad Hosseini, Zhihao Hong, Reza Farivar, and Roy Campbell. 
"R-storm: Resource-aware scheduling in storm." In Proceedings of the 16th 
Annual Middleware Conference, pp. 149-161. ACM, 2015.
+
+http://dl.acm.org/citation.cfm?id=2814808
+
+**Please Note: Enhancements have to made on top of the original scheduling 
strategy as described in the paper.  Please see section "Enhancements on 
original DefaultResourceAwareStrategy"**
+
+<div id='Specifying-Topology-Prioritization-Strategy'/>
+### Specifying Topology Prioritization Strategy
+
+The order of scheduling is a pluggable interface in which a user could define 
a strategy that prioritizes topologies.  For a user to define his or her own 
prioritization strategy, he or she needs to implement the 
ISchedulingPriorityStrategy interface.  A user can set the scheduling priority 
strategy by setting the *Config.RESOURCE_AWARE_SCHEDULER_PRIORITY_STRATEGY* to 
point to the class that implements the strategy. For instance:
+
+    resource.aware.scheduler.priority.strategy: 
"org.apache.storm.scheduler.resource.strategies.priority.DefaultSchedulingPriorityStrategy"
+    
+A default strategy will be provided.  The following explains how the default 
scheduling priority strategy works.
+
+**DefaultSchedulingPriorityStrategy**
+
+The order of scheduling should be based on the distance between a user’s 
current resource allocation and his or her guaranteed allocation.  We should 
prioritize the users who are the furthest away from their resource guarantee. 
The difficulty of this problem is that a user may have multiple resource 
guarantees, and another user can have another set of resource guarantees, so 
how can we compare them in a fair manner?  Let's use the average percentage of 
resource guarantees satisfied as a method of comparison.
+
+For example:
+
+|User|Resource Guarantee|Resource Allocated|
+|----|------------------|------------------|
+|A|<10 CPU, 50GB>|<2 CPU, 40 GB>|
+|B|< 20 CPU, 25GB>|<15 CPU, 10 GB>|
+
+User A’s average percentage satisfied of resource guarantee: 
+
+(2/10+40/50)/2  = 0.5
+
+User B’s average percentage satisfied of resource guarantee: 
+
+(15/20+10/25)/2  = 0.575
+
+Thus, in this example User A has a smaller average percentage of his or her 
resource guarantee satisfied than User B.  Thus, User A should get priority to 
be allocated more resource, i.e., schedule a topology submitted by User A.
+
+When scheduling, RAS sorts users by the average percentage satisfied of 
resource guarantee and schedule topologies from users based on that ordering 
starting from the users with the lowest average percentage satisfied of 
resource guarantee.  When a user’s resource guarantee is completely 
satisfied, the user’s average percentage satisfied of resource guarantee will 
be greater than or equal to 1.
+
+<div id='Specifying-Eviction-Strategy'/>
+### Specifying Eviction Strategy
+The eviction strategy is used when there are not enough free resources in the 
cluster to schedule new topologies. If the cluster is full, we need a mechanism 
to evict topologies so that user resource guarantees can be met and additional 
resource can be shared fairly among users. The strategy for evicting topologies 
is also a pluggable interface in which the user can implement his or her own 
topology eviction strategy.  For a user to implement his or her own eviction 
strategy, he or she needs to implement the IEvictionStrategy Interface and set 
*Config.RESOURCE_AWARE_SCHEDULER_EVICTION_STRATEGY* to point to the implemented 
strategy class. For instance:
+
+    resource.aware.scheduler.eviction.strategy: 
"org.apache.storm.scheduler.resource.strategies.eviction.DefaultEvictionStrategy"
+
+A default eviction strategy is provided.  The following explains how the 
default topology eviction strategy works
+
+**DefaultEvictionStrategy**
+
+To determine if topology eviction should occur we should take into account the 
priority of the topology that we are trying to schedule and whether the 
resource guarantees for the owner of the topology have been met.  
+
+We should never evict a topology from a user that does not have his or her 
resource guarantees satisfied.  The following flow chart should describe the 
logic for the eviction process.
+
+![Viewing metrics with 
VisualVM](images/resource_aware_scheduler_default_eviction_strategy.png)
+
+<div id='Profiling-Resource-Usage'/>
+## Profiling Resource Usage
+
+Figuring out resource usage for your topology:
+ 
+To get an idea of how much memory/CPU your topology is actually using you can 
add the following to your topology launch code.
+ 
+    //Log all storm metrics
+    
conf.registerMetricsConsumer(backtype.storm.metric.LoggingMetricsConsumer.class);
+ 
+    //Add in per worker CPU measurement
+    Map<String, String> workerMetrics = new HashMap<String, String>();
+    workerMetrics.put("CPU", "org.apache.storm.metrics.sigar.CPUMetric");
+    conf.put(Config.TOPOLOGY_WORKER_METRICS, workerMetrics);
+ 
+The CPU metrics will require you to add
+ 
+    <dependency>
+        <groupId>org.apache.storm</groupId>
+        <artifactId>storm-metrics</artifactId>
+        <version>1.0.0</version>
+    </dependency>
+ 
+as a topology dependency (1.0.0 or higher).
+ 
+You can then go to your topology on the UI, turn on the system metrics, and 
find the log that the LoggingMetricsConsumer is writing to.  It will output 
results in the log like.
+ 
+    1454526100 node1.nodes.com:6707 -1:__system CPU {user-ms=74480, 
sys-ms=10780}
+    1454526100 node1.nodes.com:6707 -1:__system memory/nonHeap     
{unusedBytes=2077536, virtualFreeBytes=-64621729, initBytes=2555904, 
committedBytes=66699264, maxBytes=-1, usedBytes=64621728}
+    1454526100 node1.nodes.com:6707 -1:__system memory/heap  
{unusedBytes=573861408, virtualFreeBytes=694644256, initBytes=805306368, 
committedBytes=657719296, maxBytes=778502144, usedBytes=83857888}
+
+The metrics with -1:__system are generally metrics for the entire worker.  In 
the example above that worker is running on node1.nodes.com:6707.  These 
metrics are collected every 60 seconds.  For the CPU you can see that over the 
60 seconds this worker used  74480 + 10780 = 85260 ms of CPU time.  This is 
equivalent to 85260/60000 or about 1.5 cores.
+ 
+The Memory usage is similar but look at the usedBytes.  offHeap is 64621728 or 
about 62MB, and onHeap is 83857888 or about 80MB, but you should know what you 
set your heap to in each of your workers already.  How do you divide this up 
per bolt/spout?  That is a bit harder and may require some trial and error from 
your end.
+
+<div id='Enhancements-on-original-DefaultResourceAwareStrategy'/>
+## * Enhancements on original DefaultResourceAwareStrategy *
+
+The default resource aware scheduling strategy as described in the paper above 
has two main scheduling phases:
+
+1. Task Selection - Calculate the order task/executors in a topology should be 
scheduled
+2. Node Selection - Given a task/executor, find a node to schedule the 
task/executor on.
+
+Enhancements have been made for both scheduling phases
+
+### Task Selection Enhancements 
+
+Instead of using a breadth first traversal of the topology graph to create a 
ordering of components and its executors, a new heuristic is used that orders 
components by the number of in and out edges (potential connections) of the 
component.  This is discovered to be a more effective way to colocate executors 
that communicate with each other and reduce the network latency.
+
+
+### Node Selection Enhancements
+
+Node selection comes down first selecting which rack (server rack) and then 
which node on that rack to choose. The gist of strategy in choosing a rack and 
node is finding the rack that has the "most" resource available and in that 
rack find the node with the "most" free resources.  The assumption we are 
making for this strategy is that the node or rack with the most free resources 
will have the highest probability that allows us to schedule colocate the most 
number of executors on the node or rack to reduce network communication latency
+
+Racks and nodes will be sorted from best choice to worst choice.  When finding 
an executor, the strategy will iterate through all racks and nodes, starting 
from best to worst, before giving up.  Racks and nodes will be sorted in the 
following matter:
 
-#### * Enhancements on original DefaultResourceAwareStrategy *
+1. How many executors are already scheduled on the rack or node  
+ -- This is done so we move executors to schedule closer to executors that are 
already scheduled and running.  If a topology partially crashed and a subset of 
the topology's executors need to be rescheduled, we want to reschedule these 
executors as close (network wise) as possible to the executors that healthy and 
running. 
 
-Originally the getBestClustering algorithm for RAS finds the "Best" 
cluster/rack based on which rack has the "most available" resources by finding 
the rack with the biggest sum of available memory + available across all nodes 
in the rack. This method is not very accurate since memory and cpu usage aree 
values on a different scale and the values are not normailized. This method is 
also not effective since it does not consider the number of slots available and 
it fails to identifying racks that are not schedulable due to the exhaustion of 
one of the resources either memory, cpu, or slots. Also the current method does 
not consider failures of workers. When executors of a topology gets unassigned 
and needs to be scheduled again, the current logic in getBestClustering may be 
inadequate since it will likely return a cluster that is different from where 
the majority of executors from the topology is originally scheduling in.
+2. Subordinate resource availability or the amount "effective" resources on 
the rack or node  
+ -- Please refer the section on Subordinate Resource Availability
 
-The new strategy/algorithm to find the "best" cluster, I dub subordinate 
resource availability ordering (inspired by Dominant Resource Fairness), sorts 
racks by the subordinate (not dominant) resource availability.
+3. Average of the all the resource availability  
+ -- This is simply taking the average of the percent available (available 
resources on node or rack divied by theavailable resources on rack or cluster, 
repectively).  This situation will only be used when "effective resources" for 
two objects (rack or node) are the same. Then we consider the average of all 
the percentages of resources as a metric for sorting. For example:
+
+        Avail Resources:
+        node 1: CPU = 50 Memory = 1024 Slots = 20
+        node 2: CPU = 50 Memory = 8192 Slots = 40
+        node 3: CPU = 1000 Memory = 0 Slots = 0
+
+        Effective resources for nodes:
+        node 1 = 50 / (50+50+1000) = 0.045 (CPU bound)
+        node 2 = 50 / (50+50+1000) = 0.045 (CPU bound)
+        node 3 = 0 (memory and slots are 0)
+
+ode 1 and node 2 have the same effective resources but clearly node 2 has more 
resources (memory and slots) than node 1 and we would want to pick node 2 first 
since there is a higher probability we will be able to schedule more executors 
on it. This is what the phase 2 averaging does
+
+Thus the sorting follows the following progression. Compare based on 1) and if 
equal then compare based on 2) and if equal compare based on 3) and if equal we 
break ties by arbitrarly assigning ordering based on comparing the ids of the 
node or rack.
+
+**Subordinate Resource Availability**
+
+Originally the getBestClustering algorithm for RAS finds the "Best" rack based 
on which rack has the "most available" resources by finding the rack with the 
biggest sum of available memory + available across all nodes in the rack. This 
method is not very accurate since memory and cpu usage aree values on a 
different scale and the values are not normailized. This method is also not 
effective since it does not consider the number of slots available and it fails 
to identifying racks that are not schedulable due to the exhaustion of one of 
the resources either memory, cpu, or slots. Also the previous method does not 
consider failures of workers. When executors of a topology gets unassigned and 
needs to be scheduled again, the current logic in getBestClustering may be 
inadequate since it will likely return a cluster that is different from where 
the majority of executors from the topology is originally scheduling in.
+
+The new strategy/algorithm to find the "best" rack or node, I dub subordinate 
resource availability ordering (inspired by Dominant Resource Fairness), sorts 
racks and nodes by the subordinate (not dominant) resource availability.
 
 For example given 4 racks with the following resource availabilities
 
@@ -231,56 +388,48 @@ Then we order the racks by the effective resource.
 Thus for our example:
 
     Sorted rack: [rack-0, rack-1, rack-4, rack-3, rack-2]
-
-Also to deal with the presence of failures, if a topology is partially 
scheduled, we find the rack with the most scheduled executors for the topology 
and we try to schedule on that rack first.
-
-Thus we first sort by the number of executors already scheduled on the rack 
and then by the subordinate resource availability.
-
-Original Jira for this enhancement: 
[STORM-1766](https://issues.apache.org/jira/browse/STORM-1766)
-
-### Specifying Topology Prioritization Strategy
-
-The order of scheduling is a pluggable interface in which a user could define 
a strategy that prioritizes topologies.  For a user to define his or her own 
prioritization strategy, he or she needs to implement the 
ISchedulingPriorityStrategy interface.  A user can set the scheduling priority 
strategy by setting the *Config.RESOURCE_AWARE_SCHEDULER_PRIORITY_STRATEGY* to 
point to the class that implements the strategy. For instance:
-
-    resource.aware.scheduler.priority.strategy: 
"org.apache.storm.scheduler.resource.strategies.priority.DefaultSchedulingPriorityStrategy"
     
-A default strategy will be provided.  The following explains how the default 
scheduling priority strategy works.
-
-**DefaultSchedulingPriorityStrategy**
-
-The order of scheduling should be based on the distance between a user’s 
current resource allocation and his or her guaranteed allocation.  We should 
prioritize the users who are the furthest away from their resource guarantee. 
The difficulty of this problem is that a user may have multiple resource 
guarantees, and another user can have another set of resource guarantees, so 
how can we compare them in a fair manner?  Let's use the average percentage of 
resource guarantees satisfied as a method of comparison.
+This metric is used in sorting for both nodes and racks.  When sorting racks, 
we consider resources available on the rack and in the whole cluster 
(containing all racks).  When sorting nodes, we consider resources available on 
the node and the resources available in the rack (sum of all resources 
available for all nodes in rack)
 
-For example:
+Original Jira for this enhancement: 
[STORM-1766](https://issues.apache.org/jira/browse/STORM-1766)
 
-|User|Resource Guarantee|Resource Allocated|
-|----|------------------|------------------|
-|A|<10 CPU, 50GB>|<2 CPU, 40 GB>|
-|B|< 20 CPU, 25GB>|<15 CPU, 10 GB>|
+### Improvements in Scheduling
+This section provides some experimental results on the performance benefits 
with the enhancements on top of the original scheduling strategy.  The 
experiments are based off of running simulations using:
 
-User A’s average percentage satisfied of resource guarantee: 
+https://github.com/jerrypeng/storm-scheduler-test-framework
 
-(2/10+40/50)/2  = 0.5
+Random topologies and clusters are used in the simulation as well as a 
comprehensive dataset consisting of all real topologies running in all the 
storm clusters at Yahoo.
 
-User B’s average percentage satisfied of resource guarantee: 
+The below graphs provides a comparison of how well the various strategies 
schedule topologies to minimize network latency.  A network metric is 
calculated for each scheduling of a topology by each scheduling strategy.  The 
network metric is calculated based on how many connections each executor in a 
topology has to make to another executor residing in the same worker (JVM 
process), in different worker but same host, different host, different rack.  
The assumption we are making is the following
 
-(15/20+10/25)/2  = 0.575
+1. Intra-worker communication is the fastest
+2. Inter-worker communication is fast
+3. Inter-node communication is slower
+4. Inter-rack communication is the slowest
 
-Thus, in this example User A has a smaller average percentage of his or her 
resource guarantee satisfied than User B.  Thus, User A should get priority to 
be allocated more resource, i.e., schedule a topology submitted by User A.
+For this network metric, the larger the number is number is the more potential 
network latency the topology will have for this scheduling.  Two types of 
experiments are performed.  One set experiments are performed with randomly 
generated topologies and randomly generate clusters.  The other set of 
experiments are performed with a dataset containing all of the running 
topologies at yahoo and semi-randomly generated clusters based on the size of 
the topology.  Both set of experiments are run millions of iterations until 
results converge.  
 
-When scheduling, RAS sorts users by the average percentage satisfied of 
resource guarantee and schedule topologies from users based on that ordering 
starting from the users with the lowest average percentage satisfied of 
resource guarantee.  When a user’s resource guarantee is completely 
satisfied, the user’s average percentage satisfied of resource guarantee will 
be greater than or equal to 1.
+For the experiments involving randomly generated topologies, an optimal 
strategy is implemented that exhausively finds the most optimal solution if a 
solution exists.  The topologies and clusters used in this experiment are 
relatively small so that the optimal strategy traverse to solution space to 
find a optimal solution in a reasonable amount of time.  This strategy is not 
run with the Yahoo topologies since the topologies are large and would take 
unreasonable amount of time to run, since the solutions space is W^N (ordering 
doesn't matter within a worker) where W is the number of workers and N is the 
number of executors. The NextGenStrategy represents the scheduling strategy 
with these enhancements.  The DefaultResourceAwareStrategy represents the 
original scheduling strategy.  The RoundRobinStrategy represents a naive 
strategy that simply schedules executors in a round robin fashion while 
respecting the resource constraints.  The graph below presents averages of the 
network metr
 ic.  A CDF graph is also presented further down.
 
-### Specifying Eviction Strategy
-The eviction strategy is used when there are not enough free resources in the 
cluster to schedule new topologies. If the cluster is full, we need a mechanism 
to evict topologies so that user resource guarantees can be met and additional 
resource can be shared fairly among users. The strategy for evicting topologies 
is also a pluggable interface in which the user can implement his or her own 
topology eviction strategy.  For a user to implement his or her own eviction 
strategy, he or she needs to implement the IEvictionStrategy Interface and set 
*Config.RESOURCE_AWARE_SCHEDULER_EVICTION_STRATEGY* to point to the implemented 
strategy class. For instance:
+| Random Topologies | Yahoo topologies |
+|-------------------|------------------|
+|![](images/ras_new_strategy_network_metric_random.png)| 
![](images/ras_new_strategy_network_metric_yahoo_topologies.png)|
 
-    resource.aware.scheduler.eviction.strategy: 
"org.apache.storm.scheduler.resource.strategies.eviction.DefaultEvictionStrategy"
+The next graph displays how close the schedulings from the respectively 
scheduling strategies are to the schedulings of the optimal strategy.  As 
explained earlier, this is only done for the random generated topologies and 
clusters.
 
-A default eviction strategy is provided.  The following explains how the 
default topology eviction strategy works
+| Random Topologies |
+|-------------------|
+|![](images/ras_new_strategy_network_metric_improvement_random.png)|
 
-**DefaultEvictionStrategy**
+The below graph is a CDF of the network metric:
 
+| Random Topologies | Yahoo topologies |
+|-------------------|------------------|
+|![](images/ras_new_strategy_network_cdf_random.png)| 
![](images/ras_new_strategy_network_metric_cdf_yahoo_topologies.png)|
 
-To determine if topology eviction should occur we should take into account the 
priority of the topology that we are trying to schedule and whether the 
resource guarantees for the owner of the topology have been met.  
+Below is a comparison of the how long the strategies take to run:
 
-We should never evict a topology from a user that does not have his or her 
resource guarantees satisfied.  The following flow chart should describe the 
logic for the eviction process.
+| Random Topologies | Yahoo topologies |
+|-------------------|------------------|
+|![](images/ras_new_strategy_runtime_random.png)| 
![](images/ras_new_strategy_runtime_yahoo.png)|
 
-![Viewing metrics with 
VisualVM](images/resource_aware_scheduler_default_eviction_strategy.png)

http://git-wip-us.apache.org/repos/asf/storm/blob/0d616ba5/docs/images/ras_new_strategy_network_cdf_random.png
----------------------------------------------------------------------
diff --git a/docs/images/ras_new_strategy_network_cdf_random.png 
b/docs/images/ras_new_strategy_network_cdf_random.png
new file mode 100644
index 0000000..8b47f36
Binary files /dev/null and 
b/docs/images/ras_new_strategy_network_cdf_random.png differ

http://git-wip-us.apache.org/repos/asf/storm/blob/0d616ba5/docs/images/ras_new_strategy_network_metric_cdf_yahoo_topologies.png
----------------------------------------------------------------------
diff --git 
a/docs/images/ras_new_strategy_network_metric_cdf_yahoo_topologies.png 
b/docs/images/ras_new_strategy_network_metric_cdf_yahoo_topologies.png
new file mode 100644
index 0000000..6e5a04d
Binary files /dev/null and 
b/docs/images/ras_new_strategy_network_metric_cdf_yahoo_topologies.png differ

http://git-wip-us.apache.org/repos/asf/storm/blob/0d616ba5/docs/images/ras_new_strategy_network_metric_improvement_random.png
----------------------------------------------------------------------
diff --git a/docs/images/ras_new_strategy_network_metric_improvement_random.png 
b/docs/images/ras_new_strategy_network_metric_improvement_random.png
new file mode 100644
index 0000000..3fd4029
Binary files /dev/null and 
b/docs/images/ras_new_strategy_network_metric_improvement_random.png differ

http://git-wip-us.apache.org/repos/asf/storm/blob/0d616ba5/docs/images/ras_new_strategy_network_metric_random.png
----------------------------------------------------------------------
diff --git a/docs/images/ras_new_strategy_network_metric_random.png 
b/docs/images/ras_new_strategy_network_metric_random.png
new file mode 100644
index 0000000..11d95d9
Binary files /dev/null and 
b/docs/images/ras_new_strategy_network_metric_random.png differ

http://git-wip-us.apache.org/repos/asf/storm/blob/0d616ba5/docs/images/ras_new_strategy_network_metric_yahoo_topologies.png
----------------------------------------------------------------------
diff --git a/docs/images/ras_new_strategy_network_metric_yahoo_topologies.png 
b/docs/images/ras_new_strategy_network_metric_yahoo_topologies.png
new file mode 100644
index 0000000..60ed7c1
Binary files /dev/null and 
b/docs/images/ras_new_strategy_network_metric_yahoo_topologies.png differ

http://git-wip-us.apache.org/repos/asf/storm/blob/0d616ba5/docs/images/ras_new_strategy_runtime_random.png
----------------------------------------------------------------------
diff --git a/docs/images/ras_new_strategy_runtime_random.png 
b/docs/images/ras_new_strategy_runtime_random.png
new file mode 100644
index 0000000..0ba1c73
Binary files /dev/null and b/docs/images/ras_new_strategy_runtime_random.png 
differ

http://git-wip-us.apache.org/repos/asf/storm/blob/0d616ba5/docs/images/ras_new_strategy_runtime_yahoo.png
----------------------------------------------------------------------
diff --git a/docs/images/ras_new_strategy_runtime_yahoo.png 
b/docs/images/ras_new_strategy_runtime_yahoo.png
new file mode 100644
index 0000000..0a51089
Binary files /dev/null and b/docs/images/ras_new_strategy_runtime_yahoo.png 
differ

Reply via email to