[
https://issues.apache.org/jira/browse/STORM-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15047730#comment-15047730
]
ASF GitHub Bot commented on STORM-898:
--------------------------------------
Github user jerrypeng commented on a diff in the pull request:
https://github.com/apache/storm/pull/921#discussion_r47034848
--- Diff: docs/documentation/Resource_Aware_Scheduler_overview.md ---
@@ -0,0 +1,222 @@
+# Introduction
+
+The purpose of this document is to provide a description of the Resource
Aware Scheduler for the Storm distributed real-time computation system. This
document will provide you with both a high level description of the resource
aware scheduler in Storm
+
+## Using Resource Aware Scheduler
+
+The user can switch to using the Resource Aware Scheduler by setting the
following in *conf/storm.yaml*
+
+ storm.scheduler:
“backtype.storm.scheduler.resource.ResourceAwareScheduler”
+
+
+## API Overview
+
+For a Storm Topology, the user can now specify the amount of resources a
topology component (i.e. Spout or Bolt) is required to run a single instance of
the component. The user can specify the resource requirement for a topology
component by using the following API calls.
+
+### Setting Memory Requirement
+
+API to set component memory requirement:
+
+ public T setMemoryLoad(Number onHeap, Number offHeap)
+
+Parameters:
+* Number onHeap – The amount of on heap memory an instance of this
component will consume in megabytes
+* Number OffHeap – The amount of off heap memory an instance of this
component will consume in megabytes
+
+The user also have to option to just specify the on heap memory
requirement if the component does not have an off heap memory need.
+
+ public T setMemoryLoad(Number onHeap)
+
+Parameters:
+* Number onHeap – The amount of on heap memory an instance of this
component will consume
+
+If no value is provided for offHeap, 0.0 will be used. If no value is
provided for onHeap, or if the API is never called for a component, the default
value will be used.
+
+Example of Usage:
+
+ SpoutDeclarer s1 = builder.setSpout("word", new TestWordSpout(), 10);
+ s1.setMemoryLoad(1024.0, 512.0);
+ builder.setBolt("exclaim1", new ExclamationBolt(), 3)
+ .shuffleGrouping("word").setMemoryLoad(512.0);
+
+The entire memory requested for this topology is 16.5 GB. That is from 10
spouts with 1GB on heap memory and 0.5 GB off heap memory each and 3 bolts with
0.5 GB on heap memory each.
+
+### Setting CPU Requirement
+
+
+API to set component CPU requirement:
+
+ public T setCPULoad(Double amount)
+
+Parameters:
+* Number amount – The amount of on CPU an instance of this component will
consume.
+
+Currently, the amount of CPU resources a component requires or is
available on a node is represented by a point system. CPU usage is a difficult
concept to define. Different CPU architectures perform differently depending on
the task at hand. They are so complex that expressing all of that in a single
precise portable number is impossible. Instead we take a convention over
configuration approach and are primarily concerned with rough level of CPU
usage while still providing the possibility to specify amounts more fine
grained.
+
+By convention a CPU core typically will get 100 points. If you feel that
your processors are more or less powerful you can adjust this accordingly.
Heavy tasks that are CPU bound will get 100 points, as they can consume an
entire core. Medium tasks should get 50, light tasks 25, and tiny tasks 10. In
some cases you have a task that spawns other threads to help with processing.
These tasks may need to go above 100 points to express the amount of CPU they
are using. If these conventions are followed the common case for a single
threaded task the reported Capacity * 100 should be the number of CPU points
that the task needs.
+
+Example of Usage:
+
+ SpoutDeclarer s1 = builder.setSpout("word", new TestWordSpout(), 10);
+ s1.setCPULoad(15.0);
+ builder.setBolt("exclaim1", new ExclamationBolt(), 3)
+ .shuffleGrouping("word").setCPULoad(10.0);
+
+### Limiting the Heap Size per Worker (JVM) Process
+
+
+ public void setTopologyWorkerMaxHeapSize(Number size)
+
+Parameters:
+* Number size – The memory limit a worker process will be allocated in
megabytes
+
+The user can limit the amount of memory resources the resource aware
scheduler that is allocated to a single worker on a per topology basis by using
the above API. This API is in place so that the users can spread executors to
multiple workers. However, spreading workers to multiple workers may increase
the communication latency since executors will not be able to use Disruptor
Queue for intra-process communication.
+
+Example of Usage:
+
+ Config conf = new Config();
+ conf.setTopologyWorkerMaxHeapSize(512.0);
+
+### Setting Available Resources on Node
+
+A storm administrator can specify node resource availability by modifying
the *conf/storm.yaml* file located in the storm home directory of that node.
+
+A storm administrator can specify how much available memory a node has in
megabytes adding the following to *storm.yaml*
+
+ supervisor.memory.capacity.mb: [amount<Double>]
+
+A storm administrator can also specify how much available CPU resources a
node has available adding the following to *storm.yaml*
+
+ supervisor.cpu.capacity: [amount<Double>]
+
+
+Note: that the amount the user can specify for the available CPU is
represented using a point system like discussed earlier.
+
+Example of Usage:
+
+ supervisor.memory.capacity.mb: 20480.0
+ supervisor.cpu.capacity: 100.0
+
+
+2.5. Other Configurations
+
+The user can set some default configurations for the Resource Aware
Scheduler in *conf/storm.yaml*:
+
+ //default value if on heap memory requirement is not specified for a
component
+ topology.component.resources.onheap.memory.mb: 128.0
+
+ //default value if off heap memory requirement is not specified for a
component
+ topology.component.resources.offheap.memory.mb: 0.0
+
+ //default value if CPU requirement is not specified for a component
+ topology.component.cpu.pcore.percent: 10.0
+
+ //default value for the max heap size for a worker
+ topology.worker.max.heap.size.mb: 768.0
+
+# Topology Priorities and Per User Resource
+
+The next step for the Resource Aware Scheduler or RAS is to enable it to
have multitenant capabilities since many Storm users typically share a Storm
cluster. Resource Aware Scheduler needs to be able to allocate resources on a
per user basis. Each user can be guaranteed a certain amount of resources to
run his or her topologies and the Resource Aware Scheduler should meet those
guarantees when possible. When the Storm cluster has extra free resources,
Resource Aware Scheduler needs to be able allocate additional resources to user
in a fair manner. The importance of topologies can also vary. Topologies can
be used for actual production or just experimentation, thus Resource Aware
Scheduler should take into account the importance of a topology when
determining the order in which to schedule topologies or when to evict
topologies
+
+## Setup
+
+The resource guarantees of a user can be specified
*conf/user-resource-pools.yaml*. Specify the resource guarantees of a user in
the following format:
+
+ resource.aware.scheduler.user.pools:
+ [UserId]
+ cpu: [Amount of Guarantee CPU Resources]
+ memory: [Amount of Guarantee Memory Resources]
+
+An example of what *user-resource-pools.yaml* can look like:
+
+ resource.aware.scheduler.user.pools:
+ jerry:
+ cpu: 1000
+ memory: 8192.0
+ derek:
+ cpu: 10000.0
+ memory: 32768
+ bobby:
+ cpu: 5000.0
+ memory: 16384.0
+
+Please note that the specified amount of Guaranteed CPU and Memory can be
either a integer or double
+
+## API Overview
+### Specifying topology priority
+The range of topology priorities can range form 0-30. The topologies
priorities will be partitioned into several priority levels that may contain a
range of priorities.
+For example we can create a priority level mapping:
+
+ PRODUCTION => 0 – 9
+ STAGING => 10 – 19
+ DEV => 20 – 29
--- End diff --
should be 0 - 29. Will fix
> Add priorities and per user resource guarantees to Resource Aware Scheduler
> ---------------------------------------------------------------------------
>
> Key: STORM-898
> URL: https://issues.apache.org/jira/browse/STORM-898
> Project: Apache Storm
> Issue Type: New Feature
> Components: storm-core
> Reporter: Robert Joseph Evans
> Assignee: Boyang Jerry Peng
> Attachments: Resource Aware Scheduler for Storm.pdf
>
>
> In a multi-tenant environment we would like to be able to give individual
> users a guarantee of how much CPU/Memory/Network they will be able to use in
> a cluster. We would also like to know which topologies a user feels are the
> most important to keep running if there are not enough resources to run all
> of their topologies.
> Each user should be able to specify if their topology is production, staging,
> or development. Within each of those categories a user should be able to give
> a topology a priority, 0 to 10 with 10 being the highest priority (or
> something like this).
> If there are not enough resources on a cluster to run a topology assume this
> topology is running using resources and find the user that is most over their
> guaranteed resources. Shoot the lowest priority topology for that user, and
> repeat until, this topology is able to run, or this topology would be the one
> shot. Ideally we don't actually shoot anything until we know that we would
> have made enough room.
> If the cluster is over-subscribed and everyone is under their guarantee, and
> this topology would not put the user over their guarantee. Shoot the lowest
> priority topology in this workers resource pool until there is enough room to
> run the topology or this topology is the one that would be shot. We might
> also want to think about what to do if we are going to shoot a production
> topology in an oversubscribed case, and perhaps we can shoot a non-production
> topology instead even if the other user is not over their guarantee.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)