[ 
https://issues.apache.org/jira/browse/STORM-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15047726#comment-15047726
 ] 

ASF GitHub Bot commented on STORM-898:
--------------------------------------

Github user jerrypeng commented on a diff in the pull request:

    https://github.com/apache/storm/pull/921#discussion_r47034771
  
    --- Diff: docs/documentation/Resource_Aware_Scheduler_overview.md ---
    @@ -0,0 +1,222 @@
    +# Introduction
    +
    +The purpose of this document is to provide a description of the Resource 
Aware Scheduler for the Storm distributed real-time computation system.  This 
document will provide you with both a high level description of the resource 
aware scheduler in Storm
    +
    +## Using Resource Aware Scheduler
    +
    +The user can switch to using the Resource Aware Scheduler by setting the 
following in *conf/storm.yaml*
    +
    +    storm.scheduler: 
“backtype.storm.scheduler.resource.ResourceAwareScheduler”
    +
    +
    +## API Overview
    +
    +For a Storm Topology, the user can now specify the amount of resources a 
topology component (i.e. Spout or Bolt) is required to run a single instance of 
the component.  The user can specify the resource requirement for a topology 
component by using the following API calls.
    +
    +### Setting Memory Requirement
    +
    +API to set component memory requirement:
    +
    +    public T setMemoryLoad(Number onHeap, Number offHeap)
    +
    +Parameters:
    +* Number onHeap – The amount of on heap memory an instance of this 
component will consume in megabytes
    +* Number OffHeap – The amount of off heap memory an instance of this 
component will consume in megabytes
    +
    +The user also have to option to just specify the on heap memory 
requirement if the component does not have an off heap memory need.
    +
    +    public T setMemoryLoad(Number onHeap)
    +
    +Parameters:
    +* Number onHeap – The amount of on heap memory an instance of this 
component will consume
    +
    +If no value is provided for offHeap, 0.0 will be used. If no value is 
provided for onHeap, or if the API is never called for a component, the default 
value will be used.
    +
    +Example of Usage:
    +
    +    SpoutDeclarer s1 = builder.setSpout("word", new TestWordSpout(), 10);
    +    s1.setMemoryLoad(1024.0, 512.0);
    +    builder.setBolt("exclaim1", new ExclamationBolt(), 3)
    +                .shuffleGrouping("word").setMemoryLoad(512.0);
    +
    +The entire memory requested for this topology is 16.5 GB. That is from 10 
spouts with 1GB on heap memory and 0.5 GB off heap memory each and 3 bolts with 
0.5 GB on heap memory each.
    +
    +### Setting CPU Requirement
    +
    +
    +API to set component CPU requirement:
    +
    +    public T setCPULoad(Double amount)
    +
    +Parameters:
    +* Number amount – The amount of on CPU an instance of this component will 
consume.
    +
    +Currently, the amount of CPU resources a component requires or is 
available on a node is represented by a point system. CPU usage is a difficult 
concept to define. Different CPU architectures perform differently depending on 
the task at hand. They are so complex that expressing all of that in a single 
precise portable number is impossible. Instead we take a convention over 
configuration approach and are primarily concerned with rough level of CPU 
usage while still providing the possibility to specify amounts more fine 
grained.
    +
    +By convention a CPU core typically will get 100 points. If you feel that 
your processors are more or less powerful you can adjust this accordingly. 
Heavy tasks that are CPU bound will get 100 points, as they can consume an 
entire core. Medium tasks should get 50, light tasks 25, and tiny tasks 10. In 
some cases you have a task that spawns other threads to help with processing. 
These tasks may need to go above 100 points to express the amount of CPU they 
are using. If these conventions are followed the common case for a single 
threaded task the reported Capacity * 100 should be the number of CPU points 
that the task needs.
    +
    +Example of Usage:
    +
    +    SpoutDeclarer s1 = builder.setSpout("word", new TestWordSpout(), 10);
    +    s1.setCPULoad(15.0);
    +    builder.setBolt("exclaim1", new ExclamationBolt(), 3)
    +                .shuffleGrouping("word").setCPULoad(10.0);
    +
    +###        Limiting the Heap Size per Worker (JVM) Process
    +
    +
    +    public void setTopologyWorkerMaxHeapSize(Number size)
    +
    +Parameters:
    +* Number size – The memory limit a worker process will be allocated in 
megabytes
    +
    +The user can limit the amount of memory resources the resource aware 
scheduler that is allocated to a single worker on a per topology basis by using 
the above API.  This API is in place so that the users can spread executors to 
multiple workers.  However, spreading workers to multiple workers may increase 
the communication latency since executors will not be able to use Disruptor 
Queue for intra-process communication.
    +
    +Example of Usage:
    +
    +    Config conf = new Config();
    +    conf.setTopologyWorkerMaxHeapSize(512.0);
    +
    +### Setting Available Resources on Node
    +
    +A storm administrator can specify node resource availability by modifying 
the *conf/storm.yaml* file located in the storm home directory of that node.
    +
    +A storm administrator can specify how much available memory a node has in 
megabytes adding the following to *storm.yaml*
    +
    +    supervisor.memory.capacity.mb: [amount<Double>]
    +
    +A storm administrator can also specify how much available CPU resources a 
node has available adding the following to *storm.yaml*
    +
    +    supervisor.cpu.capacity: [amount<Double>]
    +
    +
    +Note: that the amount the user can specify for the available CPU is 
represented using a point system like discussed earlier.
    +
    +Example of Usage:
    +
    +    supervisor.memory.capacity.mb: 20480.0
    +    supervisor.cpu.capacity: 100.0
    +
    +
    +2.5.       Other Configurations
    --- End diff --
    
    will fix


> Add priorities and per user resource guarantees to Resource Aware Scheduler
> ---------------------------------------------------------------------------
>
>                 Key: STORM-898
>                 URL: https://issues.apache.org/jira/browse/STORM-898
>             Project: Apache Storm
>          Issue Type: New Feature
>          Components: storm-core
>            Reporter: Robert Joseph Evans
>            Assignee: Boyang Jerry Peng
>         Attachments: Resource Aware Scheduler for Storm.pdf
>
>
> In a multi-tenant environment we would like to be able to give individual 
> users a guarantee of how much CPU/Memory/Network they will be able to use in 
> a cluster.  We would also like to know which topologies a user feels are the 
> most important to keep running if there are not enough resources to run all 
> of their topologies.
> Each user should be able to specify if their topology is production, staging, 
> or development. Within each of those categories a user should be able to give 
> a topology a priority, 0 to 10 with 10 being the highest priority (or 
> something like this).
> If there are not enough resources on a cluster to run a topology assume this 
> topology is running using resources and find the user that is most over their 
> guaranteed resources.  Shoot the lowest priority topology for that user, and 
> repeat until, this topology is able to run, or this topology would be the one 
> shot.   Ideally we don't actually shoot anything until we know that we would 
> have made enough room.
> If the cluster is over-subscribed and everyone is under their guarantee, and 
> this topology would not put the user over their guarantee.  Shoot the lowest 
> priority topology in this workers resource pool until there is enough room to 
> run the topology or this topology is the one that would be shot.  We might 
> also want to think about what to do if we are going to shoot a production 
> topology in an oversubscribed case, and perhaps we can shoot a non-production 
> topology instead even if the other user is not over their guarantee.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to