[ 
https://issues.apache.org/jira/browse/STORM-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15063608#comment-15063608
 ] 

ASF GitHub Bot commented on STORM-898:
--------------------------------------

Github user jerrypeng commented on a diff in the pull request:

    https://github.com/apache/storm/pull/921#discussion_r47999139
  
    --- Diff: docs/documentation/Resource_Aware_Scheduler_overview.md ---
    @@ -0,0 +1,227 @@
    +# Introduction
    +
    +The purpose of this document is to provide a description of the Resource 
Aware Scheduler for the Storm distributed real-time computation system.  This 
document will provide you with both a high level description of the resource 
aware scheduler in Storm
    +
    +## Using Resource Aware Scheduler
    +
    +The user can switch to using the Resource Aware Scheduler by setting the 
following in *conf/storm.yaml*
    +
    +    storm.scheduler: 
“backtype.storm.scheduler.resource.ResourceAwareScheduler”
    +
    +
    +## API Overview
    +
    +For a Storm Topology, the user can now specify the amount of resources a 
topology component (i.e. Spout or Bolt) is required to run a single instance of 
the component.  The user can specify the resource requirement for a topology 
component by using the following API calls.
    +
    +### Setting Memory Requirement
    +
    +API to set component memory requirement:
    +
    +    public T setMemoryLoad(Number onHeap, Number offHeap)
    +
    +Parameters:
    +* Number onHeap – The amount of on heap memory an instance of this 
component will consume in megabytes
    +* Number offHeap – The amount of off heap memory an instance of this 
component will consume in megabytes
    +
    +The user also has to option to just specify the on heap memory requirement 
if the component does not have an off heap memory need.
    +
    +    public T setMemoryLoad(Number onHeap)
    +
    +Parameters:
    +* Number onHeap – The amount of on heap memory an instance of this 
component will consume
    +
    +If no value is provided for offHeap, 0.0 will be used. If no value is 
provided for onHeap, or if the API is never called for a component, the default 
value will be used.
    +
    +Example of Usage:
    +
    +    SpoutDeclarer s1 = builder.setSpout("word", new TestWordSpout(), 10);
    +    s1.setMemoryLoad(1024.0, 512.0);
    +    builder.setBolt("exclaim1", new ExclamationBolt(), 3)
    +                .shuffleGrouping("word").setMemoryLoad(512.0);
    +
    +The entire memory requested for this topology is 16.5 GB. That is from 10 
spouts with 1GB on heap memory and 0.5 GB off heap memory each and 3 bolts with 
0.5 GB on heap memory each.
    +
    +### Setting CPU Requirement
    +
    +
    +API to set component CPU requirement:
    +
    +    public T setCPULoad(Double amount)
    +
    +Parameters:
    +* Number amount – The amount of on CPU an instance of this component will 
consume.
    --- End diff --
    
    well currently in the implementation we are assuming each executor only has 
one task.  We perhaps need to consider in the future how to handle executors 
will multiple tasks.  Since its not clear the resource usages when multiple 
tasks are executed serially in an executor


> Add priorities and per user resource guarantees to Resource Aware Scheduler
> ---------------------------------------------------------------------------
>
>                 Key: STORM-898
>                 URL: https://issues.apache.org/jira/browse/STORM-898
>             Project: Apache Storm
>          Issue Type: New Feature
>          Components: storm-core
>            Reporter: Robert Joseph Evans
>            Assignee: Boyang Jerry Peng
>         Attachments: Resource Aware Scheduler for Storm.pdf
>
>
> In a multi-tenant environment we would like to be able to give individual 
> users a guarantee of how much CPU/Memory/Network they will be able to use in 
> a cluster.  We would also like to know which topologies a user feels are the 
> most important to keep running if there are not enough resources to run all 
> of their topologies.
> Each user should be able to specify if their topology is production, staging, 
> or development. Within each of those categories a user should be able to give 
> a topology a priority, 0 to 10 with 10 being the highest priority (or 
> something like this).
> If there are not enough resources on a cluster to run a topology assume this 
> topology is running using resources and find the user that is most over their 
> guaranteed resources.  Shoot the lowest priority topology for that user, and 
> repeat until, this topology is able to run, or this topology would be the one 
> shot.   Ideally we don't actually shoot anything until we know that we would 
> have made enough room.
> If the cluster is over-subscribed and everyone is under their guarantee, and 
> this topology would not put the user over their guarantee.  Shoot the lowest 
> priority topology in this workers resource pool until there is enough room to 
> run the topology or this topology is the one that would be shot.  We might 
> also want to think about what to do if we are going to shoot a production 
> topology in an oversubscribed case, and perhaps we can shoot a non-production 
> topology instead even if the other user is not over their guarantee.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to