[ 
https://issues.apache.org/jira/browse/STORM-594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641679#comment-14641679
 ] 

Jon Weygandt commented on STORM-594:
------------------------------------

I have yet to work with a use case like that, although one of my colleagues 
pointed this out. I am really interested in how Nimbus actually converts 
bolts/parallelism/workers into a physical executable layout. This has always 
fascinated me, as how does one take the given input and create something that 
is stable and manageable. I have yet to look at the code for this, but if it is 
very predictable and what one wants: the "singleton bolts" will need to be 
tagged in a method the AutoScaleManager recognizes and the scaling algorithm 
will use that tagging. If the topology allocation is not what one wants, but it 
is somewhat "modular", perhaps the AutoScaleManager could have a 
computePhysicalTopology method, so that part of the layout is under the control 
of the auto-scale system.

Although we did go into more detail on what bolts classify as "singleton 
bolts". One that does: an accumulator of some distributed operation, that is 
accumulating partial results into a final result. Thus if you partitioned it 
you would not get the results you want. These must be "singleton bolts". One 
that does not: an email out bolt. Yes it is trivial, the system will work fine 
with only one. But logically the system will work fine with one per worker as 
well. Some might think that's wasteful, but for email, not really. You have one 
more thread, but it is doing X/workers amount of work. In fact this may be more 
efficient, as you won't have to do (de)serialization and IPC to the singleton. 
Now if it is a database output bolt, and connections to the DB are limited, a 
tradeoff. 

We will support both scenarios, so the Storm developer will have a choice.



> Auto-Scaling Resources in a Topology
> ------------------------------------
>
>                 Key: STORM-594
>                 URL: https://issues.apache.org/jira/browse/STORM-594
>             Project: Apache Storm
>          Issue Type: New Feature
>            Reporter: HARSHA BALASUBRAMANIAN
>            Assignee: Pooyan Jamshidi
>            Priority: Minor
>         Attachments: Algorithm for Auto-Scaling.pdf, Project Plan and 
> Scope.pdf
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> A useful feature missing in Storm topologies is the ability to auto-scale 
> resources, based on a pre-configured metric. The feature proposed here aims 
> to build such a auto-scaling mechanism using a feedback system. A brief 
> overview of the feature is provided here. The finer details of the required 
> components and the scaling algorithm (uses a Feedback System) are provided in 
> the PDFs attached.
> Brief Overview:
> Topologies may get created with or (ideally) without parallelism hints and 
> tasks in their bolts and spouts, before submitting them, If auto-scaling is 
> set in the topology (using a Boolean flag), the topology will also get 
> submitted to the auto-scale module.
> The auto-scale module will read a pre-configured metric (threshold/min) from 
> a configuration file. Using this value, the topology's resources will be 
> modified till the threshold is reached. At each stage in the auto-scale 
> module's execution, feedback from the previous execution will be used to tune 
> the resources.
> The systems that need to be in place to achieve this are:
> 1. Metrics which provide the current threshold (no: of acks per minute) for a 
> topology's spouts and bolts.
> 2. Access to Storm's CLI tool which can change a topology's resources are 
> runtime.
> 3. A new java or clojure module which runs within the Nimbus daemon or in 
> parallel to it. This will be the auto-scale module.
> Limitations: (This is not an exhaustive list. More will be added as the 
> design matures. Also, some of the points here may get resolved)
> To test the feature there will be a number of limitations in the first 
> release. As the feature matures, it will be allowed to scale more
> 1. The auto-scale module will be limited to a few topologies (maybe 4 or 5 at 
> maximum)
> 2. New bolts will not be added to scale a topology. This feature will be 
> limited to increasing the resources within the existing topology.
> 3. Topology resources will not be decreased when it is running at more than 
> the required number (except for a few cases)
> 4. This feature will work only for long-running topologies where the input 
> threshold can become equal to or greater than the required threshold



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to