Re: On Multiple Topologies and Worker Auto-Normalization

Ahmed El Rheddane Wed, 05 Nov 2014 09:49:32 -0800

Hello everybody,

I believe that the number of workers is only relevant if you use Storm'sdefault scheduler. You can also define your own scheduler byimplementing the IScheduler interface and assign the different executorson the available slots as you see fit.


Ahmed

On 11/05/2014 06:20 PM, Tyson Norris wrote:

I agree the behavior is awkward.
The setNumWorkers config appears to behave as an upper limit ofworkers that will be utilized (e.g. setNumWorkers(200) will not failto deploy when workers < 200), so this can also impact the ORDER oftopology deployment required when workers < sum of all setNumWorkersconfigs for all topologies.
For example, I originally setNumWorkers(200) - artificially high - sothat we can scale our worker pool up to 200 without deploying newcode. However, when worker pool is 10, and this topology gets deployedfirst, then no other topologies can be deployed, since all the workersare assigned to this single topology.
Now, while this is awkward IMHO (I would rather have some additionalknobs, like “assign a percentage of workers” or “assign a topologypriority”), we can get around this by using rebalance and setting thenumber of workers to reflect the current state of the cluster. Sinceyou MUST rebalance to increase/decrease the number of threads (whichwould typically go hand in hand with changing the number of workers, Ithink), then its not really extra work.
At some point it would be nice to have some pluggable logic thatcontrols the number of executors dynamically, so that when a worker isadded, the number of executors can be altered programmatically,instead of requiring manual intervention.
Thanks
Tyson
On Nov 5, 2014, at 9:05 AM, Dan DeCapria, CivicScience<[email protected] <mailto:[email protected]>>wrote:
Hi Nathan,
Sounds like I need to just bite-the-bullet and manually define thenumber of workers for each topology considering all topologies thatwill be running concurrently. The secondary process you mentioned isinteresting wrt using Thrift to query the worker utilization and thenauto-balance all topologies during runtime - I'll have to look intothat process further.
Thanks for you help,

-Dan
On Wed, Nov 5, 2014 at 11:50 AM, Nathan Leung <[email protected]<mailto:[email protected]>> wrote:
    It doesn't make sense to automatically balance worker load
    because you can get yourself into strange situations (e.g. 3
    workers and each topology requests 4, or 4 workers and > 4
    topologies, etc).  It would be nice if the UI or the logs gave a
    better indication that there were not enough workers to go around
    though.  You could write something that reads cluster information
    from the nimbus over thrift and rebalances all topologies as
    necessary, but I think it would be better to just make sure that
    you have enough workers available (or even better, more than
    enough workers) to satisfy the needs of your applications.

    On Tue, Nov 4, 2014 at 4:45 PM, Dan DeCapria, CivicScience
    <[email protected]
    <mailto:[email protected]>> wrote:

        Use Case:
I have a production storm cluster running with six workers.Currently topology A is active and consuming all six workers
        via conf.setNumWorkers(6). Now, launching Topology B with six
        workers (again via conf.setNumWorkers(6)) states the topology
        is active, but currently there are no available workers on
        the cluster for Topology B to use (as Topology A has claimed
        them all already) and hence Topology B is doing nothing. I
        believe this is due to storm requiring a priori that the sum
        of all topology's workers requested <= cluster worker capacity.

        I am wondering why the sum of all topology workers is not
        normalizing the allocations for the worker pool when capacity
        is exceeded and auto-adjusting as new topologies come and go?
        Meaning, from the use case, since both topologies requested
        the same count of six workers, and given the finite capacity
        of the cluster at six actual workers, the /implemented/
        normalized proportion of the cluster resources for each
        topology would be 50% split - such that Topology A gets three
        actual workers and Topology B gets three actual workers as well.

        How would I go about implementing a dynamic re-allocation of
        topology workers based on the proportion of expected workers
        given the cluster's finite worker capacity? An idea would be
        allocating workers as desired for all topologies until the
        cluster capacity is reached, at which point the actual number
        of workers desired becomes a normalized proportion allocation
        model over the cluster.

        Many thanks,

        -Dan

Re: On Multiple Topologies and Worker Auto-Normalization

Reply via email to