Re: Question on Parallelsim

Matthias J. Sax Tue, 09 Jun 2015 01:03:19 -0700

One comment: The suggestion to use a single worker to avoid overhead is
basically right. It only has the drawback of coarse grained
fault-tolerance -- if the worker JVM goes done, be one bad behaving
spout/bolt, all other spouts/bolts die, too. Also keep in mind, that a
worker will only process spouts/bolts of a single topology (enforced to
isolate topologies from each other for fault-tolerance reason). Thus,
you need at least one worker (per supervisor) per parallel executing
topology.


-Matthias


On 06/09/2015 02:22 AM, Javier Gonzalez wrote:
> In that case, I would increase tho numbers of bolts and/or spouts. If
> your use case permits*, I'd say you can safely increase those numbers.
> The machine you describe should be able to support about 15 times as
> much. Study your current performance to see where do you need more power
> - is your spout running away with it and your bolts lagging behind? Add
> more bolts. Are your bolts idle because you can't feed them enough? More
> spouts. Everything running cool? Add more everything :)
> 
> * that is, if for some reason you are not restricted to only 4 spouts
> and/or only 13 bolts
> 
> Regards,
> Javier
> 
> On Mon, Jun 8, 2015 at 8:03 PM, Seungtack Baek
> <[email protected] <mailto:[email protected]>>
> wrote:
> 
>     What would be best to do if you have more than the number of cores?
> 
>     For example, we have 4 spout and 13 bolts and our machine has 32
>     CPUs with 8 cores each..
> 
> 
>     *Seungtack Baek | Precocity, LLC*
> 
>     Tel/Direct: (972) 378-1030 | Mobile: (214) 477-5715
> 
>     [email protected]
>     <mailto:[email protected]>_ | www.precocityllc.com
>     <http://www.precocityllc.com/>__
> 
> 
>     This is the end of this message.
> 
>     --
> 
> 
>     On Mon, Jun 8, 2015 at 6:26 PM, Javier Gonzalez <[email protected]
>     <mailto:[email protected]>> wrote:
> 
>         I would say, configure so that your total parallelism matches
>         the number of cores available (i.e. if you have a topology with
>         X spouts, Y boltAs and Z boltBs, make it so that X+Y+Z = cores
>         available).  And one worker per machine, inter-JVM
>         communications are expensive. When you have more bolts and
>         spouts than available cores, you're losing time to switching
>         available cpus between them. In an ideal world, your topology
>         will be able to allocate the cores with components in a 1-1
>         fashion without switching.
> 
>         Regards,
>         JG
> 
>         On Mon, Jun 8, 2015 at 6:56 PM, Seungtack Baek
>         <[email protected]
>         <mailto:[email protected]>> wrote:
> 
>             I was reading on "How many Workers should I use?" (link
>             
> <https://storm.apache.org/documentation/FAQ.html#how-many-workers-should-i-use?>)
>             and it suggested us to use parallelism hint that is same as
>             the total number of cores in the cluster. I just want to
>             clarify that this parallelism is solely for this bolt only,
>             without counting acker and spout task, right?
> 
>             Also, even if then number of bolts (not tasks) increases,
>             are we still encouraged to keep the parallelism = total
>             cores in cluster?
> 
>             Thanks,
>             Baek
> 
> 
>             *Seungtack Baek | Precocity, LLC*
> 
>             Tel/Direct: (972) 378-1030 | Mobile: (214) 477-5715
>             <tel:%28214%29%20477-5715>
> 
>             [email protected]
>             <mailto:[email protected]>_ | www.precocityllc.com
>             <http://www.precocityllc.com/>__
> 
> 
>             This is the end of this message.
> 
>             --
> 
> 
> 
> 
>         -- 
>         Javier González Nicolini
> 
> 
> 
> 
> 
> -- 
> Javier González Nicolini

signature.asc
Description: OpenPGP digital signature

Re: Question on Parallelsim

Reply via email to