I would say it depends on what you are trying to do and how your hardware is configured. If you have a lot of memory, you might want more workers so that GC does not take as long (this can be a problem if you have 32GB or more RAM in your VM, depending on how your application behaves and how your GC is tuned). If you have your topology split across more workers, you have to do more serialization, but if a worker fails, you will lose less of the topology.
If you have smaller host machines, then fewer workers makes sense. one worker per node will not affect how many tasks or executors you have have. They will just be distributed amongst fewer workers. On Fri, Feb 6, 2015 at 3:20 PM, Sa Li <[email protected]> wrote: > Thank you very much, Luke, are you saying set one worker each node, like > > nimbus.host: "nimbus" > supervisor.slots.ports: > - 6700 > # - 6701 > # - 6702 > # - 6703 > > comment out 6701-6703, just leave one worker up? Now my question, only one > worker per node won't effect parallelism? > > thanks > > AL > > On Fri, Feb 6, 2015 at 11:18 AM, Luke Rohde <[email protected]> wrote: > >> You're probably better of using just one worker per node, unless you have >> a specific reason that you want to have more JVM instances. Keeping >> processing within a single JVM on a node allows tasks running on the same >> node to avoid serialization. >> >> On Fri Feb 06 2015 at 1:48:42 PM Sa Li <[email protected]> wrote: >> >>> Hi, all >>> >>> My storm Dev cluster has 3 nodes, and I config to run 4 workers on each >>> node by default, >>> >>> supervisor.slots.ports: For each worker machine, you configure how many >>> workers run on that machine with this config. Each worker uses a single >>> port for receiving messages, and this setting defines which ports are open >>> for use. If you define five ports here, then Storm will allocate up to five >>> workers to run on this machine. If you define three ports, Storm will only >>> run up to three. By default, this setting is configured to run 4 workers on >>> the ports 6700, 6701, 6702, and 6703. >>> >>> I think I can allocate more workers for each node, what is the maximum >>> number of worker for each node without impact the performance? >>> >>> >>> thanks >>> >>> AL >>> >> >
