Well that's true about GC only if you're setting a large heap, which should only do if you need to keep a lot of persistent state (i.e. state that's not being GCed or flushed to a db, for example). I'd be willing to bet that this covers most use cases. But yeah, point taken that there are exceptions.
On Fri Feb 06 2015 at 3:26:46 PM Nathan Leung <[email protected]> wrote: > I would say it depends on what you are trying to do and how your hardware > is configured. If you have a lot of memory, you might want more workers so > that GC does not take as long (this can be a problem if you have 32GB or > more RAM in your VM, depending on how your application behaves and how your > GC is tuned). If you have your topology split across more workers, you > have to do more serialization, but if a worker fails, you will lose less of > the topology. > > If you have smaller host machines, then fewer workers makes sense. > > one worker per node will not affect how many tasks or executors you have > have. They will just be distributed amongst fewer workers. > > On Fri, Feb 6, 2015 at 3:20 PM, Sa Li <[email protected]> wrote: > >> Thank you very much, Luke, are you saying set one worker each node, like >> >> nimbus.host: "nimbus" >> supervisor.slots.ports: >> - 6700 >> # - 6701 >> # - 6702 >> # - 6703 >> >> comment out 6701-6703, just leave one worker up? Now my question, only >> one worker per node won't effect parallelism? >> >> thanks >> >> AL >> >> On Fri, Feb 6, 2015 at 11:18 AM, Luke Rohde <[email protected]> wrote: >> >>> You're probably better of using just one worker per node, unless you >>> have a specific reason that you want to have more JVM instances. Keeping >>> processing within a single JVM on a node allows tasks running on the same >>> node to avoid serialization. >>> >>> On Fri Feb 06 2015 at 1:48:42 PM Sa Li <[email protected]> wrote: >>> >>>> Hi, all >>>> >>>> My storm Dev cluster has 3 nodes, and I config to run 4 workers on each >>>> node by default, >>>> >>>> supervisor.slots.ports: For each worker machine, you configure how >>>> many workers run on that machine with this config. Each worker uses a >>>> single port for receiving messages, and this setting defines which ports >>>> are open for use. If you define five ports here, then Storm will allocate >>>> up to five workers to run on this machine. If you define three ports, Storm >>>> will only run up to three. By default, this setting is configured to run 4 >>>> workers on the ports 6700, 6701, 6702, and 6703. >>>> >>>> I think I can allocate more workers for each node, what is the maximum >>>> number of worker for each node without impact the performance? >>>> >>>> >>>> thanks >>>> >>>> AL >>>> >>> >> >
