On Wed, Jul 9, 2014 at 11:34 AM, Preetam Rao <[email protected]> wrote:
> Hi > > Appreciate any pointers on the following which are causing us problems in > production. > > 1. Is there a way we can restrict multiple instances of a given spout be > allocated on "different hosts" ? Our spouts start a embedded Jetty server > that listen on a well known port (Here https). Thus parallelism on same > host does not help. Observing that most often, even when parallelism is set > to 5, all get allocated on same host rendering the parallelism wasted and > which in turn is causing load issues. From discussion I have read, my > thinking is it is not possible. But it is a critical issue for us right > now, so pointers help. > You could probably do this by restricting one worker per host and setting parallelism and num tasks < num of workers. > > 2. Is there a way we can control how many components get allocated per > worker (Or, enforce rule that never allocate more than one component per > worker) ? Irrespective of high worker count setup, occasionally both spout > as well as bolt are getting allocated on same worker (that is same host & > storm port). This is causing load & GC issues since the input rate is quite > high. . > I recently came across an article. See if this helps http://xumingming.sinaapp.com/885/twitter-storm-how-to-develop-a-pluggable-scheduler/ > > 3. Ours is a multi tenant setup. On the same lines as item 2 above, how > can we prevent components from different topologies not running on same > worker ? Because that simply means any topology by chance can break the > worker (say memory leak) on which my typology's components are running on. > Probably you can just restrict one worker process per host? > > Thanks in advance for any pointers/suggestions. > Preetam > > >
