The output collector is not thread safe. As such some odd things can happen if it is called from multiple threads. The reality is that in some cases, if there are more tasks then executor threads, which is not that common, then you need to follow (1). But storm itself does not follow this convention either for the shell spout/bolt and has resulted in some issues. In general 2 or 3 should be fine so long as you have the same number of executor threads as you have tasks.
‹Bobby On 5/14/14, 8:39 AM, "Srinath C" <[email protected]> wrote: >Hi, > Can someone explain the internals behind "NPE from deep inside >storm<https://github.com/nathanmarz/storm/wiki/Troubleshooting#nullpointer >exception-from-deep-inside-storm>" >as stated in troubleshooting document? (like is there a use of thread >locals, etc) > Does it imply (which of the following)?: > (1) only storm's executor thread should use the output collector > (2) output collector should be invoked from one thread throughout >its >lifetime - either from storm's executor thread or any thread that that >topology spawns > (3) any number of threads can use it as long as the access is >synchronized (maybe like synchronized(outputCollector) {...}) > > The code is in clojure and hence I'm not able to follow what could >enforce such a restriction. > >Thanks, >Srinath.
