Reading the documentation on spouts: https://storm.apache.org/documentation/Concepts.html#spouts We took the following sentence -- "It is imperative that nextTuple does not block for any spout implementation, because Storm calls all the spout methods on the same thread." -- to indicate it would be a good idea to fire up a separate thread in our custom spouts in what sounds like a similar way to you. So far we haven't had any issues doing this.
On Fri, Sep 18, 2015 at 9:51 AM, Nick R. Katsipoulakis < [email protected]> wrote: > Hello all, > > I have spouts that read input from files and send the data inside my > topology. In order to achieve higher input rates, I do some buffering of > data, by having them read by a thread, spawned after the spout is initiated > (in the open() function). The data are stored in an ArrayBlockingQueue of > fixed size. > > Unfortunately, it seems that the thread is starving and does not execute > as it would in a stand-alone JVM. > > First of all, is my approach (spawning a thread in my spout) considered a > good practice for Storm? If not, how else do you suggest I could overcome > the IO delay from reading the data directly from the file. > > Thank you, > Nick >
