Tuning an topology that contains bolts that have a unpredictable execute latency is extremely difficult. I've had to slow down the entire topology by increasing the storm.max.spout.pending and storm.message.timeout.secs otherwise you'll have tuples queue up and timeout.
On Tue, Mar 10, 2015 at 8:53 AM Martin Illecker <[email protected]> wrote: > I would be interested in a solution for high latency bolts as well. > > Maybe a custom scheduler, which prioritizes high latency bolts might help? > (e.g., allowing a worker to exclusively run high latency bolts) > > Does anyone have a working solution for a high-throughput topology (x0000 > tuples / sec) including a HTTPClient bolt (latency around 100ms)? > > > 2015-03-08 20:35 GMT+01:00 Frank Jania <[email protected]>: > >> I've been running storm successfully now for a while with a fairly simple >> topology of this form: >> >> spout with a stream of tweets --> bolt to check tweet user against cache >> --> bolts to do some persistence based on tweet content. >> >> So far that's been humming along quite well with execute latencies in low >> single digit or sub millisecond. Other than setting the parallelism for >> various bolts, I've been able to run it the default topology config pretty >> well. >> >> Now I'm trying a topology of the form: >> >> spout with a stream of tweets --> bolt to extract the urls in the tweet >> --> bolt to fetch the url and get the page's title. >> >> For this topology the "fetch" portion can have a much longer latency, I'm >> seeing execute latencies in the 300-500ms range to accommodate the fetch of >> any of these arbitrary urls. I've implemented caching to avoid fetching >> urls I already have titles for and using socket/connection timeouts to keep >> fetches from hanging for too long, but even still, this is going to be a >> bottleneck. >> >> I've set the parallelism for the fetch bolt fairly high already, but are >> there any best practices for configuring a topology like this where at >> least one bolt is going to take much more time to process than the rest? >> > >
