Hi Martin, Use "topology.max.spout.pending" config. It will solve both of the problems. Storm does not have back pressure, but this is kind of a throttling valve on the Spout. Make sure this value is up to twice the max throughput you expect out of your system.
On Tue, Aug 18, 2015 at 8:14 AM Pasquini, Reuben <[email protected]> wrote: > Hi Martin, > > > > Initialize your bolts in the ‘prepare’ method – the topology will not go > live until all the bolts have completed their preparation. > > > > I believe the topology will automatically throttle calls to a spout’s > ‘nextTuple’ method if the rest of the topology does not ‘ACK’ the tuples – > be sure that your bolts are not blindly ‘acking’ tuples that have not > completed processing in the bolt. > > > > > > > > *From:* Martin Burian [mailto:[email protected]] > *Sent:* Tuesday, August 18, 2015 5:04 AM > *To:* [email protected] > *Subject:* Topology execution synchronization > > > > Good noon to everyone, > > I have encountered problems with synchronization of component execution. > > > > When the topology starts, the spout starts emitting tuples before the > other components are prepared (the bolts recover state from redis, it takes > considerable amount of time). The emitted tuples that cannot be processed > are buffered in memory. fill the heap and cause premanent GC and a slow > painful death for the worker process. Is it meant to be like this? > > > > In the second case, a worker running part of the topology (not the spout) > dies. The spout keeps working even though it knows the other worker is dead > and drops messages to it instead of waiting for it to come up again. > > > > We have solved the first one by making the spout wait until all the bolts > come up, but the secondd problem would require noticeable amount of work. > Are theese behaviors intended? Can they be changed? > > > > Thanks for replies in advance, > > Martin B >
