Hi,

Storm version: 0.9.2

I'm running a topology where, based on an event, I try to synch one
database to the other. After a kafka spout (with db info in the message),
- the first bolt sends a tuple for each table in the db,
- the second bolt, reads from the given table and sends batches of rows for
a given table
- third bolt, writes data to the database
- fourth one (field grouping with 3rd bolt) sends success/failure for a
table
- last one (field grouping with 4th bolt) collects all table level info and
sends out the final message

This topology runs without any issue for small databases. But when the db
gets slightly larger seems like the topology gets stuck after processing
some tuples and not proceeding beyond that.

I saw a discussion similar to this here[1] but seems it is happening due to
too many pending spout messages. But in my cases, its related to the large
number of tuples coming out from bolts. As yoyu can imagine the fan out
from the second bolt can be extremely high. For example, in one case I was
sending as many as 1000 tuples from second to third bolt and from there to
4th.

I'm just wondering why this is getting stuck? Are there any buffer sizes in
play here? How can I fix this without ideally not changing the topology
design.

Really appreciate your input here.

Thanks,
Eran Withana

Reply via email to