Hi, Storm version: 0.9.2
I'm running a topology where, based on an event, I try to synch one database to the other. After a kafka spout (with db info in the message), - the first bolt sends a tuple for each table in the db, - the second bolt, reads from the given table and sends batches of rows for a given table - third bolt, writes data to the database - fourth one (field grouping with 3rd bolt) sends success/failure for a table - last one (field grouping with 4th bolt) collects all table level info and sends out the final message This topology runs without any issue for small databases. But when the db gets slightly larger seems like the topology gets stuck after processing some tuples and not proceeding beyond that. I saw a discussion similar to this here[1] but seems it is happening due to too many pending spout messages. But in my cases, its related to the large number of tuples coming out from bolts. As yoyu can imagine the fan out from the second bolt can be extremely high. For example, in one case I was sending as many as 1000 tuples from second to third bolt and from there to 4th. I'm just wondering why this is getting stuck? Are there any buffer sizes in play here? How can I fix this without ideally not changing the topology design. Really appreciate your input here. Thanks, Eran Withana
