Hello, Problem:
I have a couple of busy streams (a few of billion tuples per day) that are getting an unfair distribution of work. I'm using the Trident api. storm 0.8.0. One of the streams (spouts) doesn't get to run. Details: I've been using storm for almost 2 years and this topology has been working fine for that time until recently when our load increased. I have 2 streams reading from queues, and the topology is doing nothing more than a group by and aggregating. The main issue is that at first, if I reset the topology, both streams make progress, but as time goes on, only one of the streams gets to run and pull data. Both streams are busy enough that there will always be data available for storm to pull from either stream- (my worker count is 1 ) - and that's OK. The issue is that after some time only one of them runs, the other one has no acked tuples. I've made sure that both streams are available by commenting out either source at a time and making sure they work on their own. The issue is when i do a Topology.merge( stream 1... stream n). Things I've done to mitigate: I've increased the acker count, increased the worker count, increased with the parallelism hint. I don't have a lot of computational resources for this one topology. Our load is seasonal - high during day time and low at night time. As long as everything is aggregated by the end of day, i'm fine w/ some delay during the high traffic time. The question/help: Anyone has experience similar issues with trident? - Maybe do the acking myself through the lower level API. Any suggestion is much appreciated. Thanks, Alex
