Hello all, I'm working on a topology which takes a url from a kestrel queue, performs a series of varied steps each of which produces a set of results. I'm now attempting to select the "best" result from them by making the topology coordinated.
I have a bolt which receives all the results for a given url, execute adds these results to a data structure and the finishedId() call back selects one of these results and emits it on to the rest of the steps in the topology. All of which works great. My problem is that when I do this I'm unable to provide the sending tuple as an anchor to the emit in finishedId() meaning that the rest of the topology stops being reliable, if anything after this aggregation step fails storm won't automagically re-submit the url. I've tried storing the tuple in my data structure in the aggregating bolt, but unless I ack the tuples in execute my finishedId() callback is never run, and I've tried anchoring off of a tuple that has already been acked and that causes an NPE in storm. At this point, I'm considering adding the selected result to a new kestrel queue and creating a second topology to handle the rest of the work, but that seems like a bit of a hack. Any advice would be greatly appreciated. Ciao, -- Angelo Genovese
