Hello all,

I'm working on a topology which takes a url from a kestrel queue, performs
a series of varied steps each of which produces a set of results. I'm now
attempting to select the "best" result from them by making the topology
coordinated.

I have a bolt which receives all the results for a given url, execute adds
these results to a data structure and the finishedId() call back selects
one of these results and emits it on to the rest of the steps in the
topology. All of which works great.

My problem is that when I do this I'm unable to provide the sending tuple
as an anchor to the emit in finishedId() meaning that the rest of the
topology stops being reliable, if anything after this aggregation step
fails storm won't automagically re-submit the url.

I've tried storing the tuple in my data structure in the aggregating bolt,
but unless I ack the tuples in execute my finishedId() callback is never
run, and I've tried anchoring off of a tuple that has already been acked
and that causes an NPE in storm.

At this point, I'm considering adding the selected result to a new kestrel
queue and creating a second topology to handle the rest of the work, but
that seems like a bit of a hack. Any advice would be greatly appreciated.


Ciao,

-- 
Angelo Genovese

Reply via email to