Hi I hope this is the correct mailing list. 

I have asked a question at StackOverflow 
(http://stackoverflow.com/questions/31503017/apache-storm-track-tuples-by-unique-id-from-source-spout-to-final-bolt
 
<http://stackoverflow.com/questions/31503017/apache-storm-track-tuples-by-unique-id-from-source-spout-to-final-bolt>)
 but feel it may be better solved here. Any help or suggestions or just 
confirming that I will need to send the ID along as an output of each Spout and 
Bolt. Thanks. 

up vote
 <>1
down vote
 <>favorite
 
<http://stackoverflow.com/questions/31503017/apache-storm-track-tuples-by-unique-id-from-source-spout-to-final-bolt#>
  
I want a method of uniquely identifying tuples throughout a whole Storm 
topology, so that each tuple can be tracked from Spout to the final Bolt.

The way I understand it is when passing a unique message id with an emit from a 
spout for example:

String msgID = UUID.randomUUID();
// emits a line from user tasks with msg id
outputCollector.emit(new Values(task), msgID);
This ID is somehow returned when acked to the Spout (Can this be simulated 
earlier to get back the passed Id at any point?). But the using of get message 
id on a tuple for example:

inputTuple.getMessageId()
This returns a new messageId not the one passed in at the Spout that is 
generated by the Tuple. Reference 
https://groups.google.com/forum/#!topic/storm-user/xBEqMDa-RZs 
<https://groups.google.com/forum/#!topic/storm-user/xBEqMDa-RZs>
Questions

1) Is there a way to get the tuple.getMessageId() when the collector emits the 
Tuple.

2) Alternatively can the passed in messageId at the spout be got somehow from 
the tuple at any spout or bolt in the toplogy?

End Solution I want to be able to set an ID on a tuple when it is emitted, and 
then be able to identify that tuple again at any point in the Storm topology.

Or will the unique messageId that my system will track with have to be passed 
as a field/value on each output of each spout and bolt.

Thanks


Reply via email to