[ 
https://issues.apache.org/jira/browse/STORM-158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rick Kellogg updated STORM-158:
-------------------------------
    Component/s: storm-core

> Live tracing of tuples through the flow
> ---------------------------------------
>
>                 Key: STORM-158
>                 URL: https://issues.apache.org/jira/browse/STORM-158
>             Project: Apache Storm
>          Issue Type: New Feature
>          Components: storm-core
>            Reporter: James Xu
>            Priority: Minor
>
> https://github.com/nathanmarz/storm/issues/531
> Storm should let you bless a record as a "tracer bullet", to be specially 
> reported on as it progresses through the flow. It's important that this be 
> completely transparent -- that is, I can unintrusively switch tracing on in 
> the flow, and that tracer bullets are real, live records (not 
> specially-crafted packets). The intent is that a small fraction of records be 
> tracer bullets. If you harm performance by passing too many through, that's 
> your fault for passing too many through.
> @maphysics is working on code to implement this --- we're finding it very 
> useful for debugging flows -- and so we'd like to see if this is 
> functionality you'd pull into the mainline or storm-contrib. (If this already 
> exists, please advise.)
> Pragmatically, what we're working on is a TracerHook that...
> * In 'prepare', captures some helpful information about the flow.
> * The hook points (emit, boltExecute, both Ack/Fails) look for a field called 
> '_trace'; if absent or null, they return immediately. Otherwise, the field is 
> a HashMap indicating that the tuple should be traced. (We're using a HashMap 
> to allow whoever designated the tuple for tracing to inject extra metadata 
> for the trace report)
> * If it's a tracer bullet each hook point simply writes a verbose, 
> helpfully-formatted biography of the record to the log (the execute hook is 
> more verbose than the others).
> the component_id, sources & targets, etc of the bolt/spout
> the hook point it hit ('emit', 'ack', etc)
> list of output tuple's values in regular order
> * ...and records minimal provenance:
> The execute hook saves off (into a private variable on the hook) the _trace 
> field of the input tuple (a question about this is below)
> calls to the emit hook take the saved _trace info and duplicate it into the 
> output tuple
> Some Questions
> Should these be metrics or hooks? Right now, we're using the hook 
> functionality, not the metrics, because...
> * It wasn't clear how to inspect the tuple from the metric
> * The lifecycle of the metrics matches the bolt's, not the record's -- we'd 
> prefer as-rapid-as-reasonable reporting, tied to the tuple's progress
> * Along with, basically, using metrics would require more spelunking to 
> figure out... We'll follow up to the mailing list with questions on this. So 
> it seems like a hook is the right thing, although cycling a trace trail back 
> to nimbus has a lot of appeal.
> Are the hook points dependably executed in-order? That is, if the execute 
> hook point for bolt A on tuple Q is invoked, can we depend that (until 
> execute is called again), the calls to emit and then to ack/fail are direct 
> consequence of processing tuple Q? (The code seems to say yes, but can we 
> treat that as part of the contract?)
> How do we transparently carry the traceinfo all the way through the topology_ 
> -- yet not annotate every single bolt/spout with trace_info as a field? 
> @maphysics is following up with a separate issue on this. We need to decorate 
> each tuple generated from processing a tracer bullet with a _trace field of 
> its own -- but without modifying the topology or its bolts.
> /cc @maphysics @kornypoet
> As mentioned, we'd eventually like to dispatch tracings to nimbus (or 
> somewhere central). Instead of metrics, another approach would send them to 
> an implicit 'tracings' stream, similar to the 'failure stream' mentioned in 
> #13. Has there been any progress on implicit failure streams?
> see also #146



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to