James Xu created STORM-158:
------------------------------
Summary: Live tracing of tuples through the flow
Key: STORM-158
URL: https://issues.apache.org/jira/browse/STORM-158
Project: Apache Storm (Incubating)
Issue Type: New Feature
Reporter: James Xu
Priority: Minor
https://github.com/nathanmarz/storm/issues/531
Storm should let you bless a record as a "tracer bullet", to be specially
reported on as it progresses through the flow. It's important that this be
completely transparent -- that is, I can unintrusively switch tracing on in the
flow, and that tracer bullets are real, live records (not specially-crafted
packets). The intent is that a small fraction of records be tracer bullets. If
you harm performance by passing too many through, that's your fault for passing
too many through.
@maphysics is working on code to implement this --- we're finding it very
useful for debugging flows -- and so we'd like to see if this is functionality
you'd pull into the mainline or storm-contrib. (If this already exists, please
advise.)
Pragmatically, what we're working on is a TracerHook that...
* In 'prepare', captures some helpful information about the flow.
* The hook points (emit, boltExecute, both Ack/Fails) look for a field called
'_trace'; if absent or null, they return immediately. Otherwise, the field is a
HashMap indicating that the tuple should be traced. (We're using a HashMap to
allow whoever designated the tuple for tracing to inject extra metadata for the
trace report)
* If it's a tracer bullet each hook point simply writes a verbose,
helpfully-formatted biography of the record to the log (the execute hook is
more verbose than the others).
the component_id, sources & targets, etc of the bolt/spout
the hook point it hit ('emit', 'ack', etc)
list of output tuple's values in regular order
* ...and records minimal provenance:
The execute hook saves off (into a private variable on the hook) the _trace
field of the input tuple (a question about this is below)
calls to the emit hook take the saved _trace info and duplicate it into the
output tuple
Some Questions
Should these be metrics or hooks? Right now, we're using the hook
functionality, not the metrics, because...
* It wasn't clear how to inspect the tuple from the metric
* The lifecycle of the metrics matches the bolt's, not the record's -- we'd
prefer as-rapid-as-reasonable reporting, tied to the tuple's progress
* Along with, basically, using metrics would require more spelunking to figure
out... We'll follow up to the mailing list with questions on this. So it seems
like a hook is the right thing, although cycling a trace trail back to nimbus
has a lot of appeal.
Are the hook points dependably executed in-order? That is, if the execute hook
point for bolt A on tuple Q is invoked, can we depend that (until execute is
called again), the calls to emit and then to ack/fail are direct consequence of
processing tuple Q? (The code seems to say yes, but can we treat that as part
of the contract?)
How do we transparently carry the traceinfo all the way through the topology_
-- yet not annotate every single bolt/spout with trace_info as a field?
@maphysics is following up with a separate issue on this. We need to decorate
each tuple generated from processing a tracer bullet with a _trace field of its
own -- but without modifying the topology or its bolts.
/cc @maphysics @kornypoet
As mentioned, we'd eventually like to dispatch tracings to nimbus (or somewhere
central). Instead of metrics, another approach would send them to an implicit
'tracings' stream, similar to the 'failure stream' mentioned in #13. Has there
been any progress on implicit failure streams?
see also #146
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)