geoffreyclaude commented on issue #9415:
URL: https://github.com/apache/datafusion/issues/9415#issuecomment-2634103915

   Hi,
   
   Using internal metrics to generate tracing spans on query completion is a 
great solution. However it prevents tracing downstream data sources, which are 
often more complex than an on-disk Parquet read.
   
   I've been experimenting with tracing a full execution plan by injecting a 
custom "TracingExec" node before every execution node in the plan. Its 
`execute` method starts a new span with the child's name (using Open 
Telemetry's `otel.name` field), and streams the child result decorated with the 
`tracing` crate's `in_current_span()` through it. It also reports all child 
metrics as fields on the span, which is super nice when exploring the generated 
flame graph.
   
   The huge advantage of this method is that it avoids needing to modify every 
execution node's `execute` method, allows propagating the trace context 
downstream, and could work as an independent DataFusion "contrib" crate once 
the code is tidied up.
   
   However, there is still a major blocking point in the current DataFusion 
code: spawning tasks in new threads. Whenever a task is spawned in a new thread 
(eg, inside a `RepartitionExec`, or inside the `ParquetSink`'s `write_all` 
method), the context is lost (because the tracing context is thread-local 
only.) I've worked around this best I could by manually tracking the spans 
inside the "TracingExec", but it's a pretty wonky workaround. It also fails 
when the source nodes of the plan spawn new tasks, as in the `ParquetSink`, 
which prevent the underlying `object_store` from linking its spans to the main 
trace.
   
   Even without spans, it would be useful to be able to link logs generated 
deep in the plan (by a custom `object_store` for instance) to a given trace, 
which would also need access to the trace context.
   
   I'd like to propose a very simple change first, before going full `tracing` 
everywhere: wrapping all task spawn points with `.in_current_span()`. This 
ensures that when a task moves to a different thread, it carries along its 
current tracing context. It could be protected behind a `tracing` feature, so 
users who don't want tracing don't need to import the crate. I believe there 
are roughly 20 places in the DataFusion repo that would need this change, so 
fairly small change overall. I'd be happy to submit a PR if we can agree on 
this first step!
   
   For instance, in `physical-plan/src/stream.rs` (untested code!):
   ```rust
       #[cfg(feature = "tracing")]
       use tracing_futures::Instrument;
       /// Spawn task that will be aborted if this builder (or the stream
       /// built from it) are dropped
       pub fn spawn<F>(&mut self, task: F)
       where
           F: Future<Output = Result<()>>,
           F: Send + 'static,
       {
           #[cfg(feature = "tracing")]
           let task = task.in_current_span();
           self.join_set.spawn(task);
       }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to