geoffreyclaude commented on issue #9415:
URL: https://github.com/apache/datafusion/issues/9415#issuecomment-2634103915
Hi,
Using internal metrics to generate tracing spans on query completion is a
great solution. However it prevents tracing downstream data sources, which are
often more complex than an on-disk Parquet read.
I've been experimenting with tracing a full execution plan by injecting a
custom "TracingExec" node before every execution node in the plan. Its
`execute` method starts a new span with the child's name (using Open
Telemetry's `otel.name` field), and streams the child result decorated with the
`tracing` crate's `in_current_span()` through it. It also reports all child
metrics as fields on the span, which is super nice when exploring the generated
flame graph.
The huge advantage of this method is that it avoids needing to modify every
execution node's `execute` method, allows propagating the trace context
downstream, and could work as an independent DataFusion "contrib" crate once
the code is tidied up.
However, there is still a major blocking point in the current DataFusion
code: spawning tasks in new threads. Whenever a task is spawned in a new thread
(eg, inside a `RepartitionExec`, or inside the `ParquetSink`'s `write_all`
method), the context is lost (because the tracing context is thread-local
only.) I've worked around this best I could by manually tracking the spans
inside the "TracingExec", but it's a pretty wonky workaround. It also fails
when the source nodes of the plan spawn new tasks, as in the `ParquetSink`,
which prevent the underlying `object_store` from linking its spans to the main
trace.
Even without spans, it would be useful to be able to link logs generated
deep in the plan (by a custom `object_store` for instance) to a given trace,
which would also need access to the trace context.
I'd like to propose a very simple change first, before going full `tracing`
everywhere: wrapping all task spawn points with `.in_current_span()`. This
ensures that when a task moves to a different thread, it carries along its
current tracing context. It could be protected behind a `tracing` feature, so
users who don't want tracing don't need to import the crate. I believe there
are roughly 20 places in the DataFusion repo that would need this change, so
fairly small change overall. I'd be happy to submit a PR if we can agree on
this first step!
For instance, in `physical-plan/src/stream.rs` (untested code!):
```rust
#[cfg(feature = "tracing")]
use tracing_futures::Instrument;
/// Spawn task that will be aborted if this builder (or the stream
/// built from it) are dropped
pub fn spawn<F>(&mut self, task: F)
where
F: Future<Output = Result<()>>,
F: Send + 'static,
{
#[cfg(feature = "tracing")]
let task = task.in_current_span();
self.join_set.spawn(task);
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]