geoffreyclaude opened a new pull request, #16469: URL: https://github.com/apache/datafusion/pull/16469
## Which issue does this PR close? This PR addresses the extensibility limitation that causes recursive queries to fail when used with the `datafusion-tracing` crate, as reported in [datafusion-contrib/datafusion-tracing#5](https://github.com/datafusion-contrib/datafusion-tracing/issues/5). While there isn't a specific DataFusion issue filed, this change enables external crates like `datafusion-tracing` to properly support recursive queries by implementing work table injection in their wrapper nodes. ## Rationale for this change Currently, recursive query execution in `RecursiveQueryExec` uses downcasting to find `WorkTableExec` nodes and inject work tables. This approach fails when execution plans are wrapped by custom execution nodes, as is done by the `datafusion-tracing` crate, causing recursive queries to fail with "Unexpected empty work table" errors. The fundamental issue is that `RecursiveQueryExec` assumes direct access to concrete `WorkTableExec` types, but wrapper nodes (like those used for tracing) break this assumption. This limits DataFusion's extensibility for recursive queries and prevents external crates from properly supporting this feature. ## What changes are included in this PR? - **Add `with_work_table` as a public trait method** on `ExecutionPlan` with a default implementation returning `None` - **Update `WorkTableExec`** to implement the trait method, creating a new instance with the provided work table - **Modify `assign_work_table` logic** in `RecursiveQueryExec` to use the trait method instead of downcasting, enabling support for wrapper nodes - **Add comprehensive documentation** explaining the purpose and usage of `with_work_table` in both the trait and implementation - **Publicly re-export `WorkTable`** so external implementors can access it without knowing internal module structure - **Include cross-referenced documentation links** for better discoverability ## Are these changes tested? The changes are covered by existing tests: - All existing recursive query tests continue to pass, ensuring backward compatibility - The `WorkTableExec` implementation is tested through existing recursive query execution paths - The trait method follows the same logic as the previous direct implementation This change enables functionality that was previously broken (recursive queries with instrumentation), so it fixes existing test scenarios in external crates rather than requiring new DataFusion-specific tests. Compiling the `datafusion-tracing` crate against this branch locally enables full instrumentation of recursive queries, with one "span group" per recursion, as demonstrated by the Jaeger screenshot below:  ## Are there any user-facing changes? **API Addition (Non-breaking):** - Adds `with_work_table` method to the `ExecutionPlan` trait with a default implementation - Publicly exports `WorkTable` from the `datafusion-physical-plan` crate **Benefits for Users:** - External crates can now implement `with_work_table` in their custom execution plans to support recursive queries - Instrumentation and wrapper nodes can properly participate in recursive query execution - Enables recursive queries to work with tracing, monitoring, and other execution plan decorators **No Breaking Changes:** - All existing code continues to work unchanged - The new trait method has a default implementation, so existing `ExecutionPlan` implementations don't need updates - Internal behavior for `WorkTableExec` remains identical This change primarily enables extensibility that was previously impossible, resolving compatibility issues between recursive queries and external crates. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org