alamb commented on issue #19215: URL: https://github.com/apache/datafusion/issues/19215#issuecomment-3985120997
Thanks @nuno-faria Form my perspective, having an "auto explain" feature is super valuable. However, writing to `stdout` / `stderr` as proposed in https://github.com/apache/datafusion/pull/19316 doesn't make sense to put in the core datafusion library because: 1. DataFusion can be embedded, and runs on many different environments where `stdout` may be being used for something else (e.g. logging). Instead, it seems to me that auto explain belongs in the client applications themselves (e.g. `datafusion-cli`) That being said, I think it is a valuable feature for many client applications, so including help / support in the core or perhaps as an example makes sense. Specifically I think it is important to separate the logic of where the output will go. Perhaps some sort of Observer pattern: ```rust trait PlanObserver { fn planned(id: ..., plan: &Arc<dyn ExecutionPlan>); fn plan_executed(id: ..., plan: &Arc<dyn ExecutionPlan>); } ``` let observer = ... // some sort of trait object that can receive explain plans let ctx = SessionContext::new(....) .with_plan_observer(Arc::new(explain_output)); // And now the callback in `explain_output` is invoked whenever the plan is run ``` Then, `datafusion-cli` could implement the observer if/when necessary This design would also make it easy to capture post execution metrics as well, for example -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
