alamb commented on issue #3033: URL: https://github.com/apache/arrow-datafusion/issues/3033#issuecomment-1206303912
> A single session could have multiple statements run concurrently. And a statement_id is not enough to differ metrics either. That is a good point. I still think `statement_id` is likely necessary -- sounds like it would also be prudent to add a `execution_plan_id` or `operator_id` to `ExecutionPlan` or `LogicalPlan` as well to properly identify the execution? I can't remember if Ballista sends `dyn ExecutionPlans` or `LogicalPlans` to the executors. > One solution is to leverage the array index. For example after the Executor finish a task, traverse the plan tree and collect all the MetricsSets into an Array of MetricsSet. Yeah, as you noted above, one potential issue with this approach is that it will only work if the exact same `ExecutionPlan` is created on each node. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
