[GitHub] [arrow-datafusion] alamb commented on issue #3033: Need to add an unique id to MetricsSet or ExecutionPlanMetricsSet

GitBox Fri, 05 Aug 2022 03:40:09 -0700


alamb commented on issue #3033:
URL: 
https://github.com/apache/arrow-datafusion/issues/3033#issuecomment-1206303912


   > A single session could have multiple statements run concurrently. And a 
statement_id is not enough to differ metrics either. 
   
   That is a good point. I still think `statement_id` is likely necessary -- 
sounds like it would also be prudent to add a `execution_plan_id` or 
`operator_id` to `ExecutionPlan` or `LogicalPlan` as well to properly identify 
the execution?  I can't remember if Ballista sends `dyn ExecutionPlans` or 
`LogicalPlans` to the executors. 
   
   > One solution is to leverage the array index. For example after the 
Executor finish a task, traverse the plan tree and collect all the MetricsSets 
into an Array of MetricsSet.
   
   Yeah, as you noted above, one potential issue with this approach is that it 
will only work if the exact same `ExecutionPlan` is created on each node. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] alamb commented on issue #3033: Need to add an unique id to MetricsSet or ExecutionPlanMetricsSet

Reply via email to