Dandandan commented on a change in pull request #1029: URL: https://github.com/apache/arrow-datafusion/pull/1029#discussion_r712359785
########## File path: datafusion/src/physical_plan/mod.rs ########## @@ -308,8 +310,38 @@ pub fn visit_execution_plan<V: ExecutionPlanVisitor>( /// Execute the [ExecutionPlan] and collect the results in memory pub async fn collect(plan: Arc<dyn ExecutionPlan>) -> Result<Vec<RecordBatch>> { - let stream = execute_stream(plan).await?; - common::collect(stream).await + let stream = execute_stream(plan.clone()).await?; + let any_plan = plan.as_any().downcast_ref::<UnionExec>(); Review comment: The code to execute the UnionExec (if changed) should be changed there. I suggest to implement it using the plan we have. If a more efficient implementation could be implemented, I think the best way would be to put that in a new node - i.e. `UnionDistinctExec` and -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org