Andy Grove created ARROW-11058:
----------------------------------

             Summary: [Rust] [DataFusion] Implement "coalesce batches" operator
                 Key: ARROW-11058
                 URL: https://issues.apache.org/jira/browse/ARROW-11058
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Rust - DataFusion
            Reporter: Andy Grove
            Assignee: Andy Grove
             Fix For: 3.0.0


When we have a FilterExec in the plan, it can produce lots of small batches and 
we therefore lose efficiency of vectorized operations.

We should implement a new CoalesceBatchExec and wrap every FilterExec with one 
of these so that small batches can be recombined into larger batches to improve 
the efficiency of upstream operators.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to