Weston Pace created ARROW-15238:
-----------------------------------
Summary: [C++] Create "engine" module for the query engine
Key: ARROW-15238
URL: https://issues.apache.org/jira/browse/ARROW-15238
Project: Apache Arrow
Issue Type: Improvement
Components: C++
Reporter: Weston Pace
Circular dependencies are popping up in the query engine as the compute module
is very low level. For example, it would be nice if the default registry
included the scan node and dataset write node. We will want to be adding
spillover support at some point and that will rely on parquet/dataset
operations.
We should create a dedicated engine module which includes the query plans, the
nodes, etc. This module would not contain the kernels or other low level
compute primitives. This way we could have something like...
engine -> datasets (for scanning) -> parquet -> compute (for calculating
statistics)
The base ExecPlan itself could either go in compute or engine depending on
which has the least amount of friction.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)