[ 
https://issues.apache.org/jira/browse/ARROW-15238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-15238:
-----------------------------------
    Labels: pull-request-available query-engine  (was: query-engine)

> [C++] Create "engine" module for the query engine
> -------------------------------------------------
>
>                 Key: ARROW-15238
>                 URL: https://issues.apache.org/jira/browse/ARROW-15238
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Weston Pace
>            Assignee: Weston Pace
>            Priority: Major
>              Labels: pull-request-available, query-engine
>             Fix For: 8.0.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Circular dependencies are popping up in the query engine as the compute 
> module is very low level.  For example, it would be nice if the default 
> registry included the scan node and dataset write node.  We will want to be 
> adding spillover support at some point and that will rely on parquet/dataset 
> operations.
> We should create a dedicated engine module which includes the query plans, 
> the nodes, etc.  This module would not contain the kernels or other low level 
> compute primitives.  This way we could have something like...
> engine -> datasets (for scanning) -> parquet -> compute (for calculating 
> statistics)
> The base ExecPlan itself could either go in compute or engine depending on 
> which has the least amount of friction.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to