Weston Pace created ARROW-17023:
-----------------------------------
Summary: [C++] Add initial Acero design documents
Key: ARROW-17023
URL: https://issues.apache.org/jira/browse/ARROW-17023
Project: Apache Arrow
Issue Type: Improvement
Components: C++
Reporter: Weston Pace
Assignee: Weston Pace
As Acero grows in complexity it will be difficult for new developers to be able
to contribute meaningfully. In addition, Acero should be open for extension by
third party developers that wish to add new exec nodes. These 3rd party
developers will need to know details on how Acero schedules work and operates
and will appreciate advice on efficient development. At a minimum this first
pass should explain:
* Threading / Scheduling model for Acero (note, there are proposals to enhance
the model we currently have)
* Discussion of batch sizes and cache sizes and the morsel / batch model
* General discussion / advice for writing operators in a column-major way
* Design of current nodes, in particular, some more detail around how
expression evaluation happens and how the hash-join node operates
--
This message was sent by Atlassian Jira
(v8.20.10#820010)