Ben Kietzman created ARROW-10322:
------------------------------------
Summary: [C++][Dataset] Minimize Expression to a wrapper around
compute::Function
Key: ARROW-10322
URL: https://issues.apache.org/jira/browse/ARROW-10322
Project: Apache Arrow
Issue Type: Improvement
Components: C++
Affects Versions: 1.0.1
Reporter: Ben Kietzman
Assignee: Ben Kietzman
Fix For: 3.0.0
The Expression class hierarchy was originally intended to provide generic,
structured representations of compute functionality. On the former point they
have been superseded by compute::{Function, Kernel, ...} which encapsulates
validation and execution. In light of this Expression can be drastically
simplified and improved by composition with these classes. Each responsibility
which can be deferred implies less boilerplate when exposing a new compute
function for use in datasets. Ideally any compute function will be immediately
available to use in a filter or projection.
{code}
struct Expression {
using Literal = std::shared_ptr<Scalar>;
struct Projection {
std::vector<std::string> names
std::vector<Expression> values;
};
struct Call {
std::shared_ptr<ScalarFunction> function;
std::shared_ptr<FunctionOptions> options;
std::vector<Expression> arguments;
};
util::variant<Literal, FieldRef, Projection, Call> value;
};
{code}
A simple discriminated union as above should be sufficient to represent
arbitrary filters and projections: any expression which results in type
{{bool}} is a valid filter, and any expression which is a {{Projection}} may be
used to map one record batch to another.
Expression simplification (currently implemented in {{Expression::Assume}}) is
an optimization used for example in predicate pushdown, and therefore need not
exhaustively cover the full space of available compute functions.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)