westonpace commented on issue #34365:
URL: https://github.com/apache/arrow/issues/34365#issuecomment-1446685564
For most nodes, whether an expression input can be "anything" or "only a
direct reference" has been described to me as a "logical" vs. "physical" thing.
You can always convert to a plan where the input to a cast is a direct
reference by introducing a project node. In other words:
```
project:
exprs:
- add(2, cast(tolower("x"), int32()))
names:
- "out"
input:
src
```
can become
```
project:
exprs:
- add(2, cast("x_lower", int32()))
names:
- "out"
input:
project:
exprs:
- tolower("x")
names:
- "x_lower"
input:
src
```
There are other cases where Substrait's existing nodes are "too logical" for
Substrait and we are a little restrictive. For example, in the join node and
the aggregate node. We call out this caveat here:
https://arrow.apache.org/docs/dev/cpp/streaming_execution.html#expressions-general
That being said, `cast` is just a function call in Acero and we do have the
ability for functions to take other functions as input. So it does seem like
this is one place where we don't have to be quite so restrictive.
In the linked issue you mentioned:
> The Acero cast function looks like it can either take an Array or an
individual object of class Datum (so a scalar, array, etc).
This is true for the C++ cast function. However, this is not true for
"expressions". In other words, a `compute::call` is constructed as follows:
```
ARROW_EXPORT
Expression call(std::string function, std::vector<Expression> arguments,
std::shared_ptr<FunctionOptions> options = NULLPTR);
```
So it can receive any `compute::Expression` as an argument. This
discrepancy is handled during "expression execution"
(`compute::ExecuteScalarExpression`). In expression evaluation we travel the
AST and convert each of the arguments into an array by executing the
sub-expressions. Finally, these input arrays are passed to the actual function
call.
So...I'm not sure why this isn't working. What is the error?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]