zanmato1984 commented on code in PR #42106:
URL: https://github.com/apache/arrow/pull/42106#discussion_r1637644833
##########
cpp/src/arrow/compute/expression.h:
##########
@@ -118,6 +124,12 @@ class ARROW_EXPORT Expression {
// XXX someday
// NullGeneralization::type nullable() const;
+ /// Whether the entire expression (including all its subexpressions) is
Review Comment:
> I think this is only something you will discover during evaluation.
I don't think it's necessarily during evaluation. This can be done in
binding.
> The same function can have kernels that are aware of selection and some
that are not.
Binding will resolve the function to concrete kernel, so the selection
awareness is known once the kernel is resolved. Also note that I made selection
aware flag a property of kernel rather than of function.
> Another important aspect is the type-checking of the special forms. You
need to unify [1] the output type of all the branches, so you can pre-allocate
the output and introduce the appropriate casts.
(I don't think this is related to selection-awareness. In this proposal,
special forms and expression's selection-awareness are relatively separated -
they only correlate on the fact that a special form may use selection vector to
mask the evaluation of its subexpressions. So I'm assuming this is a general
comment about special form.)
Before responding to your comment about special forms, I want to raise some
other discussion which is part of my reasons making this proposal the current
way.
In arrow compute, there are "function"s, which users can directly invoke via
`CallFunction`, and there are expressions, one of whose concrete type is a
"call", which further leverages the "function" when evaluating. Now take
"if_else" as an example. There already exists an "if_else" function, which
makes prefect sense because users do need it for two-way branching on a triple
of {condition vector, true branch vector, false branch vector}. For expression,
"if_else", as a regular function, is represented as a call. This is fine in
terms of lexical/grammatical aspects, type checking, kernel resolution, etc.,
except that "if_else" requires special argument evaluation rules other than
what a call does (eagerly evaluating all its arguments) - the reason why we
need "if_else" (or "cond") special form.
So my question is, can we assume that for every coming special form in
arrow, there exists a function "companion" of it? For some common special forms
like "and", "or", "if_else", "case_when", there already are. For other special
forms in Lisp, like "quote", "let", "defunc", they don't have corresponding
"function"s, but those special forms themselves are more like fundamental
parts, like defining variable and assignments, for a Turing-complete language,
which I believe arrow won't need to have. So my so-far conclusion is, yes, we
can assume it.
Then the next question is, given that we'll have the function companion for
every special form, can we leverage the function companion to do
type-checking/resolution (the function need to do those anyway)? My feeling is
yes, at least I can't think of a case that a special form has different
type-checking/resolution rule than its function companion.
So to summarize, if a special form:
1. Always has a function companion;
2. Is type-checked/resolved the same as its function companion;
3. Evaluates the same as its function companion - I mean the evaluation for
the last kernel invocation after all its arguments properly evaluated;
4. Only differs a regular call on the arguments evaluation.
Then we can think of special forms to be a special call with special
argument evaluation rules.
Of course, if there are counterexamples for any of above assumptions, then
my conclusion would be wrong. And I should take the way around. Thanks.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]