felipecrv commented on code in PR #42106:
URL: https://github.com/apache/arrow/pull/42106#discussion_r1637376822


##########
cpp/src/arrow/compute/expression.h:
##########
@@ -48,6 +48,9 @@ class ARROW_EXPORT Expression {
     std::string function_name;
     std::vector<Expression> arguments;
     std::shared_ptr<FunctionOptions> options;
+    // Whether this call is a special form (e.g. if-else). If true, the 
`special_form`
+    // field will be resolved in binding.
+    bool is_special_form = false;

Review Comment:
   A special-form is not a call. A special-form is a special form of expression.
   
   ```diff
   -  using Impl = std::variant<Datum, Parameter, Call>;
   +  using Impl = std::variant<Datum, Parameter, Call, Special>;
   ```
   
   This is fundamental to the approach. Special forms don't have names, they 
are virtual sub-classes of `Special`.
   
   We can start with `CondSpecial` which can model if-else and case-when based 
on number of conditions and branches used to construct it.
   
   `selection_vector_aware` is not a member variable, it's something resolved 
at evaluation time by traversing the expression.



##########
cpp/src/arrow/compute/expression.h:
##########
@@ -159,14 +171,15 @@ Expression field_ref(FieldRef ref);
 
 ARROW_EXPORT
 Expression call(std::string function, std::vector<Expression> arguments,
-                std::shared_ptr<FunctionOptions> options = NULLPTR);
+                std::shared_ptr<FunctionOptions> options = NULLPTR,
+                bool is_special_form = false);
 
 template <typename Options, typename = typename std::enable_if<
                                 std::is_base_of<FunctionOptions, 
Options>::value>::type>
-Expression call(std::string function, std::vector<Expression> arguments,
-                Options options) {
+Expression call(std::string function, std::vector<Expression> arguments, 
Options options,
+                bool is_special_form = false) {

Review Comment:
   Special forms are not "called", they are evaluated. They are more general 
expressions than calls. `if (true) f(x)` is equivalent to a call to `f` with 
`x` that you discover while trying to evaluate the conditional.



##########
cpp/src/arrow/compute/expression.h:
##########
@@ -118,6 +124,12 @@ class ARROW_EXPORT Expression {
   // XXX someday
   // NullGeneralization::type nullable() const;
 
+  /// Whether the entire expression (including all its subexpressions) is

Review Comment:
   I think this is only something you will discover during evaluation. The same 
function can have kernels that are aware of selection and some that are not. 
Another important aspect is the type-checking of the special forms. You need to 
unify [1] the output type of all the branches, so you can pre-allocate the 
output and introduce the appropriate casts.



##########
cpp/src/arrow/compute/special_form.h:
##########
@@ -0,0 +1,62 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+// NOTE: API is EXPERIMENTAL and will change without going through a
+// deprecation cycle.
+
+#pragma once
+
+#include "arrow/compute/expression.h"
+#include "arrow/util/visibility.h"
+
+#include <vector>
+
+namespace arrow {
+namespace compute {
+
+/// The concept "special form" is borrowed from Lisp
+/// (https://courses.cs.northwestern.edu/325/readings/special-forms.html). 
Velox also uses
+/// the same term. A special form behaves like a function call except that it 
has special
+/// evaluation rules, mostly for arguments.
+/// For example, the `if_else(cond, expr1, expr2)` special form first 
evaluates the
+/// argument `cond` and obtains a boolean array:
+///   [true, false, true, false]
+/// then the argument `expr1` should ONLY be evaluated for row:
+///   [0, 2]
+/// and the argument `expr2` should ONLY be evaluated for row:
+///   [1, 3]
+/// Consider, if `expr1`/`expr2` has some observable side-effects (e.g., 
division by zero
+/// error) on row [1, 3]/[0, 2], these side-effects would be undesirably 
observed if
+/// evaluated using a regular function call, which always evaluates all its 
arguments
+/// eagerly.
+/// Other special forms include `case_when`, `and`, and `or`, etc.
+/// In a vectorized execution engine, a special form normally takes advantage 
of
+/// "selection vector" to mask rows of arguments to be evaluated.
+class ARROW_EXPORT SpecialForm {
+ public:
+  /// A poor man's factory method to create a special form by name.
+  /// TODO: More formal factory, a registry maybe?
+  static Result<std::unique_ptr<SpecialForm>> Make(const std::string& name);

Review Comment:
   The number of special-forms is very small. They should be sub-classes of the 
`SpecialForm` class. They will contain type-checking logic and other things 
that are very different from calls.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to