suneet-s commented on a change in pull request #10401:
URL: https://github.com/apache/druid/pull/10401#discussion_r492499110



##########
File path: core/src/main/java/org/apache/druid/math/expr/ApplyFunction.java
##########
@@ -49,6 +51,16 @@
    */
   String name();
 
+  default boolean canVectorize(Expr.InputBindingTypes inputTypes, Expr lambda, 
List<Expr> args)
+  {
+    return false;
+  }
+
+  default <T> ExprVectorProcessor<T> 
asVectorProcessor(Expr.VectorInputBindingTypes inputTypes, Expr lambda, 
List<Expr> args)

Review comment:
       javadocs for these functions please.

##########
File path: core/src/main/java/org/apache/druid/math/expr/ApplyFunction.java
##########
@@ -49,6 +51,16 @@
    */
   String name();
 
+  default boolean canVectorize(Expr.InputBindingTypes inputTypes, Expr lambda, 
List<Expr> args)
+  {
+    return false;
+  }
+
+  default <T> VectorExprProcessor<T> 
asVectorProcessor(Expr.VectorInputBindingTypes inputTypes, Expr lambda, 
List<Expr> args)

Review comment:
       javadocs for both of these functions please

##########
File path: sql/src/test/java/org/apache/druid/sql/calcite/CalciteQueryTest.java
##########
@@ -5912,9 +5912,6 @@ public void testExpressionFilteringAndGrouping() throws 
Exception
   @Test
   public void testExpressionFilteringAndGroupingUsingCastToLong() throws 
Exception
   {
-    // Cannot vectorize due to virtual columns.

Review comment:
       🎉 so much more vectorization!

##########
File path: core/src/main/java/org/apache/druid/math/expr/Expr.java
##########
@@ -148,6 +171,47 @@ default ExprType getOutputType(InputBindingTypes 
inputTypes)
   {
     @Nullable
     ExprType getType(String name);
+
+    default boolean areNumeric(List<Expr> args)
+    {
+      boolean numeric = args.size() > 0;

Review comment:
       Why is an empty list non numeric?
   
   Maybe add javadocs to clarify this behavior.

##########
File path: 
sql/src/main/java/org/apache/druid/sql/calcite/rel/VirtualColumnRegistry.java
##########
@@ -136,9 +137,11 @@ public RowSignature getFullRowSignature()
     final RowSignature.Builder builder =
         RowSignature.builder().addAll(baseRowSignature);
 
+    ColumnInspector baseRowsInspector = builder.build().asColumnInspector();

Review comment:
       Does this mean a virtual column can't reference another virtual column? 
since the ColumnInspector is built with just he `baseRowSignature`?

##########
File path: 
core/src/main/java/org/apache/druid/math/expr/vector/UnivariateFunctionVectorProcessor.java
##########
@@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.math.expr.vector;
+
+import org.apache.druid.math.expr.Expr;
+
+/**
+ * common machinery for processing single input operators and functions, which 
should always treat null input as null
+ * output, and are backed by a primitive value instead of an object value (and 
need to use the null vector instead of
+ * checking the vector itself for nulls)
+ */
+public abstract class UnivariateFunctionVectorProcessor<TInput, TOutput> 
implements ExprVectorProcessor<TOutput>
+{
+  final ExprVectorProcessor<TInput> processor;
+  final int maxVectorSize;
+  final boolean[] outNulls;
+  final TOutput outValues;
+
+  public UnivariateFunctionVectorProcessor(
+      ExprVectorProcessor<TInput> processor,
+      int maxVectorSize,
+      TOutput outValues
+  )
+  {
+    this.processor = processor;
+    this.maxVectorSize = maxVectorSize;
+    this.outNulls = new boolean[maxVectorSize];

Review comment:
       should outNulls be null if we're not in sql compatible mode?

##########
File path: core/src/test/java/org/apache/druid/math/expr/FunctionTest.java
##########
@@ -468,7 +468,7 @@ public void testGreatest()
   {
     // Same types
     assertExpr("greatest(y, 0)", 2L);
-    assertExpr("greatest(34.0, z, 5.0, 767.0", 767.0);
+    assertExpr("greatest(34.0, z, 5.0, 767.0)", 767.0);

Review comment:
       hmm interesting that this used to pass 🤔 

##########
File path: 
core/src/main/java/org/apache/druid/math/expr/vector/BivariateFunctionVectorProcessor.java
##########
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.math.expr.vector;
+
+import org.apache.druid.math.expr.Expr;
+
+/**
+ * common machinery for processing two input operators and functions, which 
should always treat null inputs as null
+ * output, and are backed by a primitive values instead of an object values 
(and need to use the null vectors instead of
+ * checking the vector themselves for nulls)
+ */
+public abstract class BivariateFunctionVectorProcessor<TLeftInput, 
TRightInput, TOutput>
+    implements ExprVectorProcessor<TOutput>
+{
+  final ExprVectorProcessor<TLeftInput> left;
+  final ExprVectorProcessor<TRightInput> right;
+  final int maxVectorSize;
+  final boolean[] outNulls;
+  final TOutput outValues;
+
+  protected BivariateFunctionVectorProcessor(
+      ExprVectorProcessor<TLeftInput> left,
+      ExprVectorProcessor<TRightInput> right,
+      int maxVectorSize,
+      TOutput outValues
+  )
+  {
+    this.left = left;
+    this.right = right;
+    this.maxVectorSize = maxVectorSize;
+    this.outNulls = new boolean[maxVectorSize];

Review comment:
       Similar question as the `UnivariateFunctionProcessor`
   
   I think with it written like this, `ExprEvalDoubleVector` and 
`ExprEvalLongVector` can't take advantage of the fact that there are no nulls 
in default mode. I don't know if there are any functions that produce nulls 
though... maybe that's why we need to do it this way?

##########
File path: 
core/src/main/java/org/apache/druid/math/expr/vector/ExprEvalStringVector.java
##########
@@ -0,0 +1,109 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.math.expr.vector;
+
+import org.apache.druid.common.config.NullHandling;
+import org.apache.druid.java.util.common.IAE;
+import org.apache.druid.math.expr.ExprEval;
+import org.apache.druid.math.expr.ExprType;
+
+import javax.annotation.Nullable;
+
+public final class ExprEvalStringVector extends ExprEvalVector<String[]>

Review comment:
       Should we add a comment that this vector isn't thread safe? - because of 
how `computeNumbers` is written Are any of them thread safe?

##########
File path: 
processing/src/main/java/org/apache/druid/query/aggregation/SimpleDoubleAggregatorFactory.java
##########
@@ -235,6 +244,19 @@ public String getExpression()
     return expression;
   }
 
+  @Override
+  public boolean canVectorize(ColumnInspector columnInspector)
+  {
+    if (fieldName != null) {
+      final ColumnCapabilities capabilities = 
columnInspector.getColumnCapabilities(fieldName);
+      return capabilities == null || 
ValueType.isNumeric(capabilities.getType());
+    }
+    if (expression != null) {
+      return fieldExpression.get().canVectorize(columnInspector);
+    }
+    return false;
+  }
+

Review comment:
       If I'm reading this correctly, this change isn't really needed in this 
PR, but it's some consolidation that was done across the various aggregator 
factories




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to