gianm commented on code in PR #13887:
URL: https://github.com/apache/druid/pull/13887#discussion_r1134597082


##########
docs/querying/sql-aggregations.md:
##########
@@ -139,6 +139,17 @@ Load the [DataSketches 
extension](../development/extensions-core/datasketches-ex
 |`DS_QUANTILES_SKETCH(expr, [k])`|Creates a [Quantiles 
sketch](../development/extensions-core/datasketches-quantiles.md) on the values 
of `expr`, which can be a regular column or a column containing quantiles 
sketches. The `k` parameter is described in the Quantiles sketch 
documentation.<br/><br/>See the [known 
issue](sql-translation.md#approximations) with this function.|`'0'` (STRING)|
 
 
+### Tuple sketch functions
+
+Load the [DataSketches 
extension](../development/extensions-core/datasketches-extension.md) to use the 
following functions.
+
+|Function|Notes|Default|
+|--------|-----|-------|
+|`ARRAY_OF_DOUBLES_SKETCH(expr, [nominalEntries])`|Creates a [Tuple 
sketch](../development/extensions-core/datasketches-tuple.md) on the values of 
`expr` which is a column containing Tuple sketches. The `nominalEntries` 
override parameter is optional and described in the Tuple sketch documentation.
+|`ARRAY_OF_DOUBLES_SKETCH(dimensionColumnExpr, metricColumnExpr, ..., 
[nominalEntries])`|Creates a [Tuple 
sketch](../development/extensions-core/datasketches-tuple.md) on the dimension 
value of `dimensionColumnExpr` and the metric values contained in one or more 
`metricColumnExpr` columns. If the last value of the array is a numeric 
literal, Druid assumes that the value is an override parameter for [nominal 
entries](../development/extensions-core/datasketches-tuple.md).
+|`ARRAY_OF_DOUBLES_SKETCH_METRICS_SUM_ESTIMATE(expr)`|Computes approximate 
sums of the values contained within a [Tuple 
sketch](../development/extensions-core/datasketches-tuple.md#estimated-metrics-values-for-each-column-of-arrayofdoublessketch)
 column.

Review Comment:
   Since `ARRAY_OF_DOUBLES_SKETCH_METRICS_SUM_ESTIMATE` is a regular function 
(as opposed to an aggregation function) it should go in 
`querying/sql-scalar.md`.



##########
docs/querying/sql-functions.md:
##########
@@ -136,6 +136,24 @@ Returns an array of all values of the specified expression.
 
 Concatenates array inputs into a single array.
 
+## ARRAY_OF_DOUBLES_SKETCH
+
+`ARRAY_OF_DOUBLES_SKETCH(expr, [nominalEntries])`
+
+`ARRAY_OF_DOUBLES_SKETCH(dimensionColumnExpr, metricColumnExpr..., 
[nominalEntries])`
+
+**Function type:** [Aggregation](sql-aggregations.md)
+
+Creates a Tuple sketch.
+
+## ARRAY_OF_DOUBLES_SKETCH_METRICS_SUM_ESTIMATE

Review Comment:
   This one needs to go in `.spelling` too.



##########
extensions-core/datasketches/src/main/java/org/apache/druid/query/aggregation/datasketches/tuple/ArrayOfDoublesSketchMergeAggregator.java:
##########
@@ -58,12 +58,20 @@ public ArrayOfDoublesSketchMergeAggregator(
   @Override
   public void aggregate()
   {
-    final ArrayOfDoublesSketch update = selector.getObject();
+    final Object update = selector.getObject();
     if (update == null) {
       return;
     }
+    final ArrayOfDoublesSketch sketch;
+    if (update instanceof ArrayOfDoublesSketch) {

Review Comment:
   What was this change needed for? (Asking since I'm wondering if we're 
missing a call to `factory.deserialize` somewhere; it's doing something 
similar.)



##########
docs/querying/sql-functions.md:
##########
@@ -136,6 +136,24 @@ Returns an array of all values of the specified expression.
 
 Concatenates array inputs into a single array.
 
+## ARRAY_OF_DOUBLES_SKETCH

Review Comment:
   This needs to go in the `.spelling` file. Search for `ARRAY_CONCAT_AGG`, it 
would be right after that one.



##########
extensions-core/datasketches/src/main/java/org/apache/druid/query/aggregation/datasketches/tuple/sql/ArrayOfDoublesSketchSetBaseOperatorConversion.java:
##########
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation.datasketches.tuple.sql;
+
+import org.apache.calcite.rex.RexCall;
+import org.apache.calcite.rex.RexLiteral;
+import org.apache.calcite.rex.RexNode;
+import org.apache.calcite.sql.SqlFunction;
+import org.apache.calcite.sql.SqlFunctionCategory;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.SqlOperator;
+import org.apache.calcite.sql.type.OperandTypes;
+import org.apache.calcite.sql.type.SqlOperandCountRanges;
+import org.apache.calcite.sql.type.SqlTypeFamily;
+import org.apache.druid.java.util.common.StringUtils;
+import org.apache.druid.query.aggregation.PostAggregator;
+import 
org.apache.druid.query.aggregation.datasketches.tuple.ArrayOfDoublesSketchSetOpPostAggregator;
+import org.apache.druid.segment.column.RowSignature;
+import org.apache.druid.sql.calcite.expression.DruidExpression;
+import org.apache.druid.sql.calcite.expression.OperatorConversions;
+import org.apache.druid.sql.calcite.expression.PostAggregatorVisitor;
+import org.apache.druid.sql.calcite.expression.SqlOperatorConversion;
+import org.apache.druid.sql.calcite.planner.PlannerContext;
+
+import javax.annotation.Nullable;
+import java.util.ArrayList;
+import java.util.List;
+
+public abstract class ArrayOfDoublesSketchSetBaseOperatorConversion implements 
SqlOperatorConversion

Review Comment:
   Looks like this family of functions isn't documented in this patch; is there 
a reason for that? If so, it'd be good to write it down in a class-level 
javadoc, so other developers are aware.



##########
extensions-core/datasketches/src/main/java/org/apache/druid/query/aggregation/datasketches/tuple/sql/ArrayOfDoublesSketchSetBaseOperatorConversion.java:
##########
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation.datasketches.tuple.sql;
+
+import org.apache.calcite.rex.RexCall;
+import org.apache.calcite.rex.RexLiteral;
+import org.apache.calcite.rex.RexNode;
+import org.apache.calcite.sql.SqlFunction;
+import org.apache.calcite.sql.SqlFunctionCategory;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.SqlOperator;
+import org.apache.calcite.sql.type.OperandTypes;
+import org.apache.calcite.sql.type.SqlOperandCountRanges;
+import org.apache.calcite.sql.type.SqlTypeFamily;
+import org.apache.druid.java.util.common.StringUtils;
+import org.apache.druid.query.aggregation.PostAggregator;
+import 
org.apache.druid.query.aggregation.datasketches.tuple.ArrayOfDoublesSketchSetOpPostAggregator;
+import org.apache.druid.segment.column.RowSignature;
+import org.apache.druid.sql.calcite.expression.DruidExpression;
+import org.apache.druid.sql.calcite.expression.OperatorConversions;
+import org.apache.druid.sql.calcite.expression.PostAggregatorVisitor;
+import org.apache.druid.sql.calcite.expression.SqlOperatorConversion;
+import org.apache.druid.sql.calcite.planner.PlannerContext;
+
+import javax.annotation.Nullable;
+import java.util.ArrayList;
+import java.util.List;
+
+public abstract class ArrayOfDoublesSketchSetBaseOperatorConversion implements 
SqlOperatorConversion
+{
+  public ArrayOfDoublesSketchSetBaseOperatorConversion()
+  {
+  }
+
+  @Override
+  public SqlOperator calciteOperator()
+  {
+    return makeSqlFunction();
+  }
+
+  @Nullable
+  @Override
+  public DruidExpression toDruidExpression(
+      PlannerContext plannerContext,
+      RowSignature rowSignature,
+      RexNode rexNode
+  )
+  {
+    plannerContext.setPlanningError("%s can only be used on aggregates. " +
+        "It cannot be used directly on a column or on a scalar expression.", 
getFunctionName());
+    return null;
+  }
+
+  @Nullable
+  @Override
+  public PostAggregator toPostAggregator(
+      PlannerContext plannerContext,
+      RowSignature rowSignature,
+      RexNode rexNode,
+      PostAggregatorVisitor postAggregatorVisitor
+  )
+  {
+    final List<RexNode> operands = ((RexCall) rexNode).getOperands();
+    final List<PostAggregator> inputPostAggs = new ArrayList<>();
+    Integer nominalEntries = null;
+    Integer numberOfvalues = null;
+
+    for (int i = 0; i < operands.size(); i++) {
+      RexNode operand = operands.get(i);
+      if (i == 0 && operand.isA(SqlKind.LITERAL) && 
SqlTypeFamily.INTEGER.contains(operand.getType())) {
+        nominalEntries = RexLiteral.intValue(operand);

Review Comment:
   These requirements could be better-plugged into the OperandTypeChecker, and 
if so, users would get better messages when using the function incorrectly. I'm 
willing to leave this for future work, though, if you aren't up to doing it in 
this patch.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to