Re: [PR] Support sorting on complex columns in MSQ (druid)

via GitHub Thu, 09 May 2024 00:21:21 -0700


gianm commented on code in PR #16322:
URL: https://github.com/apache/druid/pull/16322#discussion_r1595006563



##########
sql/src/test/java/org/apache/druid/sql/calcite/CalciteNestedDataQueryTest.java:
##########
@@ -5422,6 +5428,8 @@ public void testJoinOnNestedColumnThrows()
   @Test
   public void testScanStringNotNullCast()
   {
+    // Scan results between native and the MSQ aren't in the same order.
+    msqIncompatible();

Review Comment:
   We can deal with this by having different expected results for different 
engines; better to do that rather than skip the test. The join tests do that 
with a method `sortIfSortBased`. I don't think the same exact method will work 
here, but something similar should work.



##########
sql/src/test/java/org/apache/druid/sql/calcite/CalciteNestedDataQueryTest.java:
##########
@@ -2478,6 +2480,8 @@ public void 
testGroupByRootSingleTypeArrayStringElementFiltered()
   @Test
   public void testGroupByRootSingleTypeArrayDoubleElement()
   {
+    // Test framework doesn't register the RELAX_NULLS when running as part of 
MSQ tests
+    msqIncompatible();

Review Comment:
   Better to fix this issue with the test framework rather than skip the test.



##########
processing/src/test/java/org/apache/druid/frame/write/FrameWriterTestData.java:
##########
@@ -270,8 +268,9 @@ public class FrameWriterTestData
       HyperUniquesAggregatorFactory.TYPE,
       Arrays.asList(
           null,
-          makeHllCollector(null),
-          makeHllCollector("foo")
+          ByteRowKeyComparatorTest.makeHllCollector(1),

Review Comment:
   Please add some test nested data too. This is going to be a common case so 
it would be good to test it explicitly. Everything in the same dataset should 
be the same type, so I'd suggest renaming this to `TEST_COMPLEX_HLL` and adding 
another `TEST_COMPLEX_NESTED`. It should include a variety of shapes of data, 
including 'naked' arrays, strings, numbers etc.



##########
processing/src/main/java/org/apache/druid/frame/field/ComplexFieldReader.java:
##########
@@ -92,7 +92,28 @@ public boolean isNull(Memory memory, long position)
   @Override
   public boolean isComparable()
   {
-    return false;
+    return true;

Review Comment:
   Not technically true: the javadoc for FieldReader says "Comparable fields 
can be compared as unsigned bytes." But we can remove this method from the 
interface anyway, since it always returns true now, and no longer serves a 
purpose.



##########
processing/src/test/java/org/apache/druid/frame/key/ByteRowKeyComparatorTest.java:
##########
@@ -57,17 +82,21 @@ public class ByteRowKeyComparatorTest extends 
InitializedNullHandlingTest
       OBJECTS4,
       OBJECTS5,
       OBJECTS6,
-      OBJECTS7
+      OBJECTS7,
+      OBJECTS8,
+      OBJECTS9
   );
 
   @Test
-  public void test_compare_AAAA() // AAAA = all ascending

Review Comment:
   Please keep the old tests that don't use complex types. They're still 
valuable, since we can go down different code paths in situations where there 
are no complex types at all.



##########
processing/src/main/java/org/apache/druid/frame/key/ByteRowKeyComparator.java:
##########
@@ -68,83 +90,80 @@ public static int computeFirstFieldPosition(final int 
fieldCount)
     return Ints.checkedCast((long) fieldCount * Integer.BYTES);
   }
 
-  /**
-   * Given a list of sort columns, compute an array of the number of ascending 
fields in a run, followed by number of
-   * descending fields in a run, followed by ascending, etc. For example: ASC, 
ASC, DESC, ASC would return [2, 1, 1]
-   * and DESC, DESC, ASC would return [0, 2, 1].
-   *
-   * Public so {@link FrameComparisonWidgetImpl} can use it.
-   */
-  public static int[] computeAscDescRunLengths(final List<KeyColumn> 
keyColumns)
-  {
-    final IntList ascDescRunLengths = new IntArrayList(4);
-
-    KeyOrder order = KeyOrder.ASCENDING;
-    int runLength = 0;
-
-    for (final KeyColumn column : keyColumns) {
-      if (column.order() == KeyOrder.NONE) {
-        throw new IAE("Key must be sortable");
-      }
-
-      if (column.order() != order) {
-        ascDescRunLengths.add(runLength);
-        runLength = 0;
-
-        // Invert "order".
-        order = order == KeyOrder.ASCENDING ? KeyOrder.DESCENDING : 
KeyOrder.ASCENDING;
-      }
-
-      runLength++;
-    }
-
-    if (runLength > 0) {
-      ascDescRunLengths.add(runLength);
-    }
-
-    return ascDescRunLengths.toIntArray();
-  }
-
   @Override
   @SuppressWarnings("SubtractionInCompareTo")
   public int compare(final byte[] keyArray1, final byte[] keyArray2)
   {
-    // Similar logic to FrameComparaisonWidgetImpl, but implementation is 
different enough that we need our own.
+    // Similar logic to FrameComparisonWidgetImpl, but implementation is 
different enough that we need our own.
     // Major difference is Frame v. Frame instead of byte[] v. byte[].
 
-    int comparableBytesStartPosition1 = firstFieldPosition;
-    int comparableBytesStartPosition2 = firstFieldPosition;
+    int currentRunStartPosition1 = firstFieldPosition;
+    int currentRunStartPosition2 = firstFieldPosition;
 
-    boolean ascending = true;
-    int field = 0;
+    // Number of fields compared till now, which is equivalent to the index of 
the field to compare next
+    int fieldsComparedTillNow = 0;
 
-    for (int numFields : ascDescRunLengths) {
-      if (numFields > 0) {
-        final int nextField = field + numFields;
-        final int comparableBytesEndPosition1 = 
RowKeyReader.fieldEndPosition(keyArray1, nextField - 1);
-        final int comparableBytesEndPosition2 = 
RowKeyReader.fieldEndPosition(keyArray2, nextField - 1);
+    for (RowKeyComparisonRunLengths.RunLengthEntry runLengthEntry : 
rowKeyComparisonRunLengths.getRunLengthEntries()) {
+
+      if (runLengthEntry.getRunLength() <= 0) {
+        // Defensive check
+        continue;
+      }
+
+      // Index of the next field that will get considered. Excludes the last 
field of the current run length that is being
+      // compared in this iteration
+      final int nextField = fieldsComparedTillNow + 
runLengthEntry.getRunLength();
+      final int currentRunEndPosition1 = 
RowKeyReader.fieldEndPosition(keyArray1, nextField - 1);
+      final int currentRunEndPosition2 = 
RowKeyReader.fieldEndPosition(keyArray2, nextField - 1);
+
+      final int cmp;
+
+      if (!runLengthEntry.isByteComparable()) {
+        // Only complex types are not byte comparable. Nested arrays aren't 
supported in MSQ
+        assert runLengthEntry.getRunLength() == 1;
+        // 'fieldsComparedTillNow' is the index of the current keyColumn in 
the keyColumns list. Sanity check that its
+        // a known complex type
+        ColumnType columnType = 
rowSignature.getColumnType(keyColumns.get(fieldsComparedTillNow).columnName())

Review Comment:
   Similar comment to `FrameComparisonWidgetImpl`: we should avoid this map 
lookup for `getColumnType(String)` and also avoid the lookup by string in 
`serdeMap`. That's expensive stuff to be doing for each comparison. Instead, 
how about doing a `serdes` that is a pre-built `ComplexMetricsSerde[]` (in the 
constructor), indexed by position in `keyColumns`. Each entry in `serdes` is 
either a serde (for complex types) or `null` otherwise. The entire `serdes` 
should be null if there are no complex types.
   
   I don't think `rowSignature` is used for anything else, so when we make this 
change we can also get rid of the `rowSignature` field.



##########
sql/src/test/java/org/apache/druid/sql/calcite/CalciteNestedDataQueryTest.java:
##########
@@ -1086,6 +1086,8 @@ public void testGroupByRootSingleTypeStringMixed2Sparse()
   @Test
   public void 
testGroupByRootSingleTypeStringMixed2SparseJsonValueNonExistentPath()
   {
+    // Fails while planning

Review Comment:
   Why does it fail?



##########
sql/src/test/java/org/apache/druid/sql/calcite/CalciteNestedDataQueryTest.java:
##########
@@ -6660,6 +6674,8 @@ public void testCoalesceOnNestedColumns()
   @Test
   public void testCoalesceOnNestedColumnsLater()
   {
+    // MSQ doesn't work well with unnest

Review Comment:
   What doesn't work well about this test?



##########
sql/src/test/java/org/apache/druid/sql/calcite/CalciteNestedDataQueryTest.java:
##########
@@ -6622,6 +6634,8 @@ public void testFilterJsonIsNull()
   @Test
   public void testCoalesceOnNestedColumns()
   {
+    // Unnest on MSQ returns incorrect results

Review Comment:
   What's incorrect & why?



##########
sql/src/test/java/org/apache/druid/sql/calcite/CalciteNestedDataQueryTest.java:
##########
@@ -5882,6 +5890,8 @@ public void 
testGroupByRootSingleTypeArrayStringNullsFilteredAsMvd()
   @Test
   public void testGroupByAndFilterVariant()
   {
+    // Returns incorrect results with MSQ

Review Comment:
   What's incorrect? We should fix correctness bugs with grouping on nested 
columns that we discover as part of shipping the feature.



##########
sql/src/test/java/org/apache/druid/sql/calcite/CalciteNestedDataQueryTest.java:
##########
@@ -7072,6 +7090,8 @@ public void testUnnestJsonQueryArraysJsonValueSum()
   @Test
   public void testJsonValueNestedEmptyArray()
   {
+    // Returns incorrect results with MSQ
+    msqIncompatible();

Review Comment:
   Why? Can we fix it?



##########
processing/src/main/java/org/apache/druid/frame/read/FrameReaderUtils.java:
##########
@@ -216,6 +219,60 @@ public static int compareByteArraysUnsigned(
     return Integer.compare(length1, length2);
   }
 
+  public static int compareComplexTypes(
+      final byte[] array1,
+      final int position1,
+      final byte[] array2,
+      final int position2,
+      final ColumnType columnType,
+      final ComplexMetricSerde complexMetricSerde
+  )
+  {
+    return compareComplexTypes(
+        Memory.wrap(array1),

Review Comment:
   The purpose of having these different overloads of the comparison methods is 
to avoid needing to wrap `byte[]` in `Memory` just to read one value. Wrapping 
in `Memory` is a somewhat expensive operation. It'd be better to add a 
`readFieldFromMemory` method that operates on byte arrays, and call that 
directly.



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/querykit/DataSourcePlan.java:
##########
@@ -721,7 +721,6 @@ private static DataSourcePlan forSortMergeJoin(
     final List<KeyColumn> leftPartitionKey = partitionKeys.get(0);
     leftBuilder.shuffleSpec(new HashShuffleSpec(new 
ClusterBy(leftPartitionKey, 0), maxWorkerCount));
     
leftBuilder.signature(QueryKitUtils.sortableSignature(leftBuilder.getSignature(),
 leftPartitionKey));
-

Review Comment:
   seems like no reason to remove this line.



##########
processing/src/main/java/org/apache/druid/frame/key/FrameComparisonWidgetImpl.java:
##########
@@ -178,41 +198,69 @@ public int compare(int row, RowKey key)
     final long rowPosition = getRowPositionInDataRegion(row);
 
     long comparableBytesStartPositionInRow = firstFieldPosition;
-    int keyComparableBytesStartPosition = Integer.BYTES * keyFieldCount;
+    int comparableBytesStartPositionInKey = Integer.BYTES * keyFieldCount;
 
-    boolean ascending = true;
-    int field = 0;
+    // Number of fields compared till now, which is equivalent to the index of 
the field to compare next
+    int fieldsComparedTillNow = 0;
 
-    for (int numFields : ascDescRunLengths) {
-      if (numFields > 0) {
-        final int nextField = field + numFields;
-        final long comparableBytesEndPositionInRow = 
getFieldEndPositionInRow(rowPosition, nextField - 1);
-        final int keyComparableBytesEndPosition = 
RowKeyReader.fieldEndPosition(keyArray, nextField - 1);
+    for (RowKeyComparisonRunLengths.RunLengthEntry runLengthEntry : 
rowKeyComparisonRunLengths.getRunLengthEntries()) {
 
-        final long comparableBytesLength = comparableBytesEndPositionInRow - 
comparableBytesStartPositionInRow;
-        final int keyComparableBytesLength = keyComparableBytesEndPosition - 
keyComparableBytesStartPosition;
+      if (runLengthEntry.getRunLength() <= 0) {
+        // Defensive check
+        continue;
+      }
 
-        int cmp = FrameReaderUtils.compareMemoryToByteArrayUnsigned(
+      // Index of the next field that will get considered. Excludes the 
current field that we are comparing right now
+      final int nextField = fieldsComparedTillNow + 
runLengthEntry.getRunLength();
+      final int comparableBytesEndPositionInKey = 
RowKeyReader.fieldEndPosition(keyArray, nextField - 1);
+      final int comparableBytesEndPositionInRow = 
getFieldEndPositionInRow(rowPosition, nextField - 1);
+
+      final int cmp;
+
+      if (!runLengthEntry.isByteComparable()) {
+        // Only complex types are not byte comparable. Nested arrays aren't 
supported in MSQ
+        assert runLengthEntry.getRunLength() == 1;
+        // 'fieldsComparedTillNow' is the index of the current keyColumn in 
the keyColumns list. Sanity check that its
+        // a known complex type
+        ColumnType columnType = 
signature.getColumnType(keyColumns.get(fieldsComparedTillNow).columnName())

Review Comment:
   We should avoid this map lookup for `getColumnType(String)` and also avoid 
the lookup by string in `serdeMap`. That's expensive stuff to be doing for each 
comparison. Instead, how about doing a `serdes` that is a pre-built 
`ComplexMetricsSerde[]` (in the constructor), indexed by position in 
`keyColumns`. Each entry in `serdes` is either a serde (for complex types) or 
`null` otherwise. The entire `serdes` should be null if there are no complex 
types.



##########
processing/src/main/java/org/apache/druid/segment/column/TypeStrategy.java:
##########
@@ -179,20 +179,38 @@ default T fromBytes(byte[] value)
    * true or false, depending on whether the semantics and implementation of 
the type naturally leads to groupability
    * or not. For example, it makes sense for JSON columns to be groupable, 
however there is little sense in grouping
    * sketches (before finalizing).
-   *
-   * If a type is groupable, it MUST implement the {@link #hashCode} and 
{@link #equals} correctly
+   * <p>
+   * If a type is groupable, following statements MUST hold:
+   * <p>
+   * a. {@link #equals(Object, Object)} must be implemented. It should return 
true if and only if two objects are equal
+   *    and can be grouped together.
+   * <p>
+   * b. {@link #hashCode(Object)} must be implemented, and must be consistent 
with equals. It should return a hashCode
+   *    for the given object. For two objects that are equal, it must return 
the same hash value. For two objects that are
+   *    not equal, it can return the same hash value or not (collision)
+   * <p>
+   * c. {@link #compare(Object, Object)} must be consistent with equals. Apart 
from abiding by the definition of
+   *    {@link Comparator#compare}, it must not return 0 for two objects that 
are not equals, and converse must also hold,
+   *    i.e. if the value returned by compare is not zero, then the arguments 
must not be equal. It must return 0 for two

Review Comment:
   I think the last sentence is redundant.



##########
processing/src/main/java/org/apache/druid/segment/column/TypeStrategy.java:
##########
@@ -179,20 +179,38 @@ default T fromBytes(byte[] value)
    * true or false, depending on whether the semantics and implementation of 
the type naturally leads to groupability
    * or not. For example, it makes sense for JSON columns to be groupable, 
however there is little sense in grouping
    * sketches (before finalizing).
-   *
-   * If a type is groupable, it MUST implement the {@link #hashCode} and 
{@link #equals} correctly
+   * <p>
+   * If a type is groupable, following statements MUST hold:
+   * <p>
+   * a. {@link #equals(Object, Object)} must be implemented. It should return 
true if and only if two objects are equal
+   *    and can be grouped together.
+   * <p>
+   * b. {@link #hashCode(Object)} must be implemented, and must be consistent 
with equals. It should return a hashCode
+   *    for the given object. For two objects that are equal, it must return 
the same hash value. For two objects that are
+   *    not equal, it can return the same hash value or not (collision)

Review Comment:
   Should note here that even though collisions are technically ok from a 
correctness PoV, they will lead to bad performance in the grouping engine, so 
they should be minimized.



##########
processing/src/test/java/org/apache/druid/frame/key/ByteRowKeyComparatorTest.java:
##########
@@ -35,20 +41,39 @@
 
 public class ByteRowKeyComparatorTest extends InitializedNullHandlingTest
 {
+
+  static {
+    ComplexMetrics.registerSerde(HyperUniquesSerde.TYPE_NAME, new 
HyperUniquesSerde());
+  }
+
   static final RowSignature SIGNATURE =
       RowSignature.builder()
                   .add("1", ColumnType.LONG)
                   .add("2", ColumnType.STRING)
                   .add("3", ColumnType.LONG)
                   .add("4", ColumnType.DOUBLE)
+                  .add("5", HyperUniquesAggregatorFactory.TYPE)

Review Comment:
   Include a test where the complex type is at the start, and somewhere in the 
middle, to ensure we have the edge cases handled.



##########
processing/src/test/java/org/apache/druid/frame/key/RowKeyComparisonRunLengthsTest.java:
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.frame.key;
+
+import com.google.common.collect.ImmutableList;
+import org.apache.druid.error.DruidException;
+import org.apache.druid.segment.column.ColumnType;
+import org.apache.druid.segment.column.RowSignature;
+import org.junit.Assert;
+import org.junit.Test;
+
+import java.util.Collections;
+import java.util.List;
+
+public class RowKeyComparisonRunLengthsTest

Review Comment:
   In a situation like this, I like to have one test that does a whole space of 
things. Here I suggest a test case that checks all possible length-3 keys where 
we vary each key element's type between string and complex, and direction 
between asc and desc. So that's 4 possibilities for each key element (string 
asc, string desc, complex asc, complex desc) and therefore 64 keys we're 
testing. It will run fast since it's only 64 cases, and it gets good coverage 
of different possibilities for runs and orderings.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Support sorting on complex columns in MSQ (druid)

Reply via email to