[GitHub] [pinot] mqliang commented on a change in pull request #8433: Adjust comparison of numDocsScanned

GitBox Tue, 29 Mar 2022 18:30:48 -0700


mqliang commented on a change in pull request #8433:
URL: https://github.com/apache/pinot/pull/8433#discussion_r838036344




##########
File path: 
pinot-common/src/main/java/org/apache/pinot/common/utils/SqlResultComparator.java
##########
@@ -126,29 +126,32 @@ public static boolean areEqual(JsonNode actual, JsonNode 
expected, String query)
      */
     if (expected.has(FIELD_IS_SUPERSET) && 
expected.get(FIELD_IS_SUPERSET).asBoolean(false)) {
       return areElementsSubset(actualElementsSerialized, 
expectedElementsSerialized);
-    } else {
-      if (!areLengthsEqual(actual, expected)) {
-        return false;
-      }
-      /*
-       * Pinot server do some early termination optimization (process in 
parallel and early return if get enough
-       * documents to fulfill the LIMIT and OFFSET requirement) for queries:
-       * - selection without order by
-       * - selection with order by (by sorting the segments on min-max value)
-       * - DISTINCT queries.
-       * numDocsScanned is non-deterministic for those queries so 
numDocsScanned comparison should be skipped.
-       *
-       * NOTE: DISTINCT queries are modeled as non-selection queries during 
query processing, but all DISTINCT
-       * queries are selection queries during Calcite parsing (DISTINCT 
queries are selection queries with
-       * selectNode.getModifierNode(SqlSelectKeyword.DISTINCT) != null).
-       */
-      if (!isSelectionQuery(query) && !areNumDocsScannedEqual(actual, 
expected)) {
-        return false;
-      }
-      return isOrderByQuery(query) ? areOrderByQueryElementsEqual(actualRows, 
expectedRows, actualElementsSerialized,
-          expectedElementsSerialized, query)
-          : areNonOrderByQueryElementsEqual(actualElementsSerialized, 
expectedElementsSerialized);
     }
+    if (!areLengthsEqual(actual, expected)) {
+      return false;
+    }
+    boolean areResultsEqual = isOrderByQuery(query) ? 
areOrderByQueryElementsEqual(actualRows, expectedRows,
+        actualElementsSerialized, expectedElementsSerialized, query)
+        : areNonOrderByQueryElementsEqual(actualElementsSerialized, 
expectedElementsSerialized);
+    /*
+     * Pinot servers implement early termination optimization (process in 
parallel and early return if we get enough
+     * documents to fulfill the LIMIT and OFFSET requirement) for queries for 
the following cases:
+     * - selection without order by
+     * - selection with order by (by sorting the segments on min-max value)
+     * - DISTINCT queries.
+     * numDocsScanned is non-deterministic for those queries so numDocsScanned 
comparison should be skipped.
+     *
+     * In other cases, we accept if numDocsScanned is better than the expected 
one, since that is indicative of
+     * performance improvement.
+     *
+     * NOTE: DISTINCT queries are modeled as non-selection queries during 
query processing, but all DISTINCT
+     * queries are selection queries during Calcite parsing (DISTINCT queries 
are selection queries with
+     * selectNode.getModifierNode(SqlSelectKeyword.DISTINCT) != null).
+     */
+    if (areResultsEqual && !isSelectionQuery(query) && 
!isNumDocsScannedBetter(actual, expected)) {
+      return false;
+    }
+    return areResultsEqual;

Review comment:
       👌




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [pinot] mqliang commented on a change in pull request #8433: Adjust comparison of numDocsScanned

Reply via email to