spapin commented on code in PR #18587:
URL: https://github.com/apache/pinot/pull/18587#discussion_r3498585715


##########
pinot-core/src/main/java/org/apache/pinot/core/query/optimizer/filter/IdenticalPredicateFilterOptimizer.java:
##########
@@ -43,62 +61,45 @@ boolean canBeOptimized(Expression filterExpression, 
@Nullable Schema schema) {
   @Override
   Expression optimizeChild(Expression filterExpression, @Nullable Schema 
schema) {
     Function function = filterExpression.getFunctionCall();
-    FilterKind kind = FilterKind.valueOf(function.getOperator());
-    switch (kind) {
-      case EQUALS:
-        if (hasIdenticalLhsAndRhs(function.getOperands())) {
-          return TRUE;
-        }
-        break;
-      case NOT_EQUALS:
-        if (hasIdenticalLhsAndRhs(function.getOperands())) {
-          return FALSE;
-        }
-        break;
-      default:
-        break;
+    if (FilterKind.valueOf(function.getOperator()) == FilterKind.EQUALS) {
+      Optional<Boolean> folded = 
foldIdenticalComparisonWithTrueLiteral(function.getOperands());
+      if (folded.isPresent()) {
+        return getExpressionFromBoolean(folded.get());
+      }
     }
     return filterExpression;
   }
 
   /**
-   * Pinot queries of the WHERE 1 != 1 AND "col1" = "col2" variety are 
rewritten as
-   * 1-1 != 0 AND "col1"-"col2" = 0. Therefore, we check specifically for the 
case where
-   * the operand is set up in this fashion.
-   *
-   * We return false specifically after every check to ensure we're only 
continuing when
-   * the input looks as expected. Otherwise, it's easy to for one of the 
operand functions
-   * to return null and fail the query.
-   *
-   * TODO: The rewrite is already happening in 
PredicateComparisonRewriter.updateFunctionExpression(),
-   * so we might just compare the lhs and rhs there.
+   * Folds a predicate of the form 'comparison(a, a) = true' — a comparison 
whose two operands are the <em>same</em>
+   * expression — to the constant value it must always have. Returns an empty 
{@link Optional} when the predicate is
+   * not such a comparison: the operands differ, the right-hand side is not 
the literal {@code true}, or the function
+   * is a non-comparison like 'startsWith(a, a) = true' whose value cannot be 
determined here.
    */
-  private boolean hasIdenticalLhsAndRhs(List<Expression> operands) {
-    boolean hasTwoChildren = operands.size() == 2;
-    Expression firstChild = operands.get(0);
-    if (firstChild.getFunctionCall() == null || !hasTwoChildren) {
-      return false;
+  private Optional<Boolean> 
foldIdenticalComparisonWithTrueLiteral(List<Expression> operands) {

Review Comment:
   Good point. I think if this optimization becomes necessary, we can first ask 
users to rewrite `a = a` or re-add the optimization and be specific this is for 
performance, not correctness.  So I'd leave as is.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to