FrankChen021 commented on issue #17891:
URL: https://github.com/apache/druid/issues/17891#issuecomment-2788399921

   The problem is clear now.
   
   For the `IN` operator in SQL, the planner first converts it to `BoundFilter` 
and then tries to convert it into `SelectorDimFilter`(1), and at last tries to 
optimize the selector dim filters into `InDimFilter`(2).
   
   At step 1, 
   
   ```java
         final StringComparator comparator = 
RowSignatures.getNaturalStringComparator(
             rowSignature,
             SimpleExtraction.of(bound.getDimension(), bound.getExtractionFn())
         );
   
         if (bound.hasUpperBound()
             && bound.hasLowerBound()
             && bound.getUpper().equals(bound.getLower())
             && !bound.isUpperStrict()
             && !bound.isLowerStrict()
             && bound.getOrdering().equals(comparator)) {
           return new SelectorDimFilter(
               bound.getDimension(),
               bound.getUpper(),
               bound.getExtractionFn()
           );
         } else {
           return filter;
         }
   ```
   
   the check `bound.getOrdering().equals(comparator)` is false in this case on 
Druid 27, because the `bound.getOrdering` is numeric comparator while the 
comparator calculated by rowSignature is string comparator, so the 
`BoundFilter` is not converted to `SelectorDimFilter` which mean the step 2 
does not run.
   
   After `druid.sql.planner.metadataColumnTypeMergePolicy` to the 
`latestInterval`, the column type of id is computed as LONG in this case, so 
the above step 1 and step 2 works to generate a `InDimFilter` as expected.
   
   BTW, the SQL which contains lots of items in the `IN` operator takes seconds 
to generate a native query plan, while the native query takes only sub-second 
to execute the query. I'm not sure if 
https://github.com/apache/druid/pull/16039 addresses the performance problem, 
will check it later.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to