cheezman34 opened a new issue #12366:
URL: https://github.com/apache/druid/issues/12366


   ### Affected Version
   
   21.1 (didn't see anything related to this in the notes for 22.1 when I 
scanned through them)
   
   ### Description
   
   When I run EXPLAIN PLAN for This SQL:
   
   ```
   SELECT "user"
   FROM my_table
   WHERE "user" > 'y' and "user" != '-'
     AND ('2022-02-22T00:00:00Z' <= "__time"
          AND "__time" < TIME_PARSE('2022-02-23T00:00:00Z'))
   GROUP BY 1
   ```
   I see this in the explain results:
   ```
     "intervals": {
       "type": "intervals",
       "intervals": [
         "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
       ]
     },
     "filter": {
       "type": "not",
       "field": {
         "type": "or",
         "fields": [
           {
             "type": "not",
             "field": {
               "type": "bound",
               "dimension": "__time",
               "lower": "1645488000000",
               "upper": "1645574400000",
               "lowerStrict": false,
               "upperStrict": true,
               "extractionFn": null,
               "ordering": {
                 "type": "numeric"
               }
             }
           },
           {
             "type": "bound",
             "dimension": "user",
             "lower": null,
             "upper": "y",
             "lowerStrict": false,
             "upperStrict": false,
             "extractionFn": null,
             "ordering": {
               "type": "lexicographic"
             }
           }
         ]
       }
     },
   ```
   
   Presumably druid goes through some heuristics to try and optimize filters, 
but in this particular case, it adds a weird negation. I would expect intervals 
to look like:
   ```
       "intervals": [
         "2022-02-22T00:00:00.000Z/2022-02-23T00:00:00.000Z"
       ]
   ```
   but instead, it covers all of time, and then there is a secondary __time 
bound filter in "filters". 
   
   Workarounds for anyone seeing similar issues:
   1) Optimize your filters yourself before you send them to druid. The first 
filter of `"user">'y'` makes the second filter unnecessary. In my case, the SQL 
was generated programmatically, so that's why it showed up.
   2) For some reason, if "user" is bounded on both ends ( `"user">'y' AND 
"user" <= 'z'` ) then EXPLAIN PLAN shows more optimal intervals. 
   
   Anyway, not a critical bug, but probably worth fixing at some point.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to