renatocron opened a new issue, #12620:
URL: https://github.com/apache/druid/issues/12620

   When executing the following query (using the wikipedia sample data):
   
   ```SQL
   SELECT 
       MAX("added") as last_added, 
       case when cityName IN ('Abbotsford','Aglayan' ) then '-1' else 0 end as 
MyFavCitiesFirst 
   FROM wikipedia 
   WHERE 
       __time >= CURRENT_TIMESTAMP - INTERVAL '7' YEAR 
       AND 
       ( 
          cityName IN ('Abbotsford', 'Aglayan') 
       ) 
   GROUP BY 
       cityName
   ORDER BY MyFavCitiesFirst, last_added DESC 
   LIMIT 1 
   ```
   Calcites try to otimize somethings and fails:
   ```
   WARN [query-id] org.apache.druid.sql.http.SqlResource - Failed to handle 
query: SqlQuery{query='SELECT...
   ', resultFormat=array, header=true, context={timeout=13033, 
sqlOuterLimit=101}, parameters=[]}
   java.util.NoSuchElementException: null 
   at java.util.Optional.orElseThrow(Optional.java:290) ~[?:1.8.0_275] 
   at 
org.apache.calcite.sql2rel.RelStructuredTypeFlattener.getNewInputFieldByNewOrdinal(RelStructuredTypeFlattener.java:311)
 ~[calcite-core-1.21.0.jar:1.21.0] 
   at 
org.apache.calcite.sql2rel.RelStructuredTypeFlattener.getNewFieldForOldInput(RelStructuredTypeFlattener.java:301)
 ~[calcite-core-1.21.0.jar:1.21.0] 
   at 
org.apache.calcite.sql2rel.RelStructuredTypeFlattener.getNewFieldForOldInput(RelStructuredTypeFlattener.java:322)
 ~[calcite-core-1.21.0.jar:1.21.0] 
   at 
org.apache.calcite.sql2rel.RelStructuredTypeFlattener$RewriteRexShuttle.visitInputRef(RelStructuredTypeFlattener.java:809)
 ~[calcite-core-1.21.0.jar:1.21.0]
   at 
org.apache.calcite.sql2rel.RelStructuredTypeFlattener$RewriteRexShuttle.visitInputRef(RelStructuredTypeFlattener.java:806)
 ~[calcite-core-1.21.0.jar:1.21.0]
   at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112) 
~[calcite-core-1.21.0.jar:1.21.0] 
   at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:149) 
~[calcite-core-1.21.0.jar:1.21.0] 
   ```
   Full log at 
https://gist.github.com/renatocron/3ef274ec83627b258c4b376b7f9c1cb6
   
   If you change the query to add some item to the first or the second list, 
the query runs successfully:
   ```SQL
   SELECT 
       MAX("added") as last_added, 
       case when cityName IN ('Abbotsford','Aglayan', '') then '-1' else 0 end 
as MyFavCitiesFirst 
   FROM wikipedia 
   WHERE 
       __time >= CURRENT_TIMESTAMP - INTERVAL '7' YEAR 
       AND 
       ( 
          cityName IN ('Abbotsford', 'Aglayan') 
       ) 
   
   GROUP BY 
       cityName
   ORDER BY MyFavCitiesFirst, last_added DESC 
   LIMIT 1 
   ```
   Should return:
   
   |last_added|MyFavCitiesFirst|
   |----------|----------------|
   |10        |-1              |
   
   
   ### Affected Version
   
   Only tested over 0.22.1 (apache/druid:0.22.1 image, 
druid.extensions.loadList=["druid-kafka-indexing-service", 
"druid-s3-extensions", "postgresql-metadata-storage", "druid-histogram", 
"druid-datasketches", "druid-avro-extensions", "druid-bloom-filter"])
   
   anyone is welcome to test other versions and post the results here, as the 
wikipedia sample is easy to get up!
   
   You may ask, why would I run such query? Well, this is the reduced version 
for easy testing that trigger the bug, the original query was already an 
attempt to solve another that I will open after that one!
   
   ```SQL
   SELECT 
       MAX("added") as last_added, 
       case when cityName IN ('Abbotsford','Aglayan', '') then '-1' else 0 end 
as MyFavCitiesFirst 
   FROM wikipedia 
   WHERE 
       __time >= CURRENT_TIMESTAMP - INTERVAL '7' YEAR 
       AND 
       ( 
        (
          1 = 1 OR -- only to feature-check this condition for only some runs 
          cityName IN ('Abbotsford', 'Aglayan', '') 
          AND commentLength > 10 -- for example, views on pages..
          ) OR (
              cityName IN ('OTHER', 'list','of','generic','cities')    
              AND isRobot = '0' -- some other conditions .. 
          )
       ) 
   GROUP BY 
       cityName
   ORDER BY MyFavCitiesFirst, last_added DESC 
   LIMIT 1 
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to