sascha-coenen opened a new issue #9412: Regression: SQL query with multiple 
similar CASE clauses cannot be translated to correct native json format
URL: https://github.com/apache/druid/issues/9412
 
 
   ### Affected Version
   v0.17.0
   This is a serious regression bug for us. Queries like the following used to 
work in Druid 0.16.0 and are broken in 0.17.0
   We used Druid 0.17.0 with SQL compatible null handling DISABLED.
   
   ### Description
   
   The following query produces a rule error in the SQL engine:
   ```
   SELECT 
        CASE cityName WHEN NULL THEN FALSE ELSE TRUE END AS col_a,
        CASE countryName WHEN NULL THEN FALSE ELSE TRUE END AS col_b
   FROM wikipedia
   GROUP BY 1, 2
   ```
   
   the error is:
   
   ```
   Unknown exception / Error while applying rule DruidQueryRule(AGGREGATE), 
args [rel#592:LogicalAggregate.NONE.[](input=RelSubset#591,group={0, 1}), 
rel#599:DruidQueryRel.NONE.[](query={"queryType":"scan","dataSource":{"type":"table","name":"wikipedia"},"intervals":{"type":"intervals","intervals":["-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"]},"virtualColumns":[{"type":"expression","name":"v0","expression":"1","outputType":"LONG"}],"resultFormat":"compactedList","batchSize":20480,"limit":9223372036854775807,"order":"none","filter":null,"columns":["v0"],"legacy":false,"context":{"forceLimitPushDown":"false","sqlOuterLimit":100,"sqlQueryId":"f83cf8c0-1856-4dc9-aeb6-fc487b92750f","useApproximateTopN":"false","vectorize":"false"},"descending":false,"granularity":{"type":"all"}},signature={v0:LONG,
 v0:LONG})] / java.lang.RuntimeException
   ```
   
   Following is a comparison between what a similar query got translated to in 
Druid 0.16.0 vs now
   
   in 0.16.0:
   ```
   {
     "queryType": "groupBy",
   ...
     "virtualColumns": [
       {
         "type": "expression",
         "name": "v0",
         "expression": "case_searched((\"integrationType\" == null),0,1)",
         "outputType": "LONG"
       },
       {
         "type": "expression",
         "name": "v1",
         "expression": "case_searched((\"integrationVersion\" == null),0,1)",
         "outputType": "LONG"
       }
     ],
     "dimensions": [
       {
         "type": "default",
         "dimension": "v0",
         "outputName": "v0",
         "outputType": "LONG"
       },
       {
         "type": "default",
         "dimension": "v1",
         "outputName": "v1",
         "outputType": "LONG"
       }
     ],
   ...
   }
   ```
   
   
   In v 0.17.0, there seem to be several things that compound:
   a) the query is rewritten into a scan query although it augh to be a groupby 
query
   b) the main issue seems to be that instead of recognizing the two CASE 
statements as being different and generating different logical names "v0" and 
"v1" for them, the two statements are being given the same name "v0" which is 
then consecutively leading to an illegal query syntax.
   c) instead of faithful CASE expressions, a virtual column expression gets 
generated that is just a constant number "1" 
   
   ```
   {
      "queryType":"scan",
   ...
      "virtualColumns":[
         {
            "type":"expression",
            "name":"v0",
            "expression":"1",
            "outputType":"LONG"
         }
      ],
      "columns":[
         "v0"
      ],
   ...
   }
   ```
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to