sascha-coenen opened a new issue #9412: Regression: SQL query with multiple similar CASE clauses cannot be translated to correct native json format URL: https://github.com/apache/druid/issues/9412 ### Affected Version v0.17.0 This is a serious regression bug for us. Queries like the following used to work in Druid 0.16.0 and are broken in 0.17.0 We used Druid 0.17.0 with SQL compatible null handling DISABLED. ### Description The following query produces a rule error in the SQL engine: ``` SELECT CASE cityName WHEN NULL THEN FALSE ELSE TRUE END AS col_a, CASE countryName WHEN NULL THEN FALSE ELSE TRUE END AS col_b FROM wikipedia GROUP BY 1, 2 ``` the error is: ``` Unknown exception / Error while applying rule DruidQueryRule(AGGREGATE), args [rel#592:LogicalAggregate.NONE.[](input=RelSubset#591,group={0, 1}), rel#599:DruidQueryRel.NONE.[](query={"queryType":"scan","dataSource":{"type":"table","name":"wikipedia"},"intervals":{"type":"intervals","intervals":["-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"]},"virtualColumns":[{"type":"expression","name":"v0","expression":"1","outputType":"LONG"}],"resultFormat":"compactedList","batchSize":20480,"limit":9223372036854775807,"order":"none","filter":null,"columns":["v0"],"legacy":false,"context":{"forceLimitPushDown":"false","sqlOuterLimit":100,"sqlQueryId":"f83cf8c0-1856-4dc9-aeb6-fc487b92750f","useApproximateTopN":"false","vectorize":"false"},"descending":false,"granularity":{"type":"all"}},signature={v0:LONG, v0:LONG})] / java.lang.RuntimeException ``` Following is a comparison between what a similar query got translated to in Druid 0.16.0 vs now in 0.16.0: ``` { "queryType": "groupBy", ... "virtualColumns": [ { "type": "expression", "name": "v0", "expression": "case_searched((\"integrationType\" == null),0,1)", "outputType": "LONG" }, { "type": "expression", "name": "v1", "expression": "case_searched((\"integrationVersion\" == null),0,1)", "outputType": "LONG" } ], "dimensions": [ { "type": "default", "dimension": "v0", "outputName": "v0", "outputType": "LONG" }, { "type": "default", "dimension": "v1", "outputName": "v1", "outputType": "LONG" } ], ... } ``` In v 0.17.0, there seem to be several things that compound: a) the query is rewritten into a scan query although it augh to be a groupby query b) the main issue seems to be that instead of recognizing the two CASE statements as being different and generating different logical names "v0" and "v1" for them, the two statements are being given the same name "v0" which is then consecutively leading to an illegal query syntax. c) instead of faithful CASE expressions, a virtual column expression gets generated that is just a constant number "1" ``` { "queryType":"scan", ... "virtualColumns":[ { "type":"expression", "name":"v0", "expression":"1", "outputType":"LONG" } ], "columns":[ "v0" ], ... } ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
