juhoautio-rovio opened a new issue, #13775:
URL: https://github.com/apache/druid/issues/13775

   ### Affected Version
   
   25.0.0
   
   ### Description
   
   Having a an empty string as a value and grouping by it doesn't make much 
sense, but there happened to be a query like that, and it was working with 
Druid version 0.17.0.
   
   Now, with Druid 25.0.0 this query fails:
   ```sql
   SELECT
       '' as my_group,
       sum(double_col_1) + sum(double_col_2) as sum_double_cols
   FROM druid.my_datasource
   WHERE
       CAST(__time AS DATE) BETWEEN (DATE '2018-05-01') AND (DATE '2018-09-07')
   GROUP BY 1;
   ```
   
   The error is:
   ```
   [00000][-1] Error -1 (00000) : Error while executing SQL "SELECT
   '' as my_group,
   sum(double_col_1) + sum(double_col_2) as sum_double_cols
   FROM druid.my_datasource
   WHERE
   CAST(__time AS DATE) BETWEEN (DATE '2018-05-01') AND (DATE '2018-09-07')
   GROUP BY 1": Remote driver error: QueryInterruptedException: 
java.lang.AssertionError: Cannot add expression of different type to set:
   set type is RecordType(CHAR(0) NOT NULL my_group, DOUBLE NOT NULL 
sum_double_cols) NOT NULL
   expression type is RecordType(CHAR(0) my_group, DOUBLE NOT NULL 
sum_double_cols) NOT NULL
   set is 
rel#30743:LogicalProject.NONE.[](input=HepRelVertex#30742,my_group=$0,sum_double_cols=+($1,
 $2))
   expression is LogicalProject(my_group=[null:CHAR(0)], sum_double_cols=[+($1, 
$2)])
   LogicalAggregate(group=[{0}], agg#0=[SUM($1)], agg#1=[SUM($2)])
   LogicalProject(my_group=[''], double_col_1=[$2], double_col_2=[$1])
   LogicalFilter(condition=[AND(>=(CAST($0):DATE NOT NULL, 2018-05-01), 
<=(CAST($0):DATE NOT NULL, 2018-09-07))])
   LogicalProject(__time=[$0], double_col_2=[$23], double_col_1=[$31])
   LogicalTableScan(table=[[druid, my_datasource]])
   -> RuntimeException: java.lang.AssertionError: Cannot add expression of 
different type to set:
   set type is RecordType(CHAR(0) NOT NULL my_group, DOUBLE NOT NULL 
sum_double_cols) NOT NULL
   expression type is RecordType(CHAR(0) my_group, DOUBLE NOT NULL 
sum_double_cols) NOT NULL
   set is 
rel#30743:LogicalProject.NONE.[](input=HepRelVertex#30742,my_group=$0,sum_double_cols=+($1,
 $2))
   expression is LogicalProject(my_group=[null:CHAR(0)], sum_double_cols=[+($1, 
$2)])
   LogicalAggregate(group=[{0}], agg#0=[SUM($1)], agg#1=[SUM($2)])
   LogicalProject(my_group=[''], double_col_1=[$2], double_col_2=[$1])
   LogicalFilter(condition=[AND(>=(CAST($0):DATE NOT NULL, 2018-05-01), 
<=(CAST($0):DATE NOT NULL, 2018-09-07))])
   LogicalProject(__time=[$0], double_col_2=[$23], double_col_1=[$31])
   LogicalTableScan(table=[[druid, my_datasource]])
   -> AssertionError: Cannot add expression of different type to set:
   set type is RecordType(CHAR(0) NOT NULL my_group, DOUBLE NOT NULL 
sum_double_cols) NOT NULL
   expression type is RecordType(CHAR(0) my_group, DOUBLE NOT NULL 
sum_double_cols) NOT NULL
   set is 
rel#30743:LogicalProject.NONE.[](input=HepRelVertex#30742,my_group=$0,sum_double_cols=+($1,
 $2))
   expression is LogicalProject(my_group=[null:CHAR(0)], sum_double_cols=[+($1, 
$2)])
   LogicalAggregate(group=[{0}], agg#0=[SUM($1)], agg#1=[SUM($2)])
   LogicalProject(my_group=[''], double_col_1=[$2], double_col_2=[$1])
   LogicalFilter(condition=[AND(>=(CAST($0):DATE NOT NULL, 2018-05-01), 
<=(CAST($0):DATE NOT NULL, 2018-09-07))])
   LogicalProject(__time=[$0], double_col_2=[$23], double_col_1=[$31])
   LogicalTableScan(table=[[druid, my_datasource]])
   
   ```
   
   Replacing the `my_group` literal with a single space string makes the query 
pass:
   ```sql
   SELECT
       ' ' as my_group,
       sum(double_col_1) + sum(double_col_2) as sum_double_cols
   FROM druid.my_datasource
   WHERE
       CAST(__time AS DATE) BETWEEN (DATE '2018-05-01') AND (DATE '2018-09-07')
   GROUP BY 1;
   ```
   
   Also, somehow if I don't try to do any further operation on the result of 
sum, the query passes. For example this modification, with still an empty 
string for `my_group`, succeeds:
   ```sql
   SELECT
       '' as my_group,
       sum(double_col_1 + double_col_2) as sum_double_cols
   FROM druid.my_datasource
   WHERE
       CAST(__time AS DATE) BETWEEN (DATE '2018-05-01') AND (DATE '2018-09-07')
   GROUP BY 1;
   ```
   
   Other information:
   - **Cluster size:** reproduced the issue with smaller and bigger cluster
   - **Configurations in use:** I didn't change any relevant configuration when 
switching from 0.17.0 to 25.0.0.
   - **Steps to reproduce the problem:** I'm not sure how to best provide a 
full example with table creation & inserting some dummy data into it – please 
advise if possible?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to