abhishekrb19 opened a new issue #10282:
URL: https://github.com/apache/druid/issues/10282
A SQL sub-query with an expression specified in the inner query's HAVING
sub-clause returns no data. Easy to reproduce, please see below for more
details.
### Affected Version
0.18.1
### Description
A minimal example to reproduce the issue easily -- let's consider a single
dimension "a" in a data source "ds", the problem can be reproduced with the
following query:
**Query-1**: Sub-query:
```
SELECT COUNT(*)
FROM (
SELECT a
FROM ds
GROUP BY a
HAVING SUM(a) + 1 > 0
)
```
returns no data.
**Query-2**: The inner sub-query in itself returns results as expected:
```
SELECT a
FROM ds
GROUP BY a
HAVING SUM(a) + 1 > 0
```
returns the following results:
```
a2
a3
a6
a7
```
The expression specified in the `HAVING` clause in the sub-query appears to
be the offending one. Note that without the expression, the query returns the
correct results with 4 rows.
Further, looking into the query plan for Query-1, we have:
```
DruidOuterQueryRel(query=[{"queryType":"timeseries","dataSource":{"type":"table","name":"__subquery__"},"intervals":{"type":"intervals","intervals":["-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"]},"descending":false,"virtualColumns":[{"type":"expression","name":"v0","expression":"(\"$f1\"
+
1)","outputType":"DOUBLE"}],"filter":{"type":"bound","dimension":"v0","lower":"0","upper":null,"lowerStrict":true,"upperStrict":false,"extractionFn":null,"ordering":{"type":"numeric"}},"granularity":{"type":"all"},"aggregations":[{"type":"count","name":"a0"}],"postAggregations":[],"limit":2147483647,"context":{"populateCache":false,"skipEmptyBuckets":false,"sqlOuterLimit":100,"sqlQueryId":"ae0500b4-4dab-40c3-8f2d-6d17a2e3e82b","useCache":false}}],
signature=[{a0:LONG}])
DruidQueryRel(query=[{"queryType":"groupBy","dataSource":{"type":"table","name":"ds"},"intervals":{"type":"intervals","intervals":["-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"]},"virtualColumns
":[],"filter":null,"granularity":{"type":"all"},"dimensions":[{"type":"default","dimension":"a","outputName":"d0","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"a0","fieldName":null,"expression":"CAST(\"a\",
'DOUBLE')"}],"postAggregations":[],"having":null,"limitSpec":{"type":"NoopLimitSpec"},"context":{"populateCache":false,"skipEmptyBuckets":false,"sqlOuterLimit":100,"sqlQueryId":"ae0500b4-4dab-40c3-8f2d-6d17a2e3e82b","useCache":false},"descending":false}],
signature=[{d0:STRING, a0:DOUBLE}])
```
In the query plan above, the _outer_ query (`DruidOuterQueryRel`) has the
virtual column with the expression specified which looks incorrect, while the
`having` in the _inner_ query (`DruidQueryRel`) is `null`. I'd think the
expression must be in the _inner_ query, with the `having` clause set.
/cc: @gianm @clintropolis @jihoonson. Thanks!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]