abhishekrb19 opened a new issue #10282:
URL: https://github.com/apache/druid/issues/10282


   A SQL sub-query with an expression specified in the inner query's HAVING 
sub-clause returns no data. Easy to reproduce, please see below for more 
details.
   
   ### Affected Version
   
   0.18.1
   ### Description
   
   A minimal example to reproduce the issue easily -- let's consider a single 
dimension "a" in a data source "ds", the problem can be reproduced with the 
following query:
   
   **Query-1**: Sub-query:
   ```
   SELECT COUNT(*)
   FROM (
       SELECT a
       FROM ds
       GROUP BY a
       HAVING SUM(a) + 1 > 0
    )
   ```
   returns no data.
   
   **Query-2**: The inner sub-query in itself returns results as expected:
   ```
       SELECT a
       FROM ds
       GROUP BY a
       HAVING SUM(a) + 1 > 0
   ```
   
   returns the following results:
   ```
   a2
   a3
   a6
   a7
   ```
   The expression specified in the `HAVING` clause in the sub-query appears to 
be the offending one. Note that without the expression, the query returns the 
correct results with 4 rows.
   
   Further, looking into the query plan for Query-1, we have:
   ```
   
DruidOuterQueryRel(query=[{"queryType":"timeseries","dataSource":{"type":"table","name":"__subquery__"},"intervals":{"type":"intervals","intervals":["-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"]},"descending":false,"virtualColumns":[{"type":"expression","name":"v0","expression":"(\"$f1\"
 + 
1)","outputType":"DOUBLE"}],"filter":{"type":"bound","dimension":"v0","lower":"0","upper":null,"lowerStrict":true,"upperStrict":false,"extractionFn":null,"ordering":{"type":"numeric"}},"granularity":{"type":"all"},"aggregations":[{"type":"count","name":"a0"}],"postAggregations":[],"limit":2147483647,"context":{"populateCache":false,"skipEmptyBuckets":false,"sqlOuterLimit":100,"sqlQueryId":"ae0500b4-4dab-40c3-8f2d-6d17a2e3e82b","useCache":false}}],
 signature=[{a0:LONG}]) 
DruidQueryRel(query=[{"queryType":"groupBy","dataSource":{"type":"table","name":"ds"},"intervals":{"type":"intervals","intervals":["-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"]},"virtualColumns
 
":[],"filter":null,"granularity":{"type":"all"},"dimensions":[{"type":"default","dimension":"a","outputName":"d0","outputType":"STRING"}],"aggregations":[{"type":"doubleSum","name":"a0","fieldName":null,"expression":"CAST(\"a\",
 
'DOUBLE')"}],"postAggregations":[],"having":null,"limitSpec":{"type":"NoopLimitSpec"},"context":{"populateCache":false,"skipEmptyBuckets":false,"sqlOuterLimit":100,"sqlQueryId":"ae0500b4-4dab-40c3-8f2d-6d17a2e3e82b","useCache":false},"descending":false}],
 signature=[{d0:STRING, a0:DOUBLE}])
   ```
   
   In the query plan above, the _outer_ query (`DruidOuterQueryRel`) has the 
virtual column with the expression specified which looks incorrect, while the 
`having` in the _inner_ query (`DruidQueryRel`) is `null`.  I'd think the 
expression must be in the _inner_ query, with the `having` clause set.
   
   /cc: @gianm @clintropolis @jihoonson. Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to