adarshsanjeev opened a new issue, #13950:
URL: https://github.com/apache/druid/issues/13950

   While resolving some test failures, I noticed that there is a small 
discrepancy in how APPROX_COUNT_DISTINCT_BUILTIN works. This only occurs if 
`druid.generic.useDefaultValueForNull` is false and there are null values 
present in the segment queried.
   
   For a datasource `foo` which contains the following values in dim2
   
   ```
   "a"
   null
   ""
   "a"
   "abc"
   null
   ```
   On running a query `SELECT dim2, APPROX_COUNT_DISTINCT_BUILTIN(dim2) FROM 
druid.foo GROUP BY 1`, we get the following results:
   Native:
   ```
   null, 0L
   "", 1L
   "a", 1L
   "abc", 1L
   ```
   MSQ:
   ```
   null, 0L
   "", 0L
   "a", 1L
   "abc", 1L
   ```
   MSQ seems to ignore the empty string in the same way as null, while native 
seems to have the correct behaviour. A change might need to be made to bring 
MSQ in line.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to