adarshsanjeev opened a new issue, #13950: URL: https://github.com/apache/druid/issues/13950
While resolving some test failures, I noticed that there is a small discrepancy in how APPROX_COUNT_DISTINCT_BUILTIN works. This only occurs if `druid.generic.useDefaultValueForNull` is false and there are null values present in the segment queried. For a datasource `foo` which contains the following values in dim2 ``` "a" null "" "a" "abc" null ``` On running a query `SELECT dim2, APPROX_COUNT_DISTINCT_BUILTIN(dim2) FROM druid.foo GROUP BY 1`, we get the following results: Native: ``` null, 0L "", 1L "a", 1L "abc", 1L ``` MSQ: ``` null, 0L "", 0L "a", 1L "abc", 1L ``` MSQ seems to ignore the empty string in the same way as null, while native seems to have the correct behaviour. A change might need to be made to bring MSQ in line. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
