benkrug opened a new issue #11197: URL: https://github.com/apache/druid/issues/11197
Some druid SQL queries including GROUP BY and COUNT return incorrect results. I'm including two examples. ### Affected Version Seen in 0.20.0 and 0.21.0. The second example below apparently doesn't reproduce in 0.18.0. I haven't tested the first on 0.18.0. ### Description Particular GROUP BY queries are returning 0 counts when they should return non-zero numbers. Providing two examples, one with wikipedia, and one with a custom data set (attached). With wikipedia, we can see the following, to zero in on the issue: This query returns various non-zero numbers for the counts: ``` select comment, count(*) FROM wikipedia WHERE channel = '#ar.wikipedia' GROUP BY comment ``` This query will return 1's, which makes sense: ``` select comment, count(distinct(channel)) FROM wikipedia WHERE channel = '#ar.wikipedia' GROUP BY comment ``` So, we have data for the channel field (which makes sense, since we filter for a channel), but the following query returns 0's for all the counts, which is not only incorrect, but logically impossible given the query: ``` select comment, count(channel) FROM wikipedia WHERE channel = '#ar.wikipedia' GROUP BY comment ``` The second example adds a twist, I'm not sure of the reason. With the given dataset (data file and spec attached), the following query also incorrectly returns all 0's: ``` SELECT ctnsappname, count(ind_id) FROM testData WHERE ind_id = 3006 AND __time >= '2000-01-01' AND __time < '3000-01-01' GROUP BY ctnsappname ``` Note that all the __time values are from 2021-01-31, so all data is included in the time filters. However (the twist), if you remove the time filters, you seem to get correct results. I didn't test with 0.18.0, but Jie Zang reported that for this query, results were correct in 0.18.0. He also noted that in 0.18.0, the explain uses a virtual column, but in 0.21.0, the virtual column entry (for v0) is empty. I saw the same in 0.20.0. [ingestdata.txt](https://github.com/apache/druid/files/6422782/ingestdata.txt) [ingestspec.txt](https://github.com/apache/druid/files/6422783/ingestspec.txt) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
