edgar2020 commented on code in PR #16953: URL: https://github.com/apache/druid/pull/16953#discussion_r1757720689
########## docs/tutorials/tutorial-sketches-theta.md: ########## @@ -209,36 +137,23 @@ Let's first see what the data looks like in Druid. Run the following SQL stateme SELECT * FROM ts_tutorial ``` - + The Theta sketch column `theta_uid` appears as a Base64-encoded string; behind it is a bitmap. -The following query to compute the distinct counts of user IDs uses `APPROX_COUNT_DISTINCT_DS_THETA` and groups by the other dimensions: -```sql -SELECT __time, - "show", - "episode", - APPROX_COUNT_DISTINCT_DS_THETA(theta_uid) AS users -FROM ts_tutorial -GROUP BY 1, 2, 3 -``` - - - -In the preceding query, `APPROX_COUNT_DISTINCT_DS_THETA` is equivalent to calling `DS_THETA` and `THETA_SKETCH_ESIMATE` as follows: +The following query uses `THETA_SKETCH_ESTIMATE` to compute the distinct counts of user IDs and groups by the other dimensions: ```sql -SELECT __time, - "show", - "episode", - THETA_SKETCH_ESTIMATE(DS_THETA(theta_uid)) AS users -FROM ts_tutorial -GROUP BY 1, 2, 3 +SELECT + __time, + "show", + "episode", + THETA_SKETCH_ESTIMATE(theta_uid) AS users +FROM ts_tutorial +GROUP BY 1, 2, 3, 4 Review Comment: You are correct. It appears that the GROUP BY is unnecessary because it runs just fine and gives the same output compared to if it did have it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
