Re: [PR] 202307-notebook Count Distinct (inc HLL + Theta) (druid)

via GitHub Mon, 10 Jul 2023 01:08:04 -0700


petermarshallio commented on code in PR #14523:
URL: https://github.com/apache/druid/pull/14523#discussion_r1257879545



##########
examples/quickstart/jupyter-notebooks/notebooks/03-query/03-approxCountDistinct.ipynb:
##########
@@ -0,0 +1,470 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "557e06e8-9b35-4b34-8322-8a8ede6de709",
+   "metadata": {},
+   "source": [
+    "# Counting distinct values\n",
+    "\n",
+    "__It's extremely common for analysts to want to count unique occurences 
of some dimension value in data. With the Druid database's history of large 
volumes of data comes an advanced computer science technique to speed up this 
calculation through approximation. In this tutorial, work through some examples 
and see the effect of turning it on and off, and of making it even faster by 
pre-generating the objects that Druid uses to execute the query.__\n",

Review Comment:
   Habit as a blog intro. Removed the boldness.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] 202307-notebook Count Distinct (inc HLL + Theta) (druid)

Reply via email to