[incubator-druid] branch master updated: Remove SQL experimental banner and other doc adjustments. (#7591)

fjy Mon, 06 May 2019 12:32:47 -0700

This is an automated email from the ASF dual-hosted git repository.

fjy pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-druid.git



The following commit(s) were added to refs/heads/master by this push:
     new 727b65c  Remove SQL experimental banner and other doc adjustments. 
(#7591)
727b65c is described below

commit 727b65c7e5ce536daf2fff58372228d80fd61078
Author: Gian Merlino <[email protected]>
AuthorDate: Mon May 6 12:31:51 2019 -0700

    Remove SQL experimental banner and other doc adjustments. (#7591)
    
    * Remove SQL experimental banner and other doc adjustments.
    
    Also,
    
    - Adjust the ToC and other docs a bit so SQL and native queries are
      presented on more equal footing.
    - De-emphasize querying historicals and peons directly in the
      native query docs. This is a really niche thing and may have been
      confusing to include prominently in the very first paragraph.
    - Remove DataSketches and Kafka indexing service from the experimental
      features ToC. They are not experimental any longer and were there in
      error.
    
    * More notes.
    
    * Slight tweak.
    
    * Remove extra extra word.
    
    * Remove RT node from ToC.
---
 docs/content/development/experimental.md | 19 +++++-----
 docs/content/development/router.md       |  5 +++
 docs/content/querying/aggregations.md    | 17 ++++-----
 docs/content/querying/lookups.md         | 11 ++++++
 docs/content/querying/querying.md        | 33 +++++++++-------
 docs/content/querying/select-query.md    | 16 ++++----
 docs/content/querying/sql.md             | 12 +++---
 docs/content/toc.md                      | 64 +++++++++++++++-----------------
 8 files changed, 99 insertions(+), 78 deletions(-)

diff --git a/docs/content/development/experimental.md 
b/docs/content/development/experimental.md
index adf4e24..eb3c051 100644
--- a/docs/content/development/experimental.md
+++ b/docs/content/development/experimental.md
@@ -24,16 +24,15 @@ title: "Experimental Features"
 
 # Experimental Features
 
-Experimental features are features we have developed but have not fully tested 
in a production environment. If you choose to try them out, there will likely 
be edge cases that we have not covered. We would love feedback on any of these 
features, whether they are bug reports, suggestions for improvement, or letting 
us know they work as intended.
+Features often start out in "experimental" status that indicates they are 
still evolving.
+This can mean any of the following things:
 
-<div class="note caution">
-APIs for experimental features may change in backwards incompatible ways.
-</div>
+1. The feature's API may change even in minor releases or patch releases.
+2. The feature may have known "missing" pieces that will be added later.
+3. The feature may or may not have received full battle-testing in production 
environments.
 
-To enable experimental features, include their artifacts in the configuration 
runtime.properties file, e.g.,
+All experimental features are optional.
 
-```
-druid.extensions.loadList=["druid-histogram"]
-```
-
-The configuration files for all the Apache Druid (incubating) processes need 
to be updated with this.
+Note that not all of these points apply to every experimental feature. Some 
have been battle-tested in terms of
+implementation, but are still marked experimental due to an evolving API. 
Please check the documentation for each
+feature for full details.
diff --git a/docs/content/development/router.md 
b/docs/content/development/router.md
index 3c8f3b7..11508ac 100644
--- a/docs/content/development/router.md
+++ b/docs/content/development/router.md
@@ -24,6 +24,11 @@ title: "Router Process"
 
 # Router Process
 
+<div class="note info">
+The Router is an optional and <a 
href="../development/experimental.html">experimental</a> feature due to the 
fact that its recommended place in the Druid cluster architecture is still 
evolving.
+However, it has been battle-tested in production, and it hosts the powerful 
[Druid Console](../operations/management-uis.html#druid-console), so you should 
feel safe deploying it.
+</div>
+
 The Apache Druid (incubating) Router process can be used to route queries to 
different Broker processes. By default, the broker routes queries based on how 
[Rules](../operations/rule-configuration.html) are set up. For example, if 1 
month of recent data is loaded into a `hot` cluster, queries that fall within 
the recent month can be routed to a dedicated set of brokers. Queries outside 
this range are routed to another set of brokers. This set up provides query 
isolation such that queries [...]
 
 For query routing purposes, you should only ever need the Router process if 
you have a Druid cluster well into the terabyte range. 
diff --git a/docs/content/querying/aggregations.md 
b/docs/content/querying/aggregations.md
index 23b333f..a204720 100644
--- a/docs/content/querying/aggregations.md
+++ b/docs/content/querying/aggregations.md
@@ -279,21 +279,19 @@ The [DataSketches HLL 
Sketch](../development/extensions-core/datasketches-hll.ht
 
 Compared to the Theta sketch, the HLL sketch does not support set operations 
and has slightly slower update and merge speed, but requires significantly less 
space.
 
-#### Cardinality/HyperUnique (Deprecated)
+#### Cardinality, hyperUnique
 
-<div class="note caution">
-The Cardinality and HyperUnique aggregators are deprecated.
+<div class="note info">
 For new use cases, we recommend evaluating <a 
href="../development/extensions-core/datasketches-theta.html">DataSketches 
Theta Sketch</a> or <a 
href="../development/extensions-core/datasketches-hll.html">DataSketches HLL 
Sketch</a> instead.
-For existing users, we recommend evaluating the newer DataSketches aggregators 
and migrating if possible.
+The DataSketches aggregators are generally able to offer more flexibility and 
better accuracy than the classic Druid `cardinality` and `hyperUnique` 
aggregators.
 </div>
 
 The [Cardinality and HyperUnique](../querying/hll-old.html) aggregators are 
older aggregator implementations available by default in Druid that also 
provide distinct count estimates using the HyperLogLog algorithm. The newer 
DataSketches Theta and HLL extension-provided aggregators described above have 
superior accuracy and performance and are recommended instead. 
 
-The DataSketches team has published a [comparison 
study](https://datasketches.github.io/docs/HLL/HllSketchVsDruidHyperLogLogCollector.html)
 between Druid's original HLL algorithm and the DataSketches HLL algorithm. 
Based on the demonstrated advantages of the DataSketches implementation, we 
have deprecated Druid's original HLL aggregator.
-
-Please note that `hyperUnique` aggregators are not mutually compatible with 
Datasketches HLL or Theta sketches. 
+The DataSketches team has published a [comparison 
study](https://datasketches.github.io/docs/HLL/HllSketchVsDruidHyperLogLogCollector.html)
 between Druid's original HLL algorithm and the DataSketches HLL algorithm. 
Based on the demonstrated advantages of the DataSketches implementation, we are 
recommending using them in preference to Druid's original HLL-based aggregators.
+However, to ensure backwards compatibility, we will continue to support the 
classic aggregators.
 
-Although deprecated, we will continue to support the older 
Cardinality/HyperUnique aggregators for backwards compatibility. 
+Please note that `hyperUnique` aggregators are not mutually compatible with 
Datasketches HLL or Theta sketches.
 
 ##### Multi-column handling
 
@@ -326,10 +324,11 @@ The fixed buckets histogram can perform well when the 
distribution of the input
 
 We do not recommend the fixed buckets histogram for general use, as its 
usefulness is extremely data dependent. However, it is made available for users 
that have already identified use cases where a fixed buckets histogram is 
suitable.
 
-#### Approximate Histogram (Deprecated)
+#### Approximate Histogram (deprecated)
 
 <div class="note caution">
 The Approximate Histogram aggregator is deprecated.
+There are a number of other quantile estimation algorithms that offer better 
performance, accuracy, and memory footprint.
 We recommend using <a 
href="../development/extensions-core/datasketches-quantiles.html">DataSketches 
Quantiles</a> instead.
 </div>
 
diff --git a/docs/content/querying/lookups.md b/docs/content/querying/lookups.md
index 68f3287..a072317 100644
--- a/docs/content/querying/lookups.md
+++ b/docs/content/querying/lookups.md
@@ -55,6 +55,17 @@ Other lookup types are available as extensions, including:
 - Globally cached lookups from local files, remote URIs, or JDBC through 
[lookups-cached-global](../development/extensions-core/lookups-cached-global.html).
 - Globally cached lookups from a Kafka topic through 
[kafka-extraction-namespace](../development/extensions-core/kafka-extraction-namespace.html).
 
+Query Syntax
+------------
+
+In [Druid SQL](sql.html), lookups can be queried using the `LOOKUP` function, 
for example:
+
+```
+SELECT LOOKUP(column_name, 'lookup-name'), COUNT(*) FROM datasource GROUP BY 1
+```
+
+In native queries, lookups can be queried with [dimension specs or extraction 
functions](dimensionspecs.html).
+
 Query Execution
 ---------------
 When executing an aggregation query involving lookups, Druid can decide to 
apply lookups either while scanning and
diff --git a/docs/content/querying/querying.md 
b/docs/content/querying/querying.md
index 3470e24..5b6e30e 100644
--- a/docs/content/querying/querying.md
+++ b/docs/content/querying/querying.md
@@ -1,6 +1,6 @@
 ---
 layout: doc_page
-title: "Querying"
+title: "Native queries"
 ---
 
 <!--
@@ -22,26 +22,28 @@ title: "Querying"
   ~ under the License.
   -->
 
-# Querying
+# Native queries
 
-Apache Druid (incubating) queries are made using an HTTP REST style request to 
queryable processes ([Broker](../design/broker.html),
-[Historical](../design/historical.html). [Peons](../design/peons.html)) that 
are running stream ingestion tasks can also accept queries. The
-query is expressed in JSON and each of these process types expose the same
-REST query interface. For normal Druid operations, queries should be issued to 
the Broker processes. Queries can be posted
-to the queryable processes like this -
+<div class="note info">
+Apache Druid (incubating) supports two query languages: [Druid SQL](sql.html) 
and native queries, which SQL queries
+are planned into, and which end users can also issue directly. This document 
describes the native query language.
+</div>
 
- ```bash
- curl -X POST '<queryable_host>:<port>/druid/v2/?pretty' -H 
'Content-Type:application/json' -H 'Accept:application/json' -d 
@<query_json_file>
- ```
+Native queries in Druid are JSON objects and are typically issued to the 
Broker or Router processes. Queries can be
+posted like this:
+
+```bash
+curl -X POST '<queryable_host>:<port>/druid/v2/?pretty' -H 
'Content-Type:application/json' -H 'Accept:application/json' -d 
@<query_json_file>
+```
  
 Druid's native query language is JSON over HTTP, although many members of the 
community have contributed different 
 [client libraries](../development/libraries.html) in other languages to query 
Druid. 
 
 The Content-Type/Accept Headers can also take 'application/x-jackson-smile'.
 
- ```bash
- curl -X POST '<queryable_host>:<port>/druid/v2/?pretty' -H 
'Content-Type:application/json' -H 'Accept:application/x-jackson-smile' -d 
@<query_json_file>
- ```
+```bash
+curl -X POST '<queryable_host>:<port>/druid/v2/?pretty' -H 
'Content-Type:application/json' -H 'Accept:application/x-jackson-smile' -d 
@<query_json_file>
+```
 
 Note: If Accept header is not provided, it defaults to value of 'Content-Type' 
header.
 
@@ -49,6 +51,11 @@ Druid's native query is relatively low level, mapping 
closely to how computation
 are designed to be lightweight and complete very quickly. This means that for 
more complex analysis, or to build 
 more complex visualizations, multiple Druid queries may be required.
 
+Even though queries are typically made to Brokers or Routers, they can also be 
accepted by
+[Historical](../design/historical.html) processes and by [Peons (task 
JVMs)](../design/peons.html)) that are running
+stream ingestion tasks. This may be valuable if you want to query results for 
specific segments that are served by
+specific processes.
+
 ## Available Queries
 
 Druid has numerous query types for various use cases. Queries are composed of 
various JSON properties and Druid has different types of queries for different 
use cases. The documentation for the various query types describe all the JSON 
properties that can be set.
diff --git a/docs/content/querying/select-query.md 
b/docs/content/querying/select-query.md
index e0b7f2e..4c7ba20 100644
--- a/docs/content/querying/select-query.md
+++ b/docs/content/querying/select-query.md
@@ -24,7 +24,15 @@ title: "Select Queries"
 
 # Select Queries
 
-Select queries return raw Apache Druid (incubating) rows and support 
pagination.
+<div class="note caution">
+We encourage you to use the [Scan query](../querying/scan-query.html) type 
rather than Select whenever possible.
+In situations involving larger numbers of segments, the Select query can have 
very high memory and performance overhead.
+The Scan query does not have this issue.
+The major difference between the two is that the Scan query does not support 
pagination.
+However, the Scan query type is able to return a virtually unlimited number of 
results even without pagination, making it unnecessary in many cases.
+</div>
+
+Select queries return raw Druid rows and support pagination.
 
 ```json
  {
@@ -41,12 +49,6 @@ Select queries return raw Apache Druid (incubating) rows and 
support pagination.
  }
 ```
 
-<div class="note info">
-Consider using the [Scan query](../querying/scan-query.html) instead of the 
Select query if you don't need pagination. 
-The Scan query returns results without pagination but is significantly more 
efficient in terms of both processing time
-and memory requirements. It is also capable of returning a virtually unlimited 
number of results.
-</div>
-
 There are several main parts to a select query:
 
 |property|description|required?|
diff --git a/docs/content/querying/sql.md b/docs/content/querying/sql.md
index 4871594..032b101 100644
--- a/docs/content/querying/sql.md
+++ b/docs/content/querying/sql.md
@@ -31,12 +31,12 @@ title: "SQL"
 
 # SQL
 
-<div class="note caution">
-Built-in SQL is an <a href="../development/experimental.html">experimental</a> 
feature. The API described here is
-subject to change.
+<div class="note info">
+Apache Druid (incubating) supports two query languages: Druid SQL and [native 
queries](querying.html), which SQL queries
+are planned into, and which end users can also issue directly. This document 
describes the SQL language.
 </div>
 
-Apache Druid (incubating) SQL is a built-in SQL layer and an alternative to 
Druid's native JSON-based query language, and is powered by a
+Druid SQL is a built-in SQL layer and an alternative to Druid's native 
JSON-based query language, and is powered by a
 parser and planner based on [Apache Calcite](https://calcite.apache.org/). 
Druid SQL translates SQL into native Druid
 queries on the query Broker (the first process you query), which are then 
passed down to data processes as native Druid
 queries. Other than the (slight) overhead of translating SQL on the Broker, 
there isn't an additional performance
@@ -125,7 +125,7 @@ Only the COUNT aggregation can accept DISTINCT.
 |`MIN(expr)`|Takes the minimum of numbers.|
 |`MAX(expr)`|Takes the maximum of numbers.|
 |`AVG(expr)`|Averages numbers.|
-|`APPROX_COUNT_DISTINCT(expr)`|Counts distinct values of expr, which can be a 
regular column or a hyperUnique column. This is always approximate, regardless 
of the value of "useApproximateCountDistinct". See also `COUNT(DISTINCT expr)`.|
+|`APPROX_COUNT_DISTINCT(expr)`|Counts distinct values of expr, which can be a 
regular column or a hyperUnique column. This is always approximate, regardless 
of the value of "useApproximateCountDistinct". This uses Druid's builtin 
"cardinality" or "hyperUnique" aggregators. See also `COUNT(DISTINCT expr)`.|
 |`APPROX_COUNT_DISTINCT_DS_HLL(expr, [lgK, tgtHllType])`|Counts distinct 
values of expr, which can be a regular column or an [HLL 
sketch](../development/extensions-core/datasketches-hll.html) column. The `lgK` 
and `tgtHllType` parameters are described in the HLL sketch documentation. This 
is always approximate, regardless of the value of 
"useApproximateCountDistinct". See also `COUNT(DISTINCT expr)`. The 
[DataSketches 
extension](../development/extensions-core/datasketches-extension.html) [...]
 |`APPROX_COUNT_DISTINCT_DS_THETA(expr, [size])`|Counts distinct values of 
expr, which can be a regular column or a [Theta 
sketch](../development/extensions-core/datasketches-theta.html) column. The 
`size` parameter is described in the Theta sketch documentation. This is always 
approximate, regardless of the value of "useApproximateCountDistinct". See also 
`COUNT(DISTINCT expr)`. The [DataSketches 
extension](../development/extensions-core/datasketches-extension.html) must be 
loaded to use [...]
 |`APPROX_QUANTILE(expr, probability, [resolution])`|Computes approximate 
quantiles on numeric or 
[approxHistogram](../development/extensions-core/approximate-histograms.html#approximate-histogram-aggregator)
 exprs. The "probability" should be between 0 and 1 (exclusive). The 
"resolution" is the number of centroids to use for the computation. Higher 
resolutions will give more precise results but also have higher overhead. If 
not provided, the default resolution is 50. The [approximate his [...]
@@ -133,6 +133,8 @@ Only the COUNT aggregation can accept DISTINCT.
 |`APPROX_QUANTILE_FIXED_BUCKETS(expr, probability, numBuckets, lowerLimit, 
upperLimit, [outlierHandlingMode])`|Computes approximate quantiles on numeric 
or [fixed buckets 
histogram](../development/extensions-core/approximate-histograms.html#fixed-buckets-histogram)
 exprs. The "probability" should be between 0 and 1 (exclusive). The 
`numBuckets`, `lowerLimit`, `upperLimit`, and `outlierHandlingMode` parameters 
are described in the fixed buckets histogram documentation. The [approximate hi 
[...]
 |`BLOOM_FILTER(expr, numEntries)`|Computes a bloom filter from values produced 
by `expr`, with `numEntries` maximum number of distinct values before false 
positve rate increases. See [bloom filter 
extension](../development/extensions-core/bloom-filter.html) documentation for 
additional details.|
 
+For advice on choosing approximate aggregation functions, check out our 
[approximate aggregations documentation](aggregations.html#approx).
+
 ### Numeric functions
 
 Numeric functions will return 64 bit integers or 64 bit floats, depending on 
their inputs.
diff --git a/docs/content/toc.md b/docs/content/toc.md
index 0acf7d0..76ef297 100644
--- a/docs/content/toc.md
+++ b/docs/content/toc.md
@@ -70,32 +70,34 @@ layout: toc
   * [Misc. Tasks](/docs/VERSION/ingestion/misc-tasks.html)
 
 ## Querying
-  * [Overview](/docs/VERSION/querying/querying.html)
-  * [Timeseries](/docs/VERSION/querying/timeseriesquery.html)
-  * [TopN](/docs/VERSION/querying/topnquery.html)
-  * [GroupBy](/docs/VERSION/querying/groupbyquery.html)
-  * [Time Boundary](/docs/VERSION/querying/timeboundaryquery.html)
-  * [Segment Metadata](/docs/VERSION/querying/segmentmetadataquery.html)
-  * [DataSource Metadata](/docs/VERSION/querying/datasourcemetadataquery.html)
-  * [Search](/docs/VERSION/querying/searchquery.html)
-  * [Select](/docs/VERSION/querying/select-query.html)
-  * [Scan](/docs/VERSION/querying/scan-query.html)
-  * Components
-    * [Datasources](/docs/VERSION/querying/datasource.html)
-    * [Filters](/docs/VERSION/querying/filters.html)
-    * [Aggregations](/docs/VERSION/querying/aggregations.html)
-    * [Post Aggregations](/docs/VERSION/querying/post-aggregations.html)
-    * [Granularities](/docs/VERSION/querying/granularities.html)
-    * [DimensionSpecs](/docs/VERSION/querying/dimensionspecs.html)
-    * [Context](/docs/VERSION/querying/query-context.html)
-  * [Multi-value 
dimensions](/docs/VERSION/querying/multi-value-dimensions.html)
-  * [SQL](/docs/VERSION/querying/sql.html)
-  * [Lookups](/docs/VERSION/querying/lookups.html)
-  * [Joins](/docs/VERSION/querying/joins.html)
-  * [Multitenancy](/docs/VERSION/querying/multitenancy.html)
-  * [Caching](/docs/VERSION/querying/caching.html)
-  * [Sorting Orders](/docs/VERSION/querying/sorting-orders.html)
-  * [Virtual Columns](/docs/VERSION/querying/virtual-columns.html)
+  * [Druid SQL](/docs/VERSION/querying/sql.html)
+  * [Native queries](/docs/VERSION/querying/querying.html)
+    * [Timeseries](/docs/VERSION/querying/timeseriesquery.html)
+    * [TopN](/docs/VERSION/querying/topnquery.html)
+    * [GroupBy](/docs/VERSION/querying/groupbyquery.html)
+    * [Time Boundary](/docs/VERSION/querying/timeboundaryquery.html)
+    * [Segment Metadata](/docs/VERSION/querying/segmentmetadataquery.html)
+    * [DataSource 
Metadata](/docs/VERSION/querying/datasourcemetadataquery.html)
+    * [Search](/docs/VERSION/querying/searchquery.html)
+    * [Scan](/docs/VERSION/querying/scan-query.html)
+    * [Select](/docs/VERSION/querying/select-query.html)
+    * Components
+      * [Datasources](/docs/VERSION/querying/datasource.html)
+      * [Filters](/docs/VERSION/querying/filters.html)
+      * [Aggregations](/docs/VERSION/querying/aggregations.html)
+      * [Post Aggregations](/docs/VERSION/querying/post-aggregations.html)
+      * [Granularities](/docs/VERSION/querying/granularities.html)
+      * [DimensionSpecs](/docs/VERSION/querying/dimensionspecs.html)
+      * [Sorting Orders](/docs/VERSION/querying/sorting-orders.html)
+      * [Virtual Columns](/docs/VERSION/querying/virtual-columns.html)
+      * [Context](/docs/VERSION/querying/query-context.html)
+  * Concepts
+    * [Multi-value 
dimensions](/docs/VERSION/querying/multi-value-dimensions.html)
+    * [Lookups](/docs/VERSION/querying/lookups.html)
+    * [Joins](/docs/VERSION/querying/joins.html)
+    * [Multitenancy](/docs/VERSION/querying/multitenancy.html)
+    * [Caching](/docs/VERSION/querying/caching.html)
+    * [Geographic Queries](/docs/VERSION/development/geo.html) (experimental)
 
 ## Design
   * [Overview](/docs/VERSION/design/index.html)
@@ -108,7 +110,7 @@ layout: toc
     * [Historical](/docs/VERSION/design/historical.html)
     * [MiddleManager](/docs/VERSION/design/middlemanager.html)
       * [Peons](/docs/VERSION/design/peons.html)
-    * [Realtime (Deprecated)](/docs/VERSION/design/realtime.html)
+    * [Router](/docs/VERSION/development/router.html) (optional; experimental)
   * Dependencies
     * [Deep Storage](/docs/VERSION/dependencies/deep-storage.html)
     * [Metadata Storage](/docs/VERSION/dependencies/metadata-storage.html)
@@ -161,13 +163,7 @@ layout: toc
   * [Build From Source](/docs/VERSION/development/build.html)
   * [Versioning](/docs/VERSION/development/versioning.html)
   * 
[Integration](/docs/VERSION/development/integrating-druid-with-other-technologies.html)
-  * Experimental Features
-    * [Overview](/docs/VERSION/development/experimental.html)
-    * [Approximate Histograms and 
Quantiles](/docs/VERSION/development/extensions-core/approximate-histograms.html)
-    * 
[Datasketches](/docs/VERSION/development/extensions-core/datasketches-extension.html)
-    * [Geographic Queries](/docs/VERSION/development/geo.html)
-    * [Router](/docs/VERSION/development/router.html)
-    * [Kafka Indexing 
Service](/docs/VERSION/development/extensions-core/kafka-ingestion.html)
+  * [Experimental Features](/docs/VERSION/development/experimental.html)
 
 ## Misc
   * [Druid Expressions Language](/docs/VERSION/misc/math-expr.html)


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[incubator-druid] branch master updated: Remove SQL experimental banner and other doc adjustments. (#7591)

Reply via email to