This is an automated email from the ASF dual-hosted git repository.
jihoonson pushed a commit to branch 0.18.1
in repository https://gitbox.apache.org/repos/asf/druid.git
The following commit(s) were added to refs/heads/0.18.1 by this push:
new 5f740ed Datasource doc structure adjustments. (#9716) (#9777)
5f740ed is described below
commit 5f740ed054dc4036e751ffda7b152d5c184ca88d
Author: Jihoon Son <[email protected]>
AuthorDate: Mon Apr 27 19:44:02 2020 -0700
Datasource doc structure adjustments. (#9716) (#9777)
- Reorder both the datasource and query-execution page orderings to
table, lookup, union, inline, query, join. (Roughly increasing order
of conceptual "fanciness".)
- Add more crosslinks from datasource page to query-execution page:
one per datasource type.
Co-authored-by: Gian Merlino <[email protected]>
---
docs/querying/datasource.md | 130 ++++++++++++++++++++-------------------
docs/querying/query-execution.md | 28 +++++----
2 files changed, 85 insertions(+), 73 deletions(-)
diff --git a/docs/querying/datasource.md b/docs/querying/datasource.md
index 653757b..c522f5f 100644
--- a/docs/querying/datasource.md
+++ b/docs/querying/datasource.md
@@ -107,6 +107,70 @@ To see a list of all lookup datasources, use the SQL query
> `LOOKUP` function can defer evaluation until after an aggregation phase.
> This means that the `LOOKUP` function is
> usually faster than joining to a lookup datasource.
+Refer to the [Query execution](query-execution.md#table) page for more details
on how queries are executed when you
+use table datasources.
+
+### `union`
+
+<!--DOCUSAURUS_CODE_TABS-->
+<!--Native-->
+```json
+{
+ "queryType": "scan",
+ "dataSource": {
+ "type": "union",
+ "dataSources": ["<tableDataSourceName1>", "<tableDataSourceName2>",
"<tableDataSourceName3>"]
+ },
+ "columns": ["column1", "column2"],
+ "intervals": ["0000/3000"]
+}
+```
+<!--END_DOCUSAURUS_CODE_TABS-->
+
+Union datasources allow you to treat two or more table datasources as a single
datasource. The datasources being unioned
+do not need to have identical schemas. If they do not fully match up, then
columns that exist in one table but not
+another will be treated as if they contained all null values in the tables
where they do not exist.
+
+Union datasources are not available in Druid SQL.
+
+Refer to the [Query execution](query-execution.md#union) page for more details
on how queries are executed when you
+use union datasources.
+
+### `inline`
+
+<!--DOCUSAURUS_CODE_TABS-->
+<!--Native-->
+```json
+{
+ "queryType": "scan",
+ "dataSource": {
+ "type": "inline",
+ "columnNames": ["country", "city"],
+ "rows": [
+ ["United States", "San Francisco"],
+ ["Canada", "Calgary"]
+ ]
+ },
+ "columns": ["country", "city"],
+ "intervals": ["0000/3000"]
+}
+```
+<!--END_DOCUSAURUS_CODE_TABS-->
+
+Inline datasources allow you to query a small amount of data that is embedded
in the query itself. They are useful when
+you want to write a query on a small amount of data without loading it first.
They are also useful as inputs into a
+[join](#join). Druid also uses them internally to handle subqueries that need
to be inlined on the Broker. See the
+[`query` datasource](#query) documentation for more details.
+
+There are two fields in an inline datasource: an array of `columnNames` and an
array of `rows`. Each row is an array
+that must be exactly as long as the list of `columnNames`. The first element
in each row corresponds to the first
+column in `columnNames`, and so on.
+
+Inline datasources are not available in Druid SQL.
+
+Refer to the [Query execution](query-execution.md#inline) page for more
details on how queries are executed when you
+use inline datasources.
+
### `query`
<!--DOCUSAURUS_CODE_TABS-->
@@ -157,8 +221,8 @@ Query datasources allow you to issue subqueries. In native
queries, they can app
> Performance tip: In most cases, subquery results are fully buffered in
> memory on the Broker and then further
> processing occurs on the Broker itself. This means that subqueries with
> large result sets can cause performance
-> bottlenecks or run into memory usage limits on the Broker. See the [Query
execution](query-execution.md) documentation
-> for more details on how subqueries are executed and what limits will apply.
+> bottlenecks or run into memory usage limits on the Broker. See the [Query
execution](query-execution.md#query)
+> page for more details on how subqueries are executed and what limits will
apply.
### `join`
@@ -210,8 +274,8 @@ other than the leftmost "base" table must fit in memory. It
also means that the
feature is intended mainly to allow joining regular Druid tables with
[lookup](#lookup), [inline](#inline), and
[query](#query) datasources.
-For information about how Druid executes queries involving joins, refer to the
-[Query execution](query-execution.html#join) page.
+Refer to the [Query execution](query-execution.md#join) page for more details
on how queries are executed when you
+use join datasources.
#### Joins in SQL
@@ -279,61 +343,3 @@ future versions:
always be correct.
- Performance-related optimizations as mentioned in the [previous
section](#join-performance).
- Join algorithms other than broadcast hash-joins.
-
-### `union`
-
-<!--DOCUSAURUS_CODE_TABS-->
-<!--Native-->
-```json
-{
- "queryType": "scan",
- "dataSource": {
- "type": "union",
- "dataSources": ["<tableDataSourceName1>", "<tableDataSourceName2>",
"<tableDataSourceName3>"]
- },
- "columns": ["column1", "column2"],
- "intervals": ["0000/3000"]
-}
-```
-<!--END_DOCUSAURUS_CODE_TABS-->
-
-Union datasources allow you to treat two or more table datasources as a single
datasource. The datasources being unioned
-do not need to have identical schemas. If they do not fully match up, then
columns that exist in one table but not
-another will be treated as if they contained all null values in the tables
where they do not exist.
-
-Union datasources are not available in Druid SQL.
-
-Refer to the [Query execution](query-execution.md#union) documentation for
more details on how union datasources
-are executed.
-
-### `inline`
-
-<!--DOCUSAURUS_CODE_TABS-->
-<!--Native-->
-```json
-{
- "queryType": "scan",
- "dataSource": {
- "type": "inline",
- "columnNames": ["country", "city"],
- "rows": [
- ["United States", "San Francisco"],
- ["Canada", "Calgary"]
- ]
- },
- "columns": ["country", "city"],
- "intervals": ["0000/3000"]
-}
-```
-<!--END_DOCUSAURUS_CODE_TABS-->
-
-Inline datasources allow you to query a small amount of data that is embedded
in the query itself. They are useful when
-you want to write a query on a small amount of data without loading it first.
They are also useful as inputs into a
-[join](#join). Druid also uses them internally to handle subqueries that need
to be inlined on the Broker. See the
-[`query` datasource](#query) documentation for more details.
-
-There are two fields in an inline datasource: an array of `columnNames` and an
array of `rows`. Each row is an array
-that must be exactly as long as the list of `columnNames`. The first element
in each row corresponds to the first
-column in `columnNames`, and so on.
-
-Inline datasources are not available in Druid SQL.
diff --git a/docs/querying/query-execution.md b/docs/querying/query-execution.md
index 2b87429..811a6ef 100644
--- a/docs/querying/query-execution.md
+++ b/docs/querying/query-execution.md
@@ -62,6 +62,23 @@ Queries that operate directly on [lookup
datasources](datasource.md#lookup) (wit
that received the query, using its local copy of the lookup. All registered
lookup tables are preloaded in-memory on the
Broker. The query runs single-threaded.
+Execution of queries that use lookups as right-hand inputs to a join are
executed in a way that depends on their
+"base" (bottom-leftmost) datasource, as described in the [join](#join) section
below.
+
+### `union`
+
+Queries that operate directly on [union datasources](datasource.md#union) are
split up on the Broker into a separate
+query for each table that is part of the union. Each of these queries runs
separately, and the Broker merges their
+results together.
+
+### `inline`
+
+Queries that operate directly on [inline datasources](datasource.md#inline)
are executed on the Broker that received the
+query. The query runs single-threaded.
+
+Execution of queries that use inline datasources as right-hand inputs to a
join are executed in a way that depends on
+their "base" (bottom-leftmost) datasource, as described in the [join](#join)
section below.
+
### `query`
[Query datasources](datasource.md#query) are subqueries. Each subquery is
executed as if it was its own query and
@@ -99,14 +116,3 @@ lookups do not require new hash tables to be built (because
they are preloaded),
5. Query execution proceeds again using the same structure that the base
datasource would use on its own, with one
addition: while processing the base datasource, Druid servers will use the
hash tables built from the other join inputs
to produce the join result row-by-row, and query engines will operate on the
joined rows rather than the base rows.
-
-### `union`
-
-Queries that operate directly on [union datasources](datasource.md#union) are
split up on the Broker into a separate
-query for each table that is part of the union. Each of these queries runs
separately, and the Broker merges their
-results together.
-
-### `inline`
-
-Queries that operate directly on [inline datasources](datasource.md#inline)
are executed on the Broker that received the
-query. The query runs single-threaded.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]