This is an automated email from the ASF dual-hosted git repository.
techdocsmith pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git
The following commit(s) were added to refs/heads/master by this push:
new 02ad62a08c Docs: update description of query priority default value
(#13191)
02ad62a08c is described below
commit 02ad62a08cd245da62f2625dced17ae2a55224cc
Author: Victoria Lim <[email protected]>
AuthorDate: Fri Oct 14 14:28:04 2022 -0700
Docs: update description of query priority default value (#13191)
* update description of default for query priority
* update order
* update terms
* standardize to query context parameters
---
docs/configuration/index.md | 6 +--
docs/misc/math-expr.md | 4 +-
docs/querying/query-context.md | 94 +++++++++++++++++++++---------------------
3 files changed, 52 insertions(+), 52 deletions(-)
diff --git a/docs/configuration/index.md b/docs/configuration/index.md
index 14192fe0d6..cb826ef348 100644
--- a/docs/configuration/index.md
+++ b/docs/configuration/index.md
@@ -2045,9 +2045,9 @@ This section describes configurations that control
behavior of Druid's query typ
### Overriding default query context values
-Any [Query Context General
Parameter](../querying/query-context.md#general-parameters) default value can be
-overridden by setting runtime property in the format of
`druid.query.default.context.{query_context_key}`.
-`druid.query.default.context.{query_context_key}` runtime property prefix
applies to all current and future
+Any [query context general
parameter](../querying/query-context.md#general-parameters) default value can be
+overridden by setting the runtime property in the format of
`druid.query.default.context.{query_context_key}`.
+The `druid.query.default.context.{query_context_key}` runtime property prefix
applies to all current and future
query context keys, the same as how query context parameter passed with the
query works. Note that the runtime property
value can be overridden if value for the same key is explicitly specify in the
query contexts.
diff --git a/docs/misc/math-expr.md b/docs/misc/math-expr.md
index 27bddb37d0..810bdd1df3 100644
--- a/docs/misc/math-expr.md
+++ b/docs/misc/math-expr.md
@@ -289,10 +289,10 @@ For the IPv4 address functions, the `address` argument
accepts either an IPv4 do
| human_readable_decimal_format(value[, precision]) | Format a number in
human-readable SI format. `precision` must be in the range of [0,3] (default:
2). For example:<li>human_readable_decimal_format(1000000) returns `1.00
M`</li><li>human_readable_decimal_format(1000000, 3) returns `1.000 M`</li> |
-## Vectorization Support
+## Vectorization support
A number of expressions support ['vectorized' query
engines](../querying/query-context.md#vectorization-parameters)
-supported features:
+Supported features:
* constants and identifiers are supported for any column type
* `cast` is supported for numeric and string types
* math operators: `+`,`-`,`*`,`/`,`%`,`^` are supported for numeric types
diff --git a/docs/querying/query-context.md b/docs/querying/query-context.md
index 5a4b2fc641..0c086c09d3 100644
--- a/docs/querying/query-context.md
+++ b/docs/querying/query-context.md
@@ -26,9 +26,9 @@ sidebar_label: "Query context"
The query context is used for various query configuration parameters. Query
context parameters can be specified in
the following ways:
-- For [Druid SQL](sql-api.md), context parameters are provided either as a
JSON object named `context` to the
+- For [Druid SQL](sql-api.md), context parameters are provided either in a
JSON object named `context` to the
HTTP POST API, or as properties to the JDBC connection.
-- For [native queries](querying.md), context parameters are provided as a JSON
object named `context`.
+- For [native queries](querying.md), context parameters are provided in a JSON
object named `context`.
Note that setting query context will override both the default value and the
runtime properties value in the format of
`druid.query.default.context.{property_key}` (if set).
@@ -37,34 +37,34 @@ Note that setting query context will override both the
default value and the run
Unless otherwise noted, the following parameters apply to all query types.
-|property |default | description
|
-|-----------------|----------------------------------------|----------------------|
-|timeout | `druid.server.http.defaultQueryTimeout`| Query timeout in
millis, beyond which unfinished queries will be cancelled. 0 timeout means `no
timeout` (up to the server-side maximum query timeout,
`druid.server.http.maxQueryTimeout`). To set the default timeout and maximum
timeout, see [Broker configuration](../configuration/index.md#broker) |
-|priority | `0` | Query Priority.
Queries with higher priority get precedence for computational resources.|
-|lane | `null` | Query lane, used
to control usage limits on classes of queries. See [Broker
configuration](../configuration/index.md#broker) for more details.|
-|queryId | auto-generated | Unique identifier
given to this query. If a query ID is set or known, this can be used to cancel
the query |
-|brokerService | `null` | Broker service to
which this query should be routed. This parameter is honored only by a broker
selector strategy of type *manual*. See [Router
strategies](../design/router.md#router-strategies) for more details.|
-|useCache | `true` | Flag indicating
whether to leverage the query cache for this query. When set to false, it
disables reading from the query cache for this query. When set to true, Apache
Druid uses `druid.broker.cache.useCache` or `druid.historical.cache.useCache`
to determine whether or not to read from the query cache |
-|populateCache | `true` | Flag indicating
whether to save the results of the query to the query cache. Primarily used for
debugging. When set to false, it disables saving the results of this query to
the query cache. When set to true, Druid uses
`druid.broker.cache.populateCache` or `druid.historical.cache.populateCache` to
determine whether or not to save the results of this query to the query cache |
-|useResultLevelCache | `true` | Flag indicating
whether to leverage the result level cache for this query. When set to false,
it disables reading from the query cache for this query. When set to true,
Druid uses `druid.broker.cache.useResultLevelCache` to determine whether or not
to read from the result-level query cache |
-|populateResultLevelCache | `true` | Flag indicating
whether to save the results of the query to the result level cache. Primarily
used for debugging. When set to false, it disables saving the results of this
query to the query cache. When set to true, Druid uses
`druid.broker.cache.populateResultLevelCache` to determine whether or not to
save the results of this query to the result-level query cache |
-|bySegment | `false` | Native queries
only. Return "by segment" results. Primarily used for debugging, setting it to
`true` returns results associated with the data segment they came from |
-|finalize | `true` | Flag indicating
whether to "finalize" aggregation results. Primarily used for debugging. For
instance, the `hyperUnique` aggregator will return the full HyperLogLog sketch
instead of the estimated cardinality when this flag is set to `false` |
-|maxScatterGatherBytes| `druid.server.http.maxScatterGatherBytes` | Maximum
number of bytes gathered from data processes such as Historicals and realtime
processes to execute a query. This parameter can be used to further reduce
`maxScatterGatherBytes` limit at query time. See [Broker
configuration](../configuration/index.md#broker) for more details.|
-|maxQueuedBytes | `druid.broker.http.maxQueuedBytes` | Maximum
number of bytes queued per query before exerting backpressure on the channel to
the data server. Similar to `maxScatterGatherBytes`, except unlike that
configuration, this one will trigger backpressure rather than query failure.
Zero means disabled.|
-|serializeDateTimeAsLong| `false` | If true, DateTime is serialized as
long in the result returned by Broker and the data transportation between
Broker and compute process|
-|serializeDateTimeAsLongInner| `false` | If true, DateTime is serialized as
long in the data transportation between Broker and compute process|
-|enableParallelMerge|`true`|Enable parallel result merging on the Broker. Note
that `druid.processing.merge.useParallelMergePool` must be enabled for this
setting to be set to `true`. See [Broker
configuration](../configuration/index.md#broker) for more details.|
-|parallelMergeParallelism|`druid.processing.merge.pool.parallelism`|Maximum
number of parallel threads to use for parallel result merging on the Broker.
See [Broker configuration](../configuration/index.md#broker) for more details.|
-|parallelMergeInitialYieldRows|`druid.processing.merge.task.initialYieldNumRows`|Number
of rows to yield per ForkJoinPool merge task for parallel result merging on
the Broker, before forking off a new task to continue merging sequences. See
[Broker configuration](../configuration/index.md#broker) for more details.|
-|parallelMergeSmallBatchRows|`druid.processing.merge.task.smallBatchNumRows`|Size
of result batches to operate on in ForkJoinPool merge tasks for parallel
result merging on the Broker. See [Broker
configuration](../configuration/index.md#broker) for more details.|
-|useFilterCNF|`false`| If true, Druid will attempt to convert the query filter
to Conjunctive Normal Form (CNF). During query processing, columns can be
pre-filtered by intersecting the bitmap indexes of all values that match the
eligible filters, often greatly reducing the raw number of rows which need to
be scanned. But this effect only happens for the top level filter, or
individual clauses of a top level 'and' filter. As such, filters in CNF
potentially have a higher chance to utiliz [...]
-|secondaryPartitionPruning|`true`|Enable secondary partition pruning on the
Broker. The Broker will always prune unnecessary segments from the input scan
based on a filter on time intervals, but if the data is further partitioned
with hash or range partitioning, this option will enable additional pruning
based on a filter on secondary partition dimensions.|
-|enableJoinLeftTableScanDirect|`false`|This flag applies to queries which have
joins. For joins, where left child is a simple scan with a filter, by default,
druid will run the scan as a query and the join the results to the right child
on broker. Setting this flag to true overrides that behavior and druid will
attempt to push the join to data servers instead. Please note that the flag
could be applicable to queries even if there is no explicit join. since queries
can internally transla [...]
-|debug| `false` | Flag indicating whether to enable debugging outputs for the
query. When set to false, no additional logs will be produced (logs produced
will be entirely dependent on your logging level). When set to true, the
following addition logs will be produced:<br />- Log the stack trace of the
exception (if any) produced by the query |
-|setProcessingThreadNames|`true`| Whether processing thread names will be set
to `queryType_dataSource_intervals` while processing a query. This aids in
interpreting thread dumps, and is on by default. Query overhead can be reduced
slightly by setting this to `false`. This has a tiny effect in most scenarios,
but can be meaningful in high-QPS, low-per-segment-processing-time scenarios. |
-|maxNumericInFilters|`-1`|Max limit for the amount of numeric values that can
be compared for a string type dimension when the entire SQL WHERE clause of a
query translates only to an [OR](../querying/filters.md#or) of [Bound
filter](../querying/filters.md#bound-filter). By default, Druid does not
restrict the amount of of numeric Bound Filters on String columns, although
this situation may block other queries from running. Set this property to a
smaller value to prevent Druid from runni [...]
-|inSubQueryThreshold|`2147483647`| Threshold for minimum number of values in
an IN clause to convert the query to a JOIN operation on an inlined table
rather than a predicate. A threshold of 0 forces usage of an inline table in
all cases; a threshold of [Integer.MAX_VALUE] forces usage of OR in all cases. |
+|Parameter |Default | Description
|
+|-------------------|----------------------------------------|----------------------|
+|`timeout` | `druid.server.http.defaultQueryTimeout`| Query timeout
in millis, beyond which unfinished queries will be cancelled. 0 timeout means
`no timeout` (up to the server-side maximum query timeout,
`druid.server.http.maxQueryTimeout`). To set the default timeout and maximum
timeout, see [Broker configuration](../configuration/index.md#broker) |
+|`priority` | The default priority is one of the following:
<ul><li>Value of `priority` in the query context, if set</li><li>The value of
the runtime property `druid.query.default.context.priority`, if set and not
null</li><li>`0` if the priority is not set in the query context or runtime
properties</li></ul>| Query priority. Queries with higher priority get
precedence for computational resources.|
+|`lane` | `null` | Query lane,
used to control usage limits on classes of queries. See [Broker
configuration](../configuration/index.md#broker) for more details.|
+|`queryId` | auto-generated | Unique
identifier given to this query. If a query ID is set or known, this can be used
to cancel the query |
+|`brokerService` | `null` | Broker service
to which this query should be routed. This parameter is honored only by a
broker selector strategy of type *manual*. See [Router
strategies](../design/router.md#router-strategies) for more details.|
+|`useCache` | `true` | Flag indicating
whether to leverage the query cache for this query. When set to false, it
disables reading from the query cache for this query. When set to true, Apache
Druid uses `druid.broker.cache.useCache` or `druid.historical.cache.useCache`
to determine whether or not to read from the query cache |
+|`populateCache` | `true` | Flag indicating
whether to save the results of the query to the query cache. Primarily used for
debugging. When set to false, it disables saving the results of this query to
the query cache. When set to true, Druid uses
`druid.broker.cache.populateCache` or `druid.historical.cache.populateCache` to
determine whether or not to save the results of this query to the query cache |
+|`useResultLevelCache`| `true` | Flag indicating whether
to leverage the result level cache for this query. When set to false, it
disables reading from the query cache for this query. When set to true, Druid
uses `druid.broker.cache.useResultLevelCache` to determine whether or not to
read from the result-level query cache |
+|`populateResultLevelCache` | `true` | Flag indicating
whether to save the results of the query to the result level cache. Primarily
used for debugging. When set to false, it disables saving the results of this
query to the query cache. When set to true, Druid uses
`druid.broker.cache.populateResultLevelCache` to determine whether or not to
save the results of this query to the result-level query cache |
+|`bySegment` | `false` | Native queries
only. Return "by segment" results. Primarily used for debugging, setting it to
`true` returns results associated with the data segment they came from |
+|`finalize` | `true` | Flag indicating
whether to "finalize" aggregation results. Primarily used for debugging. For
instance, the `hyperUnique` aggregator will return the full HyperLogLog sketch
instead of the estimated cardinality when this flag is set to `false` |
+|`maxScatterGatherBytes`| `druid.server.http.maxScatterGatherBytes` | Maximum
number of bytes gathered from data processes such as Historicals and realtime
processes to execute a query. This parameter can be used to further reduce
`maxScatterGatherBytes` limit at query time. See [Broker
configuration](../configuration/index.md#broker) for more details.|
+|`maxQueuedBytes` | `druid.broker.http.maxQueuedBytes` | Maximum
number of bytes queued per query before exerting backpressure on the channel to
the data server. Similar to `maxScatterGatherBytes`, except unlike that
configuration, this one will trigger backpressure rather than query failure.
Zero means disabled.|
+|`serializeDateTimeAsLong`| `false` | If true, DateTime is serialized as
long in the result returned by Broker and the data transportation between
Broker and compute process|
+|`serializeDateTimeAsLongInner`| `false` | If true, DateTime is serialized as
long in the data transportation between Broker and compute process|
+|`enableParallelMerge`|`true`|Enable parallel result merging on the Broker.
Note that `druid.processing.merge.useParallelMergePool` must be enabled for
this setting to be set to `true`. See [Broker
configuration](../configuration/index.md#broker) for more details.|
+|`parallelMergeParallelism`|`druid.processing.merge.pool.parallelism`|Maximum
number of parallel threads to use for parallel result merging on the Broker.
See [Broker configuration](../configuration/index.md#broker) for more details.|
+|`parallelMergeInitialYieldRows`|`druid.processing.merge.task.initialYieldNumRows`|Number
of rows to yield per ForkJoinPool merge task for parallel result merging on
the Broker, before forking off a new task to continue merging sequences. See
[Broker configuration](../configuration/index.md#broker) for more details.|
+|`parallelMergeSmallBatchRows`|`druid.processing.merge.task.smallBatchNumRows`|Size
of result batches to operate on in ForkJoinPool merge tasks for parallel
result merging on the Broker. See [Broker
configuration](../configuration/index.md#broker) for more details.|
+|`useFilterCNF`|`false`| If true, Druid will attempt to convert the query
filter to Conjunctive Normal Form (CNF). During query processing, columns can
be pre-filtered by intersecting the bitmap indexes of all values that match the
eligible filters, often greatly reducing the raw number of rows which need to
be scanned. But this effect only happens for the top level filter, or
individual clauses of a top level 'and' filter. As such, filters in CNF
potentially have a higher chance to util [...]
+|`secondaryPartitionPruning`|`true`|Enable secondary partition pruning on the
Broker. The Broker will always prune unnecessary segments from the input scan
based on a filter on time intervals, but if the data is further partitioned
with hash or range partitioning, this option will enable additional pruning
based on a filter on secondary partition dimensions.|
+|`enableJoinLeftTableScanDirect`|`false`|This flag applies to queries which
have joins. For joins, where left child is a simple scan with a filter, by
default, druid will run the scan as a query and the join the results to the
right child on broker. Setting this flag to true overrides that behavior and
druid will attempt to push the join to data servers instead. Please note that
the flag could be applicable to queries even if there is no explicit join.
since queries can internally trans [...]
+|`debug`| `false` | Flag indicating whether to enable debugging outputs for
the query. When set to false, no additional logs will be produced (logs
produced will be entirely dependent on your logging level). When set to true,
the following addition logs will be produced:<br />- Log the stack trace of the
exception (if any) produced by the query |
+|`setProcessingThreadNames`|`true`| Whether processing thread names will be
set to `queryType_dataSource_intervals` while processing a query. This aids in
interpreting thread dumps, and is on by default. Query overhead can be reduced
slightly by setting this to `false`. This has a tiny effect in most scenarios,
but can be meaningful in high-QPS, low-per-segment-processing-time scenarios. |
+|`maxNumericInFilters`|`-1`|Max limit for the amount of numeric values that
can be compared for a string type dimension when the entire SQL WHERE clause of
a query translates only to an [OR](../querying/filters.md#or) of [Bound
filter](../querying/filters.md#bound-filter). By default, Druid does not
restrict the amount of of numeric Bound Filters on String columns, although
this situation may block other queries from running. Set this parameter to a
smaller value to prevent Druid from ru [...]
+|`inSubQueryThreshold`|`2147483647`| Threshold for minimum number of values in
an IN clause to convert the query to a JOIN operation on an inlined table
rather than a predicate. A threshold of 0 forces usage of an inline table in
all cases; a threshold of [Integer.MAX_VALUE] forces usage of OR in all cases. |
## Druid SQL parameters
@@ -76,25 +76,25 @@ Some query types offer context parameters specific to that
query type.
### TopN
-|property |default | description |
+|Parameter |Default | Description |
|-----------------|---------------------|----------------------|
-|minTopNThreshold | `1000` | The top minTopNThreshold local
results from each segment are returned for merging to determine the global
topN. |
+|`minTopNThreshold` | `1000` | The top minTopNThreshold local
results from each segment are returned for merging to determine the global
topN. |
### Timeseries
-|property |default | description |
+|Parameter |Default | Description |
|-----------------|---------------------|----------------------|
-|skipEmptyBuckets | `false` | Disable timeseries zero-filling
behavior, so only buckets with results will be returned. |
+|`skipEmptyBuckets` | `false` | Disable timeseries zero-filling
behavior, so only buckets with results will be returned. |
### Join filter
-|property |default | description |
+|Parameter |Default | Description |
|-----------------|---------------------|----------------------|
-|enableJoinFilterPushDown | `true` | Controls whether a join query will
attempt filter push down, which reduces the number of rows that have to be
compared in a join operation.|
-|enableJoinFilterRewrite | `true` | Controls whether filter clauses that
reference non-base table columns will be rewritten into filters on base table
columns.|
-|enableJoinFilterRewriteValueColumnFilters | `false` | Controls whether Druid
rewrites non-base table filters on non-key columns in the non-base table.
Requires a scan of the non-base table.|
-|enableRewriteJoinToFilter | `true` | Controls whether a join can be pushed
partial or fully to the base table as a filter at runtime.|
-|joinFilterRewriteMaxSize | `10000` | The maximum size of the correlated value
set used for filter rewrites. Set this limit to prevent excessive memory use.|
+|`enableJoinFilterPushDown` | `true` | Controls whether a join query will
attempt filter push down, which reduces the number of rows that have to be
compared in a join operation.|
+|`enableJoinFilterRewrite` | `true` | Controls whether filter clauses that
reference non-base table columns will be rewritten into filters on base table
columns.|
+|`enableJoinFilterRewriteValueColumnFilters` | `false` | Controls whether
Druid rewrites non-base table filters on non-key columns in the non-base table.
Requires a scan of the non-base table.|
+|`enableRewriteJoinToFilter` | `true` | Controls whether a join can be pushed
partial or fully to the base table as a filter at runtime.|
+|`joinFilterRewriteMaxSize` | `10000` | The maximum size of the correlated
value set used for filter rewrites. Set this limit to prevent excessive memory
use.|
### GroupBy
@@ -120,11 +120,11 @@ include "selector", "bound", "in", "like", "regex",
"search", "and", "or", and "
- Only immutable segments (not real-time).
- Only [table datasources](datasource.md#table) (not joins, subqueries,
lookups, or inline datasources).
-Other query types (like TopN, Scan, Select, and Search) ignore the "vectorize"
parameter, and will execute without
-vectorization. These query types will ignore the "vectorize" parameter even if
it is set to `"force"`.
+Other query types (like TopN, Scan, Select, and Search) ignore the `vectorize`
parameter, and will execute without
+vectorization. These query types will ignore the `vectorize` parameter even if
it is set to `"force"`.
-|property|default| description|
-|--------|-------|------------|
-|vectorize|`true`|Enables or disables vectorized query execution. Possible
values are `false` (disabled), `true` (enabled if possible, disabled otherwise,
on a per-segment basis), and `force` (enabled, and groupBy or timeseries
queries that cannot be vectorized will fail). The `"force"` setting is meant to
aid in testing, and is not generally useful in production (since real-time
segments can never be processed with vectorized execution, any queries on
real-time data will fail). This wil [...]
-|vectorSize|`512`|Sets the row batching size for a particular query. This will
override `druid.query.default.context.vectorSize` if it's set.|
-|vectorizeVirtualColumns|`true`|Enables or disables vectorized query
processing of queries with virtual columns, layered on top of `vectorize`
(`vectorize` must also be set to true for a query to utilize vectorization).
Possible values are `false` (disabled), `true` (enabled if possible, disabled
otherwise, on a per-segment basis), and `force` (enabled, and groupBy or
timeseries queries with virtual columns that cannot be vectorized will fail).
The `"force"` setting is meant to aid in te [...]
+|Parameter|Default| Description|
+|---------|-------|------------|
+|`vectorize`|`true`|Enables or disables vectorized query execution. Possible
values are `false` (disabled), `true` (enabled if possible, disabled otherwise,
on a per-segment basis), and `force` (enabled, and groupBy or timeseries
queries that cannot be vectorized will fail). The `"force"` setting is meant to
aid in testing, and is not generally useful in production (since real-time
segments can never be processed with vectorized execution, any queries on
real-time data will fail). This w [...]
+|`vectorSize`|`512`|Sets the row batching size for a particular query. This
will override `druid.query.default.context.vectorSize` if it's set.|
+|`vectorizeVirtualColumns`|`true`|Enables or disables vectorized query
processing of queries with virtual columns, layered on top of `vectorize`
(`vectorize` must also be set to true for a query to utilize vectorization).
Possible values are `false` (disabled), `true` (enabled if possible, disabled
otherwise, on a per-segment basis), and `force` (enabled, and groupBy or
timeseries queries with virtual columns that cannot be vectorized will fail).
The `"force"` setting is meant to aid in [...]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]