techdocsmith commented on a change in pull request #12128: URL: https://github.com/apache/druid/pull/12128#discussion_r780446061
########## File path: docs/operations/query-concurrency.md ########## @@ -0,0 +1,174 @@ +--- +id: query-concurrency +title: Query concurrency +sidebar_label: Query concurrency +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +If you frequently run concurrent, mixed workloads on your Druid cluster, configure Druid to properly allocate cluster resources and optimize your overall query performance. With proper resource isolation, you can execute long-running, low priority queries that are resource intensive without interfering with short-running, high priority queries that require fewer resources. By separating cluster resources, you prevent queries from competing with each other for resources such as CPU, memory, and network access. Review comment: I don't know if this is in the style, but recommend "Apache Druid" at first mention. I wonder if we can change the setup of the introduction a little bit to answer: - what are concurrent, mixed (heterogenerous) workloads? Can we provide an example? - what are the problems you can run into if you don't optimize (like long-running lower-priority queries will impact the perf of higher priority queries)? Then talk about how this topic talks about strategies to optimize Druid/avoid perf problems. ########## File path: docs/operations/query-concurrency.md ########## @@ -0,0 +1,174 @@ +--- +id: query-concurrency +title: Query concurrency +sidebar_label: Query concurrency +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +If you frequently run concurrent, mixed workloads on your Druid cluster, configure Druid to properly allocate cluster resources and optimize your overall query performance. With proper resource isolation, you can execute long-running, low priority queries that are resource intensive without interfering with short-running, high priority queries that require fewer resources. By separating cluster resources, you prevent queries from competing with each other for resources such as CPU, memory, and network access. + +There are two approaches to isolate your resources for improving query concurrency: query laning and cluster tiering. Use query laning to set a limit on the maximum number of long-running queries executed on each Broker. Use cluster tiering to define separate groups of Historicals and Brokers to which different queries can be directed based on their priority. + +## Query laning + +Query laning directs Druid to restrict resource usage for less urgent queries to ensure dedicated resources for higher priority queries. Query laning is ideal when you need to run many concurrent queries having heterogeneous workloads. + +Query lanes are analogous to carpool and normal lanes on the freeway. With query laning, Druid restricts low priority queries to low lanes and allows high priority queries to run wherever possible, whether in a high or low lane. In this way, higher priority queries may bypass other queries in lower priority lanes. Review comment: I like the freeway lanes analogy! Something about "low" lanes and "high" lanes feels undefined to me. Perhaps a sentence stating that you define the high-priority query lanes and low-priority query lanes? Also wonder if we can use an example here. Maybe if you set up an example in the intro, you can just refer to the same high priority/low priority example throughout? ########## File path: docs/operations/query-concurrency.md ########## @@ -0,0 +1,174 @@ +--- +id: query-concurrency +title: Query concurrency +sidebar_label: Query concurrency +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +If you frequently run concurrent, mixed workloads on your Druid cluster, configure Druid to properly allocate cluster resources and optimize your overall query performance. With proper resource isolation, you can execute long-running, low priority queries that are resource intensive without interfering with short-running, high priority queries that require fewer resources. By separating cluster resources, you prevent queries from competing with each other for resources such as CPU, memory, and network access. + +There are two approaches to isolate your resources for improving query concurrency: query laning and cluster tiering. Use query laning to set a limit on the maximum number of long-running queries executed on each Broker. Use cluster tiering to define separate groups of Historicals and Brokers to which different queries can be directed based on their priority. + +## Query laning + +Query laning directs Druid to restrict resource usage for less urgent queries to ensure dedicated resources for higher priority queries. Query laning is ideal when you need to run many concurrent queries having heterogeneous workloads. + +Query lanes are analogous to carpool and normal lanes on the freeway. With query laning, Druid restricts low priority queries to low lanes and allows high priority queries to run wherever possible, whether in a high or low lane. In this way, higher priority queries may bypass other queries in lower priority lanes. + +In Druid, query lanes reserve resources for Broker HTTP threads. Each Druid query requires one Broker thread. The number of threads on a Broker is defined by the `druid.server.http.numThreads` parameter. Broker threads may be occupied by tasks other than queries, such as health checks. You can use query laning to limit the number of HTTP threads designated for resource-intensive queries, leaving other threads available for short-running queries and other tasks. + +### General properties + +Set the following query laning properties in the `broker/runtime.properties` file. + +* `druid.query.scheduler.numThreads` – The total number of queries that can be served per Broker. We recommend setting this value to 1-2 less than `druid.server.http.numThreads`. + > The query scheduler by default does not limit the number of Broker HTTP threads. Setting this property to a bounded number limits the thread count. If the allocated threads are all occupied, any incoming query, including interactive queries, will be rejected with an HTTP 429 status code. + +* `druid.query.scheduler.laning.strategy` – The strategy used to assign queries to lanes. You can [define your own laning strategy manually](../configuration/index.md#manual-laning-strategy) or use the built-in [“high/low” laning strategy](../configuration/index.md#highlow-laning-strategy). + +Consider also defining a [prioritization strategy](../configuration/index.md#prioritization-strategies) for how queries are labeled high or low priority. Otherwise, manually set the priority for incoming queries on the query context. + +### Lane-specific properties + +If you use a manual laning strategy, set the following: + +* `druid.query.scheduler.laning.lanes.{name}` +* `druid.query.scheduler.laning.isLimitPercent` + +If you use the high/low laning strategy, set the following: + +* `druid.query.scheduler.laning.maxLowPercent` – The maximum percent of query threads to handle low priority queries. The remaining query threads are dedicated to higher priority queries. + +### Example + +Example config for query laning with the high/low laning strategy: + +``` +# Limit the number of HTTP threads for query processing +# This value should be less than druid.server.http.numThreads +druid.query.scheduler.numThreads=40 + +# Laning strategy +druid.query.scheduler.laning.strategy=hilo +druid.query.scheduler.laning.maxLowPercent=20 +``` + +See [Query prioritization and laning](../configuration/index.md#query-prioritization-and-laning) for details on query laning and the available query laning strategies. + +## Historical tiering + +Configure segment loading and tiers for Historicals as described in this section. In the examples below, we set up two tiers—hot and cold—for the Historicals and for the Brokers. We’ll instruct the Brokers to serve hot queries before cold queries. Short-running queries will be routed to the hot tiers, and long-running queries will be routed to the cold tiers. + +It is possible to separate Historical processes into tiers without having separate Broker tiers. This way, you can assign data from specific time intervals to specific tiers in order to support higher concurrency on hot data. + +### Configure segment loading + +Druid stores data in segment files. Define a [load rule](rule-configuration.md#load-rules) to indicate how segment replicas should be assigned to different Historical tiers. For example, you may store segments of more recent data on more powerful hardware for better performance. + +There are three types of load rules: forever, interval, and period. Select the load rule that matches your use case for each Historical, whether you want all segments to be loaded, segments within a certain time interval, or segments within a certain time period. + +In the load rule, define tiers in the `tieredReplicants` property. Provide descriptive names for your tiers, and specify how many replicas each tier should have. You can designate a higher number of replicas for the hot tier to increase the concurrency for processing queries. + +Example load rule with two Historical tiers, named “hot” and “\_default\_tier”: + +``` +{ + "type" : "loadByPeriod", + "period" : "P1M", + "includeFuture" : true, + "tieredReplicants": { + "hot": 3, + "_default_tier" : 1 + } +} +``` + +See [Load rules](rule-configuration.md#load-rules) for more information on segment load rules. Visit [Tutorial: Configuring data retention](../tutorials/tutorial-retention.md) for an example of setting retention rules from the Druid web console. + +### Assign Historicals to tiers + +Assign the Historical to tiers by labeling the tier name and setting the priority value in the `historical/runtime.properties` files. + +Example config for a Historical in the hot tier: + +``` +druid.server.tier=hot +druid.server.priority=1 +``` + +Example config for a Historical in the cold tier: + +``` +druid.server.tier=_default_tier +druid.server.priority=0 +``` + +See [Historical general configuration](../configuration/index.md#historical-general-configuration) for more details on these properties. + +## Broker tiering + +To set up Broker tiering, assign Brokers to tiers and configure query routing by the Router. You must set up Historical tiering before you can use Broker tiering. Review comment: Why would I do Broker tiering? What is the added benefit. Consider stating that it's dependent on Historical tiering first. ########## File path: docs/operations/query-concurrency.md ########## @@ -0,0 +1,174 @@ +--- +id: query-concurrency +title: Query concurrency +sidebar_label: Query concurrency +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +If you frequently run concurrent, mixed workloads on your Druid cluster, configure Druid to properly allocate cluster resources and optimize your overall query performance. With proper resource isolation, you can execute long-running, low priority queries that are resource intensive without interfering with short-running, high priority queries that require fewer resources. By separating cluster resources, you prevent queries from competing with each other for resources such as CPU, memory, and network access. + +There are two approaches to isolate your resources for improving query concurrency: query laning and cluster tiering. Use query laning to set a limit on the maximum number of long-running queries executed on each Broker. Use cluster tiering to define separate groups of Historicals and Brokers to which different queries can be directed based on their priority. + +## Query laning + +Query laning directs Druid to restrict resource usage for less urgent queries to ensure dedicated resources for higher priority queries. Query laning is ideal when you need to run many concurrent queries having heterogeneous workloads. + +Query lanes are analogous to carpool and normal lanes on the freeway. With query laning, Druid restricts low priority queries to low lanes and allows high priority queries to run wherever possible, whether in a high or low lane. In this way, higher priority queries may bypass other queries in lower priority lanes. + +In Druid, query lanes reserve resources for Broker HTTP threads. Each Druid query requires one Broker thread. The number of threads on a Broker is defined by the `druid.server.http.numThreads` parameter. Broker threads may be occupied by tasks other than queries, such as health checks. You can use query laning to limit the number of HTTP threads designated for resource-intensive queries, leaving other threads available for short-running queries and other tasks. + +### General properties + +Set the following query laning properties in the `broker/runtime.properties` file. + +* `druid.query.scheduler.numThreads` – The total number of queries that can be served per Broker. We recommend setting this value to 1-2 less than `druid.server.http.numThreads`. + > The query scheduler by default does not limit the number of Broker HTTP threads. Setting this property to a bounded number limits the thread count. If the allocated threads are all occupied, any incoming query, including interactive queries, will be rejected with an HTTP 429 status code. + +* `druid.query.scheduler.laning.strategy` – The strategy used to assign queries to lanes. You can [define your own laning strategy manually](../configuration/index.md#manual-laning-strategy) or use the built-in [“high/low” laning strategy](../configuration/index.md#highlow-laning-strategy). + +Consider also defining a [prioritization strategy](../configuration/index.md#prioritization-strategies) for how queries are labeled high or low priority. Otherwise, manually set the priority for incoming queries on the query context. + +### Lane-specific properties + +If you use a manual laning strategy, set the following: + +* `druid.query.scheduler.laning.lanes.{name}` +* `druid.query.scheduler.laning.isLimitPercent` + +If you use the high/low laning strategy, set the following: + +* `druid.query.scheduler.laning.maxLowPercent` – The maximum percent of query threads to handle low priority queries. The remaining query threads are dedicated to higher priority queries. + +### Example + +Example config for query laning with the high/low laning strategy: + +``` +# Limit the number of HTTP threads for query processing +# This value should be less than druid.server.http.numThreads +druid.query.scheduler.numThreads=40 + +# Laning strategy +druid.query.scheduler.laning.strategy=hilo +druid.query.scheduler.laning.maxLowPercent=20 +``` + +See [Query prioritization and laning](../configuration/index.md#query-prioritization-and-laning) for details on query laning and the available query laning strategies. + +## Historical tiering + +Configure segment loading and tiers for Historicals as described in this section. In the examples below, we set up two tiers—hot and cold—for the Historicals and for the Brokers. We’ll instruct the Brokers to serve hot queries before cold queries. Short-running queries will be routed to the hot tiers, and long-running queries will be routed to the cold tiers. + +It is possible to separate Historical processes into tiers without having separate Broker tiers. This way, you can assign data from specific time intervals to specific tiers in order to support higher concurrency on hot data. + +### Configure segment loading + +Druid stores data in segment files. Define a [load rule](rule-configuration.md#load-rules) to indicate how segment replicas should be assigned to different Historical tiers. For example, you may store segments of more recent data on more powerful hardware for better performance. + +There are three types of load rules: forever, interval, and period. Select the load rule that matches your use case for each Historical, whether you want all segments to be loaded, segments within a certain time interval, or segments within a certain time period. + +In the load rule, define tiers in the `tieredReplicants` property. Provide descriptive names for your tiers, and specify how many replicas each tier should have. You can designate a higher number of replicas for the hot tier to increase the concurrency for processing queries. + +Example load rule with two Historical tiers, named “hot” and “\_default\_tier”: + +``` +{ + "type" : "loadByPeriod", + "period" : "P1M", + "includeFuture" : true, + "tieredReplicants": { + "hot": 3, + "_default_tier" : 1 + } +} +``` + +See [Load rules](rule-configuration.md#load-rules) for more information on segment load rules. Visit [Tutorial: Configuring data retention](../tutorials/tutorial-retention.md) for an example of setting retention rules from the Druid web console. + +### Assign Historicals to tiers + +Assign the Historical to tiers by labeling the tier name and setting the priority value in the `historical/runtime.properties` files. + +Example config for a Historical in the hot tier: Review comment: ```suggestion Example Historical in the hot tier: ``` ########## File path: docs/operations/query-concurrency.md ########## @@ -0,0 +1,174 @@ +--- +id: query-concurrency +title: Query concurrency +sidebar_label: Query concurrency +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +If you frequently run concurrent, mixed workloads on your Druid cluster, configure Druid to properly allocate cluster resources and optimize your overall query performance. With proper resource isolation, you can execute long-running, low priority queries that are resource intensive without interfering with short-running, high priority queries that require fewer resources. By separating cluster resources, you prevent queries from competing with each other for resources such as CPU, memory, and network access. + +There are two approaches to isolate your resources for improving query concurrency: query laning and cluster tiering. Use query laning to set a limit on the maximum number of long-running queries executed on each Broker. Use cluster tiering to define separate groups of Historicals and Brokers to which different queries can be directed based on their priority. + +## Query laning + +Query laning directs Druid to restrict resource usage for less urgent queries to ensure dedicated resources for higher priority queries. Query laning is ideal when you need to run many concurrent queries having heterogeneous workloads. + +Query lanes are analogous to carpool and normal lanes on the freeway. With query laning, Druid restricts low priority queries to low lanes and allows high priority queries to run wherever possible, whether in a high or low lane. In this way, higher priority queries may bypass other queries in lower priority lanes. + +In Druid, query lanes reserve resources for Broker HTTP threads. Each Druid query requires one Broker thread. The number of threads on a Broker is defined by the `druid.server.http.numThreads` parameter. Broker threads may be occupied by tasks other than queries, such as health checks. You can use query laning to limit the number of HTTP threads designated for resource-intensive queries, leaving other threads available for short-running queries and other tasks. + +### General properties + +Set the following query laning properties in the `broker/runtime.properties` file. + +* `druid.query.scheduler.numThreads` – The total number of queries that can be served per Broker. We recommend setting this value to 1-2 less than `druid.server.http.numThreads`. + > The query scheduler by default does not limit the number of Broker HTTP threads. Setting this property to a bounded number limits the thread count. If the allocated threads are all occupied, any incoming query, including interactive queries, will be rejected with an HTTP 429 status code. + +* `druid.query.scheduler.laning.strategy` – The strategy used to assign queries to lanes. You can [define your own laning strategy manually](../configuration/index.md#manual-laning-strategy) or use the built-in [“high/low” laning strategy](../configuration/index.md#highlow-laning-strategy). + +Consider also defining a [prioritization strategy](../configuration/index.md#prioritization-strategies) for how queries are labeled high or low priority. Otherwise, manually set the priority for incoming queries on the query context. + +### Lane-specific properties + +If you use a manual laning strategy, set the following: + +* `druid.query.scheduler.laning.lanes.{name}` +* `druid.query.scheduler.laning.isLimitPercent` + +If you use the high/low laning strategy, set the following: + +* `druid.query.scheduler.laning.maxLowPercent` – The maximum percent of query threads to handle low priority queries. The remaining query threads are dedicated to higher priority queries. + +### Example + +Example config for query laning with the high/low laning strategy: + +``` +# Limit the number of HTTP threads for query processing +# This value should be less than druid.server.http.numThreads +druid.query.scheduler.numThreads=40 + +# Laning strategy +druid.query.scheduler.laning.strategy=hilo +druid.query.scheduler.laning.maxLowPercent=20 +``` + +See [Query prioritization and laning](../configuration/index.md#query-prioritization-and-laning) for details on query laning and the available query laning strategies. + +## Historical tiering + +Configure segment loading and tiers for Historicals as described in this section. In the examples below, we set up two tiers—hot and cold—for the Historicals and for the Brokers. We’ll instruct the Brokers to serve hot queries before cold queries. Short-running queries will be routed to the hot tiers, and long-running queries will be routed to the cold tiers. + +It is possible to separate Historical processes into tiers without having separate Broker tiers. This way, you can assign data from specific time intervals to specific tiers in order to support higher concurrency on hot data. + +### Configure segment loading + +Druid stores data in segment files. Define a [load rule](rule-configuration.md#load-rules) to indicate how segment replicas should be assigned to different Historical tiers. For example, you may store segments of more recent data on more powerful hardware for better performance. + +There are three types of load rules: forever, interval, and period. Select the load rule that matches your use case for each Historical, whether you want all segments to be loaded, segments within a certain time interval, or segments within a certain time period. + +In the load rule, define tiers in the `tieredReplicants` property. Provide descriptive names for your tiers, and specify how many replicas each tier should have. You can designate a higher number of replicas for the hot tier to increase the concurrency for processing queries. + +Example load rule with two Historical tiers, named “hot” and “\_default\_tier”: + +``` +{ + "type" : "loadByPeriod", + "period" : "P1M", + "includeFuture" : true, + "tieredReplicants": { + "hot": 3, + "_default_tier" : 1 + } +} +``` + +See [Load rules](rule-configuration.md#load-rules) for more information on segment load rules. Visit [Tutorial: Configuring data retention](../tutorials/tutorial-retention.md) for an example of setting retention rules from the Druid web console. + +### Assign Historicals to tiers + +Assign the Historical to tiers by labeling the tier name and setting the priority value in the `historical/runtime.properties` files. + +Example config for a Historical in the hot tier: + +``` +druid.server.tier=hot +druid.server.priority=1 +``` + +Example config for a Historical in the cold tier: Review comment: ```suggestion Example Historical in the cold tier: ``` ########## File path: docs/operations/query-concurrency.md ########## @@ -0,0 +1,174 @@ +--- +id: query-concurrency +title: Query concurrency +sidebar_label: Query concurrency +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +If you frequently run concurrent, mixed workloads on your Druid cluster, configure Druid to properly allocate cluster resources and optimize your overall query performance. With proper resource isolation, you can execute long-running, low priority queries that are resource intensive without interfering with short-running, high priority queries that require fewer resources. By separating cluster resources, you prevent queries from competing with each other for resources such as CPU, memory, and network access. + +There are two approaches to isolate your resources for improving query concurrency: query laning and cluster tiering. Use query laning to set a limit on the maximum number of long-running queries executed on each Broker. Use cluster tiering to define separate groups of Historicals and Brokers to which different queries can be directed based on their priority. + +## Query laning + +Query laning directs Druid to restrict resource usage for less urgent queries to ensure dedicated resources for higher priority queries. Query laning is ideal when you need to run many concurrent queries having heterogeneous workloads. + +Query lanes are analogous to carpool and normal lanes on the freeway. With query laning, Druid restricts low priority queries to low lanes and allows high priority queries to run wherever possible, whether in a high or low lane. In this way, higher priority queries may bypass other queries in lower priority lanes. + +In Druid, query lanes reserve resources for Broker HTTP threads. Each Druid query requires one Broker thread. The number of threads on a Broker is defined by the `druid.server.http.numThreads` parameter. Broker threads may be occupied by tasks other than queries, such as health checks. You can use query laning to limit the number of HTTP threads designated for resource-intensive queries, leaving other threads available for short-running queries and other tasks. + +### General properties + +Set the following query laning properties in the `broker/runtime.properties` file. + +* `druid.query.scheduler.numThreads` – The total number of queries that can be served per Broker. We recommend setting this value to 1-2 less than `druid.server.http.numThreads`. + > The query scheduler by default does not limit the number of Broker HTTP threads. Setting this property to a bounded number limits the thread count. If the allocated threads are all occupied, any incoming query, including interactive queries, will be rejected with an HTTP 429 status code. + +* `druid.query.scheduler.laning.strategy` – The strategy used to assign queries to lanes. You can [define your own laning strategy manually](../configuration/index.md#manual-laning-strategy) or use the built-in [“high/low” laning strategy](../configuration/index.md#highlow-laning-strategy). + +Consider also defining a [prioritization strategy](../configuration/index.md#prioritization-strategies) for how queries are labeled high or low priority. Otherwise, manually set the priority for incoming queries on the query context. + +### Lane-specific properties + +If you use a manual laning strategy, set the following: + +* `druid.query.scheduler.laning.lanes.{name}` +* `druid.query.scheduler.laning.isLimitPercent` + +If you use the high/low laning strategy, set the following: + +* `druid.query.scheduler.laning.maxLowPercent` – The maximum percent of query threads to handle low priority queries. The remaining query threads are dedicated to higher priority queries. + +### Example + +Example config for query laning with the high/low laning strategy: + +``` +# Limit the number of HTTP threads for query processing +# This value should be less than druid.server.http.numThreads +druid.query.scheduler.numThreads=40 + +# Laning strategy +druid.query.scheduler.laning.strategy=hilo +druid.query.scheduler.laning.maxLowPercent=20 +``` + +See [Query prioritization and laning](../configuration/index.md#query-prioritization-and-laning) for details on query laning and the available query laning strategies. + +## Historical tiering + +Configure segment loading and tiers for Historicals as described in this section. In the examples below, we set up two tiers—hot and cold—for the Historicals and for the Brokers. We’ll instruct the Brokers to serve hot queries before cold queries. Short-running queries will be routed to the hot tiers, and long-running queries will be routed to the cold tiers. + +It is possible to separate Historical processes into tiers without having separate Broker tiers. This way, you can assign data from specific time intervals to specific tiers in order to support higher concurrency on hot data. + +### Configure segment loading + +Druid stores data in segment files. Define a [load rule](rule-configuration.md#load-rules) to indicate how segment replicas should be assigned to different Historical tiers. For example, you may store segments of more recent data on more powerful hardware for better performance. Review comment: The blunt statement about how Druid stores data, feels abrupt. It's also not exactly the subject of this article. Maybe talk about it in the context of data distribution across Historicals. Also is this date-related tiering? it seems like the statement in line 78 about fast-running queries vs slow-running queries doesn't apply here? Like we could issue a really slow query on the most current data, right? ########## File path: docs/operations/query-concurrency.md ########## @@ -0,0 +1,174 @@ +--- +id: query-concurrency +title: Query concurrency +sidebar_label: Query concurrency +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +If you frequently run concurrent, mixed workloads on your Druid cluster, configure Druid to properly allocate cluster resources and optimize your overall query performance. With proper resource isolation, you can execute long-running, low priority queries that are resource intensive without interfering with short-running, high priority queries that require fewer resources. By separating cluster resources, you prevent queries from competing with each other for resources such as CPU, memory, and network access. + +There are two approaches to isolate your resources for improving query concurrency: query laning and cluster tiering. Use query laning to set a limit on the maximum number of long-running queries executed on each Broker. Use cluster tiering to define separate groups of Historicals and Brokers to which different queries can be directed based on their priority. + +## Query laning + +Query laning directs Druid to restrict resource usage for less urgent queries to ensure dedicated resources for higher priority queries. Query laning is ideal when you need to run many concurrent queries having heterogeneous workloads. + +Query lanes are analogous to carpool and normal lanes on the freeway. With query laning, Druid restricts low priority queries to low lanes and allows high priority queries to run wherever possible, whether in a high or low lane. In this way, higher priority queries may bypass other queries in lower priority lanes. + +In Druid, query lanes reserve resources for Broker HTTP threads. Each Druid query requires one Broker thread. The number of threads on a Broker is defined by the `druid.server.http.numThreads` parameter. Broker threads may be occupied by tasks other than queries, such as health checks. You can use query laning to limit the number of HTTP threads designated for resource-intensive queries, leaving other threads available for short-running queries and other tasks. + +### General properties + +Set the following query laning properties in the `broker/runtime.properties` file. + +* `druid.query.scheduler.numThreads` – The total number of queries that can be served per Broker. We recommend setting this value to 1-2 less than `druid.server.http.numThreads`. + > The query scheduler by default does not limit the number of Broker HTTP threads. Setting this property to a bounded number limits the thread count. If the allocated threads are all occupied, any incoming query, including interactive queries, will be rejected with an HTTP 429 status code. + +* `druid.query.scheduler.laning.strategy` – The strategy used to assign queries to lanes. You can [define your own laning strategy manually](../configuration/index.md#manual-laning-strategy) or use the built-in [“high/low” laning strategy](../configuration/index.md#highlow-laning-strategy). + +Consider also defining a [prioritization strategy](../configuration/index.md#prioritization-strategies) for how queries are labeled high or low priority. Otherwise, manually set the priority for incoming queries on the query context. + +### Lane-specific properties + +If you use a manual laning strategy, set the following: + +* `druid.query.scheduler.laning.lanes.{name}` +* `druid.query.scheduler.laning.isLimitPercent` + +If you use the high/low laning strategy, set the following: + +* `druid.query.scheduler.laning.maxLowPercent` – The maximum percent of query threads to handle low priority queries. The remaining query threads are dedicated to higher priority queries. + +### Example + +Example config for query laning with the high/low laning strategy: + +``` +# Limit the number of HTTP threads for query processing +# This value should be less than druid.server.http.numThreads +druid.query.scheduler.numThreads=40 + +# Laning strategy +druid.query.scheduler.laning.strategy=hilo +druid.query.scheduler.laning.maxLowPercent=20 +``` + +See [Query prioritization and laning](../configuration/index.md#query-prioritization-and-laning) for details on query laning and the available query laning strategies. + +## Historical tiering + +Configure segment loading and tiers for Historicals as described in this section. In the examples below, we set up two tiers—hot and cold—for the Historicals and for the Brokers. We’ll instruct the Brokers to serve hot queries before cold queries. Short-running queries will be routed to the hot tiers, and long-running queries will be routed to the cold tiers. Review comment: - I'm missing the conceptual introduction like was there for Query laning. - Are Historical tiering and Broker tiering coupled in a single approach? If so, I think they should go into a "Service tiering" or "Historical and Broker service tiering" section. - Avoid "we'll" You can configure Brokers... - What are "hot" queries? "cold" queries? ########## File path: docs/operations/query-concurrency.md ########## @@ -0,0 +1,174 @@ +--- +id: query-concurrency +title: Query concurrency +sidebar_label: Query concurrency +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +If you frequently run concurrent, mixed workloads on your Druid cluster, configure Druid to properly allocate cluster resources and optimize your overall query performance. With proper resource isolation, you can execute long-running, low priority queries that are resource intensive without interfering with short-running, high priority queries that require fewer resources. By separating cluster resources, you prevent queries from competing with each other for resources such as CPU, memory, and network access. + +There are two approaches to isolate your resources for improving query concurrency: query laning and cluster tiering. Use query laning to set a limit on the maximum number of long-running queries executed on each Broker. Use cluster tiering to define separate groups of Historicals and Brokers to which different queries can be directed based on their priority. + +## Query laning + +Query laning directs Druid to restrict resource usage for less urgent queries to ensure dedicated resources for higher priority queries. Query laning is ideal when you need to run many concurrent queries having heterogeneous workloads. + +Query lanes are analogous to carpool and normal lanes on the freeway. With query laning, Druid restricts low priority queries to low lanes and allows high priority queries to run wherever possible, whether in a high or low lane. In this way, higher priority queries may bypass other queries in lower priority lanes. + +In Druid, query lanes reserve resources for Broker HTTP threads. Each Druid query requires one Broker thread. The number of threads on a Broker is defined by the `druid.server.http.numThreads` parameter. Broker threads may be occupied by tasks other than queries, such as health checks. You can use query laning to limit the number of HTTP threads designated for resource-intensive queries, leaving other threads available for short-running queries and other tasks. + +### General properties + +Set the following query laning properties in the `broker/runtime.properties` file. + +* `druid.query.scheduler.numThreads` – The total number of queries that can be served per Broker. We recommend setting this value to 1-2 less than `druid.server.http.numThreads`. + > The query scheduler by default does not limit the number of Broker HTTP threads. Setting this property to a bounded number limits the thread count. If the allocated threads are all occupied, any incoming query, including interactive queries, will be rejected with an HTTP 429 status code. + +* `druid.query.scheduler.laning.strategy` – The strategy used to assign queries to lanes. You can [define your own laning strategy manually](../configuration/index.md#manual-laning-strategy) or use the built-in [“high/low” laning strategy](../configuration/index.md#highlow-laning-strategy). + +Consider also defining a [prioritization strategy](../configuration/index.md#prioritization-strategies) for how queries are labeled high or low priority. Otherwise, manually set the priority for incoming queries on the query context. + +### Lane-specific properties + +If you use a manual laning strategy, set the following: + +* `druid.query.scheduler.laning.lanes.{name}` +* `druid.query.scheduler.laning.isLimitPercent` + +If you use the high/low laning strategy, set the following: + +* `druid.query.scheduler.laning.maxLowPercent` – The maximum percent of query threads to handle low priority queries. The remaining query threads are dedicated to higher priority queries. + +### Example + +Example config for query laning with the high/low laning strategy: + +``` +# Limit the number of HTTP threads for query processing +# This value should be less than druid.server.http.numThreads +druid.query.scheduler.numThreads=40 + +# Laning strategy +druid.query.scheduler.laning.strategy=hilo +druid.query.scheduler.laning.maxLowPercent=20 +``` + +See [Query prioritization and laning](../configuration/index.md#query-prioritization-and-laning) for details on query laning and the available query laning strategies. + +## Historical tiering + +Configure segment loading and tiers for Historicals as described in this section. In the examples below, we set up two tiers—hot and cold—for the Historicals and for the Brokers. We’ll instruct the Brokers to serve hot queries before cold queries. Short-running queries will be routed to the hot tiers, and long-running queries will be routed to the cold tiers. + +It is possible to separate Historical processes into tiers without having separate Broker tiers. This way, you can assign data from specific time intervals to specific tiers in order to support higher concurrency on hot data. + +### Configure segment loading + +Druid stores data in segment files. Define a [load rule](rule-configuration.md#load-rules) to indicate how segment replicas should be assigned to different Historical tiers. For example, you may store segments of more recent data on more powerful hardware for better performance. + +There are three types of load rules: forever, interval, and period. Select the load rule that matches your use case for each Historical, whether you want all segments to be loaded, segments within a certain time interval, or segments within a certain time period. + +In the load rule, define tiers in the `tieredReplicants` property. Provide descriptive names for your tiers, and specify how many replicas each tier should have. You can designate a higher number of replicas for the hot tier to increase the concurrency for processing queries. + +Example load rule with two Historical tiers, named “hot” and “\_default\_tier”: Review comment: make sure to include a description of what it does (loads only the last month of data on these tiers?) ########## File path: docs/operations/query-concurrency.md ########## @@ -0,0 +1,174 @@ +--- +id: query-concurrency +title: Query concurrency +sidebar_label: Query concurrency +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +If you frequently run concurrent, mixed workloads on your Druid cluster, configure Druid to properly allocate cluster resources and optimize your overall query performance. With proper resource isolation, you can execute long-running, low priority queries that are resource intensive without interfering with short-running, high priority queries that require fewer resources. By separating cluster resources, you prevent queries from competing with each other for resources such as CPU, memory, and network access. + +There are two approaches to isolate your resources for improving query concurrency: query laning and cluster tiering. Use query laning to set a limit on the maximum number of long-running queries executed on each Broker. Use cluster tiering to define separate groups of Historicals and Brokers to which different queries can be directed based on their priority. Review comment: ```suggestion Druid provides the following approaches to isolate resources and improve query concurrency: - **Query laning** where you set a limit on the maximum number of long-running queries executed on each Broker. - **Cluster tiering** which defines separate groups of Historicals and Brokers to receive different query assignments based on query priority. ``` Consider giving the options more space and make them easier to scan. Avoid naming a specific number of options when possible, to make it simpler to add new options. ########## File path: docs/operations/query-concurrency.md ########## @@ -0,0 +1,174 @@ +--- +id: query-concurrency +title: Query concurrency +sidebar_label: Query concurrency +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +If you frequently run concurrent, mixed workloads on your Druid cluster, configure Druid to properly allocate cluster resources and optimize your overall query performance. With proper resource isolation, you can execute long-running, low priority queries that are resource intensive without interfering with short-running, high priority queries that require fewer resources. By separating cluster resources, you prevent queries from competing with each other for resources such as CPU, memory, and network access. + +There are two approaches to isolate your resources for improving query concurrency: query laning and cluster tiering. Use query laning to set a limit on the maximum number of long-running queries executed on each Broker. Use cluster tiering to define separate groups of Historicals and Brokers to which different queries can be directed based on their priority. + +## Query laning + +Query laning directs Druid to restrict resource usage for less urgent queries to ensure dedicated resources for higher priority queries. Query laning is ideal when you need to run many concurrent queries having heterogeneous workloads. + +Query lanes are analogous to carpool and normal lanes on the freeway. With query laning, Druid restricts low priority queries to low lanes and allows high priority queries to run wherever possible, whether in a high or low lane. In this way, higher priority queries may bypass other queries in lower priority lanes. + +In Druid, query lanes reserve resources for Broker HTTP threads. Each Druid query requires one Broker thread. The number of threads on a Broker is defined by the `druid.server.http.numThreads` parameter. Broker threads may be occupied by tasks other than queries, such as health checks. You can use query laning to limit the number of HTTP threads designated for resource-intensive queries, leaving other threads available for short-running queries and other tasks. + +### General properties + +Set the following query laning properties in the `broker/runtime.properties` file. + +* `druid.query.scheduler.numThreads` – The total number of queries that can be served per Broker. We recommend setting this value to 1-2 less than `druid.server.http.numThreads`. + > The query scheduler by default does not limit the number of Broker HTTP threads. Setting this property to a bounded number limits the thread count. If the allocated threads are all occupied, any incoming query, including interactive queries, will be rejected with an HTTP 429 status code. + +* `druid.query.scheduler.laning.strategy` – The strategy used to assign queries to lanes. You can [define your own laning strategy manually](../configuration/index.md#manual-laning-strategy) or use the built-in [“high/low” laning strategy](../configuration/index.md#highlow-laning-strategy). + +Consider also defining a [prioritization strategy](../configuration/index.md#prioritization-strategies) for how queries are labeled high or low priority. Otherwise, manually set the priority for incoming queries on the query context. + +### Lane-specific properties + +If you use a manual laning strategy, set the following: + +* `druid.query.scheduler.laning.lanes.{name}` +* `druid.query.scheduler.laning.isLimitPercent` + +If you use the high/low laning strategy, set the following: + +* `druid.query.scheduler.laning.maxLowPercent` – The maximum percent of query threads to handle low priority queries. The remaining query threads are dedicated to higher priority queries. + +### Example + +Example config for query laning with the high/low laning strategy: + +``` +# Limit the number of HTTP threads for query processing +# This value should be less than druid.server.http.numThreads +druid.query.scheduler.numThreads=40 + +# Laning strategy +druid.query.scheduler.laning.strategy=hilo +druid.query.scheduler.laning.maxLowPercent=20 +``` + +See [Query prioritization and laning](../configuration/index.md#query-prioritization-and-laning) for details on query laning and the available query laning strategies. + +## Historical tiering + +Configure segment loading and tiers for Historicals as described in this section. In the examples below, we set up two tiers—hot and cold—for the Historicals and for the Brokers. We’ll instruct the Brokers to serve hot queries before cold queries. Short-running queries will be routed to the hot tiers, and long-running queries will be routed to the cold tiers. + +It is possible to separate Historical processes into tiers without having separate Broker tiers. This way, you can assign data from specific time intervals to specific tiers in order to support higher concurrency on hot data. Review comment: Even though Broker tiering is optional, it is dependent on Historical Tiering, so I'd keep them in the same section. In the beginning I'd make it clear why you'd want to choose to do tiering for Historicals only or both Historicals and Brokers. ########## File path: docs/operations/query-concurrency.md ########## @@ -0,0 +1,174 @@ +--- +id: query-concurrency +title: Query concurrency +sidebar_label: Query concurrency +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +If you frequently run concurrent, mixed workloads on your Druid cluster, configure Druid to properly allocate cluster resources and optimize your overall query performance. With proper resource isolation, you can execute long-running, low priority queries that are resource intensive without interfering with short-running, high priority queries that require fewer resources. By separating cluster resources, you prevent queries from competing with each other for resources such as CPU, memory, and network access. + +There are two approaches to isolate your resources for improving query concurrency: query laning and cluster tiering. Use query laning to set a limit on the maximum number of long-running queries executed on each Broker. Use cluster tiering to define separate groups of Historicals and Brokers to which different queries can be directed based on their priority. + +## Query laning + +Query laning directs Druid to restrict resource usage for less urgent queries to ensure dedicated resources for higher priority queries. Query laning is ideal when you need to run many concurrent queries having heterogeneous workloads. Review comment: If Broker tiering is the ideal solution, should we put it first as the option. @jihoonson how can we help the user decide when to choose one approach over the other? For example, why would you choose Query laning if Broker tiering is ideal? Does it make sense to have a Stratgies section like this one: https://druid.apache.org/docs/0.22.1/ingestion/compaction.html#compaction-strategies ########## File path: docs/operations/query-concurrency.md ########## @@ -0,0 +1,174 @@ +--- +id: query-concurrency +title: Query concurrency +sidebar_label: Query concurrency +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +If you frequently run concurrent, mixed workloads on your Druid cluster, configure Druid to properly allocate cluster resources and optimize your overall query performance. With proper resource isolation, you can execute long-running, low priority queries that are resource intensive without interfering with short-running, high priority queries that require fewer resources. By separating cluster resources, you prevent queries from competing with each other for resources such as CPU, memory, and network access. + +There are two approaches to isolate your resources for improving query concurrency: query laning and cluster tiering. Use query laning to set a limit on the maximum number of long-running queries executed on each Broker. Use cluster tiering to define separate groups of Historicals and Brokers to which different queries can be directed based on their priority. + +## Query laning + +Query laning directs Druid to restrict resource usage for less urgent queries to ensure dedicated resources for higher priority queries. Query laning is ideal when you need to run many concurrent queries having heterogeneous workloads. + +Query lanes are analogous to carpool and normal lanes on the freeway. With query laning, Druid restricts low priority queries to low lanes and allows high priority queries to run wherever possible, whether in a high or low lane. In this way, higher priority queries may bypass other queries in lower priority lanes. + +In Druid, query lanes reserve resources for Broker HTTP threads. Each Druid query requires one Broker thread. The number of threads on a Broker is defined by the `druid.server.http.numThreads` parameter. Broker threads may be occupied by tasks other than queries, such as health checks. You can use query laning to limit the number of HTTP threads designated for resource-intensive queries, leaving other threads available for short-running queries and other tasks. + +### General properties + +Set the following query laning properties in the `broker/runtime.properties` file. + +* `druid.query.scheduler.numThreads` – The total number of queries that can be served per Broker. We recommend setting this value to 1-2 less than `druid.server.http.numThreads`. + > The query scheduler by default does not limit the number of Broker HTTP threads. Setting this property to a bounded number limits the thread count. If the allocated threads are all occupied, any incoming query, including interactive queries, will be rejected with an HTTP 429 status code. + +* `druid.query.scheduler.laning.strategy` – The strategy used to assign queries to lanes. You can [define your own laning strategy manually](../configuration/index.md#manual-laning-strategy) or use the built-in [“high/low” laning strategy](../configuration/index.md#highlow-laning-strategy). + +Consider also defining a [prioritization strategy](../configuration/index.md#prioritization-strategies) for how queries are labeled high or low priority. Otherwise, manually set the priority for incoming queries on the query context. + +### Lane-specific properties + +If you use a manual laning strategy, set the following: + +* `druid.query.scheduler.laning.lanes.{name}` +* `druid.query.scheduler.laning.isLimitPercent` + +If you use the high/low laning strategy, set the following: + +* `druid.query.scheduler.laning.maxLowPercent` – The maximum percent of query threads to handle low priority queries. The remaining query threads are dedicated to higher priority queries. + +### Example + +Example config for query laning with the high/low laning strategy: + +``` +# Limit the number of HTTP threads for query processing +# This value should be less than druid.server.http.numThreads +druid.query.scheduler.numThreads=40 + +# Laning strategy +druid.query.scheduler.laning.strategy=hilo +druid.query.scheduler.laning.maxLowPercent=20 +``` + +See [Query prioritization and laning](../configuration/index.md#query-prioritization-and-laning) for details on query laning and the available query laning strategies. + +## Historical tiering + +Configure segment loading and tiers for Historicals as described in this section. In the examples below, we set up two tiers—hot and cold—for the Historicals and for the Brokers. We’ll instruct the Brokers to serve hot queries before cold queries. Short-running queries will be routed to the hot tiers, and long-running queries will be routed to the cold tiers. + +It is possible to separate Historical processes into tiers without having separate Broker tiers. This way, you can assign data from specific time intervals to specific tiers in order to support higher concurrency on hot data. + +### Configure segment loading + +Druid stores data in segment files. Define a [load rule](rule-configuration.md#load-rules) to indicate how segment replicas should be assigned to different Historical tiers. For example, you may store segments of more recent data on more powerful hardware for better performance. + +There are three types of load rules: forever, interval, and period. Select the load rule that matches your use case for each Historical, whether you want all segments to be loaded, segments within a certain time interval, or segments within a certain time period. + +In the load rule, define tiers in the `tieredReplicants` property. Provide descriptive names for your tiers, and specify how many replicas each tier should have. You can designate a higher number of replicas for the hot tier to increase the concurrency for processing queries. + +Example load rule with two Historical tiers, named “hot” and “\_default\_tier”: + +``` +{ + "type" : "loadByPeriod", + "period" : "P1M", + "includeFuture" : true, + "tieredReplicants": { + "hot": 3, + "_default_tier" : 1 + } +} +``` + +See [Load rules](rule-configuration.md#load-rules) for more information on segment load rules. Visit [Tutorial: Configuring data retention](../tutorials/tutorial-retention.md) for an example of setting retention rules from the Druid web console. + +### Assign Historicals to tiers + +Assign the Historical to tiers by labeling the tier name and setting the priority value in the `historical/runtime.properties` files. Review comment: ```suggestion To assign a Historical to a tier, add a label for the tier name and set the priority value in the `historical/runtime.properties` for the Historical. ``` ########## File path: docs/operations/query-concurrency.md ########## @@ -0,0 +1,174 @@ +--- +id: query-concurrency +title: Query concurrency +sidebar_label: Query concurrency +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +If you frequently run concurrent, mixed workloads on your Druid cluster, configure Druid to properly allocate cluster resources and optimize your overall query performance. With proper resource isolation, you can execute long-running, low priority queries that are resource intensive without interfering with short-running, high priority queries that require fewer resources. By separating cluster resources, you prevent queries from competing with each other for resources such as CPU, memory, and network access. + +There are two approaches to isolate your resources for improving query concurrency: query laning and cluster tiering. Use query laning to set a limit on the maximum number of long-running queries executed on each Broker. Use cluster tiering to define separate groups of Historicals and Brokers to which different queries can be directed based on their priority. + +## Query laning + +Query laning directs Druid to restrict resource usage for less urgent queries to ensure dedicated resources for higher priority queries. Query laning is ideal when you need to run many concurrent queries having heterogeneous workloads. + +Query lanes are analogous to carpool and normal lanes on the freeway. With query laning, Druid restricts low priority queries to low lanes and allows high priority queries to run wherever possible, whether in a high or low lane. In this way, higher priority queries may bypass other queries in lower priority lanes. Review comment: The piece I am missing from this topic is how a Broker (or Druid) decides which query goes into which lane. I think it merits a brief mention of how you can specify a lane in the query context or set up thresholds (depinding). Then, in your example, show a query getting routed to high/low lanes. ########## File path: docs/operations/query-concurrency.md ########## @@ -0,0 +1,174 @@ +--- +id: query-concurrency +title: Query concurrency Review comment: I agree with @jihoonson . Configure Druid for mixed use workloads (this title would be very long for the left nav, but you get the idea). This is mostly task based, so a "verb" title would be good. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
