techdocsmith commented on code in PR #18252: URL: https://github.com/apache/druid/pull/18252#discussion_r2216603867
########## docs/querying/dart.md: ########## @@ -0,0 +1,101 @@ +--- +id: dart +title: "SQL queries using the Dart query engine" +sidebar_label: "Dart query engine" +description: Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +:::info[Experimental] + +Dart is experimental. For production use, we recommend using the other available query engines. + +::: + + +Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. For example, use Dart for GROUP BY queries that have intermediate results consisting of hundreds of millions of rows. Dart works well for these sorts of queries because its multi-threaded workers perform in-memory shuffles using locally cached data. There's no time spent hitting deep storage. + +You can query batch or realtime datasources with Dart. + +## Enable Dart + +To enable Dart, add the following line to your `broker/runtime.properties` and `historical/runtime.properties` files: + +``` +druid.msq.dart.enabled = true +``` + +### Additional configs + +There are additional configs that provide some control over Dart's resource consumption. Review Comment: ```suggestion You can configure the Broker and the Historical to tune Dart's resource consumption. ``` ########## docs/querying/dart.md: ########## @@ -0,0 +1,101 @@ +--- +id: dart +title: "SQL queries using the Dart query engine" +sidebar_label: "Dart query engine" +description: Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +:::info[Experimental] + +Dart is experimental. For production use, we recommend using the other available query engines. + +::: + + +Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. For example, use Dart for GROUP BY queries that have intermediate results consisting of hundreds of millions of rows. Dart works well for these sorts of queries because its multi-threaded workers perform in-memory shuffles using locally cached data. There's no time spent hitting deep storage. Review Comment: This is a little confusing since MSQ is for ingestion only AFAIK. Or can DART do INSERT INTO/ingestion too? ```suggestion Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. For example, use Dart for GROUP BY queries that have intermediate results consisting of hundreds of millions of rows. In this case, the Dart engine's multi-threaded workers perform in-memory shuffles using locally cached data without pulling from deep storage. ``` ########## docs/querying/dart.md: ########## @@ -0,0 +1,101 @@ +--- +id: dart +title: "SQL queries using the Dart query engine" +sidebar_label: "Dart query engine" +description: Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +:::info[Experimental] + +Dart is experimental. For production use, we recommend using the other available query engines. + +::: + + +Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. For example, use Dart for GROUP BY queries that have intermediate results consisting of hundreds of millions of rows. Dart works well for these sorts of queries because its multi-threaded workers perform in-memory shuffles using locally cached data. There's no time spent hitting deep storage. + +You can query batch or realtime datasources with Dart. + +## Enable Dart + +To enable Dart, add the following line to your `broker/runtime.properties` and `historical/runtime.properties` files: + +``` +druid.msq.dart.enabled = true +``` + +### Additional configs + +There are additional configs that provide some control over Dart's resource consumption. + +For Brokers, you can set the following configs: + +- `druid.msq.dart.controller.concurrentQueries`: The maximum number of query controllers that can run concurrently on that Broker. Additional controllers are queued. Defaults to 1. +- `druid.msq.dart.query.context.targetPartitionsPerWorker`: The number of partitions per worker to create during a shuffle. We recommend setting this to the number of threads available on workers to fully take advantage of multi-threaded processing of shuffled data. Review Comment: suggest using a table for configuration reference. ########## docs/querying/dart.md: ########## @@ -0,0 +1,101 @@ +--- +id: dart +title: "SQL queries using the Dart query engine" +sidebar_label: "Dart query engine" +description: Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +:::info[Experimental] + +Dart is experimental. For production use, we recommend using the other available query engines. + +::: + + +Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. For example, use Dart for GROUP BY queries that have intermediate results consisting of hundreds of millions of rows. Dart works well for these sorts of queries because its multi-threaded workers perform in-memory shuffles using locally cached data. There's no time spent hitting deep storage. + +You can query batch or realtime datasources with Dart. + +## Enable Dart + +To enable Dart, add the following line to your `broker/runtime.properties` and `historical/runtime.properties` files: + +``` +druid.msq.dart.enabled = true +``` + +### Additional configs Review Comment: Prefer "configurations" over configs ########## docs/querying/dart.md: ########## @@ -0,0 +1,101 @@ +--- +id: dart +title: "SQL queries using the Dart query engine" +sidebar_label: "Dart query engine" +description: Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +:::info[Experimental] + +Dart is experimental. For production use, we recommend using the other available query engines. + +::: + + +Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. For example, use Dart for GROUP BY queries that have intermediate results consisting of hundreds of millions of rows. Dart works well for these sorts of queries because its multi-threaded workers perform in-memory shuffles using locally cached data. There's no time spent hitting deep storage. + +You can query batch or realtime datasources with Dart. + +## Enable Dart + +To enable Dart, add the following line to your `broker/runtime.properties` and `historical/runtime.properties` files: + +``` +druid.msq.dart.enabled = true +``` + +### Additional configs + +There are additional configs that provide some control over Dart's resource consumption. + +For Brokers, you can set the following configs: + +- `druid.msq.dart.controller.concurrentQueries`: The maximum number of query controllers that can run concurrently on that Broker. Additional controllers are queued. Defaults to 1. +- `druid.msq.dart.query.context.targetPartitionsPerWorker`: The number of partitions per worker to create during a shuffle. We recommend setting this to the number of threads available on workers to fully take advantage of multi-threaded processing of shuffled data. + +For Historicals, you can set the following configs: + +- `druid.msq.dart.worker.concurrentQueries`: The maximum number of query workers that can run concurrently on that Historical. Default is equal to the number of merge buffers because each query needs one merge buffer. Ideally, this should be equal to or larger than the sum of the `concurrentQueries` setting on yourl Brokers. +- `druid.msq.dart.worker.heapFraction`: The maximum amount of heap available for use across all Dart queries as a decimal. The default is 0.35, 35% of heap. + + +## Run a Dart query + +Once enabled, you can select Dart from the available engines in the Druid console or the API to issue queries like with other query engines. Review Comment: ```suggestion Once enabled, you can use Dart in the Druid console or the (Query?) API to issue queries. ``` This reads awkwardly b/c with the API you don't select Dart from an available engine. I don't think "like with other Query engines add much here. ########## docs/querying/dart.md: ########## @@ -0,0 +1,101 @@ +--- +id: dart +title: "SQL queries using the Dart query engine" +sidebar_label: "Dart query engine" +description: Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +:::info[Experimental] + +Dart is experimental. For production use, we recommend using the other available query engines. + +::: + + +Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. For example, use Dart for GROUP BY queries that have intermediate results consisting of hundreds of millions of rows. Dart works well for these sorts of queries because its multi-threaded workers perform in-memory shuffles using locally cached data. There's no time spent hitting deep storage. + +You can query batch or realtime datasources with Dart. + +## Enable Dart + +To enable Dart, add the following line to your `broker/runtime.properties` and `historical/runtime.properties` files: + +``` +druid.msq.dart.enabled = true +``` + +### Additional configs + +There are additional configs that provide some control over Dart's resource consumption. + +For Brokers, you can set the following configs: + +- `druid.msq.dart.controller.concurrentQueries`: The maximum number of query controllers that can run concurrently on that Broker. Additional controllers are queued. Defaults to 1. +- `druid.msq.dart.query.context.targetPartitionsPerWorker`: The number of partitions per worker to create during a shuffle. We recommend setting this to the number of threads available on workers to fully take advantage of multi-threaded processing of shuffled data. Review Comment: ```suggestion - `druid.msq.dart.query.context.targetPartitionsPerWorker`: The number of partitions per worker to create during a shuffle. Set this to the number of available threads on workers to fully take advantage of multi-threaded processing of shuffled data. ``` ########## docs/querying/dart.md: ########## @@ -0,0 +1,101 @@ +--- +id: dart +title: "SQL queries using the Dart query engine" +sidebar_label: "Dart query engine" +description: Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +:::info[Experimental] + +Dart is experimental. For production use, we recommend using the other available query engines. + +::: + + +Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. For example, use Dart for GROUP BY queries that have intermediate results consisting of hundreds of millions of rows. Dart works well for these sorts of queries because its multi-threaded workers perform in-memory shuffles using locally cached data. There's no time spent hitting deep storage. + +You can query batch or realtime datasources with Dart. + +## Enable Dart + +To enable Dart, add the following line to your `broker/runtime.properties` and `historical/runtime.properties` files: + +``` +druid.msq.dart.enabled = true +``` + +### Additional configs + +There are additional configs that provide some control over Dart's resource consumption. + +For Brokers, you can set the following configs: + +- `druid.msq.dart.controller.concurrentQueries`: The maximum number of query controllers that can run concurrently on that Broker. Additional controllers are queued. Defaults to 1. +- `druid.msq.dart.query.context.targetPartitionsPerWorker`: The number of partitions per worker to create during a shuffle. We recommend setting this to the number of threads available on workers to fully take advantage of multi-threaded processing of shuffled data. + +For Historicals, you can set the following configs: + +- `druid.msq.dart.worker.concurrentQueries`: The maximum number of query workers that can run concurrently on that Historical. Default is equal to the number of merge buffers because each query needs one merge buffer. Ideally, this should be equal to or larger than the sum of the `concurrentQueries` setting on yourl Brokers. +- `druid.msq.dart.worker.heapFraction`: The maximum amount of heap available for use across all Dart queries as a decimal. The default is 0.35, 35% of heap. + + +## Run a Dart query + +Once enabled, you can select Dart from the available engines in the Druid console or the API to issue queries like with other query engines. + +### Druid console + +In the **Query** view, select **Engine: SQL (Dart)** from the engine selector menu. Review Comment: consider a screen capture. not absolutely necessary. ########## docs/querying/dart.md: ########## @@ -0,0 +1,101 @@ +--- +id: dart +title: "SQL queries using the Dart query engine" +sidebar_label: "Dart query engine" +description: Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +:::info[Experimental] + +Dart is experimental. For production use, we recommend using the other available query engines. + +::: + + +Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. For example, use Dart for GROUP BY queries that have intermediate results consisting of hundreds of millions of rows. Dart works well for these sorts of queries because its multi-threaded workers perform in-memory shuffles using locally cached data. There's no time spent hitting deep storage. + +You can query batch or realtime datasources with Dart. + +## Enable Dart + +To enable Dart, add the following line to your `broker/runtime.properties` and `historical/runtime.properties` files: + +``` +druid.msq.dart.enabled = true +``` + +### Additional configs + +There are additional configs that provide some control over Dart's resource consumption. + +For Brokers, you can set the following configs: + +- `druid.msq.dart.controller.concurrentQueries`: The maximum number of query controllers that can run concurrently on that Broker. Additional controllers are queued. Defaults to 1. +- `druid.msq.dart.query.context.targetPartitionsPerWorker`: The number of partitions per worker to create during a shuffle. We recommend setting this to the number of threads available on workers to fully take advantage of multi-threaded processing of shuffled data. + +For Historicals, you can set the following configs: + +- `druid.msq.dart.worker.concurrentQueries`: The maximum number of query workers that can run concurrently on that Historical. Default is equal to the number of merge buffers because each query needs one merge buffer. Ideally, this should be equal to or larger than the sum of the `concurrentQueries` setting on yourl Brokers. +- `druid.msq.dart.worker.heapFraction`: The maximum amount of heap available for use across all Dart queries as a decimal. The default is 0.35, 35% of heap. + + +## Run a Dart query + +Once enabled, you can select Dart from the available engines in the Druid console or the API to issue queries like with other query engines. + +### Druid console + +In the **Query** view, select **Engine: SQL (Dart)** from the engine selector menu. + +### API + +Dart uses the SQL endpoint `/druid/v2/sql` like the other SQL query engines. To use Dart, include the query context parameter `engine` and set it to `msq-dart`: Review Comment: I don't know that "like other SQL query engines does much work here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
