317brian commented on code in PR #18252: URL: https://github.com/apache/druid/pull/18252#discussion_r2224259367
########## docs/querying/dart.md: ########## @@ -0,0 +1,101 @@ +--- +id: dart +title: "SQL queries using the Dart query engine" +sidebar_label: "Dart query engine" +description: Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +:::info[Experimental] + +Dart is experimental. For production use, we recommend using the other available query engines. + +::: + + +Use the Dart query engine for light-weight queries that don't need all the capabilities of the MSQ task engine. For example, use Dart for GROUP BY queries that have intermediate results consisting of hundreds of millions of rows. Dart works well for these sorts of queries because its multi-threaded workers perform in-memory shuffles using locally cached data. There's no time spent hitting deep storage. + +You can query batch or realtime datasources with Dart. + +## Enable Dart + +To enable Dart, add the following line to your `broker/runtime.properties` and `historical/runtime.properties` files: + +``` +druid.msq.dart.enabled = true +``` + +### Additional configs + +There are additional configs that provide some control over Dart's resource consumption. + +For Brokers, you can set the following configs: + +- `druid.msq.dart.controller.concurrentQueries`: The maximum number of query controllers that can run concurrently on that Broker. Additional controllers are queued. Defaults to 1. +- `druid.msq.dart.query.context.targetPartitionsPerWorker`: The number of partitions per worker to create during a shuffle. We recommend setting this to the number of threads available on workers to fully take advantage of multi-threaded processing of shuffled data. + +For Historicals, you can set the following configs: + +- `druid.msq.dart.worker.concurrentQueries`: The maximum number of query workers that can run concurrently on that Historical. Default is equal to the number of merge buffers because each query needs one merge buffer. Ideally, this should be equal to or larger than the sum of the `concurrentQueries` setting on yourl Brokers. +- `druid.msq.dart.worker.heapFraction`: The maximum amount of heap available for use across all Dart queries as a decimal. The default is 0.35, 35% of heap. + + +## Run a Dart query + +Once enabled, you can select Dart from the available engines in the Druid console or the API to issue queries like with other query engines. + +### Druid console + +In the **Query** view, select **Engine: SQL (Dart)** from the engine selector menu. + +### API + +Dart uses the SQL endpoint `/druid/v2/sql` like the other SQL query engines. To use Dart, include the query context parameter `engine` and set it to `msq-dart`: Review Comment: ```suggestion Dart uses the SQL endpoint `/druid/v2/sql`. To use Dart, include the query context parameter `engine` and set it to `msq-dart`: ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
