Re: [PR] docs: msq autocompaction (druid)

via GitHub Tue, 15 Oct 2024 14:15:10 -0700


vtlim commented on code in PR #16681:
URL: https://github.com/apache/druid/pull/16681#discussion_r1801766042



##########
docs/data-management/automatic-compaction.md:
##########
@@ -131,7 +78,62 @@ maximize performance and minimize disk usage of the 
`compact` tasks launched by
 
 For more details on each of the specs in an auto-compaction configuration, see 
[Automatic compaction dynamic 
configuration](../configuration/index.md#automatic-compaction-dynamic-configuration).
 
-### Set frequency of compaction runs
+## Auto-compaction using Coordinator duties
+
+You can control how often the Coordinator checks to see if auto-compaction is 
needed. The Coordinator [indexing 
period](../configuration/index.md#coordinator-operation), 
`druid.coordinator.period.indexingPeriod`, controls the frequency of compaction 
tasks.
+The default indexing period is 30 minutes, meaning that the Coordinator first 
checks for segments to compact at most 30 minutes from when auto-compaction is 
enabled.
+This time period also affects other Coordinator duties such as cleanup of 
unused segments and stale pending segments.
+To configure the auto-compaction time period without interfering with 
`indexingPeriod`, see [Set frequency of compaction 
runs](#change-compaction-frequency).
+
+At every invocation of auto-compaction, the Coordinator initiates a [segment 
search](../design/coordinator.md#segment-search-policy-in-automatic-compaction) 
to determine eligible segments to compact.
+When there are eligible segments to compact, the Coordinator issues compaction 
tasks based on available worker capacity.
+If a compaction task takes longer than the indexing period, the Coordinator 
waits for it to finish before resuming the period for segment search.
+
+No additional configuration is needed to run automatic compaction tasks using 
the Coordinator and native engine. This is the default behavior for Druid.
+You can configure it for a datasource through the web console or 
programmatically via an API.
+This process differs for manual compaction tasks, which can be submitted from 
the [Tasks view of the web console](../operations/web-console.md) or the [Tasks 
API](../api-reference/tasks-api.md).
+
+### Manage auto-compaction using the web-console

Review Comment:
   ```suggestion
   ### Manage auto-compaction using the web console
   ```



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord 
rather than Coordinator duties. Compaction supervisors provide the following 
benefits over Coordinator duties:

Review Comment:
   ```suggestion
   You can run automatic compaction using compaction supervisors on the 
Overlord rather than the Coordinator. Compaction supervisors provide the 
following benefits over Coordinator duties:
   ```



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord 
rather than Coordinator duties. Compaction supervisors provide the following 
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the 
auto-compaction, such as status or state

Review Comment:
   Is it easier to use the supervisor framework compared to `GET 
/druid/coordinator/v1/compaction/status`?



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord 
rather than Coordinator duties. Compaction supervisors provide the following 
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the 
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task 
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord 
runtime properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  `druid.supervisor.compaction.engine` to  `msq` to specify the MSQ task 
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some 
differences. Specifically, you submit a supervisor spec with the `type` set to 
`autocompact` and the auto-compaction config in the `spec` to configure 
auto-compaction.
+  
+For information about the syntax, see [automatic-compaction 
syntax](#auto-compaction-syntax). 
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform 
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+     - The type of supervisor spec by setting `"type": "autocompact"`
+     - The compaction configuration by adding it to the `spec` field
+    ```json
+    {
+   "type": "autocompact",
+   "spec": {
+      "dataSource": YOUR_DATASOURCE,
+    ...
+    ...
+   }
+    ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor 
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint 
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia` 
datasource:
+
+```sh
+curl --location --request POST 
'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+   "type": "autocompact",    // required
+   "suspended": false,         // optional
+   "spec": {                           // required
+       "dataSource": "wikipedia",          // required
+       "tuningConfig": {...},                    // optional
+       "granularitySpec": {...},               // optional
+       "engine": <native|msq>,  //optional
+       ...
+   }
+}'
+```
+
+Note that if you omit `spec.engine`, Druid uses the default compaction engine. 
You can control the default compaction engine with the 
`druid.supervisor.compaction.engine` Overlord runtime property. If 
`spec.engine` and `druid.supervisor.compaction.engine` are omitted, Druid 
defaults to the native engine.
+
+To stop the automatic compaction task, suspend or terminate the supervisor 
through the UI or API.
+
+### Use MSQ for auto-compaction
+
+The MSQ task engine is available as a compaction engine if you configure 
auto-compaction to use compaction supervisors. To use the MSQ task engine for 
automatic compaction, make sure the following requirements are met:
+
+* Have the [MSQ  task engine extension 
loaded](../multi-stage-query/index.md#load-the-extension).

Review Comment:
   ```suggestion
   * [Load the MSQ task engine 
extension](../multi-stage-query/index.md#load-the-extension).
   ```



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord 
rather than Coordinator duties. Compaction supervisors provide the following 
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the 
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task 
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord 
runtime properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  `druid.supervisor.compaction.engine` to  `msq` to specify the MSQ task 
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some 
differences. Specifically, you submit a supervisor spec with the `type` set to 
`autocompact` and the auto-compaction config in the `spec` to configure 
auto-compaction.
+  
+For information about the syntax, see [automatic-compaction 
syntax](#auto-compaction-syntax). 
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform 
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+     - The type of supervisor spec by setting `"type": "autocompact"`
+     - The compaction configuration by adding it to the `spec` field
+    ```json
+    {
+   "type": "autocompact",
+   "spec": {
+      "dataSource": YOUR_DATASOURCE,
+    ...
+    ...
+   }
+    ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor 
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint 
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia` 
datasource:
+
+```sh

Review Comment:
   Align comments for easier reading?
   
   
![image](https://github.com/user-attachments/assets/62901f17-8ec8-4982-8440-0af2d86169c6)
   



##########
docs/ingestion/supervisor.md:
##########
@@ -23,22 +23,22 @@ sidebar_label: Supervisor
   ~ under the License.
   -->
 
-A supervisor manages streaming ingestion from external streaming sources into 
Apache Druid.
-Supervisors oversee the state of indexing tasks to coordinate handoffs, manage 
failures, and ensure that the scalability and replication requirements are 
maintained.
+Apache Druid uses supervisors to manage streaming ingestion from external 
streaming sources into Druid.
+Supervisors oversee the state of indexing tasks to coordinate handoffs, manage 
failures, and ensure that the scalability and replication requirements are 
maintained. They can also be used to perform [automatic 
compaction](../data-management/automatic-compaction.md) after data has been 
ingested.
 
 This topic uses the Apache Kafka term offset to refer to the identifier for 
records in a partition. If you are using Amazon Kinesis, the equivalent is 
sequence number.
 
 ## Supervisor spec
 
-Druid uses a JSON specification, often referred to as the supervisor spec, to 
define streaming ingestion tasks.
-The supervisor spec specifies how Druid should consume, process, and index 
streaming data.
+Druid uses a JSON specification, often referred to as the supervisor spec, to 
define tasks used for streaming ingestion or auto-compaction.
+The supervisor spec specifies how Druid should consume, process, and index 
data from an external stream or Druid itself.
 
 The following table outlines the high-level configuration options for a 
supervisor spec:
 
 |Property|Type|Description|Required|
 |--------|----|-----------|--------|
-|`type`|String|The supervisor type. One of `kafka`or `kinesis`.|Yes|
-|`spec`|Object|The container object for the supervisor configuration.|Yes|
+|`type`|String|The supervisor type. For streaming ingestion, this can be 
either `kafka`, `kinesis` or `rabbit`. For automatic compaction, set the type 
to `autocompact`. |Yes|

Review Comment:
   ```suggestion
   |`type`|String|The supervisor type. For streaming ingestion, this can be 
either `kafka`, `kinesis`, or `rabbit`. For automatic compaction, set the type 
to `autocompact`. |Yes|
   ```



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord 
rather than Coordinator duties. Compaction supervisors provide the following 
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the 
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task 
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord 
runtime properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  `druid.supervisor.compaction.engine` to  `msq` to specify the MSQ task 
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some 
differences. Specifically, you submit a supervisor spec with the `type` set to 
`autocompact` and the auto-compaction config in the `spec` to configure 
auto-compaction.
+  
+For information about the syntax, see [automatic-compaction 
syntax](#auto-compaction-syntax). 
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform 
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+     - The type of supervisor spec by setting `"type": "autocompact"`
+     - The compaction configuration by adding it to the `spec` field
+    ```json
+    {
+   "type": "autocompact",
+   "spec": {
+      "dataSource": YOUR_DATASOURCE,
+    ...
+    ...
+   }
+    ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor 
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint 
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia` 
datasource:
+
+```sh
+curl --location --request POST 
'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+   "type": "autocompact",    // required
+   "suspended": false,         // optional
+   "spec": {                           // required
+       "dataSource": "wikipedia",          // required
+       "tuningConfig": {...},                    // optional
+       "granularitySpec": {...},               // optional
+       "engine": <native|msq>,  //optional
+       ...
+   }
+}'
+```
+
+Note that if you omit `spec.engine`, Druid uses the default compaction engine. 
You can control the default compaction engine with the 
`druid.supervisor.compaction.engine` Overlord runtime property. If 
`spec.engine` and `druid.supervisor.compaction.engine` are omitted, Druid 
defaults to the native engine.
+
+To stop the automatic compaction task, suspend or terminate the supervisor 
through the UI or API.
+
+### Use MSQ for auto-compaction
+
+The MSQ task engine is available as a compaction engine if you configure 
auto-compaction to use compaction supervisors. To use the MSQ task engine for 
automatic compaction, make sure the following requirements are met:
+
+* Have the [MSQ  task engine extension 
loaded](../multi-stage-query/index.md#load-the-extension).
+* In your Overlord runtime properties, set the following properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  Optionally, set `druid.supervisor.compaction.engine` to `msq` to specify 
the MSQ task engine as the default compaction engine. If you don't do this, 
you'll need to set `spec.engine` to `msq` for each compaction supervisor spec 
where you want to use the MSQ task engine.
+* Have at least two compaction task slots available or set 
`compactionConfig.taskContext.maxNumTasks` to two or more. The MSQ task engine 
requires at least two tasks to run, one controller task and one worker task.
+
+You can use [MSQ task engine context 
parameters](../multi-stage-query/reference.md#context-parameters) in 
`spec.taskContext` when configuring your datasource for automatic compaction, 
such as setting the maximum number of tasks using the 
`spec.taskContext.maxNumTasks` parameter. Some of the MSQ task engine context 
parameters overlap with automatic compaction parameters. When these settings 
overlap, set one or the other.
+To submit an automatic compaction task, you submit a supervisor spec through 
the UI or API with the type `autocompact` and the `spec` where you define the 
compaction behavior using the [automatic compaction 
syntax](#auto-compaction-syntax). You can also use the [web 
console](#manage-compaction-supervisors-with-the-web-console).
+
+
+#### MSQ task engine limitations
+
+When using the MSQ task engine for auto-compaction, keep the following 
limitations in mind:
+
+- The `metricSpec` field is only supported for certain aggregators. For more 
information, see [Supported aggregators](#supported-aggregators).
+- Only dynamic and range-based partitioning are supported
+- Set `rollup`  to `true` if and only if `metricSpec` is not empty or null.
+- You can only partition on string dimensions. However, multi-valued string 
dimensions are not supported.
+- The `maxTotalRows` config is not supported in `DynamicPartitionsSpec`. Use 
`maxRowsPerSegment` instead.
+- Segments can only be sorted on `__time` as the first column.
+
+##### Supported aggregators
+
+Auto-compaction using the MSQ task engine supports only aggregators that 
satisfy the following properties: 
+a) mergeability: can also be used to combine partial aggregates
+b) idempotency: produce the same results on repeated runs of the aggregator on 
previously aggregated values in a column
+
+This is exemplified by the following `longSum` aggregator:
+
+```
+{"name": "added", "type": "longSum", "fieldName": "added"}
+```
+
+where `longSum` being capable of combining partial results satisfies 
mergeability, while input and output column being the same (`added`) ensures 
idempotency.
+
+The following are some examples of aggregators that aren't supported since at 
least one of the required conditions aren't satisfied:
+
+*  `longSum` aggregator where the `added` column rolls up into `sum_added` 
column discarding the input `added` column, violating idempotency, as 
subsequent runs would no longer find the `added` column:
+    ```
+    {"name": "sum_added", "type": "longSum", "fieldName": "added" }

Review Comment:
   ```suggestion
       {"name": "sum_added", "type": "longSum", "fieldName": "added"}
   ```



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord 
rather than Coordinator duties. Compaction supervisors provide the following 
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the 
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task 
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available

Review Comment:
   Maybe a comparison table could be helpful for some of these points and also 
be more explicit with the Coordinator limitations (for example, the Coordinator 
may/does compact an interval repeatedly?)



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord 
rather than Coordinator duties. Compaction supervisors provide the following 
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the 
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task 
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord 
runtime properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  `druid.supervisor.compaction.engine` to  `msq` to specify the MSQ task 
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some 
differences. Specifically, you submit a supervisor spec with the `type` set to 
`autocompact` and the auto-compaction config in the `spec` to configure 
auto-compaction.
+  
+For information about the syntax, see [automatic-compaction 
syntax](#auto-compaction-syntax). 
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform 
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+     - The type of supervisor spec by setting `"type": "autocompact"`
+     - The compaction configuration by adding it to the `spec` field

Review Comment:
   Should this be included in the code block?



##########
docs/multi-stage-query/known-issues.md:
##########
@@ -68,3 +68,16 @@ properties, and the `indexSpec` 
[`tuningConfig`](../ingestion/ingestion-spec.md#
 - The maximum number of elements in a window cannot exceed a value of 100,000. 
 - To avoid `leafOperators` in MSQ engine, window functions have an extra scan 
stage after the window stage for cases 
 where native engine has a non-empty `leafOperator`.
+
+## Automatic compaction
+
+<!--This list also exists in data-management/automatic-compaction-->

Review Comment:
   ```suggestion
   <!-- If you update this list, also update 
data-management/automatic-compaction.md -->
   ```



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord 
rather than Coordinator duties. Compaction supervisors provide the following 
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the 
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task 
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord 
runtime properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  `druid.supervisor.compaction.engine` to  `msq` to specify the MSQ task 
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some 
differences. Specifically, you submit a supervisor spec with the `type` set to 
`autocompact` and the auto-compaction config in the `spec` to configure 
auto-compaction.
+  
+For information about the syntax, see [automatic-compaction 
syntax](#auto-compaction-syntax). 

Review Comment:
   ```suggestion
   For information about the syntax, see [automatic compaction 
syntax](#auto-compaction-syntax). 
   ```



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord 
rather than Coordinator duties. Compaction supervisors provide the following 
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the 
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task 
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord 
runtime properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  `druid.supervisor.compaction.engine` to  `msq` to specify the MSQ task 
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some 
differences. Specifically, you submit a supervisor spec with the `type` set to 
`autocompact` and the auto-compaction config in the `spec` to configure 
auto-compaction.
+  
+For information about the syntax, see [automatic-compaction 
syntax](#auto-compaction-syntax). 
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform 
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+     - The type of supervisor spec by setting `"type": "autocompact"`
+     - The compaction configuration by adding it to the `spec` field
+    ```json
+    {
+   "type": "autocompact",
+   "spec": {
+      "dataSource": YOUR_DATASOURCE,
+    ...
+    ...
+   }
+    ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor 
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint 
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia` 
datasource:
+
+```sh
+curl --location --request POST 
'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+   "type": "autocompact",    // required
+   "suspended": false,         // optional
+   "spec": {                           // required
+       "dataSource": "wikipedia",          // required
+       "tuningConfig": {...},                    // optional
+       "granularitySpec": {...},               // optional
+       "engine": <native|msq>,  //optional
+       ...
+   }
+}'
+```
+
+Note that if you omit `spec.engine`, Druid uses the default compaction engine. 
You can control the default compaction engine with the 
`druid.supervisor.compaction.engine` Overlord runtime property. If 
`spec.engine` and `druid.supervisor.compaction.engine` are omitted, Druid 
defaults to the native engine.
+
+To stop the automatic compaction task, suspend or terminate the supervisor 
through the UI or API.
+
+### Use MSQ for auto-compaction
+
+The MSQ task engine is available as a compaction engine if you configure 
auto-compaction to use compaction supervisors. To use the MSQ task engine for 
automatic compaction, make sure the following requirements are met:
+
+* Have the [MSQ  task engine extension 
loaded](../multi-stage-query/index.md#load-the-extension).
+* In your Overlord runtime properties, set the following properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  Optionally, set `druid.supervisor.compaction.engine` to `msq` to specify 
the MSQ task engine as the default compaction engine. If you don't do this, 
you'll need to set `spec.engine` to `msq` for each compaction supervisor spec 
where you want to use the MSQ task engine.
+* Have at least two compaction task slots available or set 
`compactionConfig.taskContext.maxNumTasks` to two or more. The MSQ task engine 
requires at least two tasks to run, one controller task and one worker task.
+
+You can use [MSQ task engine context 
parameters](../multi-stage-query/reference.md#context-parameters) in 
`spec.taskContext` when configuring your datasource for automatic compaction, 
such as setting the maximum number of tasks using the 
`spec.taskContext.maxNumTasks` parameter. Some of the MSQ task engine context 
parameters overlap with automatic compaction parameters. When these settings 
overlap, set one or the other.
+To submit an automatic compaction task, you submit a supervisor spec through 
the UI or API with the type `autocompact` and the `spec` where you define the 
compaction behavior using the [automatic compaction 
syntax](#auto-compaction-syntax). You can also use the [web 
console](#manage-compaction-supervisors-with-the-web-console).
+
+
+#### MSQ task engine limitations
+
+When using the MSQ task engine for auto-compaction, keep the following 
limitations in mind:
+
+- The `metricSpec` field is only supported for certain aggregators. For more 
information, see [Supported aggregators](#supported-aggregators).
+- Only dynamic and range-based partitioning are supported

Review Comment:
   ```suggestion
   - Only dynamic and range-based partitioning are supported.
   ```



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).

Review Comment:
   Maybe "Coordinator-managed"? If the compaction part happens in the same way, 
but just managed using a different service



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord 
rather than Coordinator duties. Compaction supervisors provide the following 
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the 
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task 
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord 
runtime properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task

Review Comment:
   ```suggestion
     *  `druid.supervisor.compaction.enabled` to `true` so that compaction 
tasks can be run as a supervisor tasks
   ```



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord 
rather than Coordinator duties. Compaction supervisors provide the following 
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the 
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task 
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord 
runtime properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  `druid.supervisor.compaction.engine` to  `msq` to specify the MSQ task 
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some 
differences. Specifically, you submit a supervisor spec with the `type` set to 
`autocompact` and the auto-compaction config in the `spec` to configure 
auto-compaction.
+  
+For information about the syntax, see [automatic-compaction 
syntax](#auto-compaction-syntax). 
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform 
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+     - The type of supervisor spec by setting `"type": "autocompact"`
+     - The compaction configuration by adding it to the `spec` field
+    ```json
+    {
+   "type": "autocompact",
+   "spec": {
+      "dataSource": YOUR_DATASOURCE,
+    ...

Review Comment:
   ```suggestion
   ```



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord 
rather than Coordinator duties. Compaction supervisors provide the following 
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the 
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task 
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord 
runtime properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  `druid.supervisor.compaction.engine` to  `msq` to specify the MSQ task 
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some 
differences. Specifically, you submit a supervisor spec with the `type` set to 
`autocompact` and the auto-compaction config in the `spec` to configure 
auto-compaction.
+  
+For information about the syntax, see [automatic-compaction 
syntax](#auto-compaction-syntax). 
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform 
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+     - The type of supervisor spec by setting `"type": "autocompact"`
+     - The compaction configuration by adding it to the `spec` field
+    ```json
+    {
+   "type": "autocompact",
+   "spec": {
+      "dataSource": YOUR_DATASOURCE,
+    ...
+    ...
+   }
+    ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor 
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint 
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia` 
datasource:
+
+```sh
+curl --location --request POST 
'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+   "type": "autocompact",    // required
+   "suspended": false,         // optional
+   "spec": {                           // required
+       "dataSource": "wikipedia",          // required
+       "tuningConfig": {...},                    // optional
+       "granularitySpec": {...},               // optional
+       "engine": <native|msq>,  //optional
+       ...
+   }
+}'
+```
+
+Note that if you omit `spec.engine`, Druid uses the default compaction engine. 
You can control the default compaction engine with the 
`druid.supervisor.compaction.engine` Overlord runtime property. If 
`spec.engine` and `druid.supervisor.compaction.engine` are omitted, Druid 
defaults to the native engine.
+
+To stop the automatic compaction task, suspend or terminate the supervisor 
through the UI or API.
+
+### Use MSQ for auto-compaction
+
+The MSQ task engine is available as a compaction engine if you configure 
auto-compaction to use compaction supervisors. To use the MSQ task engine for 
automatic compaction, make sure the following requirements are met:
+
+* Have the [MSQ  task engine extension 
loaded](../multi-stage-query/index.md#load-the-extension).
+* In your Overlord runtime properties, set the following properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  Optionally, set `druid.supervisor.compaction.engine` to `msq` to specify 
the MSQ task engine as the default compaction engine. If you don't do this, 
you'll need to set `spec.engine` to `msq` for each compaction supervisor spec 
where you want to use the MSQ task engine.
+* Have at least two compaction task slots available or set 
`compactionConfig.taskContext.maxNumTasks` to two or more. The MSQ task engine 
requires at least two tasks to run, one controller task and one worker task.
+
+You can use [MSQ task engine context 
parameters](../multi-stage-query/reference.md#context-parameters) in 
`spec.taskContext` when configuring your datasource for automatic compaction, 
such as setting the maximum number of tasks using the 
`spec.taskContext.maxNumTasks` parameter. Some of the MSQ task engine context 
parameters overlap with automatic compaction parameters. When these settings 
overlap, set one or the other.
+To submit an automatic compaction task, you submit a supervisor spec through 
the UI or API with the type `autocompact` and the `spec` where you define the 
compaction behavior using the [automatic compaction 
syntax](#auto-compaction-syntax). You can also use the [web 
console](#manage-compaction-supervisors-with-the-web-console).
+
+
+#### MSQ task engine limitations
+
+When using the MSQ task engine for auto-compaction, keep the following 
limitations in mind:
+
+- The `metricSpec` field is only supported for certain aggregators. For more 
information, see [Supported aggregators](#supported-aggregators).
+- Only dynamic and range-based partitioning are supported
+- Set `rollup`  to `true` if and only if `metricSpec` is not empty or null.
+- You can only partition on string dimensions. However, multi-valued string 
dimensions are not supported.
+- The `maxTotalRows` config is not supported in `DynamicPartitionsSpec`. Use 
`maxRowsPerSegment` instead.
+- Segments can only be sorted on `__time` as the first column.
+
+##### Supported aggregators
+
+Auto-compaction using the MSQ task engine supports only aggregators that 
satisfy the following properties: 
+a) mergeability: can also be used to combine partial aggregates
+b) idempotency: produce the same results on repeated runs of the aggregator on 
previously aggregated values in a column

Review Comment:
   ```suggestion
   * __Mergeability__: can combine partial aggregates
   * __Idempotency__: produces the same results on repeated runs of the 
aggregator on previously aggregated values in a column
   ```



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord 
rather than Coordinator duties. Compaction supervisors provide the following 
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the 
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task 
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord 
runtime properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  `druid.supervisor.compaction.engine` to  `msq` to specify the MSQ task 
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some 
differences. Specifically, you submit a supervisor spec with the `type` set to 
`autocompact` and the auto-compaction config in the `spec` to configure 
auto-compaction.
+  
+For information about the syntax, see [automatic-compaction 
syntax](#auto-compaction-syntax). 
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform 
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+     - The type of supervisor spec by setting `"type": "autocompact"`
+     - The compaction configuration by adding it to the `spec` field
+    ```json
+    {
+   "type": "autocompact",
+   "spec": {
+      "dataSource": YOUR_DATASOURCE,
+    ...
+    ...
+   }
+    ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor 
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint 
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia` 
datasource:
+
+```sh
+curl --location --request POST 
'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+   "type": "autocompact",    // required
+   "suspended": false,         // optional
+   "spec": {                           // required
+       "dataSource": "wikipedia",          // required
+       "tuningConfig": {...},                    // optional
+       "granularitySpec": {...},               // optional
+       "engine": <native|msq>,  //optional
+       ...
+   }
+}'
+```
+
+Note that if you omit `spec.engine`, Druid uses the default compaction engine. 
You can control the default compaction engine with the 
`druid.supervisor.compaction.engine` Overlord runtime property. If 
`spec.engine` and `druid.supervisor.compaction.engine` are omitted, Druid 
defaults to the native engine.
+
+To stop the automatic compaction task, suspend or terminate the supervisor 
through the UI or API.
+
+### Use MSQ for auto-compaction
+
+The MSQ task engine is available as a compaction engine if you configure 
auto-compaction to use compaction supervisors. To use the MSQ task engine for 
automatic compaction, make sure the following requirements are met:
+
+* Have the [MSQ  task engine extension 
loaded](../multi-stage-query/index.md#load-the-extension).
+* In your Overlord runtime properties, set the following properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  Optionally, set `druid.supervisor.compaction.engine` to `msq` to specify 
the MSQ task engine as the default compaction engine. If you don't do this, 
you'll need to set `spec.engine` to `msq` for each compaction supervisor spec 
where you want to use the MSQ task engine.
+* Have at least two compaction task slots available or set 
`compactionConfig.taskContext.maxNumTasks` to two or more. The MSQ task engine 
requires at least two tasks to run, one controller task and one worker task.
+
+You can use [MSQ task engine context 
parameters](../multi-stage-query/reference.md#context-parameters) in 
`spec.taskContext` when configuring your datasource for automatic compaction, 
such as setting the maximum number of tasks using the 
`spec.taskContext.maxNumTasks` parameter. Some of the MSQ task engine context 
parameters overlap with automatic compaction parameters. When these settings 
overlap, set one or the other.
+To submit an automatic compaction task, you submit a supervisor spec through 
the UI or API with the type `autocompact` and the `spec` where you define the 
compaction behavior using the [automatic compaction 
syntax](#auto-compaction-syntax). You can also use the [web 
console](#manage-compaction-supervisors-with-the-web-console).
+
+
+#### MSQ task engine limitations
+
+When using the MSQ task engine for auto-compaction, keep the following 
limitations in mind:
+
+- The `metricSpec` field is only supported for certain aggregators. For more 
information, see [Supported aggregators](#supported-aggregators).
+- Only dynamic and range-based partitioning are supported
+- Set `rollup`  to `true` if and only if `metricSpec` is not empty or null.
+- You can only partition on string dimensions. However, multi-valued string 
dimensions are not supported.
+- The `maxTotalRows` config is not supported in `DynamicPartitionsSpec`. Use 
`maxRowsPerSegment` instead.
+- Segments can only be sorted on `__time` as the first column.
+
+##### Supported aggregators
+
+Auto-compaction using the MSQ task engine supports only aggregators that 
satisfy the following properties: 
+a) mergeability: can also be used to combine partial aggregates
+b) idempotency: produce the same results on repeated runs of the aggregator on 
previously aggregated values in a column
+
+This is exemplified by the following `longSum` aggregator:
+
+```
+{"name": "added", "type": "longSum", "fieldName": "added"}
+```
+
+where `longSum` being capable of combining partial results satisfies 
mergeability, while input and output column being the same (`added`) ensures 
idempotency.
+
+The following are some examples of aggregators that aren't supported since at 
least one of the required conditions aren't satisfied:
+
+*  `longSum` aggregator where the `added` column rolls up into `sum_added` 
column discarding the input `added` column, violating idempotency, as 
subsequent runs would no longer find the `added` column:
+    ```
+    {"name": "sum_added", "type": "longSum", "fieldName": "added" }
+    ```
+* Partial sketches which cannot themselves be used to combine partial 
aggregates and need merging aggregators -- such as `HLLSketchMerge` required 
for `HLLSketchBuild` aggregator below -- violating mergeability:
+    ```
+    {"name": added, "type":"HLLSketchBuild", fieldName: added}

Review Comment:
   ```suggestion
       {"name": added, "type":"HLLSketchBuild", fieldName: "added"}
   ```



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord 
rather than Coordinator duties. Compaction supervisors provide the following 
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the 
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task 
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available

Review Comment:
   Does that mean the Coordinator can't submit as soon as it's available but 
has to wait over the indexing period?



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord 
rather than Coordinator duties. Compaction supervisors provide the following 
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the 
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task 
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord 
runtime properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  `druid.supervisor.compaction.engine` to  `msq` to specify the MSQ task 
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some 
differences. Specifically, you submit a supervisor spec with the `type` set to 
`autocompact` and the auto-compaction config in the `spec` to configure 
auto-compaction.
+  
+For information about the syntax, see [automatic-compaction 
syntax](#auto-compaction-syntax). 
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform 
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+     - The type of supervisor spec by setting `"type": "autocompact"`
+     - The compaction configuration by adding it to the `spec` field
+    ```json
+    {
+   "type": "autocompact",
+   "spec": {
+      "dataSource": YOUR_DATASOURCE,
+    ...
+    ...
+   }
+    ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor 
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint 
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia` 
datasource:
+
+```sh
+curl --location --request POST 
'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+   "type": "autocompact",    // required
+   "suspended": false,         // optional
+   "spec": {                           // required
+       "dataSource": "wikipedia",          // required
+       "tuningConfig": {...},                    // optional
+       "granularitySpec": {...},               // optional
+       "engine": <native|msq>,  //optional
+       ...
+   }
+}'
+```
+
+Note that if you omit `spec.engine`, Druid uses the default compaction engine. 
You can control the default compaction engine with the 
`druid.supervisor.compaction.engine` Overlord runtime property. If 
`spec.engine` and `druid.supervisor.compaction.engine` are omitted, Druid 
defaults to the native engine.
+
+To stop the automatic compaction task, suspend or terminate the supervisor 
through the UI or API.
+
+### Use MSQ for auto-compaction
+
+The MSQ task engine is available as a compaction engine if you configure 
auto-compaction to use compaction supervisors. To use the MSQ task engine for 
automatic compaction, make sure the following requirements are met:
+
+* Have the [MSQ  task engine extension 
loaded](../multi-stage-query/index.md#load-the-extension).
+* In your Overlord runtime properties, set the following properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  Optionally, set `druid.supervisor.compaction.engine` to `msq` to specify 
the MSQ task engine as the default compaction engine. If you don't do this, 
you'll need to set `spec.engine` to `msq` for each compaction supervisor spec 
where you want to use the MSQ task engine.
+* Have at least two compaction task slots available or set 
`compactionConfig.taskContext.maxNumTasks` to two or more. The MSQ task engine 
requires at least two tasks to run, one controller task and one worker task.
+
+You can use [MSQ task engine context 
parameters](../multi-stage-query/reference.md#context-parameters) in 
`spec.taskContext` when configuring your datasource for automatic compaction, 
such as setting the maximum number of tasks using the 
`spec.taskContext.maxNumTasks` parameter. Some of the MSQ task engine context 
parameters overlap with automatic compaction parameters. When these settings 
overlap, set one or the other.
+To submit an automatic compaction task, you submit a supervisor spec through 
the UI or API with the type `autocompact` and the `spec` where you define the 
compaction behavior using the [automatic compaction 
syntax](#auto-compaction-syntax). You can also use the [web 
console](#manage-compaction-supervisors-with-the-web-console).
+
+
+#### MSQ task engine limitations
+
+When using the MSQ task engine for auto-compaction, keep the following 
limitations in mind:
+
+- The `metricSpec` field is only supported for certain aggregators. For more 
information, see [Supported aggregators](#supported-aggregators).
+- Only dynamic and range-based partitioning are supported
+- Set `rollup`  to `true` if and only if `metricSpec` is not empty or null.
+- You can only partition on string dimensions. However, multi-valued string 
dimensions are not supported.
+- The `maxTotalRows` config is not supported in `DynamicPartitionsSpec`. Use 
`maxRowsPerSegment` instead.
+- Segments can only be sorted on `__time` as the first column.
+
+##### Supported aggregators
+
+Auto-compaction using the MSQ task engine supports only aggregators that 
satisfy the following properties: 
+a) mergeability: can also be used to combine partial aggregates
+b) idempotency: produce the same results on repeated runs of the aggregator on 
previously aggregated values in a column
+
+This is exemplified by the following `longSum` aggregator:
+
+```
+{"name": "added", "type": "longSum", "fieldName": "added"}
+```
+
+where `longSum` being capable of combining partial results satisfies 
mergeability, while input and output column being the same (`added`) ensures 
idempotency.
+
+The following are some examples of aggregators that aren't supported since at 
least one of the required conditions aren't satisfied:
+
+*  `longSum` aggregator where the `added` column rolls up into `sum_added` 
column discarding the input `added` column, violating idempotency, as 
subsequent runs would no longer find the `added` column:
+    ```
+    {"name": "sum_added", "type": "longSum", "fieldName": "added" }
+    ```
+* Partial sketches which cannot themselves be used to combine partial 
aggregates and need merging aggregators -- such as `HLLSketchMerge` required 
for `HLLSketchBuild` aggregator below -- violating mergeability:
+    ```
+    {"name": added, "type":"HLLSketchBuild", fieldName: added}
+    ```
+* Count aggregator since it cannot be used to combine partial aggregates and 
it rolls up into a different `count` column discarding the input column(s), 
violating both mergeability and idempotency.
+    ```
+    { "type" : "count", "name" : "count" }

Review Comment:
   ```suggestion
       {"type": "count", "name": "count"}
   ```



##########
docs/multi-stage-query/known-issues.md:
##########
@@ -68,3 +68,16 @@ properties, and the `indexSpec` 
[`tuningConfig`](../ingestion/ingestion-spec.md#
 - The maximum number of elements in a window cannot exceed a value of 100,000. 
 - To avoid `leafOperators` in MSQ engine, window functions have an extra scan 
stage after the window stage for cases 
 where native engine has a non-empty `leafOperator`.
+
+## Automatic compaction
+
+<!--This list also exists in data-management/automatic-compaction-->
+
+The following known issues and limitations affect automatic compaction with 
the MSQ task engine:

Review Comment:
   I made a couple of suggested edits to the other section which would go here 
too



##########
docs/ingestion/supervisor.md:
##########
@@ -23,22 +23,22 @@ sidebar_label: Supervisor
   ~ under the License.
   -->
 
-A supervisor manages streaming ingestion from external streaming sources into 
Apache Druid.
-Supervisors oversee the state of indexing tasks to coordinate handoffs, manage 
failures, and ensure that the scalability and replication requirements are 
maintained.
+Apache Druid uses supervisors to manage streaming ingestion from external 
streaming sources into Druid.
+Supervisors oversee the state of indexing tasks to coordinate handoffs, manage 
failures, and ensure that the scalability and replication requirements are 
maintained. They can also be used to perform [automatic 
compaction](../data-management/automatic-compaction.md) after data has been 
ingested.
 
 This topic uses the Apache Kafka term offset to refer to the identifier for 
records in a partition. If you are using Amazon Kinesis, the equivalent is 
sequence number.
 
 ## Supervisor spec
 
-Druid uses a JSON specification, often referred to as the supervisor spec, to 
define streaming ingestion tasks.
-The supervisor spec specifies how Druid should consume, process, and index 
streaming data.
+Druid uses a JSON specification, often referred to as the supervisor spec, to 
define tasks used for streaming ingestion or auto-compaction.
+The supervisor spec specifies how Druid should consume, process, and index 
data from an external stream or Druid itself.
 
 The following table outlines the high-level configuration options for a 
supervisor spec:
 
 |Property|Type|Description|Required|
 |--------|----|-----------|--------|
-|`type`|String|The supervisor type. One of `kafka`or `kinesis`.|Yes|
-|`spec`|Object|The container object for the supervisor configuration.|Yes|
+|`type`|String|The supervisor type. For streaming ingestion, this can be 
either `kafka`, `kinesis` or `rabbit`. For automatic compaction, set the type 
to `autocompact`. |Yes|

Review Comment:
   Just checking that rabbit is supposed to be added here?



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord 
rather than Coordinator duties. Compaction supervisors provide the following 
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the 
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task 
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord 
runtime properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  `druid.supervisor.compaction.engine` to  `msq` to specify the MSQ task 
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some 
differences. Specifically, you submit a supervisor spec with the `type` set to 
`autocompact` and the auto-compaction config in the `spec` to configure 
auto-compaction.
+  
+For information about the syntax, see [automatic-compaction 
syntax](#auto-compaction-syntax). 
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform 
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+     - The type of supervisor spec by setting `"type": "autocompact"`
+     - The compaction configuration by adding it to the `spec` field
+    ```json
+    {
+   "type": "autocompact",
+   "spec": {
+      "dataSource": YOUR_DATASOURCE,
+    ...
+    ...
+   }
+    ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor 
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint 
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia` 
datasource:
+
+```sh
+curl --location --request POST 
'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+   "type": "autocompact",    // required
+   "suspended": false,         // optional
+   "spec": {                           // required
+       "dataSource": "wikipedia",          // required
+       "tuningConfig": {...},                    // optional
+       "granularitySpec": {...},               // optional
+       "engine": <native|msq>,  //optional
+       ...
+   }
+}'
+```
+
+Note that if you omit `spec.engine`, Druid uses the default compaction engine. 
You can control the default compaction engine with the 
`druid.supervisor.compaction.engine` Overlord runtime property. If 
`spec.engine` and `druid.supervisor.compaction.engine` are omitted, Druid 
defaults to the native engine.
+
+To stop the automatic compaction task, suspend or terminate the supervisor 
through the UI or API.
+
+### Use MSQ for auto-compaction
+
+The MSQ task engine is available as a compaction engine if you configure 
auto-compaction to use compaction supervisors. To use the MSQ task engine for 
automatic compaction, make sure the following requirements are met:
+
+* Have the [MSQ  task engine extension 
loaded](../multi-stage-query/index.md#load-the-extension).
+* In your Overlord runtime properties, set the following properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  Optionally, set `druid.supervisor.compaction.engine` to `msq` to specify 
the MSQ task engine as the default compaction engine. If you don't do this, 
you'll need to set `spec.engine` to `msq` for each compaction supervisor spec 
where you want to use the MSQ task engine.
+* Have at least two compaction task slots available or set 
`compactionConfig.taskContext.maxNumTasks` to two or more. The MSQ task engine 
requires at least two tasks to run, one controller task and one worker task.
+
+You can use [MSQ task engine context 
parameters](../multi-stage-query/reference.md#context-parameters) in 
`spec.taskContext` when configuring your datasource for automatic compaction, 
such as setting the maximum number of tasks using the 
`spec.taskContext.maxNumTasks` parameter. Some of the MSQ task engine context 
parameters overlap with automatic compaction parameters. When these settings 
overlap, set one or the other.
+To submit an automatic compaction task, you submit a supervisor spec through 
the UI or API with the type `autocompact` and the `spec` where you define the 
compaction behavior using the [automatic compaction 
syntax](#auto-compaction-syntax). You can also use the [web 
console](#manage-compaction-supervisors-with-the-web-console).

Review Comment:
   >To submit an automatic compaction task, you submit a supervisor spec 
through the UI or API with the type `autocompact` and the `spec` where you 
define the compaction behavior using the [automatic compaction 
syntax](#auto-compaction-syntax). You can also use the [web 
console](#manage-compaction-supervisors-with-the-web-console).
   
   
   Is this redundant with the two previous sections?



##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,133 @@ The following auto-compaction configuration compacts 
updates the `wikipedia` seg
 }
 ```
 
+## Auto-compaction using compaction supervisors  
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend 
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord 
rather than Coordinator duties. Compaction supervisors provide the following 
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the 
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task 
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord 
runtime properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task
+  *  `druid.supervisor.compaction.engine` to  `msq` to specify the MSQ task 
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some 
differences. Specifically, you submit a supervisor spec with the `type` set to 
`autocompact` and the auto-compaction config in the `spec` to configure 
auto-compaction.
+  
+For information about the syntax, see [automatic-compaction 
syntax](#auto-compaction-syntax). 
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform 
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+     - The type of supervisor spec by setting `"type": "autocompact"`
+     - The compaction configuration by adding it to the `spec` field
+    ```json
+    {
+   "type": "autocompact",
+   "spec": {
+      "dataSource": YOUR_DATASOURCE,
+    ...
+    ...
+   }
+    ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor 
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint 
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia` 
datasource:
+
+```sh
+curl --location --request POST 
'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+   "type": "autocompact",    // required
+   "suspended": false,         // optional
+   "spec": {                           // required
+       "dataSource": "wikipedia",          // required
+       "tuningConfig": {...},                    // optional
+       "granularitySpec": {...},               // optional
+       "engine": <native|msq>,  //optional
+       ...
+   }
+}'
+```
+
+Note that if you omit `spec.engine`, Druid uses the default compaction engine. 
You can control the default compaction engine with the 
`druid.supervisor.compaction.engine` Overlord runtime property. If 
`spec.engine` and `druid.supervisor.compaction.engine` are omitted, Druid 
defaults to the native engine.
+
+To stop the automatic compaction task, suspend or terminate the supervisor 
through the UI or API.
+
+### Use MSQ for auto-compaction
+
+The MSQ task engine is available as a compaction engine if you configure 
auto-compaction to use compaction supervisors. To use the MSQ task engine for 
automatic compaction, make sure the following requirements are met:
+
+* Have the [MSQ  task engine extension 
loaded](../multi-stage-query/index.md#load-the-extension).
+* In your Overlord runtime properties, set the following properties:
+  *  `druid.supervisor.compaction.enabled` to `true` so that compaction tasks 
can be run as a supervisor task

Review Comment:
   ```suggestion
     *  `druid.supervisor.compaction.enabled` to `true` so that compaction 
tasks can be run as a supervisor task.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] docs: msq autocompaction (druid)

Reply via email to