gargvishesh commented on code in PR #16681:
URL: https://github.com/apache/druid/pull/16681#discussion_r1794682705
##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,128 @@ The following auto-compaction configuration compacts
updates the `wikipedia` seg
}
```
+## Auto-compaction using compaction supervisors
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord
rather than Coordinator duties. Compaction supervisors provide the following
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord
runtime properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks
can be run as a supervisor task
+ * `druid.supervisor.compaction.engine` to `msq` to specify the MSQ task
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some
differences. Specifically, you submit a supervisor spec with the `type` set to
`autocompact` and the auto-compaction config in the `spec` to configure
auto-compaction.
+
+For information about the syntax, see [automatic-compaction
syntax](#auto-compaction-syntax).
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+ - The type of supervisor spec by setting `"type": "autocompact"`
+ - The compaction configuration by adding it to the `spec` field
+ ```json
+ {
+ "type": "autocompact",
+ "spec": {
+ "dataSource": YOUR_DATASOURCE,
+ ...
+ ...
+ }
+ ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia`
datasource:
+
+```sh
+curl --location --request POST
'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+ "type": "autocompact", // required
+ "suspended": false, // optional
+ "spec": { // required
+ "dataSource": "wikipedia", // required
+ "tuningConfig": {...}, // optional
+ "granularitySpec": {...}, // optional
+ "engine": <native|msq>, //optional
+ ...
+ }
+}'
+```
+
+Note that if you omit `spec.engine`, Druid uses the default compaction engine.
You can control the default compaction engine with the
`druid.supervisor.compaction.engine` Overlord runtime property. If
`spec.engine` and `druid.supervisor.compaction.engine` are omitted, Druid
defaults to the native engine.
+
+To stop the automatic compaction task, suspend or terminate the supervisor
through the UI or API.
+
+### Use MSQ for auto-compaction
+
+The MSQ task engine is available as a compaction engine if you configure
auto-compaction to use compaction supervisors. To use the MSQ task engine for
automatic compaction, make sure the following requirements are met:
+
+* Have the [MSQ task engine extension
loaded](../multi-stage-query/index.md#load-the-extension).
+* In your Overlord runtime properties, set the following properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks
can be run as a supervisor task
+ * Optionally, set `druid.supervisor.compaction.engine` to `msq` to specify
the MSQ task engine as the default compaction engine. If you don't do this,
you'll need to set `spec.engine` to `msq` for each compaction supervisor spec
where you want to use the MSQ task engine.
+* Have at least two compaction task slots available or set
`compactionConfig.taskContext.maxNumTasks` to two or more. The MSQ task engine
requires at least two tasks to run, one controller task and one worker task.
+
+You can use [MSQ task engine context
parameters](../multi-stage-query/reference.md#context-parameters) in
`spec.taskContext` when configuring your datasource for automatic compaction,
such as setting the maximum number of tasks using the
`spec.taskContext.maxNumTasks` parameter. Some of the MSQ task engine context
parameters overlap with automatic compaction parameters. When these settings
overlap, set one or the other.
+To submit an automatic compaction task, you submit a supervisor spec through
the UI or API with the type `autocompact` and the `spec` where you define the
compaction behavior using the [automatic compaction
syntax](#auto-compaction-syntax). You can also use the [web
console](#manage-compaction-supervisors-with-the-web-console).
+
+
+#### MSQ task engine limitations
+
+When using the MSQ task engine for auto-compaction, keep the following
limitations in mind:
+
+- The `metricSpec` field is only supported for certain aggregators. For more
information, see [Supported aggregators](#supported-aggregators).
+- Only dynamic and range-based partitioning are supported
+- Set `rollup` to `true` if and only if `metricSpec` is not empty or null.
+- You can only partition on string dimensions. However, multi-valued string
dimensions are not supported.
+- The `maxTotalRows` config is not supported in `DynamicPartitionsSpec`. Use
`maxRowsPerSegment` instead.
+
+##### Supported aggregators
+
+Auto-compaction using the MSQ task engine only supports aggregators. Only
aggregators where repeated runs of the aggregator on a column produce the same
results each time, such as the following `longSum` aggregator:
+
+```
+{"name": "added", "type": "longSum", "fieldName": "added"}
+```
+
+where the input and output column are both `added`.
Review Comment:
```suggestion
where `longSum` being capable of combining partial results satisfies
mergeability, while input and output column being the same (`added`) ensures
idempotency.
```
##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,128 @@ The following auto-compaction configuration compacts
updates the `wikipedia` seg
}
```
+## Auto-compaction using compaction supervisors
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord
rather than Coordinator duties. Compaction supervisors provide the following
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord
runtime properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks
can be run as a supervisor task
+ * `druid.supervisor.compaction.engine` to `msq` to specify the MSQ task
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some
differences. Specifically, you submit a supervisor spec with the `type` set to
`autocompact` and the auto-compaction config in the `spec` to configure
auto-compaction.
+
+For information about the syntax, see [automatic-compaction
syntax](#auto-compaction-syntax).
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+ - The type of supervisor spec by setting `"type": "autocompact"`
+ - The compaction configuration by adding it to the `spec` field
+ ```json
+ {
+ "type": "autocompact",
+ "spec": {
+ "dataSource": YOUR_DATASOURCE,
+ ...
+ ...
+ }
+ ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia`
datasource:
+
+```sh
+curl --location --request POST
'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+ "type": "autocompact", // required
+ "suspended": false, // optional
+ "spec": { // required
+ "dataSource": "wikipedia", // required
+ "tuningConfig": {...}, // optional
+ "granularitySpec": {...}, // optional
+ "engine": <native|msq>, //optional
+ ...
+ }
+}'
+```
+
+Note that if you omit `spec.engine`, Druid uses the default compaction engine.
You can control the default compaction engine with the
`druid.supervisor.compaction.engine` Overlord runtime property. If
`spec.engine` and `druid.supervisor.compaction.engine` are omitted, Druid
defaults to the native engine.
+
+To stop the automatic compaction task, suspend or terminate the supervisor
through the UI or API.
+
+### Use MSQ for auto-compaction
+
+The MSQ task engine is available as a compaction engine if you configure
auto-compaction to use compaction supervisors. To use the MSQ task engine for
automatic compaction, make sure the following requirements are met:
+
+* Have the [MSQ task engine extension
loaded](../multi-stage-query/index.md#load-the-extension).
+* In your Overlord runtime properties, set the following properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks
can be run as a supervisor task
+ * Optionally, set `druid.supervisor.compaction.engine` to `msq` to specify
the MSQ task engine as the default compaction engine. If you don't do this,
you'll need to set `spec.engine` to `msq` for each compaction supervisor spec
where you want to use the MSQ task engine.
+* Have at least two compaction task slots available or set
`compactionConfig.taskContext.maxNumTasks` to two or more. The MSQ task engine
requires at least two tasks to run, one controller task and one worker task.
+
+You can use [MSQ task engine context
parameters](../multi-stage-query/reference.md#context-parameters) in
`spec.taskContext` when configuring your datasource for automatic compaction,
such as setting the maximum number of tasks using the
`spec.taskContext.maxNumTasks` parameter. Some of the MSQ task engine context
parameters overlap with automatic compaction parameters. When these settings
overlap, set one or the other.
+To submit an automatic compaction task, you submit a supervisor spec through
the UI or API with the type `autocompact` and the `spec` where you define the
compaction behavior using the [automatic compaction
syntax](#auto-compaction-syntax). You can also use the [web
console](#manage-compaction-supervisors-with-the-web-console).
+
+
+#### MSQ task engine limitations
+
+When using the MSQ task engine for auto-compaction, keep the following
limitations in mind:
+
+- The `metricSpec` field is only supported for certain aggregators. For more
information, see [Supported aggregators](#supported-aggregators).
+- Only dynamic and range-based partitioning are supported
+- Set `rollup` to `true` if and only if `metricSpec` is not empty or null.
+- You can only partition on string dimensions. However, multi-valued string
dimensions are not supported.
+- The `maxTotalRows` config is not supported in `DynamicPartitionsSpec`. Use
`maxRowsPerSegment` instead.
+
+##### Supported aggregators
+
+Auto-compaction using the MSQ task engine only supports aggregators. Only
aggregators where repeated runs of the aggregator on a column produce the same
results each time, such as the following `longSum` aggregator:
Review Comment:
```suggestion
Auto-compaction using the MSQ task engine supports only aggregators that
satisfy the following properties:
a) mergeability: can also be used to combine partial aggregates
b) idempotency: produce the same results on repeated runs of the aggregator
on previously aggregated values in a column
This is exemplified by the following `longSum` aggregator:
```
##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,128 @@ The following auto-compaction configuration compacts
updates the `wikipedia` seg
}
```
+## Auto-compaction using compaction supervisors
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord
rather than Coordinator duties. Compaction supervisors provide the following
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord
runtime properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks
can be run as a supervisor task
+ * `druid.supervisor.compaction.engine` to `msq` to specify the MSQ task
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some
differences. Specifically, you submit a supervisor spec with the `type` set to
`autocompact` and the auto-compaction config in the `spec` to configure
auto-compaction.
+
+For information about the syntax, see [automatic-compaction
syntax](#auto-compaction-syntax).
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+ - The type of supervisor spec by setting `"type": "autocompact"`
+ - The compaction configuration by adding it to the `spec` field
+ ```json
+ {
+ "type": "autocompact",
+ "spec": {
+ "dataSource": YOUR_DATASOURCE,
+ ...
+ ...
+ }
+ ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia`
datasource:
+
+```sh
+curl --location --request POST
'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+ "type": "autocompact", // required
+ "suspended": false, // optional
+ "spec": { // required
+ "dataSource": "wikipedia", // required
+ "tuningConfig": {...}, // optional
+ "granularitySpec": {...}, // optional
+ "engine": <native|msq>, //optional
+ ...
+ }
+}'
+```
+
+Note that if you omit `spec.engine`, Druid uses the default compaction engine.
You can control the default compaction engine with the
`druid.supervisor.compaction.engine` Overlord runtime property. If
`spec.engine` and `druid.supervisor.compaction.engine` are omitted, Druid
defaults to the native engine.
+
+To stop the automatic compaction task, suspend or terminate the supervisor
through the UI or API.
+
+### Use MSQ for auto-compaction
+
+The MSQ task engine is available as a compaction engine if you configure
auto-compaction to use compaction supervisors. To use the MSQ task engine for
automatic compaction, make sure the following requirements are met:
+
+* Have the [MSQ task engine extension
loaded](../multi-stage-query/index.md#load-the-extension).
+* In your Overlord runtime properties, set the following properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks
can be run as a supervisor task
+ * Optionally, set `druid.supervisor.compaction.engine` to `msq` to specify
the MSQ task engine as the default compaction engine. If you don't do this,
you'll need to set `spec.engine` to `msq` for each compaction supervisor spec
where you want to use the MSQ task engine.
+* Have at least two compaction task slots available or set
`compactionConfig.taskContext.maxNumTasks` to two or more. The MSQ task engine
requires at least two tasks to run, one controller task and one worker task.
+
+You can use [MSQ task engine context
parameters](../multi-stage-query/reference.md#context-parameters) in
`spec.taskContext` when configuring your datasource for automatic compaction,
such as setting the maximum number of tasks using the
`spec.taskContext.maxNumTasks` parameter. Some of the MSQ task engine context
parameters overlap with automatic compaction parameters. When these settings
overlap, set one or the other.
+To submit an automatic compaction task, you submit a supervisor spec through
the UI or API with the type `autocompact` and the `spec` where you define the
compaction behavior using the [automatic compaction
syntax](#auto-compaction-syntax). You can also use the [web
console](#manage-compaction-supervisors-with-the-web-console).
+
+
+#### MSQ task engine limitations
+
+When using the MSQ task engine for auto-compaction, keep the following
limitations in mind:
+
+- The `metricSpec` field is only supported for certain aggregators. For more
information, see [Supported aggregators](#supported-aggregators).
+- Only dynamic and range-based partitioning are supported
+- Set `rollup` to `true` if and only if `metricSpec` is not empty or null.
+- You can only partition on string dimensions. However, multi-valued string
dimensions are not supported.
+- The `maxTotalRows` config is not supported in `DynamicPartitionsSpec`. Use
`maxRowsPerSegment` instead.
Review Comment:
Segment sorting limitation is lifted now but holds true for commits in druid
31
```suggestion
- The `maxTotalRows` config is not supported in `DynamicPartitionsSpec`. Use
`maxRowsPerSegment` instead.
- Segments can only be sorted on `__time` as the first column.
```
##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,128 @@ The following auto-compaction configuration compacts
updates the `wikipedia` seg
}
```
+## Auto-compaction using compaction supervisors
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord
rather than Coordinator duties. Compaction supervisors provide the following
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord
runtime properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks
can be run as a supervisor task
+ * `druid.supervisor.compaction.engine` to `msq` to specify the MSQ task
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some
differences. Specifically, you submit a supervisor spec with the `type` set to
`autocompact` and the auto-compaction config in the `spec` to configure
auto-compaction.
+
+For information about the syntax, see [automatic-compaction
syntax](#auto-compaction-syntax).
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+ - The type of supervisor spec by setting `"type": "autocompact"`
+ - The compaction configuration by adding it to the `spec` field
+ ```json
+ {
+ "type": "autocompact",
+ "spec": {
+ "dataSource": YOUR_DATASOURCE,
+ ...
+ ...
+ }
+ ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia`
datasource:
+
+```sh
+curl --location --request POST
'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+ "type": "autocompact", // required
+ "suspended": false, // optional
+ "spec": { // required
+ "dataSource": "wikipedia", // required
+ "tuningConfig": {...}, // optional
+ "granularitySpec": {...}, // optional
+ "engine": <native|msq>, //optional
+ ...
+ }
+}'
+```
+
+Note that if you omit `spec.engine`, Druid uses the default compaction engine.
You can control the default compaction engine with the
`druid.supervisor.compaction.engine` Overlord runtime property. If
`spec.engine` and `druid.supervisor.compaction.engine` are omitted, Druid
defaults to the native engine.
+
+To stop the automatic compaction task, suspend or terminate the supervisor
through the UI or API.
+
+### Use MSQ for auto-compaction
+
+The MSQ task engine is available as a compaction engine if you configure
auto-compaction to use compaction supervisors. To use the MSQ task engine for
automatic compaction, make sure the following requirements are met:
+
+* Have the [MSQ task engine extension
loaded](../multi-stage-query/index.md#load-the-extension).
+* In your Overlord runtime properties, set the following properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks
can be run as a supervisor task
+ * Optionally, set `druid.supervisor.compaction.engine` to `msq` to specify
the MSQ task engine as the default compaction engine. If you don't do this,
you'll need to set `spec.engine` to `msq` for each compaction supervisor spec
where you want to use the MSQ task engine.
+* Have at least two compaction task slots available or set
`compactionConfig.taskContext.maxNumTasks` to two or more. The MSQ task engine
requires at least two tasks to run, one controller task and one worker task.
+
+You can use [MSQ task engine context
parameters](../multi-stage-query/reference.md#context-parameters) in
`spec.taskContext` when configuring your datasource for automatic compaction,
such as setting the maximum number of tasks using the
`spec.taskContext.maxNumTasks` parameter. Some of the MSQ task engine context
parameters overlap with automatic compaction parameters. When these settings
overlap, set one or the other.
+To submit an automatic compaction task, you submit a supervisor spec through
the UI or API with the type `autocompact` and the `spec` where you define the
compaction behavior using the [automatic compaction
syntax](#auto-compaction-syntax). You can also use the [web
console](#manage-compaction-supervisors-with-the-web-console).
+
+
+#### MSQ task engine limitations
+
+When using the MSQ task engine for auto-compaction, keep the following
limitations in mind:
+
+- The `metricSpec` field is only supported for certain aggregators. For more
information, see [Supported aggregators](#supported-aggregators).
+- Only dynamic and range-based partitioning are supported
+- Set `rollup` to `true` if and only if `metricSpec` is not empty or null.
+- You can only partition on string dimensions. However, multi-valued string
dimensions are not supported.
+- The `maxTotalRows` config is not supported in `DynamicPartitionsSpec`. Use
`maxRowsPerSegment` instead.
+
+##### Supported aggregators
+
+Auto-compaction using the MSQ task engine only supports aggregators. Only
aggregators where repeated runs of the aggregator on a column produce the same
results each time, such as the following `longSum` aggregator:
+
+```
+{"name": "added", "type": "longSum", "fieldName": "added"}
+```
+
+where the input and output column are both `added`.
+
+The following are some examples of aggregators that aren't supported, where
each run of the aggregator produces different results:
+
+* `longSum` aggregator where the `added` column rolls up into the `sum_added`
column:
+ ```
+ {"name": "sum_added", "type": "longSum", "fieldName": "added" }
+ ```
+* Partial sketches:
Review Comment:
```suggestion
* Partial sketches which cannot themselves be used to combine partial
aggregates and need merging aggregators -- such as `HLLSketchMerge` required
for `HLLSketchBuild` aggregator below -- violating mergeability:
```
##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,126 @@ The following auto-compaction configuration compacts
updates the `wikipedia` seg
}
```
+## Auto-compaction using compaction supervisors
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord
rather than Coordinator duties. Compaction supervisors provide the following
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine and [MSQ task
engine](#use-msq-for-auto-compaction) are available
+* More reactive and submit tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord
runtime properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks
can be run as a supervisor task
+ * `druid.supervisor.compaction.engine` to `msq` to specify the MSQ task
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some
differences. Specifically, you submit a supervisor spec with the `type` set to
`autocompact` and the auto-compaction config in the `spec` to configure
auto-compaction.
+
+For information about the syntax, see [automatic-compaction
syntax](#auto-compaction-syntax).
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+ - The type of supervisor spec by setting `"type": "autocompact"`
+ - The compaction configuration by adding it to the `spec` field
+ ```json
+ {
+ "type": "autocompact",
+ "spec": {
+ "dataSource": YOUR_DATASOURCE,
+ ...
+ ...
+ }
+ ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia`
datasource:
+
+```sh
+curl --location --request POST
'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+ "type": "autocompact", // required
+ "suspended": false, // optional
+ "spec": { // required
+ "dataSource": "wikipedia", // required
+ "tuningConfig": {...}, // optional
+ "granularitySpec": {...}, // optional
+ ...
Review Comment:
Shall we also include the above on `Automatic compaction dynamic
configuration` page?
##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,128 @@ The following auto-compaction configuration compacts
updates the `wikipedia` seg
}
```
+## Auto-compaction using compaction supervisors
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord
rather than Coordinator duties. Compaction supervisors provide the following
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord
runtime properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks
can be run as a supervisor task
+ * `druid.supervisor.compaction.engine` to `msq` to specify the MSQ task
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some
differences. Specifically, you submit a supervisor spec with the `type` set to
`autocompact` and the auto-compaction config in the `spec` to configure
auto-compaction.
+
+For information about the syntax, see [automatic-compaction
syntax](#auto-compaction-syntax).
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+ - The type of supervisor spec by setting `"type": "autocompact"`
+ - The compaction configuration by adding it to the `spec` field
+ ```json
+ {
+ "type": "autocompact",
+ "spec": {
+ "dataSource": YOUR_DATASOURCE,
+ ...
+ ...
+ }
+ ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia`
datasource:
+
+```sh
+curl --location --request POST
'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+ "type": "autocompact", // required
+ "suspended": false, // optional
+ "spec": { // required
+ "dataSource": "wikipedia", // required
+ "tuningConfig": {...}, // optional
+ "granularitySpec": {...}, // optional
+ "engine": <native|msq>, //optional
+ ...
+ }
+}'
+```
+
+Note that if you omit `spec.engine`, Druid uses the default compaction engine.
You can control the default compaction engine with the
`druid.supervisor.compaction.engine` Overlord runtime property. If
`spec.engine` and `druid.supervisor.compaction.engine` are omitted, Druid
defaults to the native engine.
+
+To stop the automatic compaction task, suspend or terminate the supervisor
through the UI or API.
+
+### Use MSQ for auto-compaction
+
+The MSQ task engine is available as a compaction engine if you configure
auto-compaction to use compaction supervisors. To use the MSQ task engine for
automatic compaction, make sure the following requirements are met:
+
+* Have the [MSQ task engine extension
loaded](../multi-stage-query/index.md#load-the-extension).
+* In your Overlord runtime properties, set the following properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks
can be run as a supervisor task
+ * Optionally, set `druid.supervisor.compaction.engine` to `msq` to specify
the MSQ task engine as the default compaction engine. If you don't do this,
you'll need to set `spec.engine` to `msq` for each compaction supervisor spec
where you want to use the MSQ task engine.
+* Have at least two compaction task slots available or set
`compactionConfig.taskContext.maxNumTasks` to two or more. The MSQ task engine
requires at least two tasks to run, one controller task and one worker task.
+
+You can use [MSQ task engine context
parameters](../multi-stage-query/reference.md#context-parameters) in
`spec.taskContext` when configuring your datasource for automatic compaction,
such as setting the maximum number of tasks using the
`spec.taskContext.maxNumTasks` parameter. Some of the MSQ task engine context
parameters overlap with automatic compaction parameters. When these settings
overlap, set one or the other.
+To submit an automatic compaction task, you submit a supervisor spec through
the UI or API with the type `autocompact` and the `spec` where you define the
compaction behavior using the [automatic compaction
syntax](#auto-compaction-syntax). You can also use the [web
console](#manage-compaction-supervisors-with-the-web-console).
+
+
+#### MSQ task engine limitations
+
+When using the MSQ task engine for auto-compaction, keep the following
limitations in mind:
+
+- The `metricSpec` field is only supported for certain aggregators. For more
information, see [Supported aggregators](#supported-aggregators).
+- Only dynamic and range-based partitioning are supported
+- Set `rollup` to `true` if and only if `metricSpec` is not empty or null.
+- You can only partition on string dimensions. However, multi-valued string
dimensions are not supported.
+- The `maxTotalRows` config is not supported in `DynamicPartitionsSpec`. Use
`maxRowsPerSegment` instead.
+
+##### Supported aggregators
+
+Auto-compaction using the MSQ task engine only supports aggregators. Only
aggregators where repeated runs of the aggregator on a column produce the same
results each time, such as the following `longSum` aggregator:
+
+```
+{"name": "added", "type": "longSum", "fieldName": "added"}
+```
+
+where the input and output column are both `added`.
+
+The following are some examples of aggregators that aren't supported, where
each run of the aggregator produces different results:
+
+* `longSum` aggregator where the `added` column rolls up into the `sum_added`
column:
+ ```
+ {"name": "sum_added", "type": "longSum", "fieldName": "added" }
+ ```
+* Partial sketches:
+ ```
+ {"name": added, "type":"HLLSketchMerge", fieldName: added}
Review Comment:
```suggestion
{"name": added, "type":"HLLSketchBuild", fieldName: added}
```
##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,128 @@ The following auto-compaction configuration compacts
updates the `wikipedia` seg
}
```
+## Auto-compaction using compaction supervisors
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord
rather than Coordinator duties. Compaction supervisors provide the following
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord
runtime properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks
can be run as a supervisor task
+ * `druid.supervisor.compaction.engine` to `msq` to specify the MSQ task
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some
differences. Specifically, you submit a supervisor spec with the `type` set to
`autocompact` and the auto-compaction config in the `spec` to configure
auto-compaction.
+
+For information about the syntax, see [automatic-compaction
syntax](#auto-compaction-syntax).
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+ - The type of supervisor spec by setting `"type": "autocompact"`
+ - The compaction configuration by adding it to the `spec` field
+ ```json
+ {
+ "type": "autocompact",
+ "spec": {
+ "dataSource": YOUR_DATASOURCE,
+ ...
+ ...
+ }
+ ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia`
datasource:
+
+```sh
+curl --location --request POST
'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+ "type": "autocompact", // required
+ "suspended": false, // optional
+ "spec": { // required
+ "dataSource": "wikipedia", // required
+ "tuningConfig": {...}, // optional
+ "granularitySpec": {...}, // optional
+ "engine": <native|msq>, //optional
+ ...
+ }
+}'
+```
+
+Note that if you omit `spec.engine`, Druid uses the default compaction engine.
You can control the default compaction engine with the
`druid.supervisor.compaction.engine` Overlord runtime property. If
`spec.engine` and `druid.supervisor.compaction.engine` are omitted, Druid
defaults to the native engine.
+
+To stop the automatic compaction task, suspend or terminate the supervisor
through the UI or API.
+
+### Use MSQ for auto-compaction
+
+The MSQ task engine is available as a compaction engine if you configure
auto-compaction to use compaction supervisors. To use the MSQ task engine for
automatic compaction, make sure the following requirements are met:
+
+* Have the [MSQ task engine extension
loaded](../multi-stage-query/index.md#load-the-extension).
+* In your Overlord runtime properties, set the following properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks
can be run as a supervisor task
+ * Optionally, set `druid.supervisor.compaction.engine` to `msq` to specify
the MSQ task engine as the default compaction engine. If you don't do this,
you'll need to set `spec.engine` to `msq` for each compaction supervisor spec
where you want to use the MSQ task engine.
+* Have at least two compaction task slots available or set
`compactionConfig.taskContext.maxNumTasks` to two or more. The MSQ task engine
requires at least two tasks to run, one controller task and one worker task.
+
+You can use [MSQ task engine context
parameters](../multi-stage-query/reference.md#context-parameters) in
`spec.taskContext` when configuring your datasource for automatic compaction,
such as setting the maximum number of tasks using the
`spec.taskContext.maxNumTasks` parameter. Some of the MSQ task engine context
parameters overlap with automatic compaction parameters. When these settings
overlap, set one or the other.
+To submit an automatic compaction task, you submit a supervisor spec through
the UI or API with the type `autocompact` and the `spec` where you define the
compaction behavior using the [automatic compaction
syntax](#auto-compaction-syntax). You can also use the [web
console](#manage-compaction-supervisors-with-the-web-console).
+
+
+#### MSQ task engine limitations
+
+When using the MSQ task engine for auto-compaction, keep the following
limitations in mind:
+
+- The `metricSpec` field is only supported for certain aggregators. For more
information, see [Supported aggregators](#supported-aggregators).
+- Only dynamic and range-based partitioning are supported
+- Set `rollup` to `true` if and only if `metricSpec` is not empty or null.
+- You can only partition on string dimensions. However, multi-valued string
dimensions are not supported.
+- The `maxTotalRows` config is not supported in `DynamicPartitionsSpec`. Use
`maxRowsPerSegment` instead.
+
+##### Supported aggregators
+
+Auto-compaction using the MSQ task engine only supports aggregators. Only
aggregators where repeated runs of the aggregator on a column produce the same
results each time, such as the following `longSum` aggregator:
+
+```
+{"name": "added", "type": "longSum", "fieldName": "added"}
+```
+
+where the input and output column are both `added`.
+
+The following are some examples of aggregators that aren't supported, where
each run of the aggregator produces different results:
+
+* `longSum` aggregator where the `added` column rolls up into the `sum_added`
column:
Review Comment:
```suggestion
* `longSum` aggregator where the `added` column rolls up into `sum_added`
column discarding the input `added` column, violating idempotency, as
subsequent runs would no longer find the `added` column:
```
##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,128 @@ The following auto-compaction configuration compacts
updates the `wikipedia` seg
}
```
+## Auto-compaction using compaction supervisors
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord
rather than Coordinator duties. Compaction supervisors provide the following
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord
runtime properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks
can be run as a supervisor task
+ * `druid.supervisor.compaction.engine` to `msq` to specify the MSQ task
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some
differences. Specifically, you submit a supervisor spec with the `type` set to
`autocompact` and the auto-compaction config in the `spec` to configure
auto-compaction.
+
+For information about the syntax, see [automatic-compaction
syntax](#auto-compaction-syntax).
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+ - The type of supervisor spec by setting `"type": "autocompact"`
+ - The compaction configuration by adding it to the `spec` field
+ ```json
+ {
+ "type": "autocompact",
+ "spec": {
+ "dataSource": YOUR_DATASOURCE,
+ ...
+ ...
+ }
+ ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia`
datasource:
+
+```sh
+curl --location --request POST
'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+ "type": "autocompact", // required
+ "suspended": false, // optional
+ "spec": { // required
+ "dataSource": "wikipedia", // required
+ "tuningConfig": {...}, // optional
+ "granularitySpec": {...}, // optional
+ "engine": <native|msq>, //optional
+ ...
+ }
+}'
+```
+
+Note that if you omit `spec.engine`, Druid uses the default compaction engine.
You can control the default compaction engine with the
`druid.supervisor.compaction.engine` Overlord runtime property. If
`spec.engine` and `druid.supervisor.compaction.engine` are omitted, Druid
defaults to the native engine.
+
+To stop the automatic compaction task, suspend or terminate the supervisor
through the UI or API.
+
+### Use MSQ for auto-compaction
+
+The MSQ task engine is available as a compaction engine if you configure
auto-compaction to use compaction supervisors. To use the MSQ task engine for
automatic compaction, make sure the following requirements are met:
+
+* Have the [MSQ task engine extension
loaded](../multi-stage-query/index.md#load-the-extension).
+* In your Overlord runtime properties, set the following properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks
can be run as a supervisor task
+ * Optionally, set `druid.supervisor.compaction.engine` to `msq` to specify
the MSQ task engine as the default compaction engine. If you don't do this,
you'll need to set `spec.engine` to `msq` for each compaction supervisor spec
where you want to use the MSQ task engine.
+* Have at least two compaction task slots available or set
`compactionConfig.taskContext.maxNumTasks` to two or more. The MSQ task engine
requires at least two tasks to run, one controller task and one worker task.
+
+You can use [MSQ task engine context
parameters](../multi-stage-query/reference.md#context-parameters) in
`spec.taskContext` when configuring your datasource for automatic compaction,
such as setting the maximum number of tasks using the
`spec.taskContext.maxNumTasks` parameter. Some of the MSQ task engine context
parameters overlap with automatic compaction parameters. When these settings
overlap, set one or the other.
+To submit an automatic compaction task, you submit a supervisor spec through
the UI or API with the type `autocompact` and the `spec` where you define the
compaction behavior using the [automatic compaction
syntax](#auto-compaction-syntax). You can also use the [web
console](#manage-compaction-supervisors-with-the-web-console).
+
+
+#### MSQ task engine limitations
+
+When using the MSQ task engine for auto-compaction, keep the following
limitations in mind:
+
+- The `metricSpec` field is only supported for certain aggregators. For more
information, see [Supported aggregators](#supported-aggregators).
+- Only dynamic and range-based partitioning are supported
+- Set `rollup` to `true` if and only if `metricSpec` is not empty or null.
+- You can only partition on string dimensions. However, multi-valued string
dimensions are not supported.
+- The `maxTotalRows` config is not supported in `DynamicPartitionsSpec`. Use
`maxRowsPerSegment` instead.
+
+##### Supported aggregators
+
+Auto-compaction using the MSQ task engine only supports aggregators. Only
aggregators where repeated runs of the aggregator on a column produce the same
results each time, such as the following `longSum` aggregator:
+
+```
+{"name": "added", "type": "longSum", "fieldName": "added"}
+```
+
+where the input and output column are both `added`.
+
+The following are some examples of aggregators that aren't supported, where
each run of the aggregator produces different results:
Review Comment:
```suggestion
The following are some examples of aggregators that aren't supported since
at least of the required conditions aren't satisfied:
```
##########
docs/data-management/automatic-compaction.md:
##########
@@ -221,6 +223,128 @@ The following auto-compaction configuration compacts
updates the `wikipedia` seg
}
```
+## Auto-compaction using compaction supervisors
+
+:::info Experimental
+Compaction supervisors are experimental. For production use, we recommend
[Coordinator-based auto-compaction](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord
rather than Coordinator duties. Compaction supervisors provide the following
benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the
auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task
engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, set the following properties in your Overlord
runtime properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks
can be run as a supervisor task
+ * `druid.supervisor.compaction.engine` to `msq` to specify the MSQ task
engine as the compaction engine or to `native` to use the native engine.
+
+Compaction uses the same syntax as Coordinator-based auto-compaction with some
differences. Specifically, you submit a supervisor spec with the `type` set to
`autocompact` and the auto-compaction config in the `spec` to configure
auto-compaction.
+
+For information about the syntax, see [automatic-compaction
syntax](#auto-compaction-syntax).
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform
the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+ - The type of supervisor spec by setting `"type": "autocompact"`
+ - The compaction configuration by adding it to the `spec` field
+ ```json
+ {
+ "type": "autocompact",
+ "spec": {
+ "dataSource": YOUR_DATASOURCE,
+ ...
+ ...
+ }
+ ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor
through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint
as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia`
datasource:
+
+```sh
+curl --location --request POST
'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+ "type": "autocompact", // required
+ "suspended": false, // optional
+ "spec": { // required
+ "dataSource": "wikipedia", // required
+ "tuningConfig": {...}, // optional
+ "granularitySpec": {...}, // optional
+ "engine": <native|msq>, //optional
+ ...
+ }
+}'
+```
+
+Note that if you omit `spec.engine`, Druid uses the default compaction engine.
You can control the default compaction engine with the
`druid.supervisor.compaction.engine` Overlord runtime property. If
`spec.engine` and `druid.supervisor.compaction.engine` are omitted, Druid
defaults to the native engine.
+
+To stop the automatic compaction task, suspend or terminate the supervisor
through the UI or API.
+
+### Use MSQ for auto-compaction
+
+The MSQ task engine is available as a compaction engine if you configure
auto-compaction to use compaction supervisors. To use the MSQ task engine for
automatic compaction, make sure the following requirements are met:
+
+* Have the [MSQ task engine extension
loaded](../multi-stage-query/index.md#load-the-extension).
+* In your Overlord runtime properties, set the following properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks
can be run as a supervisor task
+ * Optionally, set `druid.supervisor.compaction.engine` to `msq` to specify
the MSQ task engine as the default compaction engine. If you don't do this,
you'll need to set `spec.engine` to `msq` for each compaction supervisor spec
where you want to use the MSQ task engine.
+* Have at least two compaction task slots available or set
`compactionConfig.taskContext.maxNumTasks` to two or more. The MSQ task engine
requires at least two tasks to run, one controller task and one worker task.
+
+You can use [MSQ task engine context
parameters](../multi-stage-query/reference.md#context-parameters) in
`spec.taskContext` when configuring your datasource for automatic compaction,
such as setting the maximum number of tasks using the
`spec.taskContext.maxNumTasks` parameter. Some of the MSQ task engine context
parameters overlap with automatic compaction parameters. When these settings
overlap, set one or the other.
+To submit an automatic compaction task, you submit a supervisor spec through
the UI or API with the type `autocompact` and the `spec` where you define the
compaction behavior using the [automatic compaction
syntax](#auto-compaction-syntax). You can also use the [web
console](#manage-compaction-supervisors-with-the-web-console).
+
+
+#### MSQ task engine limitations
+
+When using the MSQ task engine for auto-compaction, keep the following
limitations in mind:
+
+- The `metricSpec` field is only supported for certain aggregators. For more
information, see [Supported aggregators](#supported-aggregators).
+- Only dynamic and range-based partitioning are supported
+- Set `rollup` to `true` if and only if `metricSpec` is not empty or null.
+- You can only partition on string dimensions. However, multi-valued string
dimensions are not supported.
+- The `maxTotalRows` config is not supported in `DynamicPartitionsSpec`. Use
`maxRowsPerSegment` instead.
+
+##### Supported aggregators
+
+Auto-compaction using the MSQ task engine only supports aggregators. Only
aggregators where repeated runs of the aggregator on a column produce the same
results each time, such as the following `longSum` aggregator:
+
+```
+{"name": "added", "type": "longSum", "fieldName": "added"}
+```
+
+where the input and output column are both `added`.
+
+The following are some examples of aggregators that aren't supported, where
each run of the aggregator produces different results:
+
+* `longSum` aggregator where the `added` column rolls up into the `sum_added`
column:
+ ```
+ {"name": "sum_added", "type": "longSum", "fieldName": "added" }
+ ```
+* Partial sketches:
+ ```
+ {"name": added, "type":"HLLSketchMerge", fieldName: added}
+ ```
+* Count aggregators since it rolls up into a different count column
Review Comment:
```suggestion
* Count aggregator since it cannot be used to combine partial aggregates and
it rolls up into a different `count` column discarding the input column(s),
violating both mergeability and idempotency.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]