317brian commented on code in PR #15218: URL: https://github.com/apache/druid/pull/15218#discussion_r1369286165
########## docs/data-management/automatic-compaction.md: ########## @@ -203,6 +203,82 @@ The following auto-compaction configuration compacts updates the `wikipedia` seg } ``` +## Concurrent compaction + +:::info +Concurrent compaction is an [experimental feature](../development/experimental.md) and is not currently available for SQL-based ingestion. +::: + +If you enable automatic compaction, you can also use concurrent compaction for streaming and legacy JSON-based batch ingestion. Concurrent compaction compacts the data as you ingest it. + +Setting up concurrent compaction is a two-step process. The first is to update your datasource and the second is to update your ingestion job. + +Using concurrent compaction in the following scenarios can be beneficial: + +- If the job with an `APPEND` task and the job with a `REPLACE` task have the same segment granularity. For example, when a datasource and its streaming ingestion job have the same granularity. +- If the job with an `APPEND` task has a finer segment granularity than the replacing job. + +We do not recommend using concurrent compaction when the job with an `APPEND` task has a coarser granularity than the job with a `REPLACE` task. For example, if the `APPEND` job has a yearly granularity and the `REPLACE` job has a monthly granularity. The job that finishes second will fail. + +### Configure concurrent compaction + +##### Update the compaction settings with the API + + First, prepare your datasource for concurrent compaction by setting its task lock type to `REPLACE`. +Add the `taskContext` like you would any other auto-compaction setting through the API: + +```shell +curl --location --request POST 'http://localhost:8081/druid/coordinator/v1/config/compaction' \ +--header 'Content-Type: application/json' \ +--data-raw '{ + "dataSource": "YOUR_DATASOURCE", + "taskContext": { + "taskLockType": "REPLACE" + } +}' +``` + +##### Update the compaction settings with the UI + +In the **Compaction config** for a datasource, set **Allow concurrent compaction append tasks** to **True**. Review Comment: ```suggestion In the **Compaction config** for a datasource, set **Allow concurrent compactions (experimental)** to **True**. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
