This is an automated email from the ASF dual-hosted git repository.
techdocsmith pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git
The following commit(s) were added to refs/heads/master by this push:
new 9142f4b8d7 docs: update note in automatic compaction doc (#14908)
9142f4b8d7 is described below
commit 9142f4b8d7acf8942f6a0df20052b32eb85d91e2
Author: Victoria Lim <[email protected]>
AuthorDate: Fri Aug 25 14:14:29 2023 -0700
docs: update note in automatic compaction doc (#14908)
---
docs/data-management/automatic-compaction.md | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/docs/data-management/automatic-compaction.md
b/docs/data-management/automatic-compaction.md
index 1a27176986..8d696a86d4 100644
--- a/docs/data-management/automatic-compaction.md
+++ b/docs/data-management/automatic-compaction.md
@@ -23,6 +23,9 @@ title: "Automatic compaction"
-->
In Apache Druid, compaction is a special type of ingestion task that reads
data from a Druid datasource and writes it back into the same datasource. A
common use case for this is to [optimally size
segments](../operations/segment-optimization.md) after ingestion to improve
query performance. Automatic compaction, or auto-compaction, refers to the
system for automatic execution of compaction tasks managed by the [Druid
Coordinator](../design/coordinator.md).
+This topic guides you through setting up automatic compaction for your Druid
cluster. See the [examples](#examples) for common use cases for automatic
compaction.
+
+## How Druid manages automatic compaction
The Coordinator [indexing
period](../configuration/index.md#coordinator-operation),
`druid.coordinator.period.indexingPeriod`, controls the frequency of compaction
tasks.
The default indexing period is 30 minutes, meaning that the Coordinator first
checks for segments to compact at most 30 minutes from when auto-compaction is
enabled.
@@ -33,9 +36,12 @@ At every invocation of auto-compaction, the Coordinator
initiates a [segment sea
When there are eligible segments to compact, the Coordinator issues compaction
tasks based on available worker capacity.
If a compaction task takes longer than the indexing period, the Coordinator
waits for it to finish before resuming the period for segment search.
+:::info
+ Auto-compaction skips datasources that have a segment granularity of `ALL`.
+:::
+
As a best practice, you should set up auto-compaction for all Druid
datasources. You can run compaction tasks manually for cases where you want to
allocate more system resources. For example, you may choose to run multiple
compaction tasks in parallel to compact an existing datasource for the first
time. See [Compaction](compaction.md) for additional details and use cases.
-This topic guides you through setting up automatic compaction for your Druid
cluster. See the [examples](#examples) for common use cases for automatic
compaction.
## Enable automatic compaction
@@ -174,10 +180,6 @@ The following auto-compaction configuration compacts
existing `HOUR` segments in
}
```
-:::info
- Auto-compaction skips datasources containing ALL granularity segments when
the target granularity is different.
-:::
-
### Update partitioning scheme
For your `wikipedia` datasource, you want to optimize segment access when
regularly ingesting data without compromising compute time when querying the
data. Your ingestion spec for batch append uses [dynamic
partitioning](../ingestion/native-batch.md#dynamic-partitioning) to optimize
for write-time operations, while your stream ingestion partitioning is
configured by the stream service. You want to implement auto-compaction to
reorganize the data with a suitable read-time partitioning us [...]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]