This is an automated email from the ASF dual-hosted git repository.
kfaraz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git
The following commit(s) were added to refs/heads/master by this push:
new 821769c6ce8 Docs: Add metrics and configs for embedded kill tasks
(#18124)
821769c6ce8 is described below
commit 821769c6ce8f62dc7f7925836bfcba8c62921d02
Author: Kashif Faraz <[email protected]>
AuthorDate: Thu Jun 12 09:17:36 2025 +0530
Docs: Add metrics and configs for embedded kill tasks (#18124)
Docs changes for #18028
- Document metrics and configs for embedded kill tasks
- Remove duplicate configs for Coordinator auto-kill from
`data-management/delete.md`
- Fix up references
---
docs/api-reference/tasks-api.md | 2 +-
docs/configuration/index.md | 27 +++++++++++++----
docs/data-management/automatic-compaction.md | 2 +-
docs/data-management/delete.md | 44 +++++++++++++++++++++++-----
docs/operations/clean-metadata-store.md | 11 +------
docs/operations/metrics.md | 14 +++++++++
6 files changed, 75 insertions(+), 25 deletions(-)
diff --git a/docs/api-reference/tasks-api.md b/docs/api-reference/tasks-api.md
index 69a0a015361..f53037f84e1 100644
--- a/docs/api-reference/tasks-api.md
+++ b/docs/api-reference/tasks-api.md
@@ -1601,7 +1601,7 @@ Content-Length: 134
Manually clean up pending segments table in metadata storage for `datasource`.
It returns a JSON object response with
`numDeleted` for the number of rows deleted from the pending segments table.
This API is used by the
-`druid.coordinator.kill.pendingSegments.on` [Coordinator
setting](../configuration/index.md#coordinator-operation)
+`druid.coordinator.kill.pendingSegments.on` [Coordinator
setting](../configuration/index.md#data-management)
which automates this operation to perform periodically.
#### URL
diff --git a/docs/configuration/index.md b/docs/configuration/index.md
index 6c8ad4ec023..84cd1d3636e 100644
--- a/docs/configuration/index.md
+++ b/docs/configuration/index.md
@@ -878,9 +878,19 @@ These Coordinator static configurations can be defined in
the `coordinator/runti
|Property|Description|Default|
|--------|-----------|-------|
|`druid.coordinator.period`|The run period for the Coordinator. The
Coordinator operates by maintaining the current state of the world in memory
and periodically looking at the set of "used" segments and segments being
served to make decisions about whether any changes need to be made to the data
topology. This property sets the delay between each of these runs.|`PT60S`|
-|`druid.coordinator.period.indexingPeriod`|How often to send
compact/merge/conversion tasks to the indexing service. It's recommended to be
longer than `druid.manager.segments.pollDuration`|`PT1800S` (30 mins)|
|`druid.coordinator.startDelay`|The operation of the Coordinator works on the
assumption that it has an up-to-date view of the state of the world when it
runs, the current ZooKeeper interaction code, however, is written in a way that
doesn’t allow the Coordinator to know for a fact that it’s done loading the
current state of the world. This delay is a hack to give it enough time to
believe that it has all the data.|`PT300S`|
|`druid.coordinator.load.timeout`|The timeout duration for when the
Coordinator assigns a segment to a Historical service.|`PT15M`|
+|`druid.coordinator.balancer.strategy`|The [balancing
strategy](../design/coordinator.md#balancing-segments-in-a-tier) used by the
Coordinator to distribute segments among the Historical servers in a tier. The
`cost` strategy distributes segments by minimizing a cost function,
`diskNormalized` weights these costs with the disk usage ratios of the servers
and `random` distributes segments randomly.|`cost`|
+|`druid.coordinator.loadqueuepeon.http.repeatDelay`|The start and repeat delay
(in milliseconds) for the load queue peon, which manages the load/drop queue of
segments for any server.|1 minute|
+|`druid.coordinator.loadqueuepeon.http.batchSize`|Number of segment load/drop
requests to batch in one HTTP request. Note that it must be smaller than or
equal to the `druid.segmentCache.numLoadingThreads` config on Historical
service. If this value is not configured, the coordinator uses the value of the
`numLoadingThreads` for the respective server. |
`druid.segmentCache.numLoadingThreads` |
+|`druid.coordinator.asOverlord.enabled`|Boolean value for whether this
Coordinator service should act like an Overlord as well. This configuration
allows users to simplify a Druid cluster by not having to deploy any standalone
Overlord services. If set to true, then Overlord console is available at
`http://coordinator-host:port/console.html` and be sure to set
`druid.coordinator.asOverlord.overlordService` also.|false|
+|`druid.coordinator.asOverlord.overlordService`| Required, if
`druid.coordinator.asOverlord.enabled` is `true`. This must be same value as
`druid.service` on standalone Overlord services and
`druid.selectors.indexing.serviceName` on Middle Managers.|NULL|
+
+##### Data management
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.coordinator.period.indexingPeriod`|Period to run data management
duties on the Coordinator including launching compact tasks and performing
clean up of unused data. It is recommended to keep this value longer than
`druid.manager.segments.pollDuration`.|`PT1800S` (30 mins)|
|`druid.coordinator.kill.pendingSegments.on`|Boolean flag for whether or not
the Coordinator clean up old entries in the `pendingSegments` table of metadata
store. If set to true, Coordinator will check the created time of most recently
complete task. If it doesn't exist, it finds the created time of the earliest
running/pending/waiting tasks. Once the created time is found, then for all
datasources not in the `killPendingSegmentsSkipList` (see [Dynamic
configuration](#dynamic-configurat [...]
|`druid.coordinator.kill.on`|Boolean flag to enable the Coordinator to submit
a kill task for unused segments and delete them permanently from the metadata
store and deep storage.|false|
|`druid.coordinator.kill.period`| The frequency of sending kill tasks to the
indexing service. The value must be greater than or equal to
`druid.coordinator.period.indexingPeriod`. Only applies if kill is turned
on.|Same as `druid.coordinator.period.indexingPeriod`|
@@ -889,11 +899,6 @@ These Coordinator static configurations can be defined in
the `coordinator/runti
|`druid.coordinator.kill.bufferPeriod`|The amount of time that a segment must
be unused before it is able to be permanently removed from metadata and deep
storage. This can serve as a buffer period to prevent data loss if data ends up
being needed after being marked unused.|`P30D`|
|`druid.coordinator.kill.maxSegments`|The number of unused segments to kill
per kill task. This number must be greater than 0. This only applies when
`druid.coordinator.kill.on=true`.|100|
|`druid.coordinator.kill.maxInterval`|The largest interval, as an [ISO 8601
duration](https://en.wikipedia.org/wiki/ISO_8601#Durations), of segments to
delete per kill task. Set to zero, e.g. `PT0S`, for unlimited. This only
applies when `druid.coordinator.kill.on=true`.|`P30D`|
-|`druid.coordinator.balancer.strategy`|The [balancing
strategy](../design/coordinator.md#balancing-segments-in-a-tier) used by the
Coordinator to distribute segments among the Historical servers in a tier. The
`cost` strategy distributes segments by minimizing a cost function,
`diskNormalized` weights these costs with the disk usage ratios of the servers
and `random` distributes segments randomly.|`cost`|
-|`druid.coordinator.loadqueuepeon.http.repeatDelay`|The start and repeat delay
(in milliseconds) for the load queue peon, which manages the load/drop queue of
segments for any server.|1 minute|
-|`druid.coordinator.loadqueuepeon.http.batchSize`|Number of segment load/drop
requests to batch in one HTTP request. Note that it must be smaller than or
equal to the `druid.segmentCache.numLoadingThreads` config on Historical
service. If this value is not configured, the coordinator uses the value of the
`numLoadingThreads` for the respective server. |
`druid.segmentCache.numLoadingThreads` |
-|`druid.coordinator.asOverlord.enabled`|Boolean value for whether this
Coordinator service should act like an Overlord as well. This configuration
allows users to simplify a Druid cluster by not having to deploy any standalone
Overlord services. If set to true, then Overlord console is available at
`http://coordinator-host:port/console.html` and be sure to set
`druid.coordinator.asOverlord.overlordService` also.|false|
-|`druid.coordinator.asOverlord.overlordService`| Required, if
`druid.coordinator.asOverlord.enabled` is `true`. This must be same value as
`druid.service` on standalone Overlord services and
`druid.selectors.indexing.serviceName` on Middle Managers.|NULL|
##### Metadata management
@@ -1187,6 +1192,16 @@ The following properties pertain to segment metadata
caching on the Overlord tha
|`druid.manager.segments.useIncrementalCache`|Denotes the usage mode of the
segment metadata incremental cache. Possible modes are: (a) `never`: Cache is
disabled. (b) `always`: Reads are always done from the cache. Service start-up
will be blocked until cache has synced with the metadata store at least once.
Transactions will block until cache has synced with the metadata store at least
once after becoming leader. (c) `ifSynced`: Reads are done from the cache only
if it has already sync [...]
|`druid.manager.segments.pollDuration`|Duration (in ISO 8601 format) between
successive syncs of the cache with the metadata store. This property is used
only when `druid.manager.segments.useIncrementalCache` is set to `always` or
`ifSynced`.|`PT1M` (1 minute)|
+##### Auto-kill unused segments (Experimental)
+
+These configs pertain to the new embedded mode of running [kill tasks on the
Overlord](../data-management/delete.md#auto-kill-data-on-the-overlord-experimental).
+None of the configs that apply to [auto-kill performed by the
Coordinator](../data-management/delete.md#auto-kill-data-using-coordinator-duties)
are used by this feature.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.manager.segments.killUnused.enabled`|Boolean flag to enable auto-kill
of eligible unused segments on the Overlord. This feature can be used only when
[segment metadata caching](#segment-metadata-cache-experimental) is enabled on
the Overlord and MUST NOT be enabled if `druid.coordinator.kill.on` is already
set to `true` on the Coordinator.|`true`|
+|`druid.manager.segments.killUnused.bufferPeriod`|Period after which a segment
marked as unused becomes eligible for auto-kill on the Overlord. This config is
effective only if `druid.manager.segments.killUnused.enabled` is set to
`true`.|`P30D` (30 days)|
+
#### Overlord dynamic configuration
The Overlord has dynamic configurations to tune how Druid assigns tasks to
workers.
diff --git a/docs/data-management/automatic-compaction.md
b/docs/data-management/automatic-compaction.md
index 9534a477fd6..1a0803bafb2 100644
--- a/docs/data-management/automatic-compaction.md
+++ b/docs/data-management/automatic-compaction.md
@@ -85,7 +85,7 @@ For more details on each of the specs in an auto-compaction
configuration, see [
## Auto-compaction using Coordinator duties
-You can control how often the Coordinator checks to see if auto-compaction is
needed. The Coordinator [indexing
period](../configuration/index.md#coordinator-operation),
`druid.coordinator.period.indexingPeriod`, controls the frequency of compaction
tasks.
+You can control how often the Coordinator checks to see if auto-compaction is
needed. The Coordinator [indexing
period](../configuration/index.md#data-management),
`druid.coordinator.period.indexingPeriod`, controls the frequency of compaction
tasks.
The default indexing period is 30 minutes, meaning that the Coordinator first
checks for segments to compact at most 30 minutes from when auto-compaction is
enabled.
This time period also affects other Coordinator duties such as cleanup of
unused segments and stale pending segments.
To configure the auto-compaction time period without interfering with
`indexingPeriod`, see [Set frequency of compaction
runs](#change-compaction-frequency).
diff --git a/docs/data-management/delete.md b/docs/data-management/delete.md
index 626f0f910e0..e37ba48b544 100644
--- a/docs/data-management/delete.md
+++ b/docs/data-management/delete.md
@@ -22,7 +22,7 @@ title: "Data deletion"
~ under the License.
-->
-## By time range, manually
+## Delete data for a time range manually
Apache Druid stores data [partitioned by time chunk](../design/storage.md) and
supports
deleting data for time chunks by dropping segments. This is a fast,
metadata-only operation.
@@ -42,17 +42,17 @@ For documentation on disabling segments using the
Coordinator API, see the
A data deletion tutorial is available at [Tutorial: Deleting
data](../tutorials/tutorial-delete-data.md).
-## By time range, automatically
+## Delete data automatically using drop rules
Druid supports [load and drop rules](../operations/rule-configuration.md),
which are used to define intervals of time
where data should be preserved, and intervals where data should be discarded.
Data that falls under a drop rule is
-marked unused, in the same manner as if you [manually mark that time range
unused](#by-time-range-manually). This is a
+marked unused, in the same manner as if you [manually mark that time range
unused](#delete-data-for-a-time-range-manually). This is a
fast, metadata-only operation.
Data that is dropped in this way is marked unused, but remains in deep
storage. To permanently delete it, use a
[`kill` task](#kill-task).
-## Specific records
+## Delete specific records
Druid supports deleting specific records using [reindexing](update.md#reindex)
with a filter. The filter specifies which
data remains after reindexing, so it must be the inverse of the data you want
to delete. Because segments must be
@@ -74,15 +74,15 @@ used to filter, modify, or enrich the data during the
reindexing job.
Data that is deleted in this way is marked unused, but remains in deep
storage. To permanently delete it, use a [`kill`
task](#kill-task).
-## Entire table
+## Delete an entire table
-Deleting an entire table works the same way as [deleting part of a table by
time range](#by-time-range-manually). First,
+Deleting an entire table works the same way as [deleting part of a table by
time range](#delete-data-for-a-time-range-manually). First,
mark all segments unused using the Coordinator API or web console. Then,
optionally, delete it permanently using a
[`kill` task](#kill-task).
<a name="kill-task"></a>
-## Permanently (`kill` task)
+## Delete data permanently using `kill` tasks
Data that has been overwritten or soft-deleted still remains as segments that
have been marked unused. You can use a
`kill` task to permanently delete this data.
@@ -116,3 +116,33 @@ Some of the parameters used in the task payload are
further explained below:
**WARNING:** The `kill` task permanently removes all information about the
affected segments from the metadata store and
deep storage. This operation cannot be undone.
+### Auto-kill data using Coordinator duties
+
+Instead of submitting `kill` tasks manually to permanently delete data for a
given interval, you can enable auto-kill of unused segments on the Coordinator.
+The Coordinator runs a duty periodically to identify intervals containing
unused segments that are eligible for kill. It then launches a `kill` task for
each of these intervals.
+
+Refer to [Data management on the
Coordinator](../configuration/index.md#data-management) to configure auto-kill
of unused segments on the Coordinator.
+
+### Auto-kill data on the Overlord (Experimental)
+
+:::info
+This is an experimental feature that:
+- Can be used only if [segment metadata
caching](../configuration/index.md#segment-metadata-cache-experimental) is
enabled on the Overlord.
+- MUST NOT be used if auto-kill of unused segments is already enabled on the
Coordinator.
+:::
+
+This is an experimental feature to run kill tasks in an "embedded" mode on the
Overlord itself.
+
+These embedded tasks offer several advantages over auto-kill performed by the
Coordinator as they:
+- avoid a lot of unnecessary REST API calls to the Overlord from tasks or the
Coordinator.
+- kill unused segments as soon as they become eligible.
+- run on the Overlord and do not take up task slots.
+- finish faster as they save on the overhead of launching a task process.
+- kill a small number of segments per task, to ensure that locks on an
interval are not held for too long.
+- skip locked intervals to avoid head-of-line blocking in kill tasks.
+- require little to no configuration.
+- can keep up with a large number of unused segments in the cluster.
+- take advantage of the segment metadata cache on the Overlord.
+
+Refer to [Auto-kill unused segments on the
Overlord](../configuration/index.md#auto-kill-unused-segments-experimental) to
configure auto-kill of unused segments on the Overlord.
+See [Auto-kill metrics](../operations/metrics.md#auto-kill-unused-segments)
for the metrics emitted by embedded kill tasks.
diff --git a/docs/operations/clean-metadata-store.md
b/docs/operations/clean-metadata-store.md
index becc91bb483..65de3112302 100644
--- a/docs/operations/clean-metadata-store.md
+++ b/docs/operations/clean-metadata-store.md
@@ -79,16 +79,7 @@ Segment records and segments in deep storage become eligible
for deletion when b
- When they meet the eligibility requirement of kill task datasource
configuration according to `killDataSourceWhitelist` set in the Coordinator
dynamic configuration. See [Dynamic
configuration](../configuration/index.md#dynamic-configuration).
- When the `durationToRetain` time has passed since their creation.
-Kill tasks use the following configuration:
-- `druid.coordinator.kill.on`: When `true`, enables the Coordinator to submit
a kill task for unused segments, which deletes them completely from metadata
store and from deep storage.
-Only applies to the specified datasources in the dynamic configuration
parameter `killDataSourceWhitelist`.
-If `killDataSourceWhitelist` is not set or empty, then kill tasks can be
submitted for all datasources.
-- `druid.coordinator.kill.period`: Defines the frequency in [ISO 8601
format](https://en.wikipedia.org/wiki/ISO_8601#Durations) for the cleanup job
to check for and delete eligible segments. Defaults to
`druid.coordinator.period.indexingPeriod`. Must be greater than or equal to
`druid.coordinator.period.indexingPeriod`.
-- `druid.coordinator.kill.durationToRetain`: Defines the retention period in
[ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations) after
creation that segments become eligible for deletion.
-- `druid.coordinator.kill.ignoreDurationToRetain`: A way to override
`druid.coordinator.kill.durationToRetain`. When enabled, the coordinator
considers all unused segments as eligible to be killed.
-- `druid.coordinator.kill.bufferPeriod`: Defines the amount of time that a
segment must be unused before it can be permanently removed from metadata and
deep storage. This serves as a buffer period to prevent data loss if data ends
up being needed after being marked unused.
-- `druid.coordinator.kill.maxSegments`: Defines the maximum number of segments
to delete per kill task.
-- `druid.coordinator.kill.maxInterval`: Defines the largest interval, as an
[ISO 8601 duration](https://en.wikipedia.org/wiki/ISO_8601#Durations), of
segments to delete per kill task. Set to zero, e.g. `PT0S`, for unlimited.
+Refer to [Data Management on
Coordinator](../configuration/index.md#data-management) to configure auto-kill
of unused segments on the Coordinator.
### Audit records
diff --git a/docs/operations/metrics.md b/docs/operations/metrics.md
index a68cafac85c..a0f955f498a 100644
--- a/docs/operations/metrics.md
+++ b/docs/operations/metrics.md
@@ -354,6 +354,20 @@ The following metrics are emitted only when [segment
metadata caching](../config
|`segment/metadataCache/pending/updated`|Number of pending segments updated in
the cache during the latest sync.|`dataSource`|
|`segment/metadataCache/pending/skipped`|Number of unparseable pending segment
records that were skipped in the latest sync.|`dataSource`|
+### Auto-kill unused segments
+
+These metrics are emitted only if [auto-kill of unused
segments](../data-management/delete.md#auto-kill-data-on-the-overlord-experimental)
is enabled on the Overlord.
+
+|Metric|Description|Dimensions|
+|------|-----------|----------|
+|`segment/killed/metadataStore/count`|Number of segments permanently deleted
from the metadata store.|`taskId`, `groupId`, `taskType`(=`kill`), `dataSource`|
+|`segment/killed/deepStorage/count`|Number of segments permanently deleted
from the deep storage.|`taskId`, `groupId`, `taskType`(=`kill`), `dataSource`|
+|`segment/kill/unusedIntervals/count`|Number of intervals containing unused
segments for a given datasource.|`dataSource`|
+|`segment/kill/skippedIntervals/count`|Number of intervals that were skipped
for kill due to being already locked by another task.|`taskId`, `groupId`,
`taskType`(=`kill`), `dataSource`|
+|`segment/kill/queueReset/time`|Time taken in milliseconds to reset the kill
queue.||
+|`segment/kill/queueProcess/time`|Time taken in milliseconds to fully process
the kill queue.||
+|`segment/kill/jobsProcessed/count`|Number of jobs processed from the kill
queue for a given datasource.|`dataSource`|
+
## Shuffle metrics (Native parallel task)
The shuffle metrics can be enabled by adding
`org.apache.druid.indexing.worker.shuffle.ShuffleMonitor` in
`druid.monitoring.monitors`.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]