maytasm commented on a change in pull request #11245:
URL: https://github.com/apache/druid/pull/11245#discussion_r634028939



##########
File path: docs/operations/clean-metadata-store.md
##########
@@ -0,0 +1,168 @@
+---
+id: clean-metadata-store
+title: "Automated cleanup for metadata records"
+sidebar_label: Automated metadata cleanup
+description: "Defines a strategy to maintain Druid metadata store performance 
by automatically removing leftover records for deleted entities: datasources, 
supervisors, rulles, compaction configuration, audit records, etc. Most 
applicable to databases with 'high-churn' datasources."
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+When you delete some entities from Apache Druid, records related to the entity 
may remain in the metadata store including:
+
+- segments records
+- audit records
+- supervisor records
+- rule records
+- compaction configuration records
+- datasource records created by supervisors
+
+If you have a high datasource churn rate, meaning you frequently create and 
delete many short-lived datasources or other related entities like compaction 
configuration or rules, the leftover records can start to fill your metadata 
store and cause performance issues.
+
+To maintain metadata store performance in this case, you can configure Apache 
Druid to automatically remove records associated with deleted entities from the 
metadata store.
+
+## Automated cleanup strategies
+There are several cases when you should consider automated cleanup of the 
metadata related to deleted datasources:
+- Proactively, if you know you have many high-churn datasources. For example 
you have scripts that create and delete supervisors regularly.
+- If you have issues with the hard disk for your metadata database filling up.
+- If you run into performance issues with the metadata database. For example 
API calls are very slow or fail to execute.
+
+If you have compliance requirements to keep audit records, use alternative 
methods to preserve audit metadata if you enable automated cleanup for audit 
records. For example, periodically export audit metadata records to external 
storage.
+
+## Configure automated metadata cleanup
+
+You can configure cleanup on a per-entity basis with the following constraints:
+- You have to configure a [kill task for segment records](#kill-task) before 
you can configure automated cleanup for [rules](#rules-records) or [compaction 
configuration](#compaction-configuration-records).
+- You have to configure the scheduler for the cleanup jobs to run at a the 
same frequency or more frequently than your most frequent cleanup job. For 
example, if your most frequent cleanup job is every hour, set the scheduler 
metadata store management period to one hour or less: 
`druid.coordinator.period.metadataStoreManagementPeriod=P1H`.
+
+For details on configuration properties, see [Metadata 
management](../configuration/index.md#metadata-management).
+
+<a name="kill-task">
+### Segment records and segments in deep storage (kill task)
+Segment records and segments in deep storage become eligible for deletion:
+
+- When they meet the eligibility requirement of kill task datasource 
configuration according to `killDataSourceWhitelist` and `killAllDataSources` 
set in the Coordinator dynamic configuration. See [Dynamic 
configuration](../configuration/index.md#dynamic-configuration).
+- The `durationToRetain` time has passed since their creation.
+
+Kill tasks use the following configuration:
+- `druid.coordinator.kill.on`: When `True`, enables the Coordinator to submit 
kill task for unused segments which deletes them completely from metadata store 
and from deep storage. Only applies dataSources according to allowed 
datasources or all datasources.
+- `druid.coordinator.kill.period`: Defines the frequency in [ISO 8601 
format](https://en.wikipedia.org/wiki/ISO_8601#Durations) for the cleanup job 
to check for and delete eligible segments. Defaults to `P1D`. Must be greater 
than `druid.coordinator.period.indexingPeriod`. 
+- `druid.coordinator.kill.durationToRetain`: Defines the retention period in 
[ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations) after 
creation that segments become eligible for deletion.
+- `druid.coordinator.kill.maxSegments`: Defines the maximum number of segments 
to delete per kill task.
+>The kill task is the only configuration in this topic that affects actual 
data in deep storage and not simply metadata or logs.
+
+### Audit records
+All audit records become eligible for deletion when the `durationToRetain` 
time has passed since their creation.
+
+Audit cleanup uses the following configuration:
+ - `druid.coordinator.kill.audit.on`: When `true`, enables cleanup for audit 
records.
+ - `druid.coordinator.kill.audit.period`: Defines the frequency in [ISO 8601 
format](https://en.wikipedia.org/wiki/ISO_8601#Durations) for the cleanup job 
to check for and delete eligible audit records. Defaults to `P1D`.
+ - `druid.coordinator.kill.audit.durationToRetain`: Defines the retention 
period in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations) 
after creation that audit records become eligible for deletion.
+
+### Supervisor records
+Supervisor records become eligible for deletion when the supervisor is 
terminated and the `durationToRetain` time has passed since their creation.
+
+Supervisor cleanup uses the following configuration:
+ - `druid.coordinator.kill.supervisor.on`: When `true`, enables cleanup for 
supervisor records.
+ - `druid.coordinator.kill.supervisor.period`: Defines the frequency in [ISO 
8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations) for the cleanup 
job to check for and delete eligible supervisor records. Defaults to `P1D`.
+ - `druid.coordinator.kill.supervisor.durationToRetain`: Defines the retention 
period in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations) 
after creation that supervisor records become eligible for deletion.
+
+### Rules records
+Rule records become eligible for deletion when all segments for the datasource 
have been killed by the kill task and the `durationToRetain` time has passed 
since their creation. Automated cleanup for rules requires a [kill 
task](#kill-task).
+
+Rule cleanup uses the following configuration:
+ - `druid.coordinator.kill.rule.on`: When `true`, enables cleanup for rules 
records.
+ - `druid.coordinator.kill.rule.period`: Defines the frequency in [ISO 8601 
format](https://en.wikipedia.org/wiki/ISO_8601#Durations) for the cleanup job 
to check for and delete eligible rules records. Defaults to `P1D`.
+ - `druid.coordinator.kill.rule.durationToRetain`: Defines the retention 
period in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations) 
after creation that rules records become eligible for deletion.
+
+ ### Compaction configuration records
+Compaction configuration records in the records in the `druid_config` table 
become eligible for deletion all segments for the datasource have been killed 
by the kill task. Automated cleanup for compaction configuration requires a 
[kill task](#kill-task).
+
+Compaction configuration cleanup uses the following configuration:
+ - `druid.coordinator.kill.compaction.on`: When `true`, enables cleanup for 
compaction  configuration records.
+ - `druid.coordinator.kill.compaction.period`: Defines the frequency in [ISO 
8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations) for the cleanup 
job to check for and delete eligible compaction configuration records. Defaults 
to `P1D`.
+
+>If you already have an extremely large compaction configuration, you may not 
be able to delete compaction configuration due to size limits with the audit 
log. In this case you can set `druid.audit.manager.maxPayloadSizeBytes` and 
`druid.audit.manager.skipNullField` to avoid the auditing issue. See [Audit 
logging](../configuration/index.md#audit-logging).
+
+### Datasource records created by supervisors
+Datasource records created by supervisors become eligible for deletion when 
the supervisor is terminated or does not exist in the `druid_supervisors` table 
and the `durationToRetain` time has passed since their creation.
+
+Datasource cleanup uses the following configuration:
+ - `druid.coordinator.kill.datasource.on`: When `true`, enables cleanup 
datasources created by supervisors.
+ - `druid.coordinator.kill.datasource.period`: Defines the frequency in [ISO 
8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations) for the cleanup 
job to check for and delete eligible datasource records. Defaults to `P1D`.
+ - `druid.coordinator.kill.datasource.durationToRetain`: Defines the retention 
period in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601#Durations) 
after creation that datasource records become eligible for deletion.
+
+### Indexer task logs
+The Druid Overlord handles task log metadata management.

Review comment:
       Can you add that this also delete the actual task log file (file that is 
stored locally, in s3, in cloud, etc)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to