jihoonson commented on a change in pull request #11501:
URL: https://github.com/apache/druid/pull/11501#discussion_r677779492
##########
File path: docs/ingestion/data-management.md
##########
@@ -95,18 +95,23 @@ A data deletion tutorial is available at [Tutorial:
Deleting data](../tutorials/
## Kill Task
-Kill tasks delete all information about a segment and removes it from deep
storage. Segments to kill must be unused (used==0) in the Druid segment table.
The available grammar is:
+Kill tasks delete all information about a segment and removes it from deep
storage. Segments to kill must be unused (used==0) in the Druid segment table.
Review comment:
Hmm maybe
```suggestion
The kill task deletes all information about segments and removes them from
deep storage. Segments to kill must be unused (used==0) in the Druid segment
table.
```
##########
File path: docs/ingestion/data-management.md
##########
@@ -95,18 +95,23 @@ A data deletion tutorial is available at [Tutorial:
Deleting data](../tutorials/
## Kill Task
-Kill tasks delete all information about a segment and removes it from deep
storage. Segments to kill must be unused (used==0) in the Druid segment table.
The available grammar is:
+Kill tasks delete all information about a segment and removes it from deep
storage. Segments to kill must be unused (used==0) in the Druid segment table.
+
+The available grammar is:
```json
{
"type": "kill",
"id": <task_id>,
"dataSource": <task_datasource>,
"interval" : <all_segments_in_this_interval_will_die!>,
+ "markAsUnused": <true|false>,
Review comment:
It seems worth to mention what the default is.
##########
File path:
server/src/main/java/org/apache/druid/metadata/IndexerSQLMetadataStorageCoordinator.java
##########
@@ -226,6 +226,33 @@ public void start()
return matchingSegments;
}
+ @Override
+ public int markSegmentsAsUnusedWithinInterval(String dataSource, Interval
interval)
+ {
+ int numSegmentsMarkedUnused = connector.retryTransaction(
+ (handle, status) -> {
+ return handle
+ .createStatement(
+ StringUtils.format(
+ "UPDATE %s SET used=false WHERE dataSource = :dataSource
"
+ + "AND start >= :start AND %2$send%2$s <= :end",
Review comment:
Hmm should the end be exclusive? I see
`IndexerSQLMetadataStorageCoordinator.retrieveUnusedSegmentsForInterval()` uses
the same filter of `"end" <= :end`, which seems to break the contract of
`IndexerMetadataStorageCoordinator.retrieveUnusedSegmentsForInterval()` because
the end time is exclusive for segment intervals. Maybe this hasn't caused much
troubles so far because `retrieveUnusedSegmentsForInterval` returns only unused
segments, even though it seems like a bug. But this method will unset the used
flag and thus probably should respect the exclusivity of the end time?
##########
File path: docs/ingestion/data-management.md
##########
@@ -95,18 +95,23 @@ A data deletion tutorial is available at [Tutorial:
Deleting data](../tutorials/
## Kill Task
-Kill tasks delete all information about a segment and removes it from deep
storage. Segments to kill must be unused (used==0) in the Druid segment table.
The available grammar is:
+Kill tasks delete all information about a segment and removes it from deep
storage. Segments to kill must be unused (used==0) in the Druid segment table.
+
+The available grammar is:
```json
{
"type": "kill",
"id": <task_id>,
"dataSource": <task_datasource>,
"interval" : <all_segments_in_this_interval_will_die!>,
+ "markAsUnused": <true|false>,
"context": <task context>
}
```
+If `markAsUnused` is true, the kill task will first mark any segments within
the specified interval as unused, before deleting the unused segments within
the interval.
Review comment:
Perhaps we could make this more scary because it cannot be undone once
they delete segments with `markAsUnused` set?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]