jihoonson commented on code in PR #10877:
URL: https://github.com/apache/druid/pull/10877#discussion_r864264390


##########
docs/configuration/index.md:
##########
@@ -812,7 +812,8 @@ These Coordinator static configurations can be defined in 
the `coordinator/runti
 |`druid.coordinator.kill.pendingSegments.on`|Boolean flag for whether or not 
the Coordinator clean up old entries in the `pendingSegments` table of metadata 
store. If set to true, Coordinator will check the created time of most recently 
complete task. If it doesn't exist, it finds the created time of the earliest 
running/pending/waiting tasks. Once the created time is found, then for all 
dataSources not in the `killPendingSegmentsSkipList` (see [Dynamic 
configuration](#dynamic-configuration)), Coordinator will ask the Overlord to 
clean up the entries 1 day or more older than the found created time in the 
`pendingSegments` table. This will be done periodically based on 
`druid.coordinator.period.indexingPeriod` specified.|true|
 |`druid.coordinator.kill.on`|Boolean flag for whether or not the Coordinator 
should submit kill task for unused segments, that is, hard delete them from 
metadata store and deep storage. If set to true, then for all whitelisted 
dataSources (or optionally all), Coordinator will submit tasks periodically 
based on `period` specified. These kill tasks will delete all unused segments 
except for the last `durationToRetain` period. A whitelist can be set via 
dynamic configuration `killDataSourceWhitelist` described later.|true|
 |`druid.coordinator.kill.period`|How often to send kill tasks to the indexing 
service. Value must be greater than `druid.coordinator.period.indexingPeriod`. 
Only applies if kill is turned on.|P1D (1 Day)|
-|`druid.coordinator.kill.durationToRetain`| Do not kill unused segments in 
last `durationToRetain`, must be greater or equal to 0. Only applies and MUST 
be specified if kill is turned on.|`P90D`|
+|`druid.coordinator.kill.durationToRetain`|Only applies if you set 
`druid.coordinator.kill.on` to `true`. This value is ignored if 
`druid.coordinator.kill.ignoreDurationToRetain` is `true`. Valid configurations 
must be a ISO8601 period. Druid will not kill unused segments whose interval 
end date is beyond `now - durationToRetain`. `durationToRetain` can be a 
negative ISO8601 period, which would result in `now - durationToRetain` to be 
in the future.|`P90D`|

Review Comment:
   Good question. I would like to avoid adding a new column in `druid_segments` 
table if possible since it will complicate the migration process. We can 
perhaps add a new field in `DataSegment` instead which is stored in the 
`payload` column in the `druid_segments` table. This will increase the cost of 
`SqlSegmentsMetadataManager.getUnusedSegmentIntervals()` since it will no 
longer be able to use the index on the end date, but have to read each row, 
deserialize `payload`, and check the last update time in it. I guess it would 
be worth to run some benchmark to see how expensive this would be. I guess it 
will be OK if this ends up a bit expensive as the coordinator doesn't do much 
work. But, if you think this is going to be too expensive, then I think we have 
to either add a new column in `druid_segments` table or just stick to your 
change in this PR. If we are going to add a new column, we will have to 
document a right upgrade/downgrade path. Providing a script for auto migration 
woul
 d be nice, but not mandatory. 
   
   Another thing we should think about is what the last update time will be for 
existing unused segments as the new field will be missing for them. Perhaps we 
can keep the current behavior for the segments that the last update time is 
missing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to