abhishekrb19 commented on code in PR #16719:
URL: https://github.com/apache/druid/pull/16719#discussion_r1673243028
##########
server/src/main/java/org/apache/druid/server/coordinator/duty/KillUnusedSegments.java:
##########
@@ -163,30 +190,44 @@ private DruidCoordinatorRuntimeParams runInternal(final
DruidCoordinatorRuntimeP
* Spawn kill tasks for each datasource in {@code dataSourcesToKill} upto
{@code availableKillTaskSlots}.
*/
private void killUnusedSegments(
- @Nullable final Collection<String> dataSourcesToKill,
+ final Set<String> dataSourcesToKill,
final int availableKillTaskSlots,
final CoordinatorRunStats stats
)
{
- if (CollectionUtils.isNullOrEmpty(dataSourcesToKill) ||
availableKillTaskSlots <= 0) {
+ if (CollectionUtils.isNullOrEmpty(dataSourcesToKill)) {
+ log.info("Skipping KillUnusedSegments because there are no datasources
to kill.");
stats.add(Stats.Kill.SUBMITTED_TASKS, 0);
return;
}
+ final Iterator<String> dataSourcesToKillIterator =
this.datasourceIterator.getIterator();
+ final Set<String> remainingDatasourcesToKill = new
HashSet<>(dataSourcesToKill);
+ final Set<String> datasourcesKilled = new HashSet<>();
- final Collection<String> remainingDatasourcesToKill = new
ArrayList<>(dataSourcesToKill);
int submittedTasks = 0;
- for (String dataSource : dataSourcesToKill) {
+ while (dataSourcesToKillIterator.hasNext()) {
Review Comment:
> I'm not sure if its possible that this.datasourceIterator gets updated by
another run of this duty in another thread,
Each duty
[group](https://github.com/apache/druid/blob/master/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java#L505)
is allocated a single scheduled executor with only
[1](https://github.com/apache/druid/blob/master/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java#L549)
thread. And each duty within a group run sequentially. So I don't think we
can have more than one duty running concurrently at the same time.
I think that running the duties concurrently would cause race conditions and
other issues. Also, I noticed the `RoundRobinServerSelector`
[is](https://github.com/apache/druid/blob/master/server/src/main/java/org/apache/druid/server/coordinator/loading/RoundRobinServerSelector.java#L47)
`@NotThreadSafe`, which is used by a few coordinator duties. It looks like
they're operating under the same assumption?
@kfaraz, could you please confirm this assumption?
I tried to find documentation on how the duties operate but couldn't find
much information. I think adding some code/public-facing documentation on this
would help clarify this. :-)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]