suneet-s commented on PR #14662:
URL: https://github.com/apache/druid/pull/14662#issuecomment-1659564965

   @zachjsh  - First, thank you for making improvements in this area! I have a 
few comments and I have not yet read the code in detail. I think some 
additional details in the PR description about how you envision these changes 
to be used will be helpful in reviewing the patch.
   
   > > @zachjsh I really suggest breaking up this PR into smaller PRs where 
each PR captures a single improvement. (Also when a PR title is `Various Auto 
kill improvements` it really is telling you nothing much)
   > 
   > @maytasm , I changed the title to capture the change more succinctly; here 
we are adding a new kill type task which allows a list of provided segmentIds 
to be killed, and this new task type is used in the new coordinator duty.
   
   @zachjsh I agree with @maytasm. It would be good to split this patch up into 
multiple PRs to make it easier to review.
   
   > Also added a CoordinatorDynamicConfig property `killTaskSlotRatio`, which 
allows the user to control the maximum number of kill task slots in use in the 
cluster, as spawned by Druid's auto kill coordinator duty. This considers the 
total task capacity in the cluster, with autoscaling if configured, and allows 
for only that respective number of active kill tasks at a given time in the 
cluster, as spawned by the auto kill duty.
   
   ^ This part of the patch seems substantial enough to be it's own PR, and 
should definitely be called out in the release notes, but as written, it is 
very easy to miss this. wrt `killTaskSlotRatio`, have you considered combining 
this with the `compactTaskSlotRatio` so that an operator has one knob to tune 
how much capacity in the cluster should be given to "data management" 
operations? Compaction tasks also offer a maxCompactTaskSlot config, if we do 
decide to have a different set of knobs for kill tasks - is there a reason not 
to also introduce a maxKillTaskSlots config?
   
   Since we are now introducing a config to limit the number of concurrent kill 
tasks - is there any reason the kill duty should not run more frequently, say 
every minute by default?
   
   Are there any guardrails preventing an operator from setting the task slots 
used for data management operations higher than the available capacity of the 
cluster?
   
   > Implement new Kill_by_segment_ids task and use in auto kill coordinator 
duty
   
   WRT the `kill_by_segment_id` task - Why does this task need a different name 
from the kill task? Have you considered making the kill task accept an interval 
and a limit instead of a list of segment ids? This seems like an easier 
interface for operators to use and reason about.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to