Re: [PR] Track and emit segment loading rate for HttpLoadQueuePeon on Coordinator (druid)

via GitHub Wed, 10 Jul 2024 05:00:05 -0700


kfaraz commented on PR #16691:
URL: https://github.com/apache/druid/pull/16691#issuecomment-2220328893


   Thanks for the feedback, @AmatyaAvadhanula !
   
   >  Would a simpler approach such as [sum(segment size) / time] across 
successful loads in a coordinator cycle not be sufficient?
   
   Segment loads are not tied to a coordinator cycle.
   
   "Coordinator cycle" or "coordinator run" simply refers to a single 
invocation of a duty
   like `RunRules` or `BalanceSegments`. After the duty has run and assigned a 
bunch
   of segments to the load queue, the segments may take any amount of time to 
finish loading.
   
   While summing up the sizes of successfully loaded segments is trivial,
   the definition of _time elapsed_ is what complicates the whole logic.
   
   Problems:
   1. We want some kind of a moving average.
   2. Segments assigned in one coordinator run may remain in the queue for 
several runs.
   So when is the start time and end time?
   3. While there are already segments in queue, the next coordinator run may 
assign more segments.
   How would this affect start time and end time?
   ---
   The simplest (and most intuitive) thing to do would be to track the load 
time of each segment
   individually. I actually started out doing this. Start time would be the 
time when the request to
   load that segment is first sent to the server. End time would be when the 
request succeeds.
   
   This design alternative has been alluded to in the PR description as well.
   
   __But this would be incorrect,__ since while a segment is being loaded on 
the historical by one thread,
   another thread could be loading another segment.
   
   In other words, _the segment load durations are not mutually exclusive,_ so 
we can't simply sum them up.
   If we did, the computed loading rate would be lower than the actual (not the 
end of the world but still).
   
   That said, if there is only one loading thread on the server (which is often 
the case), then the naive logic works just fine.
   ```
   numLoadingThreads = Math.max(1, 
JvmUtils.getRuntimeInfo().getAvailableProcessors() / 6)
   ```
   
   ---
   
   Let me know what you think.
   If you feel this seems too complicated and we could get away with the naive 
logic for now,
   I can just do that and save this convoluted design for a rainy day 😂 . Once 
we have seen
   the feature in action, we will know for sure. In the future, if the 
Coordinator could know the number
   of loading threads on the server, we could just multiply the computed rate 
by num threads to offset
   the effect of summing up the times.
   
   cc: @abhishekrb19 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Track and emit segment loading rate for HttpLoadQueuePeon on Coordinator (druid)

Reply via email to