abhishekrb19 commented on PR #16691:
URL: https://github.com/apache/druid/pull/16691#issuecomment-2259455745

   @kfaraz, apologies for the delay in getting back. 
   
   The docs recommend having at least 16 
[vCPUs](https://druid.apache.org/docs/latest/tutorials/cluster/#data-server) 
for data servers, so there will be at least 2 loading threads by default in 
production clusters. As to how much overlap there is between the time spent by 
loading threads, I'm not sure. Here are a few exploratory thoughts/ideas to 
simplify and track this more accurately:
   
   1. How about tracking the load rate directly in the historicals/data 
servers? I see you have listed that as a potential approach for the future. 
Besides it being useful and more accurate, I think it's also relatively 
straightforward to implement. Given that the `SegmentLoadDropHandler` code is 
already processing batches of change requests, I think we piggyback on the 
logic to add some tracking there. Also, we don't introduce another notion of 
"batch" in the coordinator's `HttpLoadQueuePeon` if we decide to revive that 
idea. 
   
   I think one downside to this is that the rate computed on the historicals 
won't account for the end-to-end time (e.g., time spent in the queue, etc). If 
that is significant, we can perhaps track a metric separately on the 
coordinator?
   
   2. If we want to compute the aggregated rate from the coordinator, we could 
perhaps expose an internal API on the historicals that the coordinator can then 
query to get the requested info (# of loading threads, # of segment batches 
processed, etc.) if they're available. However, I think this approach might be 
an overkill.
   
   Please let me know what you think.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to