Re: [PR] Implement per-segment query timeout on data nodes (druid)

via GitHub Mon, 16 Jun 2025 23:09:00 -0700


abhishekagarwal87 commented on PR #18148:
URL: https://github.com/apache/druid/pull/18148#issuecomment-2979064473


   I think, this solves the problem only a narrow class of problems and adds 
another parameter that may not see a broader adoption. I do however agree that 
query scheduling at data level is an area worth exploring and tinkering with. 
Though I wonder if there are better ways to solve this problem. One solution 
that comes to my mind is if we can use virtual threads of sort. Right now, we 
have this processing thread pool that essentially dictates the compute capacity 
that segment processing threads. But if these threads are doing lot more IO 
than CPU, that capacity is being wasted. Recently java has gotten the ability 
of Virtual threads and that could be used to run segment processing instead of 
directly using OS-level threads. 
   
   A higher-level comment is that we shouldn't just make this change without 
some confidence that our solution makes lives better for a good number of use 
cases. You should first build a test setup that can be used to simulate query 
congestion at data level along with metrics that reflect the degree of the 
congestion, throughput, fairness. Once such a system is in place, thats when 
you can craft few strategies and using your test setup to measure what strategy 
is the best. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Implement per-segment query timeout on data nodes (druid)

Reply via email to