jtuglu-netflix commented on PR #18148:
URL: https://github.com/apache/druid/pull/18148#issuecomment-2979079796

   > I think, this solves the problem only a narrow class of problems and adds 
another parameter that may not see a broader adoption. I do however agree that 
query scheduling at data level is an area worth exploring and tinkering with. 
Though I wonder if there are better ways to solve this problem. One solution 
that comes to my mind is if we can use virtual threads of sort. Right now, we 
have this processing thread pool that essentially dictates the compute capacity 
that segment processing threads. But if these threads are doing lot more IO 
than CPU, that capacity is being wasted. Recently java has gotten the ability 
of Virtual threads and that could be used to run segment processing instead of 
directly using OS-level threads.
   > 
   > A higher-level comment is that we shouldn't just make this change without 
some confidence that our solution makes lives better for a good number of use 
cases. You should first build a test setup that can be used to simulate query 
congestion at data level along with metrics that reflect the degree of the 
congestion, throughput, fairness. Once such a system is in place, thats when 
you can craft few strategies and using your test setup to measure what strategy 
is the best.
   
   Regarding virtual threads – yes I thought about this as well, however that 
was introduced in JDK 21, which is currently not supported fully (AFAIK).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to