jasperjiaguo opened a new issue, #9055:
URL: https://github.com/apache/pinot/issues/9055

   Currently, the broker/server pre-empt a query at certain points during 
scheduling/execution when the remainingTime reaches 0. However, this is not 
sufficient for a couple of use-cases: 
   
   - When a user wants to initiate some ad-hoc queries for data analysis on a 
large production table but doesn't specify a tight data range. 
   - When unexpected slow queries are saturating CPU/memory on broker and 
servers.
   - When a large report generation query (especially with large 
groupby+orderby/select*/other compute intensive functions) is initiated.
   - ...
   
   Any of these cases can cause a broker/server to be very slow or bring down a 
host. Meanwhile, the recovery time can be quite long as the servers will 
struggle to finish the computation. 
   
   Our pre-emption logic can be extended to both brokers and servers to 
pre-empt queries based on two factors 
   
   - CPU Time
   Broker currently has logic to return queries that have reached timeout. 
However, this can be improved at the broker/server. We can potentially account 
for the total CPU workload on all worker threads on a server.
   
   - Memory consumption - to avoid out-of-memory errors
   We want to estimate the heap memory usage and pre-empt based on a threshold.
   
   We would like to find the correct instrument of measuring these factors and 
improve preemption with least overhead


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to