jasperjiaguo opened a new issue, #9055: URL: https://github.com/apache/pinot/issues/9055
Currently, the broker/server pre-empt a query at certain points during scheduling/execution when the remainingTime reaches 0. However, this is not sufficient for a couple of use-cases: - When a user wants to initiate some ad-hoc queries for data analysis on a large production table but doesn't specify a tight data range. - When unexpected slow queries are saturating CPU/memory on broker and servers. - When a large report generation query (especially with large groupby+orderby/select*/other compute intensive functions) is initiated. - ... Any of these cases can cause a broker/server to be very slow or bring down a host. Meanwhile, the recovery time can be quite long as the servers will struggle to finish the computation. Our pre-emption logic can be extended to both brokers and servers to pre-empt queries based on two factors - CPU Time Broker currently has logic to return queries that have reached timeout. However, this can be improved at the broker/server. We can potentially account for the total CPU workload on all worker threads on a server. - Memory consumption - to avoid out-of-memory errors We want to estimate the heap memory usage and pre-empt based on a threshold. We would like to find the correct instrument of measuring these factors and improve preemption with least overhead -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
