yupeng9 opened a new issue #5627:
URL: https://github.com/apache/incubator-pinot/issues/5627
I want to open a discussion around how to add query-level priority and
control. Though Pinot has tenants at the servers and brokers level, often we
need finer-granular control at the query level.
In production, we are facing the following common challenges:
- Bad queries. We have seen some expensive queries, such as due to mistakes
or sloppy ones from amateur users, could cause the scan of very large number of
segments, resulting in high-GC, zookeeper timeouts, and eventually bringing
down the servers in the entire tenant.
- Query priorities. Not all queries have the same priority. There are two
main categories of queries we observe today: 1) fixed query periodically run,
for example, to power dashboards; 2) ad-hoc queries to derive insights. Often,
the latter category is less important but its workload is not predictable and
therefore may breach the SLA of the first category.
I'd like to hear your thoughts on solving this type of challenge. While it's
possible to explore a formal resource-based quota control at finer-granularity
as other query engines do like Presto or Hive/Yarn, I'm also interested in some
quick solutions to solve challenge #1, which can happen often and easily. For
example, can we reject the expensive queries with the cost-estimate at the
query plan phase? Or tools to identify expensive ongoing queries and be able to
kill them?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]