gianm commented on issue #6993: [Proposal] Dynamic prioritization and laning
URL: 
https://github.com/apache/incubator-druid/issues/6993#issuecomment-488192666
 
 
   @sascha-coenen, thanks for your feedback. After reading it, it sounds to me 
like most of it can be done as additions to my original (and intentionally 
minimalist) proposal, so I am not experiencing concern that the first step 
would be misaligned with a more sophisticated solution. But would you please 
let me know what you think, and in particular, what you think of the below 
comments?
   
   **On resource pools,**
   I like the resource pool idea as a way of incorporating extrinsic 
information. I think it maps nicely onto how people want to think about query 
priorities: numbers don't really make sense, but `interactive` or `api` or 
`scheduled-report` do make sense. It's more of a pain to configure, though. 
Rather than a single number on the query context, there's an indirection from 
the query context to an identifier defined somewhere else (broker 
configuration? metadata db?).
   
   But even though it's a pain, I think it's a useful feature. In the context 
of my original proposal they would be a good place to define overrides for 
default priority and for the `druid.broker.priority.*` properties. (You might 
have more aggressive or more lenient thresholds for different kinds of queries.)
   
   You're right about the name, it's not a good one based on how you've 
conceived it (it's not really a pool of resources, it's more like a way to 
group together queries that should be treated similarly).
   
   I don't think they'd need to be added immediately, it makes sense to me to 
do them as a follow-on that lets you override the systemwide defaults.
   
   **On incorporating multiple sources of intrinsic information,**
   I think while my proposal is extraordinarily basic as far as what intrinsic 
information it incorporates (intentionally so, to keep the initial 
implementation simple), I hope more could be added later. I think all the stuff 
in your proposal could be done as follow-ons if we wanted. Since mine is so 
simple, it won't leave much legacy cruft behind that a more sophisticated 
system would have to deal with. Just the `druid.broker.priority.*` properties.
   
   **On rejecting queries,**
   I think this is the biggest area where I'm not sure how to align your 
proposal with mine.
   
   The main conflict I see is that you envision not needing to reject queries, 
but I think we do need to reject queries, if only to prevent http server thread 
starvation. Due to this issue, I don't think we can do more than two 'lanes': a 
fast lane (queries in this category are always accepted for processing) and a 
slow lane (queries are rejected if the lane is 'full'). This constrains how 
beautiful the solution can be, since a binary decision must be made by the 
broker (reject a query, or accept it and begin execution).
   
   By the way, http server thread starvation isn't the only resource that I 
think we need a binary laning system for: others include concurrent connections 
from brokers to historicals, and memory used for merging and storing partial 
results on historicals (both offheap merge buffers + on-heap structures). All 
of these resources are limited and must be reserved early in the query 
processing lifecycle, and cannot be released without canceling the query.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to