gianm commented on issue #6993: [Proposal] Dynamic prioritization and laning
URL: 
https://github.com/apache/incubator-druid/issues/6993#issuecomment-461095949
 
 
   @peferron,
   
   > Could you perhaps elaborate a bit more about why the dynamic 
prioritization scheme based on `periodThreshold` makes sense, as opposed to a 
`durationThreshold` for example? In our case, users may issue interactive 
queries that have a small interval duration (e.g. one hour) but are far in the 
past (e.g. six months ago).
   
   I was thinking a period threshold is likely to align with how people often 
set up historical tiers: a 'hot' tier for the latest 30 days and a 'cold' tier 
for older data.
   
   @Dylan1312,
   
   > Instead of using the interval to decide how to change the priority of the 
query, what do you think about using the number of segments that the query 
operates over? Appreciate that this mechanism is to be expanded upon in future 
work but in the short term I think this works better because it more directly 
correlates with the amount of work a query has to do.
   
   Well, same :)
   
   Maybe it's worth creating more than one threshold even in the first version. 
It sounds like there are is demand for ways of thinking about prioritization 
beyond a 'hot' / 'cold' tier setup. How about starting with the three that have 
been suggested: period (from now), duration (of the query interval), and number 
of segments? In this case, if a user sets multiple, I'd suggest applying the 
adjustment _once_, if _any_ of the thresholds triggers. I don't see a reason 
that a query that only triggers one threshold to be prioritized above one that 
triggers two. Concretely the properties would be,
   
   - druid.broker.priority.periodThreshold
   - druid.broker.priority.durationThreshold
   - druid.broker.priority.segmentCountThreshold
   - druid.broker.priority.adjustment
   
   I think we'll want to add number of rows in the future, but that information 
is tougher to come by right now (the broker doesn't cache it unless Druid SQL 
is enabled, and it only lives in the SQL module). So I would still leave that 
one for the future.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to