agavra commented on issue #9615:
URL: https://github.com/apache/pinot/issues/9615#issuecomment-1284347711

   Adding notes from offline discussion with @walterddr @61yao 
@siddharthteotia. Feel free to add additional details if I missed anything.
   
   Design notes:
   1. the shared buffer itself should have a notion of fairness to make sure 
that it doesn't fill up with data all from a single query
   2. the operator chain task pool should wake up on either (a) new incoming 
data for the corresponding mailbox or (b) a new operator chain is registered 
for execution to make sure that there is no race between registering a new 
operator chain and receiving data from an upstream sending node
   3. the shared buffer should have configurable limits, ideally on both data 
size as well as number of blocks
   
   Discussion around out-of-scope considerations: 
   1. retrying failed operator chains/queries (note: cascading a failure is in 
scope and should leverage [Query 
Preemption](https://docs.google.com/document/d/1Z9DYAfKznHQI9Wn8BjTWZYTcNRVGiPP0B8aEP3w_1jQ/edit?pli=1))
   2. implementing parallelism within an operator-chain
   3. implementing pipelining or partition-level parallelism of operator chains
   4. for v1, the operator chain scheduler will be round-robin. it should be 
pluggable to support priority scheduling that can ensure fairness but those 
implementations are out of scope
   
   Next steps, @agavra to come up with a PEP document with more implementation 
and design details after some prototyping. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to