agavra opened a new pull request, #9887:
URL: https://github.com/apache/pinot/pull/9887

   This PR follows up on #9753 and supports improved task scheduling that only 
schedules operator chains when they have data available for them to process.
   
   **Review Guide**:
   1. Look at the changes to `RoundRobinScheduler` and corresponding tests, 
which now takes into advantage of the newly wired `onDataAvailable` callback
   2. Look at the changes in `InMemoryReceivingMailbox` and 
`MailboxContentStreamObserver`, which no longer wait at all when they call 
`poll` on the underlying resource
   3. Look at the changed Operators (`AggregateOperator`, `SortOperator`, 
`HashJoinOperator`, `MailboxSendOperator`, `MailboxReceiveOperator`) - which 
now consume all the data they have available to them instead of eagerly 
returning a NoOp block. This allows the scheduler to be much more efficient and 
schedule them once for all the data they have available instead of once for 
each block (it also makes the code in `RoundRobinScheduler` much easier since 
it can clear the entire mail queue for an OpChain whenever it is scheduled)
   4. The rest is just wiring to wire the `gotMailCallback` callback into the 
underlying mailboxes. Note that `InMemoryMailbox` calls the callback on send 
instead of receive as receive is done synchronously with the operators (and 
therefore would never trigger a schedule)
   
   There are also two bug fixes that were necessary for this change:
   - `GrpcReceivingMailbox` will now correctly return an EOS block instead of 
returning `null`, so that the `MailboxReceiveOperator` doesn't need to be 
called twice to close the stream
   - `MailboxReceiveOperator` will not return a NoOp block if it read EOS 
blocks from all of its incoming mailboxes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to