agavra opened a new pull request, #9887: URL: https://github.com/apache/pinot/pull/9887
This PR follows up on #9753 and supports improved task scheduling that only schedules operator chains when they have data available for them to process. **Review Guide**: 1. Look at the changes to `RoundRobinScheduler` and corresponding tests, which now takes into advantage of the newly wired `onDataAvailable` callback 2. Look at the changes in `InMemoryReceivingMailbox` and `MailboxContentStreamObserver`, which no longer wait at all when they call `poll` on the underlying resource 3. Look at the changed Operators (`AggregateOperator`, `SortOperator`, `HashJoinOperator`, `MailboxSendOperator`, `MailboxReceiveOperator`) - which now consume all the data they have available to them instead of eagerly returning a NoOp block. This allows the scheduler to be much more efficient and schedule them once for all the data they have available instead of once for each block (it also makes the code in `RoundRobinScheduler` much easier since it can clear the entire mail queue for an OpChain whenever it is scheduled) 4. The rest is just wiring to wire the `gotMailCallback` callback into the underlying mailboxes. Note that `InMemoryMailbox` calls the callback on send instead of receive as receive is done synchronously with the operators (and therefore would never trigger a schedule) There are also two bug fixes that were necessary for this change: - `GrpcReceivingMailbox` will now correctly return an EOS block instead of returning `null`, so that the `MailboxReceiveOperator` doesn't need to be called twice to close the stream - `MailboxReceiveOperator` will not return a NoOp block if it read EOS blocks from all of its incoming mailboxes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
