walterddr commented on PR #10408:
URL: https://github.com/apache/pinot/pull/10408#issuecomment-1470319915
> > i have some concerns regarding how the mailbox (1) handles received
block; (2) what should be returned when normal block or error block is being
received. please kindly take a look
>
> Hey @walterddr can you check if I addressed your comments in this area
correctly? We can discuss this some more if required.
>
> Basically made changes to:
>
> 1. Clear the `_priorityQueue` when an error block is received / returned.
> 2. Always return a no-op when a block is received and the contents are
added to the `_priorityQueue`. Earlier I was waiting for all the mailboxes to
get processed in the loop before returning a no-op, now I return a no-op
immediately for each mailbox.
Yeah when I thought about it more last night I think we shouldn't return a
no-op block to every mailbox received.
The problem I have was to achieve 2 goals:
1. collaborative multi-threading. when some mailboxes (or all mailboxes) are
empty, they should give up the thread-worker, so another query stage can run.
- This is achieved via the return of the no-op block.
- Some operator doesn't follow this. for example, AggregateOperator will
hold onto the thread-worker until all inbound blocks are consumed and created
an indexed map.
- this can be problematic as when multiple stages are running on the
same server, it could deadlock awaiting inbound messages; but the inbound stage
cannot be scheduled due to thread starvation.
2. fairness when grabbing from all mailboxes.
- this is more related to the K-loser algorithm we will implement when
the sender side is sorting. this way we will maintain some sort of state
internally in the sender/receive operator.
- however, we will actually return some blocks as long as those blocks
contain data smaller (or larger when ORDER DESC) than any top row from all
mailbox blocks.
for now the implementation looks good. i will create an issue to
systematically address both points above
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]