yashmayya commented on PR #17323:
URL: https://github.com/apache/pinot/pull/17323#issuecomment-3628630591

   > Without this fix, `mailboxInfoList` won't have deterministic order of 
servers for each worker id. Where do we access it based on the index of the 
server?
   
   It's eventually used here through the `MailboxSendOperator` - 
https://github.com/apache/pinot/blob/538407935001e2ffa17fd61a766a41b2bd53f7bc/pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/exchange/HashExchange.java#L55-L79
   
   
https://github.com/apache/pinot/blob/538407935001e2ffa17fd61a766a41b2bd53f7bc/pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MailboxSendOperator.java#L168-L178
   
   > Is it easier to follow if we change `serverToWorkerIdsMap` to a 
`LinkedHashMap`? Essentially we want to preserve the worker id order in the 
generated server list
   
   Hm no, I believe the issue is more so that there's no guarantee that across 
different branches in the same query plan tree, we'll have the same partition 
<-> workerId mapping because the worker / mailbox order depends on the 
iteration order of various maps (enabled server instances, routing table etc.) 
in `WorkerManager`. The particular issue that this patch is aiming to fix is 
when inputs to a join are determined to be "pre-partitioned" based on something 
like a grouping aggregation using the same keys as the join keys. In this case, 
we need to ensure that the join _inputs_ have the same partitions / hash 
buckets on the same workers, because the way we do the distribution then relies 
on a fixed upstream worker -> downstream worker mapping - 
https://github.com/apache/pinot/blob/538407935001e2ffa17fd61a766a41b2bd53f7bc/pinot-query-planner/src/main/java/org/apache/pinot/query/planner/physical/MailboxAssignmentVisitor.java#L102-L123
   
   IMO, the cleanest solution is guaranteeing that the `workerId`s themselves 
are ordered here so that the partition assignment is deterministic (this also 
happens to be the same solution chosen by the v2 physical planner).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to