yashmayya commented on PR #17323: URL: https://github.com/apache/pinot/pull/17323#issuecomment-3628630591
> Without this fix, `mailboxInfoList` won't have deterministic order of servers for each worker id. Where do we access it based on the index of the server? It's eventually used here through the `MailboxSendOperator` - https://github.com/apache/pinot/blob/538407935001e2ffa17fd61a766a41b2bd53f7bc/pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/exchange/HashExchange.java#L55-L79 https://github.com/apache/pinot/blob/538407935001e2ffa17fd61a766a41b2bd53f7bc/pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/MailboxSendOperator.java#L168-L178 > Is it easier to follow if we change `serverToWorkerIdsMap` to a `LinkedHashMap`? Essentially we want to preserve the worker id order in the generated server list Hm no, I believe the issue is more so that there's no guarantee that across different branches in the same query plan tree, we'll have the same partition <-> workerId mapping because the worker / mailbox order depends on the iteration order of various maps (enabled server instances, routing table etc.) in `WorkerManager`. The particular issue that this patch is aiming to fix is when inputs to a join are determined to be "pre-partitioned" based on something like a grouping aggregation using the same keys as the join keys. In this case, we need to ensure that the join _inputs_ have the same partitions / hash buckets on the same workers, because the way we do the distribution then relies on a fixed upstream worker -> downstream worker mapping - https://github.com/apache/pinot/blob/538407935001e2ffa17fd61a766a41b2bd53f7bc/pinot-query-planner/src/main/java/org/apache/pinot/query/planner/physical/MailboxAssignmentVisitor.java#L102-L123 IMO, the cleanest solution is guaranteeing that the `workerId`s themselves are ordered here so that the partition assignment is deterministic (this also happens to be the same solution chosen by the v2 physical planner). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
