Jackie-Jiang commented on code in PR #17323:
URL: https://github.com/apache/pinot/pull/17323#discussion_r2596776589
##########
pinot-query-planner/src/main/java/org/apache/pinot/query/planner/physical/MailboxAssignmentVisitor.java:
##########
@@ -187,6 +188,12 @@ private void connectWorkers(int stageId, Map<Integer,
QueryServerInstance> serve
List<Integer> workerIds = entry.getValue();
mailboxInfoList.add(new MailboxInfo(server.getHostname(),
server.getQueryMailboxPort(), workerIds));
}
+ // Sort by first workerId to ensure deterministic ordering.
+ // This is critical for hash-distributed exchanges where (hash %
numMailboxes) is used as an index to choose the
+ // receiving worker.
+ // Without this sorting, different stages could route the same hash value
to different workers, resulting in
+ // incorrect join/union/intersect results for pre-partitioned sends (where
a full partition shuffle is skipped).
+ mailboxInfoList.sort(Comparator.comparingInt(info ->
info.getWorkerIds().get(0)));
Review Comment:
Is it easier to follow if we change `serverToWorkerIdsMap` to a
`LinkedHashMap`? Essentially we want to preserve the worker id order in the
generated server list
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]