gortiz commented on PR #18538: URL: https://github.com/apache/pinot/pull/18538#issuecomment-4499980856
The overall approach is sound — detecting the empty-leaf condition at planning time and short-circuiting at the broker is the right shape for this fix. **Three concerns:** **1. Cross-layer coupling in the plan rewrite.** `rewriteReduceStageForEmptyLeaves` and `inlineAllLeafStagesEmptyInputs` give the broker knowledge of plan-tree semantics: `MailboxReceiveNode → MailboxSendNode` traversal, stage ID re-numbering, and fragment construction. This is planner-layer knowledge. The broker's role should be limited to checking the flag and calling `runReducer`. **2. v2 physical optimizer silently skips the short-circuit.** `createDispatchableSubPlanV2` never calls `updateContextForLeafStage`, so `isAllNonReplicatedLeafStagesEmpty()` is permanently `false` on that path. When `USE_PHYSICAL_OPTIMIZER` is enabled, fully-pruned queries will still fan out to every server. The TODO comment is the only signal — there's no log warning, no test, and no documentation of the limitation. **3. `getQueryStageMap()` unmodifiable/mutable inconsistency.** The method now returns `Collections.unmodifiableMap(...)`, signalling immutability, but `replaceStage` writes through the same backing map as an escape hatch. The API contract is self-contradicting and there's no audit of existing callers to confirm none calls `.put()` on the returned reference. **Suggestion: move the rewrite into the planner.** Concerns 1 and 3 both go away if the plan tree is rewritten inside `finalizeDispatchableSubPlan` rather than at broker dispatch time. The `DispatchableSubPlan` would be born already in its final state — no post-construction mutation, no `replaceStage`, no planner logic in the broker. The flag still needs to exist to tell the broker to call `runReducer` instead of `submitAndReduce`, but that's all it needs to do. Concern 2 remains regardless: the v2 path needs to be wired separately either way. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
