yashmayya opened a new pull request, #18804: URL: https://github.com/apache/pinot/pull/18804
Adds a new query hint `setOpOptions(is_colocated_by_set_op_keys='...')` that forces (or disables) colocated, pre-partitioned exchanges for set operations (`UNION` / `UNION ALL` / `INTERSECT` / `EXCEPT`) in order to avoid a data shuffle when the inputs are already partitioned compatibly. This is the set-operation equivalent of the existing `joinOptions(is_colocated_by_join_keys='...')` join hint and the `windowOptions(is_partitioned_by_window_keys='...')` window hint (#17395). By default the planner inserts a hash exchange (on the full output row) below every input of a set operation. When the inputs are already co-partitioned, that shuffle is unnecessary; this hint lets the user assert colocation so the planner emits a direct (1-to-1, no-shuffle) exchange instead. This also registers the `setOpOptions` hint strategy (`HintPredicates.SETOP`), which was previously not registered at all. Like the equivalent join / window hints, this is opt-in and trusts the user's assertion. Because a set operation matches rows on the **entire** output row, forcing `is_colocated_by_set_op_keys='true'` is only correct when every input is partitioned the same way (same partition function and count) on one or more of the projected columns, so that rows that are equal across all projected columns land on the same worker. Forcing it on data that is not actually colocated will produce incorrect results for `INTERSECT`, `EXCEPT` and distinct `UNION` (`UNION ALL` only concatenates, so it is always safe). The hint is honored by the V1 query planner; the V2 physical optimizer determines colocation on its own and ignores it. **Hint placement.** Unlike a join/window node, a set operation is an ancestor of its branch `SELECT`s, so a hint on the leading `SELECT` does not naturally attach to it. The hint is therefore resolved from either the set operation itself or its first branch, supporting two placements: - Inline on the first branch: `SELECT /*+ setOpOptions(is_colocated_by_set_op_keys='true') */ col FROM a UNION ALL SELECT col FROM b` - On an outer `SELECT` wrapping the set operation: `SELECT /*+ setOpOptions(is_colocated_by_set_op_keys='true') */ * FROM (SELECT col FROM a UNION ALL SELECT col FROM b)` Two limitations worth calling out: - Plain distinct `UNION` is rewritten to an aggregate over `UNION ALL` before the exchange rule runs, so the inline hint does not apply to it (use `UNION ALL`, or the outer-wrap form). - For deeply-nested `INTERSECT`/`EXCEPT` the inline hint only colocates the innermost level (a safe degradation — the outer levels shuffle); the outer-wrap form covers all levels. **Tests added:** - Planner unit tests in `QueryCompilationTest` asserting the hint forces / disables a pre-partitioned exchange across `UNION ALL` / `INTERSECT` / `EXCEPT`, both the inline and outer-wrap placements, the no-hint baseline, auto-detection, the `='false'` override, and first-input-wins precedence when branches carry conflicting values. - A before/after physical-plan contrast pair in `ExplainPhysicalPlans.json` showing the hint turn a full shuffle into a `[PARTITIONED]` (direct, 1-to-1) exchange. - Runtime, H2-validated cases in `QueryHints.json` on physically-partitioned tables: `INTERSECT` / `EXCEPT` / `UNION ALL` with `='true'`, the `='false'` override, a multi-column set op colocated on a subset (the partition column) of the projected columns, and a mismatched-partition-count case where the planner cannot form a direct exchange and safely falls back to a shuffle. **Follow-up:** user-facing documentation for the new hint will be added to the `pinot-docs` repo, mirroring the existing `is_colocated_by_join_keys` entry (scope, the V1-only note, and the partitioning precondition under which `'true'` is safe). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
