adriangb opened a new pull request, #22760:
URL: https://github.com/apache/datafusion/pull/22760

   ## Which issue does this PR close?
   
   - Relates to https://github.com/apache/datafusion/issues/11900
   
   ## Rationale for this change
   
   This splits the test and benchmark scaffolding out of #21621 so the
   `PushDownTopKThroughJoin` optimizer rule itself can be reviewed in
   isolation, with a small, focused diff.
   
   The benchmark and SLT files here do not depend on the rule. They are
   committed first so that:
   
   1. The benchmark can measure the rule's effect against a baseline that
      does not register it.
   2. The follow-up rule PR's diff shows exactly which plans change, since
      the EXPLAIN plans here capture the current (pre-rule) behavior.
   
   ## What changes are included in this PR?
   
   - A `push_down_topk` benchmark (`dfbench push-down-topk`) that runs
     `ORDER BY <cols> LIMIT N` queries over outer joins against TPC-H
     `customer`/`orders`/`nation`, plus its query files under
     `benchmarks/queries/push_down_topk/`.
   - `push_down_topk_through_join.slt` covering the scenarios the rule
     handles: preserved-side sort keys, ineligible join types
     (inner/full/semi/anti), `ON`-clause filters, projection and
     `SubqueryAlias` resolution, existing child sorts, ties, multi-level
     joins, `OFFSET`, and volatile expressions.
   
   The EXPLAIN plans assert current behavior (TopK not yet pushed through
   the join). The follow-up PR that adds the rule updates those plans in
   place; the query-result checks hold regardless of whether the rule is
   enabled.
   
   The new optimizer rule, the `push_down_limit.rs` changes, and the
   `optimizer_rule_reference.md` update from #21621 are intentionally left
   for the follow-up PR.
   
   ## Are these changes tested?
   
   Yes — this PR is the tests. `push_down_topk_through_join.slt` passes
   against `main`, and the benchmark binary compiles and runs.
   
   ## Are there any user-facing changes?
   
   No. No API changes; only new benchmark and test files plus benchmark CLI
   wiring.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to