JanKaul opened a new pull request, #22050:
URL: https://github.com/apache/datafusion/pull/22050

   ## Which issue does this PR close?
   
   - Part of #17719
   - Part of #18250
   
   ## Rationale for this change
   
   Join-ordering algorithms (DPhyp, DPccp, …) operate on a graph view of
   the join region rather than a `LogicalPlan` tree. DataFusion has no
   such structure today, so any future reordering rule has to re-derive
   one. This PR adds the data structure and the `LogicalPlan ⇄ JoinGraph`
   boundary so the follow-up enumeration work in epic #18249 has
   something concrete to build on.
   
   ## What changes are included in this PR?
   
   New `datafusion/optimizer/src/reorder_join/join_graph.rs`:
   
   - `JoinGraph`, `Node`, `Edge` with `NodeId` / `EdgeId` handles backed
     by an internal `VecMap` (stable indices, no reuse on removal).
   - `JoinGraph::try_from_logical_plan(plan) -> Result<(JoinGraph, 
Vec<LogicalPlan>)>`:
     - strips wrapper operators above the topmost join and returns them
       so the caller can reapply them after reordering;
     - decomposes inner joins into nodes (leaf relations) and edges
       (equi-join predicates);
     - hoists non-equi predicates — both `Join.filter` and `Filter` nodes
       sitting between inner joins — into a side-channel `filters` list;
     - treats non-inner joins and other operators nested between joins
       (Aggregate, Projection, …) as opaque leaves.
   - `reconstruct_plan(join_plan, wrappers)` re-applies the stripped
     wrappers after reordering.
   - Mutation API for the future enumerator: `add_node`,
     `add_node_with_edge`, `remove_node`, `remove_edge`,
     `Node::neighbours`, `Node::connections`.
   - Module exported from `datafusion/optimizer/src/lib.rs`.
   
   No optimizer rule is registered; nothing consumes `JoinGraph` outside
   tests.
   
   ## Are these changes tested?
   
   Yes — unit tests in `join_graph.rs`:
   
   - three-way inner join with a non-equi `Join.filter` (predicate lands
     in side-channel);
   - `Filter` between two inner joins (hoisted; both joins still
     decompose);
   - `Aggregate` between two inner joins (opaque leaf);
   - `LEFT` join nested inside an inner chain (opaque leaf);
   - top-level non-inner join (single opaque leaf).
   
   No sqllogictest changes — no planner-visible behavior yet.
   
   ## Are there any user-facing changes?
   
   No. `JoinGraph` is a new internal data structure in
   `datafusion-optimizer`; no existing API changes and no rule consumes
   it yet.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to