sjhddh opened a new pull request, #22871:
URL: https://github.com/apache/datafusion/pull/22871

   ## Which issue does this PR close?
   
   Closes #15774.
   
   ## Rationale for this change
   
   The `extending-operators` user guide only documented the µWheel optimizer at 
a high level. The full worked example of building a custom operator lived in 
`datafusion/core/tests/user_defined/user_defined_plan.rs`, whose own module 
header noted the code "is better to put ... in examples". #15774 asks to move 
that example into the user guide, using `custom-table-providers.md` as the 
format reference.
   
   ## What changes are included in this PR?
   
   - Expand `docs/source/library-user-guide/extending-operators.md` with a 
complete `TopK` walkthrough:
     - the problem and the naive `Sort` + `Limit` plan it improves on,
     - the logical node (`UserDefinedLogicalNodeCore`),
     - the `OptimizerRule` that rewrites `Limit` + `Sort` into the node,
     - the physical operator (`ExecutionPlan`) and its streaming reader,
     - the `ExtensionPlanner` / `QueryPlanner` wiring, and
     - how to register everything on a `SessionState` and run a query.
   - Trim the now-redundant narrative from the `user_defined_plan.rs` header so 
the guide is the single source of the walkthrough; the header now links to the 
guide.
   
   This addresses the two follow-ups alamb raised on #15832:
   1. Remove the redundant example. The explanatory walkthrough is removed from 
the test and now lives only in the guide.
   2. Add more detail. Each component has prose explaining what the trait 
methods are for, not just the code.
   
   On the first point: I kept the implementation in `user_defined_plan.rs` 
rather than deleting the file, because the module has grown to also test user 
defined plan invariants (`InvariantMock`, the `topk_invariants*` tests). Those 
tests are not documentation and would be lost on a full delete. Happy to move 
them elsewhere or delete more aggressively if you'd prefer.
   
   Following `custom-table-providers.md`, the code blocks use `rust,ignore`: 
they reference the surrounding types and a test-only schema, so they are 
illustrative rather than standalone-compilable.
   
   ## Are these changes tested?
   
   The migrated code is the existing, tested `TopK` implementation; `cargo test 
--test user_defined_integration -p datafusion topk` still passes (4 tests, 
including the invariant tests). The guide is rendered docs only.
   
   ## Are there any user-facing changes?
   
   Documentation only. No API changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to