sjhddh commented on PR #22871: URL: https://github.com/apache/datafusion/pull/22871#issuecomment-4673735046
Thanks for the review @alamb! Pushed an update: 1. Reframed the intro to point out that DataFusion already ships an optimized TopK and that this operator is just for demonstration. Took your suggested wording verbatim. The `Sort` -> `Limit` EXPLAIN below it is now described as the original plan you get with LimitPushdown disabled, rather than the default. I left it as an illustrative block since there isn't a clean SQL knob to disable a single logical rule mid-session - happy to wire the example up to actually drop the pass from the optimizer if you'd prefer that over the prose note. 2. Swapped `df.show()` for `assert_batches_eq!` in the "Putting It Together" block so the expected output is captured inline. Rows are taken from the runnable `user_defined_plan.rs` test. On the follow-on: agreed the invariant tests piggybacking on the example hurts readability. I'll move `InvariantMock` / the `topk_invariants*` tests into proper unit tests in a separate PR so this one stays focused on the docs example. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
