theirix opened a new pull request, #17633:
URL: https://github.com/apache/datafusion/pull/17633

   ## Which issue does this PR close?
   
   - Closes #13563.
   
   ## Rationale for this change
   
   The rationale is explained in 
https://github.com/apache/datafusion/issues/13563 in detail with known syntax 
examples.
   
   This is the third design for the table sample support.
   
   1. My first design was an addition of an explicit rewrite function baked in 
into a select logical plan - #16325
   
   2. Second design introduced dedicated flexible logical and physical plans, 
but tied to datafusion core - #16505
   
   3. This third design abstracts the second design out of datafusion core into 
extensions.
   
   ## What changes are included in this PR?
   
   All changes are bundled to an example file since it is a PoC of 
extensibility - as discussed with @alamb in 
https://github.com/apache/datafusion/issues/13563#issuecomment-3201702314 .
   
   If the idea is viable, the code could be modularised, which is not possible 
in a `datafusion-examples` crate.
   
   It adds several components:
   - a custom `TableSamplePlanNode` trait for a sampling logical plan
   - a query planner `TableSampleQueryPlanner` (trait `QueryPlanner`)
   - an execution plan `SampleExec` - mostly adapted from the second design, 
all kudos and thanks to @chenkovsky !
   - an extension planner (trait `ExtensionPlanner`) to build a physical plan
   - tests
   - an example runner 
   
   The setup, as seen in main, is a bit cumbersome, but it works. Building a 
SQL extension with access to AST and without introducing a whole new statement 
(as large projects like arroyo, greptime or cube introduce new syntax) is 
complicated. 
   For full modularity I would propose a few extension points to `SqlToRel`. It 
could help to avoid manual parsing / logical plan / physical plan / execution 
conversions and would keep the concise client syntax.
   
   
   ## Are these changes tested?
   
   1. A set of unit tests
   2. An example with asserts
   
   ## Are there any user-facing changes?
   
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to