alamb opened a new pull request #8097:
URL: https://github.com/apache/arrow/pull/8097


   This PR is based on the prototype design in 
https://github.com/apache/arrow/pull/8020 and the discussion in the [Design 
Document](https://docs.google.com/document/d/1IHCGkCuUvnE9BavkykPULn6Ugxgqc1JShT4nz1vMi7g/edit#)
 and the [discussion on the mailing 
list](https://lists.apache.org/thread.html/rf8ae7d1147e93e3f6172bc2e4fa50a38abcb35f046cc5830e09da6cc%40%3Cdev.arrow.apache.org%3E).
 See also https://issues.apache.org/jira/browse/ARROW-9821,
   
   This PR adds:
   1. A `ExtensionNode` trait for defining user defined behavior in 
LogicalPlanNodes
   2. Support for planning (both logical and physical) for such plan nodes
   3. An end to end example and test of using LogicalPlanNode to implement a 
simple "topK" operator using a custom defined plan node.
   
   The idea of the end to end example is both to serve as documentation as well 
as to ensure the API can be used for a non trivial example operator
   
   Major Differences from the prototype:
   1. Renamed `LogicalPlanNode` to `ExtensionPlanNode` as I think that better 
reflects what it is
   2. I did not change the built in `LimitNode` to use the new interface, but 
instead created an end-to-end demonstration
   3. Demonstration of how to provide a custom physical planner (per comment 
https://github.com/apache/arrow/pull/8020#discussion_r475953168 from @andygrove)
   3. The code for `Extension` plan node is less of a mess (uses `Arc` instead 
of `Box`)
   4.  Register the new optimization passes so that the high level 
`ExecutionContext::sql` could be used rather than the low level APIs.
   
   I am sorry for the relatively large PR -- most of the code is the new 
example with comments. 
   
   I can also break this into a few smaller PRs (the changes to 
`ExecutionConfig` and `DefaultPhysicalPlanner` could naturally be broken out, 
if that would be easier)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to