[GitHub] [arrow-datafusion] isidentical commented on issue #462: Add support for recursive CTEs

GitBox Thu, 06 Oct 2022 13:07:45 -0700


isidentical commented on issue #462:
URL: 
https://github.com/apache/arrow-datafusion/issues/462#issuecomment-1270620697


   I was thinking about how we can split, and an initial plan might look like 
this if there are no objections on separating `ContinuanceStream` as a single 
patch (if it sounds better, also can combine first two steps).
   
   ## Possible roadmap?
   
   - [ ] Add continuance streams (a "working table" operation for DataFusion 
that actually uses streams under the hood).
   
   The implementation is self-contained enough that I think it could be split 
(with tests), and it would include the 
`push_relation_handler`/`pop_relation_handler` piece in task contexts, as well 
as the implementation of the physical operation. The only question would be 
whether it is fine to add a new physical operation that doesn't have immediate 
usage?
   
   - [ ] Implement recursive queries (as a both physical and a logical 
operation).
   
   This would be a sizable change that can actually implement the initial piece 
of logic (without distinct) where we could execute queries up until a certain 
condition has been met. It would also include new logical operations 
(`RecursiveQuery` and `NamedRelation`) and also the actual usage of the 
continuance streams.
   
   - [ ] Enable SQL planning
   
   The implementation in terms of SQL is completely decoupled from the actual 
logical/physical representation, and I think it can be added last, the 
algorithm is basically using a temporary CTE and then replacing it with the 
original form, more details in the main PR.
   
   - [ ] Start supporting `UNION`
   
   This would require us to actually record what sort of values we have 
actually collected (probably not direct references, but hashes) and it would be 
a bit less efficient than the `UNION ALL` solution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] isidentical commented on issue #462: Add support for recursive CTEs

Reply via email to