A few thoughts:

Algebra approach.


If I understand the model correctly, I actually implemented this in a
different project last year (not knowing it was now a design pattern). My
main finding was that while it had certain benefits, it was pretty
confusing for most people.

I am planning to go again over the extensions of RelVisitor/RelShuttle in
> Calcite and Hive and see if there are other changes that may make sense to
> incorporate in the new design.
>

This would be very helpful. I hope others will do the same. I hate
introducing new apis at this level so let's make sure we're accomplishing
support of enough key use cases to make it worth it.

use default methods to avoid forcing everybody to implement if not
> necessary;


Agreed.


> base dispatching on abstract relational expressions and not logical ones.


Agreed again. A wish (maybe difficult without making a big mess) would
ideally one could do something similar to how the relmetadataquery handlers
work where things are priority bound (from specific to generic). For
example, I may want to handle LogicalProject in a special way that is
different than the base class of Project (imagine a tree that might be
mixing multiple conventions or have multiple variations of a project e.g. a
columnar project and a row-wise project).

Another wish that the algebra model seems to hit on: have a clean pattern
for new kinds of relnodes outside the core. Let's say I want to make a
visitor that handles DruidAggregateNode in a special way. Is there a clean
way to extend traversal to work with that node type cleanly?



On Sat, Dec 4, 2021 at 2:53 PM Stamatis Zampetakis <zabe...@gmail.com>
wrote:

> The visitor pattern is not something written in stone. We all have a
> general understanding of what a visitor is and how it is used but there are
> also widely used variations.
>
> The following snippet (referring to the visitor pattern) is taken from
> "Design Patterns: Elements of Reusable Object-Oriented Software", the first
> reference of Wikipedia and a book I like a lot.
>
> "Who is responsible for traversing the object structure? A visitor must
> visit each
> element of the object structure. The question is, how does it get there? We
> can
> put responsibility for traversal in any of three places: in the object
> structure,
> in the visitor, or in a separate iterator object (see Iterator (257))."
>
> There is not really right or wrong here; there are alternatives with
> different advantages/disadvantages.
>
> I like the idea of having a generic return type for the shuttle, I think it
> is useful.
> I think we should take the opportunity to see how we can make the new
> API useful for the vast majority of use-cases.
> Ideally, I would like to see a solution that can effectively replace both
> RelVisitor and RelShuttle interfaces.
> I am not saying remove RelVisitor, and RelShuttle directly but it would be
> nice if we were able to deprecate them and remove them in a few releases
> by providing a better alternative.
>
> Both RelVisitor, and RelShuttle, have been in the code base for quite some
> time now and they have many extensions both in the Calcite repo and in
> other projects using Calcite.
> I would suggest spending a bit of time on these extensions and see what
> other makes sense to introduce in the new API.
> I am planning to go again over the extensions of RelVisitor/RelShuttle in
> Calcite and Hive and see if there are other changes that may make sense to
> incorporate in the new design.
> I would encourage others to do the same.
>
> A few things that I had in mind at some point regarding the design of the
> visitor are:
> use default methods to avoid forcing everybody to implement if not
> necessary;
> do not rely on method overloading;
> base dispatching on abstract relational expressions and not logical ones.
>
> Example
>
> Instead of:
>
> RelNode visit(LogicalFilter filter);
>
> use:
>
> default <T extends Filter> R visitFilter(T filter) {
>     return null;
> }
>
> and other similar things but in order to get this right I think we have to
> work out some concrete examples and rework existing use-cases.
>
> The object-algebra approach shared by Vladimir sounds interesting but again
> we should evaluate the existing use-cases and see what we really need.
>
> Many thanks Jacques for taking this initiative, and all the rest for the
> valuable feedback. I really think it is worth pushing this forward and
> getting feedback is important to get this right.
>
> Best,
> Stamatis
>
>
> On Fri, Dec 3, 2021 at 10:30 PM James Starr <jamesst...@gmail.com> wrote:
>
> > What if RelNode.accept(RelShuttle) was deprecated and the RelShuttleImpl
> > had an if(node instanceof Logical*Node) instead?
> >
> > I am against further enhancing RelNode.accept(RelShuttle), because, as
> > Vladimir pointed out, it does not actually follow a visitor pattern(aka
> > walking the tree) and fixing that could be a breaking change.  I would
> > prefer if the tree traversal and the dispatch was independent of the
> > RelNode implementation.  This would help with loose coupling so we do not
> > have email threads talking visit patterns when people like want to do
> > something rather simple.
> >
> > James
> >
> > On Fri, Dec 3, 2021 at 1:11 PM Jacques Nadeau <jacq...@apache.org>
> wrote:
> >
> > > I think this thread has forked into two distinct topics:
> > >
> > >    1. A general discussion of the utility of visitors
> > >    2. Support for generic return type for RelShuttle (the original
> > purpose
> > >    of the thread)
> > >
> > >
> > > My quick thoughts on #1: I've historically used the visitors in Calcite
> > > extensively (external to Calcite and typically not using RelShuttleImpl
> > > which requires an instance per user/thread due to internal use of a
> > deque.
> > > I don't support the removal of visitors or the current impl of accept
> > from
> > > RelNode. I think that would be disruptive to the community.
> > >
> > > For #2: It seems like most people are supportive of this kind of
> pattern
> > > assuming it doesn't impact compatibility. I 100% agree on compat and am
> > > exploring the best way to actually expose this safely. The problem I
> > > haven't yet found a great solution for is if someone has built a new
> > > relnode type that overrides the default RelShuttle based accept
> methods.
> > In
> > > this situation, to maintain compatibility, you can't actually delegate
> > the
> > > relshuttle use to the generic method (as you would skip over the
> > > customization). The options I've identified:
> > >
> > > a) Have a (* instanceof RelShuttle) check in the generic path that
> routes
> > > to the specific accept methods.
> > > b) Use reflection to statically determine if a particular relnode class
> > has
> > > a relsthuttle based override and then route to it in those
> circumstances
> > > (similar to (a) but more complex and likely slightly more performant).
> > > c) Don't try to collapse the existing RelShuttle into a new pattern and
> > > instead just introduce a new pattern independently.
> > > d) Start with a or b but also deprecate the RelShuttle specific
> 'accept'
> > > methods. Then in a subsequent release, remove the RelShuttle specific
> > > methods entirely.
> > >
> > > I'm currently leaning towards starting with (a) since it is the easiest
> > for
> > > users to understand. However, I'd love to hear if others have better
> > ideas
> > > or opinions on these existing ideas.
> > >
> > > Thanks!
> > > Jacques
> > >
> > >
> > >
> > >
> > > On Thu, Dec 2, 2021 at 10:22 AM Vladimir Sitnikov <
> > > sitnikov.vladi...@gmail.com> wrote:
> > >
> > > > Steven>Agree with James, and that's not even including
> implementations
> > in
> > > > other
> > > > Steven>codebases that use calcite (e.g. there are dozens of
> > > implementations
> > > > in
> > > > Steven>Dremio's codebase).
> > > >
> > > > Can you please show RelShuttle in Dremio codebase?
> > > > I see no RelShuttle implementations:
> > > > https://github.com/dremio/dremio-oss/search?q=relshuttle
> > > >
> > > > Note: RelShuttleImpl does not count since RelShuttleImpl does not
> > really
> > > > need RelNode cooperation.
> > > > RelShuttleImpl just enumerates children of the relations, and it does
> > not
> > > > rely on accept(..) calls.
> > > >
> > > > It can do the very same thing without calling accept(this), so we can
> > > > remove  RelNode#accept(RelNode) methods,
> > > > and we can still get the same RelShuttleImpl.
> > > > It can have a series of if (... instanceof) checks or a
> > > Map#get(getClass())
> > > > or even https://github.com/forax/exotic#visitor---javadoc
> > > >
> > > > Vladimir
> > > >
> > >
> >
>

Reply via email to