You need to complete validation in order to confirm that all tables are
real. For example, validation includes clarifying ambiguities (since users
can rely on implicit resolution but rel trees cannot).

For #2: this is a really hard thing to do with Calcite. We did this in
Apache Drill by basically watching for field and table name requests to the
catalog interface and automatically adding them but it's relatively
brittle. You can see the entry to this around the code here [1] (and the
type holder this class uses).

For #1: If you really want to do this, walking the AST is the best option.
That being said, I don't think it makes sense to add this as a public
interface to the parser. There are too many places where it won't make
sense (or work inconsistently, confusing consumers). The first place I
would do it would be once you have a validated tree since the validated
tree has identified problems and clarified ambiguities. This would be one
step before converting to rel (which seems useful in several cases). That
being said, you'd also get some weird externalities even at that step.
Views are not converted into plans until SqlToRel which means you'd only
get direct tables referenced, not tables referenced through a view. You
have to expand all the views to get the complete set (thus the need for a
rel tree for the utility method).


[1]
https://github.com/apache/drill/blob/fd836a2a6e933c294a53271f7e9b6ba3fea7cc8d/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DynamicDrillTable.java

On Wed, Apr 13, 2022 at 1:11 AM Yanjing Wang <[email protected]>
wrote:

> Hi Alessandro,
>
> Thanks for your feedback, the RelOptUtil#findAllTables requires its input
> is a relNode, But many times we need extract all tables before validation,
> this has the following advantages.
>
> 1. no metadata need and fast.
> 2. we can compose a on-demand catalog, rather get all schemas, tables to
> match.
>
> Alessandro Solimando <[email protected]> 于2022年4月13日周三
> 16:04写道:
>
> > Hi Yanjing,
> >
> > I found a discussion in the ML suggesting to use RelOptUtil#findAllTables
> > to get all tables involved in a query:
> > https://lists.apache.org/thread/wcfv2bhdytwsoyc3qlsnxcd7nmnnlb06
> >
> > Can you check if that's what you want?
> >
> > Best regards,
> > Alessandro
> >
> > On Wed, 13 Apr 2022 at 09:33, Yanjing Wang <[email protected]>
> > wrote:
> >
> > > Hi Commnunity,
> > >
> > > In real scenarios, many developers want to get what tables a query
> > > statement contains.
> > >
> > > such as 'WITH t as (SELECT * FROM t1) SELECT * FROM t, t2', it only
> > > contains table t1 and t2.
> > >
> > > I tried to visit the SqlNode and found this solution is very
> complicated
> > > and has very many checks, otherwise easy to get in parser.
> > > such as add one line 'involvedTables.add(nameList);'  in
> > > 'CompoundTableIdentifier' production rule.
> > >
> > > I know that this isn't the duty of parser, but I think it's the easiest
> > way
> > > to get the tables, I want to add this functionality to parser, how do
> you
> > > think?
> > >
> >
>

Reply via email to