Thanks Julian, I have looked into using Table Functions in Drill. I had to make some modifications to the planner so that the function lookup in the Storage plugin works. I will submit a patch for that.
I had a few questions: *1)* For this particular use case it seems that we could use TableMacro as all the logic can be happening in the planner. Should I look into that? - Drill Schema returns a DrillTable (which implements Table). - A TableMacro returns a TranslatableTable - It is not clear to me what a TableFunction returns as it defines only methods that return types. Ideally I'd like to produce a DrillTable like getTable in Schema, the only difference with getTable is that we use the function parameters when producing a table. For reference: Drill getTable there: https://github.com/apache/drill/blob/bb69f2202ed6115b39bd8681e59c6ff6091e9b9e/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java#L235 It indirectly calls: https://github.com/apache/drill/blob/bb69f2202ed6115b39bd8681e59c6ff6091e9b9e/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java#L317 *2)* The getFunctions method in Schema does not seem to be aware at all of the context it is called in. I would want to return different functions depending on where we are in the query (table functions in the from clause, regular functions in where). Is there a way to know if we are in the context of a FROM or a WHERE clause? *3)* is the table(...) wrapping syntax necessary? Note: - In Drill back ticks are use for identifiers containing dot or slash. like the path to the file as a table name: dfs.`/path/to/file.ext` - single quotes are used to delimit strings: 'my string passed as a parameter' The current syntax is something like: * select * from table(dfs.delimitedFile(path => '/path/to/file', delimiter => '|'))* * select * from table(dfs.`**/path/to/file`**(type => 'text', delimiter => '|'))* * select * from table(dfs.`**/path/to/file`**(type => 'json'))* It seems that table(...) is redundant since we are in the from clause. It could simply be: * select * from dfs.delimitedFile(path => '/path/to/file', delimiter => '|')* * select * from dfs.`**/path/to/file`**(type => 'text', delimiter => '|')* * select * from dfs.`**/path/to/file`**(type => 'json')* *4)* Can a table be a parameter? If yes, how do we declare a table parameter? (not the backticks instead of single quotes) * select * from dfs.delimitedFile(table => dfs.`/path/to/file`, delimiter => '|')* Thank you! Julien On Sun, Nov 1, 2015 at 8:54 AM, Julian Hyde <[email protected]> wrote: > On Sun, Oct 25, 2015 at 10:13 PM, Jacques Nadeau <[email protected]> > wrote: > > Agreed. We need both select with option and .drill (by etl process or by > > sql ascribe metadata). > > > > Let's start with the select with options. My only goal would be to make > > sure that creation of .drill file through SQL uses a similar pattern to > the > > select with options. It is also important that tables names are still > > expressed as identifiers instead of strings (people already have enough > > trouble with remembering whether to use single quotes or backticks). If > the > > table function approach is everybody's preferred approach, I think it is > > important to have named parameters per Julian's notes. > > > > @Julian, how hard do you think it will be to add named parameters? > > I just checked in a fix for > https://issues.apache.org/jira/browse/CALCITE-941. Check it out. > -- Julien
