Here you go: https://issues.apache.org/jira/browse/CALCITE-967 I was planning on providing patch for both master and the fork, but I haven't started yet.
On Thu, Nov 12, 2015 at 8:34 PM, Julian Hyde <[email protected]> wrote: > You’re hitting the grammar ambiguity I expected. > > I think that base Calcite should require the full verbose syntax: the > TABLE keyword for table functions and the EXTEND keyword for extends > clauses. Then Drill can override to make TABLE optional, and Phoenix can > override to make EXTEND optional. > > Are you changing the parser in your forked copy of Calcite, or are you > changing Drill’s extensions to that parser? > > If the former, you (or I) should add extension points to Calcite’s parser > make the TABLE keyword optional and to make the EXTEND keyword optional. No > project should enable both extension points — otherwise they’ll end up with > an ambiguous grammar. If you agree create a Calcite JIRA case for this. > > Julian > > > > On Nov 11, 2015, at 1:55 PM, Julien Le Dem <[email protected]> wrote: > > > > Hi, > > I've been trying to enable this but it looks like in the current grammar > > (before my change) you can not use table functions and EXTEND together. > > That's because they are on difference branches of an | in the grammar. > > So I would suggest that we treat those as two separate improvement in two > > different pull requests: > > - not require table(...) to call table functions > > - allow using table functions and extend together. > > Does it make sense? > > Julien > > > > > > On Tue, Nov 10, 2015 at 12:51 PM, Julian Hyde <[email protected]> wrote: > > > >> To be clear, it should be possible to use a table function with all of > >> the options -- EXTENDS clause, OVER clause, AS with alias and column > >> aliases, TABLESAMPLE. > >> > >> I'm surprised that the parser didn't need more lookahead to choose > >> between 't (x, y)' and 't (x INTEGER, y DATE)'. > >> > >> On Tue, Nov 10, 2015 at 12:28 PM, Julien Le Dem <[email protected]> > wrote: > >>> In the patch I just sent, probably not. > >>> I will adjust it and add the corresponding test. > >>> > >>> On Tue, Nov 10, 2015 at 11:51 AM, Julian Hyde <[email protected]> > wrote: > >>> > >>>> Can you use both together? Say > >>>> > >>>> select columns > >>>> from dfs.`/path/to/myfile`(type => 'TEXT', fieldDelimiter => '|’) > >> EXTEND > >>>> (foo INTEGER) > >>>> > >>>> Julian > >>>> > >>>> > >>>> > >>>>> On Nov 10, 2015, at 10:51 AM, Julien Le Dem <[email protected]> > >> wrote: > >>>>> > >>>>> I took a stab at adding the TableFunction syntax without table(...) > in > >>>>> Calcite. > >>>>> I have verified that both the table function and extend (with or > >> without > >>>>> keyword) work > >>>>> > >>>> > >> > https://github.com/julienledem/calcite/commit/b18f335c49e273294c2d475e359c610aaed3da34 > >>>>> > >>>>> These work: > >>>>> > >>>>> select columns from dfs.`/path/to/myfile`(type => 'TEXT', > >> fieldDelimiter > >>>> => > >>>>> '|') > >>>>> > >>>>> select columns from table(dfs.`/path/to/myfile`(type => 'TEXT', > >>>>> fieldDelimiter => '|')) > >>>>> > >>>>> select columns from table(dfs.`/path/to/myfile`('JSON')) > >>>>> > >>>>> select columns from dfs.`/path/to/myfile`('JSON') > >>>>> > >>>>> select columns from dfs.`/path/to/myfile`(type => 'JSON') > >>>>> > >>>>> On Sat, Nov 7, 2015 at 5:15 PM, Jacques Nadeau <[email protected]> > >>>> wrote: > >>>>> > >>>>>> Drill does implicitly what Phoenix does explicitly so I don't think > >> we > >>>>>> should constrain ourselves to having a union of the two syntaxes. > >>>>>> > >>>>>> > >>>>>> That being said, I think we could make these work together... maybe. > >>>>>> > >>>>>> Remove the EXTENDS without keyword syntax from the grammar. > >>>>>> > >>>>>> Create a new sub block in the table block that requires no keyword. > >>>> There > >>>>>> would be two paths (and would probably require some lookahead) > >>>>>> > >>>>>> option 1> unnamed parameters (1,2,3) > >>>>>> option 2> named parameters (a => 1, b=>2, c=> 3) > >>>>>> option 3> create table field pattern (favoriteBand VARCHAR(100), > >>>>>> golfHandicap INTEGER) > >>>>>> > >>>>>> Then we create a table function with options 1 & 2, an EXTENDS > clause > >>>> for > >>>>>> option 3. > >>>>>> > >>>>>> Best of both worlds? > >>>>>> > >>>>>> On Sat, Nov 7, 2015 at 4:44 PM, James Taylor < > [email protected] > >>> > >>>>>> wrote: > >>>>>> > >>>>>>> Phoenix already supports columns at read-time using the syntax > >> without > >>>>>> the > >>>>>>> EXTENDS keyword as Julian indicated: > >>>>>>> SELECT * FROM Emp (favoriteBand VARCHAR(100), golfHandicap > >> INTEGER) > >>>>>>> WHERE goldHandicap < 10; > >>>>>>> > >>>>>>> Changing this by requiring the EXTENDS keyword would create a > >> backward > >>>>>>> compatibility problem. > >>>>>>> > >>>>>>> I think it'd be good if both of these extensions worked in Drill & > >>>>>> Phoenix > >>>>>>> given our Drillix initiative. > >>>>>>> > >>>>>>> On Sat, Nov 7, 2015 at 3:34 PM, Jacques Nadeau <[email protected] > > > >>>>>> wrote: > >>>>>>> > >>>>>>>> My proposal was an a or b using the freemarker template in the > >>>> grammar, > >>>>>>>> not something later. > >>>>>>>> > >>>>>>>> Actually, put another way: we may want to consider stating that we > >>>> only > >>>>>>>> incorporate SQL standards in our primary grammar. Any extensions > >>>> should > >>>>>>> be > >>>>>>>> optional grammar. We could simply have grammar plugins in Calcite > >> (the > >>>>>>> same > >>>>>>>> way we plug in external things in Drill). > >>>>>>>> > >>>>>>>> Trying to get every project to agree on extensions seems like it > >> may > >>>> be > >>>>>>>> hard. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Jacques Nadeau > >>>>>>>> CTO and Co-Founder, Dremio > >>>>>>>> > >>>>>>>> On Sat, Nov 7, 2015 at 2:45 PM, Julian Hyde <[email protected]> > >> wrote: > >>>>>>>> > >>>>>>>>> I can see why Jacques wants this syntax. > >>>>>>>>> > >>>>>>>>> However a “switch" in a grammar is a bad idea. Grammars need to > be > >>>>>>>>> predictable. Any variation should happen at validation time, or > >>>> later. > >>>>>>>>> > >>>>>>>>> Also, we shouldn’t add configuration parameters as a way of > >> avoiding > >>>> a > >>>>>>>>> tough design discussion. > >>>>>>>>> > >>>>>>>>> EXTENDS and eliding TABLE are both extensions to standard SQL, > and > >>>>>> they > >>>>>>>>> are both applicable to Drill and Phoenix. I think Drill and > >> Phoenix > >>>>>> (by > >>>>>>>>> which I mean Jacques and James, I guess) need to agree what the > >> SQL > >>>>>>> syntax > >>>>>>>>> should be. > >>>>>>>>> > >>>>>>>>> Julian > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> On Nov 7, 2015, at 10:40 AM, Jim Scott <[email protected]> > >> wrote: > >>>>>>>>>> > >>>>>>>>>> Looking at those two examples I agree with Jacques. The first > >>>>>> appears > >>>>>>>>> more > >>>>>>>>>> like a hint from the syntactic sugar point of view. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Fri, Nov 6, 2015 at 11:53 PM, Jacques Nadeau < > >> [email protected] > >>>>>>> > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Since EXTEND is custom functionality, it seems reasonable that > >> we > >>>>>>> could > >>>>>>>>>>> have a switch. Given that SQL Server and Postgres support it > >> seems > >>>>>>>>>>> reasonable to support the table functions without the TABLE > >> syntax. > >>>>>>>>>>> > >>>>>>>>>>> I for one definitely think the TABLE syntax is much more > >> confusing > >>>>>> to > >>>>>>>>> use, > >>>>>>>>>>> especially in the example that we're looking to support, such > >> as: > >>>>>>>>>>> > >>>>>>>>>>> select * from dfs.`/myfolder/mytable` (type => 'CSV', > >>>>>> fieldDelimiter > >>>>>>> => > >>>>>>>>>>> '|', skipFirstRow => true) > >>>>>>>>>>> > >>>>>>>>>>> This seems much clearer than: > >>>>>>>>>>> > >>>>>>>>>>> select * from TABLE(dfs.`/myfolder/mytable` (type => 'CSV', > >>>>>>>>> fieldDelimiter > >>>>>>>>>>> => '|', skipFirstRow => true)) > >>>>>>>>>>> > >>>>>>>>>>> It also looks much more like a hint to the table (which is our > >>>>>> goal). > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>>> Jacques Nadeau > >>>>>>>>>>> CTO and Co-Founder, Dremio > >>>>>>>>>>> > >>>>>>>>>>> On Fri, Nov 6, 2015 at 9:15 PM, Julian Hyde <[email protected]> > >>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Thanks for doing the legwork and finding what the other > vendors > >>>>>> do. > >>>>>>>>> It is > >>>>>>>>>>>> indeed compelling that SQL Server and Postgres go beyond the > >>>>>>> standard > >>>>>>>>> an > >>>>>>>>>>>> make the TABLE keyword optional. > >>>>>>>>>>>> > >>>>>>>>>>>> I tried that syntax in Calcite and discovered that there is a > >>>>>> clash > >>>>>>>>> with > >>>>>>>>>>>> one of our own (few) extensions. In > >>>>>>>>>>>> https://issues.apache.org/jira/browse/CALCITE-493 we added > the > >>>>>>>>> EXTENDS > >>>>>>>>>>>> clause. You can write > >>>>>>>>>>>> > >>>>>>>>>>>> SELECT * > >>>>>>>>>>>> FROM Emp EXTEND (favoriteBand VARCHAR(100), golfHandicap > >> INTEGER) > >>>>>>>>>>>> WHERE goldHandicap < 10; > >>>>>>>>>>>> > >>>>>>>>>>>> to tell Calcite that there are two undeclared columns in the > >> Emp > >>>>>>> table > >>>>>>>>>>> but > >>>>>>>>>>>> you would like to use them in this particular query. We chose > >> to > >>>>>>> make > >>>>>>>>> the > >>>>>>>>>>>> EXTEND keyword optional, so you could instead write > >>>>>>>>>>>> > >>>>>>>>>>>> SELECT * > >>>>>>>>>>>> FROM Emp (favoriteBand VARCHAR(100), golfHandicap INTEGER) > >>>>>>>>>>>> WHERE goldHandicap < 10; > >>>>>>>>>>>> > >>>>>>>>>>>> That is uncomfortably close to > >>>>>>>>>>>> > >>>>>>>>>>>> SELECT * > >>>>>>>>>>>> FROM EmpFunction (favoriteBand, golfHandicap); > >>>>>>>>>>>> > >>>>>>>>>>>> so we would require > >>>>>>>>>>>> > >>>>>>>>>>>> SELECT * > >>>>>>>>>>>> FROM TABLE(EmpFunction (favoriteBand, golfHandicap)); > >>>>>>>>>>>> > >>>>>>>>>>>> if EmpFunction was a table-function. You could combine the two > >>>>>> forms > >>>>>>>>> like > >>>>>>>>>>>> this: > >>>>>>>>>>>> > >>>>>>>>>>>> SELECT * > >>>>>>>>>>>> FROM TABLE(EmpFunction (favoriteBand, golfHandicap)) EXTEND > >>>>>>>>>>>> (anotherAttribute INTEGER); > >>>>>>>>>>>> > >>>>>>>>>>>> We could revisit whether EXTEND is optional, I suppose. But we > >>>>>>> should > >>>>>>>>>>> also > >>>>>>>>>>>> ask whether requiring folks to type TABLE is such a hardship. > >>>>>>>>>>>> > >>>>>>>>>>>> Julian > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>>> On Nov 6, 2015, at 2:20 PM, Julien Le Dem <[email protected] > > > >>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>> - Table function syntax: I did a quick search and it seems > >>>>>> there's > >>>>>>> no > >>>>>>>>>>>>> consensus about this. > >>>>>>>>>>>>> It seems that Posgres [1] and SQL Server [2] both allow > >> calling > >>>>>>> table > >>>>>>>>>>>>> functions without the table(...) wrapper while Oracle [3] and > >> DB2 > >>>>>>> [4] > >>>>>>>>>>>>> expect it. > >>>>>>>>>>>>> MySQL does not have table functions [5] > >>>>>>>>>>>>> 2 for, 2 against and 1 undecided: that's a draw :) > >>>>>>>>>>>>> Would it be reasonable to allow a switch in the grammar > >>>>>> generation > >>>>>>> to > >>>>>>>>>>>> have > >>>>>>>>>>>>> a posgres compatible syntax? Currently in Drill we use the > >> MySQL > >>>>>>> like > >>>>>>>>>>>>> syntax (back ticks for identifiers etc) > >>>>>>>>>>>>> > >>>>>>>>>>>>> [1] > >>>>>>>>>>> > >>>>>> http://www.postgresql.org/docs/7.3/static/xfunc-tablefunctions.html > >>>>>>>>>>>>> [2] > >>>>>>>>>>> > >>>>>> https://technet.microsoft.com/en-us/library/aa214485(v=sql.80).aspx > >>>>>>>>>>>>> [3] > >>>>>>> https://oracle-base.com/articles/misc/pipelined-table-functions > >>>>>>>>>>>>> [4] > >>>>>>> http://www.ibm.com/developerworks/ibmi/library/i-power-of-udtf/ > >>>>>>>>>>>>> [5] > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>>>> > >>>> > >> > http://stackoverflow.com/questions/12163666/mysql-function-to-return-a-table > >>>>>>>>>>>>> > >>>>>>>>>>>>> - It seems a simple change in SqlCallBinding fixes the > >> function > >>>>>>>>>>>>> overloading: > https://github.com/apache/calcite/pull/166/files > >>>>>>>>>>>>> But that seems too easy to be true. Possibly this method is > >>>>>> called > >>>>>>>>> more > >>>>>>>>>>>>> than once (before and after the function has been resolved?) > >>>>>>>>>>>>> > >>>>>>>>>>>>> FYI this would happen only when using named parameter. We do > >> want > >>>>>>> to > >>>>>>>>>>>>> overload in this case, which is why I'm looking into it. > >>>>>>>>>>>>> > >>>>>>>>>>>>> I'll fill a JIRA for my other branch > >>>>>>>>>>>>> > >>>>>>>>>>>>> Julien > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Thu, Nov 5, 2015 at 5:39 PM, Julian Hyde < > [email protected] > >>> > >>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Nov 5, 2015, at 5:00 PM, Julien Le Dem < > [email protected] > >>> > >>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> TL;DR: TableMacro works for me; I need help with a bug in > >>>>>> Calcite > >>>>>>>>> when > >>>>>>>>>>>>>> there's more than 1 function with the same name. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Yes; see below. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> FYI: I have a prototype of TableMacro working in Drill. For > >> now > >>>>>>> just > >>>>>>>>>>>> being > >>>>>>>>>>>>>> able to specify the delimiter for csv files. > >>>>>>>>>>>>>> So it seem the answer to my question 1) is that TableMacros > >> are > >>>>>>> the > >>>>>>>>>>> way > >>>>>>>>>>>> to > >>>>>>>>>>>>>> go. > >>>>>>>>>>>>>> I'm still wondering about *3) is the table(...) wrapping > >> syntax > >>>>>>>>>>>>>> necessary?* > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Consider: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> select * from myTable as f(x, y) > >>>>>>>>>>>>>> select * from myTable f(x, y) > >>>>>>>>>>>>>> select * from myFunction(x, y) > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> #1 and #2 mean the same thing; #2 and #3 look awfully > >> similar. > >>>>>>> Also, > >>>>>>>>>>> if > >>>>>>>>>>>> f > >>>>>>>>>>>>>> is a function with zero arguments, could you invoke it like > >>>>>> this?: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> select * from f > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I don’t know the actual rationale. But I know that the SQL > >>>>>>> standards > >>>>>>>>>>>>>> people in their wisdom decided to add a keyword to > >> disambiguate. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I had to fix some things in Calcite to enable this: > >>>>>>>>>>>>>> https://github.com/dremio/calcite/pull/1/files > >>>>>>>>>>>>>> Drill uses Frameworks.getPlanner() that does not seem to be > >> used > >>>>>>> in > >>>>>>>>>>>>>> Calcite for the Maze example. > >>>>>>>>>>>>>> Which is why some hooks were missing. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Can you log a jira case to track this bug? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I think I found a bug in Calcite but I'd need help to fix > it. > >>>>>>>>>>>>>> Here is a test that reproduces the problem: > >>>>>>>>>>>>>> https://github.com/apache/calcite/pull/166 > >>>>>>>>>>>>>> If we return more than 1 TableFunction with the same name, > we > >>>>>> get > >>>>>>> a > >>>>>>>>>>> NPE > >>>>>>>>>>>>>> later on. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Yes, I knew there was a problem with overloading. Please log > >> a > >>>>>>> JIRA > >>>>>>>>>>> case > >>>>>>>>>>>>>> on resolution of overloaded functions when invoked with > named > >>>>>>>>>>> arguments. > >>>>>>>>>>>>>> (It probably applies to all functions, not just table > >>>>>> functions.) > >>>>>>>>> The > >>>>>>>>>>>> fix > >>>>>>>>>>>>>> will take a while (if you wait for me to write it). > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> For now please tell your users not to overload. :) > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Julian > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> -- > >>>>>>>>>>>>> Julien > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> *Jim Scott* > >>>>>>>>>> Director, Enterprise Strategy & Architecture > >>>>>>>>>> +1 (347) 746-9281 > >>>>>>>>>> @kingmesal <https://twitter.com/kingmesal> > >>>>>>>>>> > >>>>>>>>>> <http://www.mapr.com/> > >>>>>>>>>> [image: MapR Technologies] <http://www.mapr.com> > >>>>>>>>>> > >>>>>>>>>> Now Available - Free Hadoop On-Demand Training > >>>>>>>>>> < > >>>>>>>>> > >>>>>>> > >>>>>> > >>>> > >> > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> Julien > >>>> > >>>> > >>> > >>> > >>> -- > >>> Julien > >> > > > > > > > > -- > > Julien > > -- Julien
