I can see why Jacques wants this syntax. However a “switch" in a grammar is a bad idea. Grammars need to be predictable. Any variation should happen at validation time, or later.
Also, we shouldn’t add configuration parameters as a way of avoiding a tough design discussion. EXTENDS and eliding TABLE are both extensions to standard SQL, and they are both applicable to Drill and Phoenix. I think Drill and Phoenix (by which I mean Jacques and James, I guess) need to agree what the SQL syntax should be. Julian > On Nov 7, 2015, at 10:40 AM, Jim Scott <[email protected]> wrote: > > Looking at those two examples I agree with Jacques. The first appears more > like a hint from the syntactic sugar point of view. > > > On Fri, Nov 6, 2015 at 11:53 PM, Jacques Nadeau <[email protected]> wrote: > >> Since EXTEND is custom functionality, it seems reasonable that we could >> have a switch. Given that SQL Server and Postgres support it seems >> reasonable to support the table functions without the TABLE syntax. >> >> I for one definitely think the TABLE syntax is much more confusing to use, >> especially in the example that we're looking to support, such as: >> >> select * from dfs.`/myfolder/mytable` (type => 'CSV', fieldDelimiter => >> '|', skipFirstRow => true) >> >> This seems much clearer than: >> >> select * from TABLE(dfs.`/myfolder/mytable` (type => 'CSV', fieldDelimiter >> => '|', skipFirstRow => true)) >> >> It also looks much more like a hint to the table (which is our goal). >> >> >> >> >> >> >> >> -- >> Jacques Nadeau >> CTO and Co-Founder, Dremio >> >> On Fri, Nov 6, 2015 at 9:15 PM, Julian Hyde <[email protected]> wrote: >> >>> Thanks for doing the legwork and finding what the other vendors do. It is >>> indeed compelling that SQL Server and Postgres go beyond the standard an >>> make the TABLE keyword optional. >>> >>> I tried that syntax in Calcite and discovered that there is a clash with >>> one of our own (few) extensions. In >>> https://issues.apache.org/jira/browse/CALCITE-493 we added the EXTENDS >>> clause. You can write >>> >>> SELECT * >>> FROM Emp EXTEND (favoriteBand VARCHAR(100), golfHandicap INTEGER) >>> WHERE goldHandicap < 10; >>> >>> to tell Calcite that there are two undeclared columns in the Emp table >> but >>> you would like to use them in this particular query. We chose to make the >>> EXTEND keyword optional, so you could instead write >>> >>> SELECT * >>> FROM Emp (favoriteBand VARCHAR(100), golfHandicap INTEGER) >>> WHERE goldHandicap < 10; >>> >>> That is uncomfortably close to >>> >>> SELECT * >>> FROM EmpFunction (favoriteBand, golfHandicap); >>> >>> so we would require >>> >>> SELECT * >>> FROM TABLE(EmpFunction (favoriteBand, golfHandicap)); >>> >>> if EmpFunction was a table-function. You could combine the two forms like >>> this: >>> >>> SELECT * >>> FROM TABLE(EmpFunction (favoriteBand, golfHandicap)) EXTEND >>> (anotherAttribute INTEGER); >>> >>> We could revisit whether EXTEND is optional, I suppose. But we should >> also >>> ask whether requiring folks to type TABLE is such a hardship. >>> >>> Julian >>> >>> >>>> On Nov 6, 2015, at 2:20 PM, Julien Le Dem <[email protected]> wrote: >>>> >>>> - Table function syntax: I did a quick search and it seems there's no >>>> consensus about this. >>>> It seems that Posgres [1] and SQL Server [2] both allow calling table >>>> functions without the table(...) wrapper while Oracle [3] and DB2 [4] >>>> expect it. >>>> MySQL does not have table functions [5] >>>> 2 for, 2 against and 1 undecided: that's a draw :) >>>> Would it be reasonable to allow a switch in the grammar generation to >>> have >>>> a posgres compatible syntax? Currently in Drill we use the MySQL like >>>> syntax (back ticks for identifiers etc) >>>> >>>> [1] >> http://www.postgresql.org/docs/7.3/static/xfunc-tablefunctions.html >>>> [2] >> https://technet.microsoft.com/en-us/library/aa214485(v=sql.80).aspx >>>> [3] https://oracle-base.com/articles/misc/pipelined-table-functions >>>> [4] http://www.ibm.com/developerworks/ibmi/library/i-power-of-udtf/ >>>> [5] >>>> >>> >> http://stackoverflow.com/questions/12163666/mysql-function-to-return-a-table >>>> >>>> - It seems a simple change in SqlCallBinding fixes the function >>>> overloading: https://github.com/apache/calcite/pull/166/files >>>> But that seems too easy to be true. Possibly this method is called more >>>> than once (before and after the function has been resolved?) >>>> >>>> FYI this would happen only when using named parameter. We do want to >>>> overload in this case, which is why I'm looking into it. >>>> >>>> I'll fill a JIRA for my other branch >>>> >>>> Julien >>>> >>>> On Thu, Nov 5, 2015 at 5:39 PM, Julian Hyde <[email protected]> wrote: >>>> >>>>> >>>>> On Nov 5, 2015, at 5:00 PM, Julien Le Dem <[email protected]> wrote: >>>>> >>>>> TL;DR: TableMacro works for me; I need help with a bug in Calcite when >>>>> there's more than 1 function with the same name. >>>>> >>>>> >>>>> Yes; see below. >>>>> >>>>> FYI: I have a prototype of TableMacro working in Drill. For now just >>> being >>>>> able to specify the delimiter for csv files. >>>>> So it seem the answer to my question 1) is that TableMacros are the >> way >>> to >>>>> go. >>>>> I'm still wondering about *3) is the table(...) wrapping syntax >>>>> necessary?* >>>>> >>>>> >>>>> Consider: >>>>> >>>>> select * from myTable as f(x, y) >>>>> select * from myTable f(x, y) >>>>> select * from myFunction(x, y) >>>>> >>>>> #1 and #2 mean the same thing; #2 and #3 look awfully similar. Also, >> if >>> f >>>>> is a function with zero arguments, could you invoke it like this?: >>>>> >>>>> select * from f >>>>> >>>>> I don’t know the actual rationale. But I know that the SQL standards >>>>> people in their wisdom decided to add a keyword to disambiguate. >>>>> >>>>> I had to fix some things in Calcite to enable this: >>>>> https://github.com/dremio/calcite/pull/1/files >>>>> Drill uses Frameworks.getPlanner() that does not seem to be used in >>>>> Calcite for the Maze example. >>>>> Which is why some hooks were missing. >>>>> >>>>> >>>>> Can you log a jira case to track this bug? >>>>> >>>>> >>>>> I think I found a bug in Calcite but I'd need help to fix it. >>>>> Here is a test that reproduces the problem: >>>>> https://github.com/apache/calcite/pull/166 >>>>> If we return more than 1 TableFunction with the same name, we get a >> NPE >>>>> later on. >>>>> >>>>> >>>>> Yes, I knew there was a problem with overloading. Please log a JIRA >> case >>>>> on resolution of overloaded functions when invoked with named >> arguments. >>>>> (It probably applies to all functions, not just table functions.) The >>> fix >>>>> will take a while (if you wait for me to write it). >>>>> >>>>> For now please tell your users not to overload. :) >>>>> >>>>> >>>>> Julian >>>>> >>>>> >>>> >>>> >>>> -- >>>> Julien >>> >>> >> > > > > -- > *Jim Scott* > Director, Enterprise Strategy & Architecture > +1 (347) 746-9281 > @kingmesal <https://twitter.com/kingmesal> > > <http://www.mapr.com/> > [image: MapR Technologies] <http://www.mapr.com> > > Now Available - Free Hadoop On-Demand Training > <http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>
