Hi,
I've been trying to enable this but it looks like in the current grammar
(before my change) you can not use table functions and EXTEND together.
That's because they are on difference branches of an | in the grammar.
So I would suggest that we treat those as two separate improvement in two
different pull requests:
 - not require table(...) to call table functions
 - allow using table functions and extend together.
Does it make sense?
Julien


On Tue, Nov 10, 2015 at 12:51 PM, Julian Hyde <[email protected]> wrote:

> To be clear, it should be possible to use a table function with all of
> the options -- EXTENDS clause, OVER clause, AS with alias and column
> aliases, TABLESAMPLE.
>
> I'm surprised that the parser didn't need more lookahead to choose
> between 't (x, y)' and 't (x INTEGER, y DATE)'.
>
> On Tue, Nov 10, 2015 at 12:28 PM, Julien Le Dem <[email protected]> wrote:
> > In the patch I just sent, probably not.
> > I will adjust it and add the corresponding test.
> >
> > On Tue, Nov 10, 2015 at 11:51 AM, Julian Hyde <[email protected]> wrote:
> >
> >> Can you use both together? Say
> >>
> >>   select columns
> >>   from dfs.`/path/to/myfile`(type => 'TEXT', fieldDelimiter => '|’)
> EXTEND
> >> (foo INTEGER)
> >>
> >> Julian
> >>
> >>
> >>
> >> > On Nov 10, 2015, at 10:51 AM, Julien Le Dem <[email protected]>
> wrote:
> >> >
> >> > I took a stab at adding the TableFunction syntax without table(...) in
> >> > Calcite.
> >> > I have verified that both the table function and extend (with or
> without
> >> > keyword) work
> >> >
> >>
> https://github.com/julienledem/calcite/commit/b18f335c49e273294c2d475e359c610aaed3da34
> >> >
> >> > These work:
> >> >
> >> > select columns from dfs.`/path/to/myfile`(type => 'TEXT',
> fieldDelimiter
> >> =>
> >> > '|')
> >> >
> >> > select columns from table(dfs.`/path/to/myfile`(type => 'TEXT',
> >> > fieldDelimiter => '|'))
> >> >
> >> > select columns from table(dfs.`/path/to/myfile`('JSON'))
> >> >
> >> > select columns from dfs.`/path/to/myfile`('JSON')
> >> >
> >> > select columns from dfs.`/path/to/myfile`(type => 'JSON')
> >> >
> >> > On Sat, Nov 7, 2015 at 5:15 PM, Jacques Nadeau <[email protected]>
> >> wrote:
> >> >
> >> >> Drill does implicitly what Phoenix does explicitly so I don't think
> we
> >> >> should constrain ourselves to having a union of the two syntaxes.
> >> >>
> >> >>
> >> >> That being said, I think we could make these work together... maybe.
> >> >>
> >> >> Remove the EXTENDS without keyword syntax from the grammar.
> >> >>
> >> >> Create a new sub block in the table block that requires no keyword.
> >> There
> >> >> would be two paths (and would probably require some lookahead)
> >> >>
> >> >> option 1> unnamed parameters (1,2,3)
> >> >> option 2> named parameters (a => 1, b=>2, c=> 3)
> >> >> option 3> create table field pattern (favoriteBand VARCHAR(100),
> >> >> golfHandicap INTEGER)
> >> >>
> >> >> Then we create a table function with options 1 & 2, an EXTENDS clause
> >> for
> >> >> option 3.
> >> >>
> >> >> Best of both worlds?
> >> >>
> >> >> On Sat, Nov 7, 2015 at 4:44 PM, James Taylor <[email protected]
> >
> >> >> wrote:
> >> >>
> >> >>> Phoenix already supports columns at read-time using the syntax
> without
> >> >> the
> >> >>> EXTENDS keyword as Julian indicated:
> >> >>>   SELECT * FROM Emp (favoriteBand VARCHAR(100), golfHandicap
> INTEGER)
> >> >>>   WHERE goldHandicap < 10;
> >> >>>
> >> >>> Changing this by requiring the EXTENDS keyword would create a
> backward
> >> >>> compatibility problem.
> >> >>>
> >> >>> I think it'd be good if both of these extensions worked in Drill &
> >> >> Phoenix
> >> >>> given our Drillix initiative.
> >> >>>
> >> >>> On Sat, Nov 7, 2015 at 3:34 PM, Jacques Nadeau <[email protected]>
> >> >> wrote:
> >> >>>
> >> >>>> My proposal was an a or b using the freemarker template in the
> >> grammar,
> >> >>>> not something later.
> >> >>>>
> >> >>>> Actually, put another way: we may want to consider stating that we
> >> only
> >> >>>> incorporate SQL standards in our primary grammar. Any extensions
> >> should
> >> >>> be
> >> >>>> optional grammar. We could simply have grammar plugins in Calcite
> (the
> >> >>> same
> >> >>>> way we plug in external things in Drill).
> >> >>>>
> >> >>>> Trying to get every project to agree on extensions seems like it
> may
> >> be
> >> >>>> hard.
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> --
> >> >>>> Jacques Nadeau
> >> >>>> CTO and Co-Founder, Dremio
> >> >>>>
> >> >>>> On Sat, Nov 7, 2015 at 2:45 PM, Julian Hyde <[email protected]>
> wrote:
> >> >>>>
> >> >>>>> I can see why Jacques wants this syntax.
> >> >>>>>
> >> >>>>> However a “switch" in a grammar is a bad idea. Grammars need to be
> >> >>>>> predictable. Any variation should happen at validation time, or
> >> later.
> >> >>>>>
> >> >>>>> Also, we shouldn’t add configuration parameters as a way of
> avoiding
> >> a
> >> >>>>> tough design discussion.
> >> >>>>>
> >> >>>>> EXTENDS and eliding TABLE are both extensions to standard SQL, and
> >> >> they
> >> >>>>> are both applicable to Drill and Phoenix. I think Drill and
> Phoenix
> >> >> (by
> >> >>>>> which I mean Jacques and James, I guess) need to agree what the
> SQL
> >> >>> syntax
> >> >>>>> should be.
> >> >>>>>
> >> >>>>> Julian
> >> >>>>>
> >> >>>>>
> >> >>>>>> On Nov 7, 2015, at 10:40 AM, Jim Scott <[email protected]>
> wrote:
> >> >>>>>>
> >> >>>>>> Looking at those two examples I agree with Jacques. The first
> >> >> appears
> >> >>>>> more
> >> >>>>>> like a hint from the syntactic sugar point of view.
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> On Fri, Nov 6, 2015 at 11:53 PM, Jacques Nadeau <
> [email protected]
> >> >>>
> >> >>>>> wrote:
> >> >>>>>>
> >> >>>>>>> Since EXTEND is custom functionality, it seems reasonable that
> we
> >> >>> could
> >> >>>>>>> have a switch. Given that SQL Server and Postgres support it
> seems
> >> >>>>>>> reasonable to support the table functions without the TABLE
> syntax.
> >> >>>>>>>
> >> >>>>>>> I for one definitely think the TABLE syntax is much more
> confusing
> >> >> to
> >> >>>>> use,
> >> >>>>>>> especially in the example that we're looking to support, such
> as:
> >> >>>>>>>
> >> >>>>>>> select * from dfs.`/myfolder/mytable` (type => 'CSV',
> >> >> fieldDelimiter
> >> >>> =>
> >> >>>>>>> '|', skipFirstRow => true)
> >> >>>>>>>
> >> >>>>>>> This seems much clearer than:
> >> >>>>>>>
> >> >>>>>>> select * from TABLE(dfs.`/myfolder/mytable` (type => 'CSV',
> >> >>>>> fieldDelimiter
> >> >>>>>>> => '|', skipFirstRow => true))
> >> >>>>>>>
> >> >>>>>>> It also looks much more like a hint to the table (which is our
> >> >> goal).
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> --
> >> >>>>>>> Jacques Nadeau
> >> >>>>>>> CTO and Co-Founder, Dremio
> >> >>>>>>>
> >> >>>>>>> On Fri, Nov 6, 2015 at 9:15 PM, Julian Hyde <[email protected]>
> >> >>> wrote:
> >> >>>>>>>
> >> >>>>>>>> Thanks for doing the legwork and finding what the other vendors
> >> >> do.
> >> >>>>> It is
> >> >>>>>>>> indeed compelling that SQL Server and Postgres go beyond the
> >> >>> standard
> >> >>>>> an
> >> >>>>>>>> make the TABLE keyword optional.
> >> >>>>>>>>
> >> >>>>>>>> I tried that syntax in Calcite and discovered that there is a
> >> >> clash
> >> >>>>> with
> >> >>>>>>>> one of our own (few) extensions. In
> >> >>>>>>>> https://issues.apache.org/jira/browse/CALCITE-493 we added the
> >> >>>>> EXTENDS
> >> >>>>>>>> clause. You can write
> >> >>>>>>>>
> >> >>>>>>>> SELECT *
> >> >>>>>>>> FROM Emp EXTEND (favoriteBand VARCHAR(100), golfHandicap
> INTEGER)
> >> >>>>>>>> WHERE goldHandicap < 10;
> >> >>>>>>>>
> >> >>>>>>>> to tell Calcite that there are two undeclared columns in the
> Emp
> >> >>> table
> >> >>>>>>> but
> >> >>>>>>>> you would like to use them in this particular query. We chose
> to
> >> >>> make
> >> >>>>> the
> >> >>>>>>>> EXTEND keyword optional, so you could instead write
> >> >>>>>>>>
> >> >>>>>>>> SELECT *
> >> >>>>>>>> FROM Emp (favoriteBand VARCHAR(100), golfHandicap INTEGER)
> >> >>>>>>>> WHERE goldHandicap < 10;
> >> >>>>>>>>
> >> >>>>>>>> That is uncomfortably close to
> >> >>>>>>>>
> >> >>>>>>>> SELECT *
> >> >>>>>>>> FROM EmpFunction (favoriteBand, golfHandicap);
> >> >>>>>>>>
> >> >>>>>>>> so we would require
> >> >>>>>>>>
> >> >>>>>>>> SELECT *
> >> >>>>>>>> FROM TABLE(EmpFunction (favoriteBand, golfHandicap));
> >> >>>>>>>>
> >> >>>>>>>> if EmpFunction was a table-function. You could combine the two
> >> >> forms
> >> >>>>> like
> >> >>>>>>>> this:
> >> >>>>>>>>
> >> >>>>>>>> SELECT *
> >> >>>>>>>> FROM TABLE(EmpFunction (favoriteBand, golfHandicap)) EXTEND
> >> >>>>>>>> (anotherAttribute INTEGER);
> >> >>>>>>>>
> >> >>>>>>>> We could revisit whether EXTEND is optional, I suppose. But we
> >> >>> should
> >> >>>>>>> also
> >> >>>>>>>> ask whether requiring folks to type TABLE is such a hardship.
> >> >>>>>>>>
> >> >>>>>>>> Julian
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>> On Nov 6, 2015, at 2:20 PM, Julien Le Dem <[email protected]>
> >> >>> wrote:
> >> >>>>>>>>>
> >> >>>>>>>>> - Table function syntax: I did a quick search and it seems
> >> >> there's
> >> >>> no
> >> >>>>>>>>> consensus about this.
> >> >>>>>>>>> It seems that Posgres [1] and SQL Server [2] both allow
> calling
> >> >>> table
> >> >>>>>>>>> functions without the table(...) wrapper while Oracle [3] and
> DB2
> >> >>> [4]
> >> >>>>>>>>> expect it.
> >> >>>>>>>>> MySQL does not have table functions [5]
> >> >>>>>>>>> 2 for, 2 against and 1 undecided: that's a draw :)
> >> >>>>>>>>> Would it be reasonable to allow a switch in the grammar
> >> >> generation
> >> >>> to
> >> >>>>>>>> have
> >> >>>>>>>>> a posgres compatible syntax? Currently in Drill we use the
> MySQL
> >> >>> like
> >> >>>>>>>>> syntax (back ticks for identifiers etc)
> >> >>>>>>>>>
> >> >>>>>>>>> [1]
> >> >>>>>>>
> >> >> http://www.postgresql.org/docs/7.3/static/xfunc-tablefunctions.html
> >> >>>>>>>>> [2]
> >> >>>>>>>
> >> >> https://technet.microsoft.com/en-us/library/aa214485(v=sql.80).aspx
> >> >>>>>>>>> [3]
> >> >>> https://oracle-base.com/articles/misc/pipelined-table-functions
> >> >>>>>>>>> [4]
> >> >>> http://www.ibm.com/developerworks/ibmi/library/i-power-of-udtf/
> >> >>>>>>>>> [5]
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>
> >> >>>
> >> >>
> >>
> http://stackoverflow.com/questions/12163666/mysql-function-to-return-a-table
> >> >>>>>>>>>
> >> >>>>>>>>> - It seems a simple change in SqlCallBinding fixes the
> function
> >> >>>>>>>>> overloading: https://github.com/apache/calcite/pull/166/files
> >> >>>>>>>>> But that seems too easy to be true. Possibly this method is
> >> >> called
> >> >>>>> more
> >> >>>>>>>>> than once (before and after the function has been resolved?)
> >> >>>>>>>>>
> >> >>>>>>>>> FYI this would happen only when using named parameter. We do
> want
> >> >>> to
> >> >>>>>>>>> overload in this case, which is why I'm looking into it.
> >> >>>>>>>>>
> >> >>>>>>>>> I'll fill a JIRA for my other branch
> >> >>>>>>>>>
> >> >>>>>>>>> Julien
> >> >>>>>>>>>
> >> >>>>>>>>> On Thu, Nov 5, 2015 at 5:39 PM, Julian Hyde <[email protected]
> >
> >> >>>>> wrote:
> >> >>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> On Nov 5, 2015, at 5:00 PM, Julien Le Dem <[email protected]
> >
> >> >>>>> wrote:
> >> >>>>>>>>>>
> >> >>>>>>>>>> TL;DR: TableMacro works for me; I need help with a bug in
> >> >> Calcite
> >> >>>>> when
> >> >>>>>>>>>> there's more than 1 function with the same name.
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> Yes; see below.
> >> >>>>>>>>>>
> >> >>>>>>>>>> FYI: I have a prototype of TableMacro working in Drill. For
> now
> >> >>> just
> >> >>>>>>>> being
> >> >>>>>>>>>> able to specify the delimiter for csv files.
> >> >>>>>>>>>> So it seem the answer to my question 1) is that TableMacros
> are
> >> >>> the
> >> >>>>>>> way
> >> >>>>>>>> to
> >> >>>>>>>>>> go.
> >> >>>>>>>>>> I'm still wondering about *3) is the table(...) wrapping
> syntax
> >> >>>>>>>>>> necessary?*
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> Consider:
> >> >>>>>>>>>>
> >> >>>>>>>>>> select * from myTable as f(x, y)
> >> >>>>>>>>>> select * from myTable f(x, y)
> >> >>>>>>>>>> select * from myFunction(x, y)
> >> >>>>>>>>>>
> >> >>>>>>>>>> #1 and #2 mean the same thing; #2 and #3 look awfully
> similar.
> >> >>> Also,
> >> >>>>>>> if
> >> >>>>>>>> f
> >> >>>>>>>>>> is a function with zero arguments, could you invoke it like
> >> >> this?:
> >> >>>>>>>>>>
> >> >>>>>>>>>> select * from f
> >> >>>>>>>>>>
> >> >>>>>>>>>> I don’t know the actual rationale. But I know that the SQL
> >> >>> standards
> >> >>>>>>>>>> people in their wisdom decided to add a keyword to
> disambiguate.
> >> >>>>>>>>>>
> >> >>>>>>>>>> I had to fix some things in Calcite to enable this:
> >> >>>>>>>>>> https://github.com/dremio/calcite/pull/1/files
> >> >>>>>>>>>> Drill uses Frameworks.getPlanner() that does not seem to be
> used
> >> >>> in
> >> >>>>>>>>>> Calcite for the Maze example.
> >> >>>>>>>>>> Which is why some hooks were missing.
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> Can you log a jira case to track this bug?
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> I think I found a bug in Calcite but I'd need help to fix it.
> >> >>>>>>>>>> Here is a test that reproduces the problem:
> >> >>>>>>>>>> https://github.com/apache/calcite/pull/166
> >> >>>>>>>>>> If we return more than 1 TableFunction with the same name, we
> >> >> get
> >> >>> a
> >> >>>>>>> NPE
> >> >>>>>>>>>> later on.
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> Yes, I knew there was a problem with overloading. Please log
> a
> >> >>> JIRA
> >> >>>>>>> case
> >> >>>>>>>>>> on resolution of overloaded functions when invoked with named
> >> >>>>>>> arguments.
> >> >>>>>>>>>> (It probably applies to all functions, not just table
> >> >> functions.)
> >> >>>>> The
> >> >>>>>>>> fix
> >> >>>>>>>>>> will take a while (if you wait for me to write it).
> >> >>>>>>>>>>
> >> >>>>>>>>>> For now please tell your users not to overload. :)
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> Julian
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> --
> >> >>>>>>>>> Julien
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> --
> >> >>>>>> *Jim Scott*
> >> >>>>>> Director, Enterprise Strategy & Architecture
> >> >>>>>> +1 (347) 746-9281
> >> >>>>>> @kingmesal <https://twitter.com/kingmesal>
> >> >>>>>>
> >> >>>>>> <http://www.mapr.com/>
> >> >>>>>> [image: MapR Technologies] <http://www.mapr.com>
> >> >>>>>>
> >> >>>>>> Now Available - Free Hadoop On-Demand Training
> >> >>>>>> <
> >> >>>>>
> >> >>>
> >> >>
> >>
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >> >>>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Julien
> >>
> >>
> >
> >
> > --
> > Julien
>



-- 
Julien

Reply via email to