Re: select from table with options

Julian Hyde Sat, 07 Nov 2015 14:46:22 -0800

I can see why Jacques wants this syntax.

However a “switch" in a grammar is a bad idea. Grammars need to be predictable. 
Any variation should happen at validation time, or later.


Also, we shouldn’t add configuration parameters as a way of avoiding a tough 
design discussion. 

EXTENDS and eliding TABLE are both extensions to standard SQL, and they are 
both applicable to Drill and Phoenix. I think Drill and Phoenix (by which I 
mean Jacques and James, I guess) need to agree what the SQL syntax should be.

Julian


> On Nov 7, 2015, at 10:40 AM, Jim Scott <[email protected]> wrote:
> 
> Looking at those two examples I agree with Jacques. The first appears more
> like a hint from the syntactic sugar point of view.
> 
> 
> On Fri, Nov 6, 2015 at 11:53 PM, Jacques Nadeau <[email protected]> wrote:
> 
>> Since EXTEND is custom functionality, it seems reasonable that we could
>> have a switch. Given that SQL Server and Postgres support it seems
>> reasonable to support the table functions without the TABLE syntax.
>> 
>> I for one definitely think the TABLE syntax is much more confusing to use,
>> especially in the example that we're looking to support, such as:
>> 
>> select * from dfs.`/myfolder/mytable` (type => 'CSV', fieldDelimiter =>
>> '|', skipFirstRow => true)
>> 
>> This seems much clearer than:
>> 
>> select * from TABLE(dfs.`/myfolder/mytable` (type => 'CSV', fieldDelimiter
>> => '|', skipFirstRow => true))
>> 
>> It also looks much more like a hint to the table (which is our goal).
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> Jacques Nadeau
>> CTO and Co-Founder, Dremio
>> 
>> On Fri, Nov 6, 2015 at 9:15 PM, Julian Hyde <[email protected]> wrote:
>> 
>>> Thanks for doing the legwork and finding what the other vendors do. It is
>>> indeed compelling that SQL Server and Postgres go beyond the standard an
>>> make the TABLE keyword optional.
>>> 
>>> I tried that syntax in Calcite and discovered that there is a clash with
>>> one of our own (few) extensions. In
>>> https://issues.apache.org/jira/browse/CALCITE-493 we added the EXTENDS
>>> clause. You can write
>>> 
>>>  SELECT *
>>>  FROM Emp EXTEND (favoriteBand VARCHAR(100), golfHandicap INTEGER)
>>>  WHERE goldHandicap < 10;
>>> 
>>> to tell Calcite that there are two undeclared columns in the Emp table
>> but
>>> you would like to use them in this particular query. We chose to make the
>>> EXTEND keyword optional, so you could instead write
>>> 
>>>  SELECT *
>>>  FROM Emp (favoriteBand VARCHAR(100), golfHandicap INTEGER)
>>>  WHERE goldHandicap < 10;
>>> 
>>> That is uncomfortably close to
>>> 
>>>  SELECT *
>>>  FROM EmpFunction (favoriteBand, golfHandicap);
>>> 
>>> so we would require
>>> 
>>>  SELECT *
>>>  FROM TABLE(EmpFunction (favoriteBand, golfHandicap));
>>> 
>>> if EmpFunction was a table-function. You could combine the two forms like
>>> this:
>>> 
>>>  SELECT *
>>>  FROM TABLE(EmpFunction (favoriteBand, golfHandicap)) EXTEND
>>> (anotherAttribute INTEGER);
>>> 
>>> We could revisit whether EXTEND is optional, I suppose. But we should
>> also
>>> ask whether requiring folks to type TABLE is such a hardship.
>>> 
>>> Julian
>>> 
>>> 
>>>> On Nov 6, 2015, at 2:20 PM, Julien Le Dem <[email protected]> wrote:
>>>> 
>>>> - Table function syntax: I did a quick search and it seems there's no
>>>> consensus about this.
>>>> It seems that Posgres [1] and SQL Server [2] both allow calling table
>>>> functions without the table(...) wrapper while Oracle [3] and DB2 [4]
>>>> expect it.
>>>> MySQL does not have table functions [5]
>>>> 2 for, 2 against and 1 undecided: that's a draw :)
>>>> Would it be reasonable to allow a switch in the grammar generation to
>>> have
>>>> a posgres compatible syntax? Currently in Drill we use the MySQL like
>>>> syntax (back ticks for identifiers etc)
>>>> 
>>>> [1]
>> http://www.postgresql.org/docs/7.3/static/xfunc-tablefunctions.html
>>>> [2]
>> https://technet.microsoft.com/en-us/library/aa214485(v=sql.80).aspx
>>>> [3] https://oracle-base.com/articles/misc/pipelined-table-functions
>>>> [4] http://www.ibm.com/developerworks/ibmi/library/i-power-of-udtf/
>>>> [5]
>>>> 
>>> 
>> http://stackoverflow.com/questions/12163666/mysql-function-to-return-a-table
>>>> 
>>>> - It seems a simple change in SqlCallBinding fixes the function
>>>> overloading: https://github.com/apache/calcite/pull/166/files
>>>> But that seems too easy to be true. Possibly this method is called more
>>>> than once (before and after the function has been resolved?)
>>>> 
>>>> FYI this would happen only when using named parameter. We do want to
>>>> overload in this case, which is why I'm looking into it.
>>>> 
>>>> I'll fill a JIRA for my other branch
>>>> 
>>>> Julien
>>>> 
>>>> On Thu, Nov 5, 2015 at 5:39 PM, Julian Hyde <[email protected]> wrote:
>>>> 
>>>>> 
>>>>> On Nov 5, 2015, at 5:00 PM, Julien Le Dem <[email protected]> wrote:
>>>>> 
>>>>> TL;DR: TableMacro works for me; I need help with a bug in Calcite when
>>>>> there's more than 1 function with the same name.
>>>>> 
>>>>> 
>>>>> Yes; see below.
>>>>> 
>>>>> FYI: I have a prototype of TableMacro working in Drill. For now just
>>> being
>>>>> able to specify the delimiter for csv files.
>>>>> So it seem the answer to my question 1) is that TableMacros are the
>> way
>>> to
>>>>> go.
>>>>> I'm still wondering about *3) is the table(...) wrapping syntax
>>>>> necessary?*
>>>>> 
>>>>> 
>>>>> Consider:
>>>>> 
>>>>> select * from myTable as f(x, y)
>>>>> select * from myTable f(x, y)
>>>>> select * from myFunction(x, y)
>>>>> 
>>>>> #1 and #2 mean the same thing; #2 and #3 look awfully similar. Also,
>> if
>>> f
>>>>> is a function with zero arguments, could you invoke it like this?:
>>>>> 
>>>>> select * from f
>>>>> 
>>>>> I don’t know the actual rationale. But I know that the SQL standards
>>>>> people in their wisdom decided to add a keyword to disambiguate.
>>>>> 
>>>>> I had to fix some things in Calcite to enable this:
>>>>> https://github.com/dremio/calcite/pull/1/files
>>>>> Drill uses Frameworks.getPlanner() that does not seem to be used in
>>>>> Calcite for the Maze example.
>>>>> Which is why some hooks were missing.
>>>>> 
>>>>> 
>>>>> Can you log a jira case to track this bug?
>>>>> 
>>>>> 
>>>>> I think I found a bug in Calcite but I'd need help to fix it.
>>>>> Here is a test that reproduces the problem:
>>>>> https://github.com/apache/calcite/pull/166
>>>>> If we return more than 1 TableFunction with the same name, we get a
>> NPE
>>>>> later on.
>>>>> 
>>>>> 
>>>>> Yes, I knew there was a problem with overloading. Please log a JIRA
>> case
>>>>> on resolution of overloaded functions when invoked with named
>> arguments.
>>>>> (It probably applies to all functions, not just table functions.) The
>>> fix
>>>>> will take a while (if you wait for me to write it).
>>>>> 
>>>>> For now please tell your users not to overload. :)
>>>>> 
>>>>> 
>>>>> Julian
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> Julien
>>> 
>>> 
>> 
> 
> 
> 
> -- 
> *Jim Scott*
> Director, Enterprise Strategy & Architecture
> +1 (347) 746-9281
> @kingmesal <https://twitter.com/kingmesal>
> 
> <http://www.mapr.com/>
> [image: MapR Technologies] <http://www.mapr.com>
> 
> Now Available - Free Hadoop On-Demand Training
> <http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Re: select from table with options

Reply via email to