My initial inclination of a table function was that it sounds kind of sketchy. But given Julian's elaboration and description this sounds like a great idea.
>From a user perspective this is easy to understand and flexible. To me I see this table function model effectively like a hint for how to handle the data and I think others will see it that way too. +1 On Tue, Oct 20, 2015 at 1:32 PM, Julian Hyde <[email protected]> wrote: > +1 to use table functions > > In Calcite (and I presume Drill) a “table function” may actually function > more like a (Lisp) macro. The function gets called at prepare time to yield > a RelNode (say a TableScan). So a table function is every bit as efficient > as using a table, but it allows extra parameters. > > If the table function has a lot of parameters it might be nice to support > named parameters: > > select * from table(disitributedFile(path => ‘/path/to/something.psv’, > delimiter => ‘|’)); > > Named parameters are in the SQL standard but are not supported by > Calcite’s parser currently. Parameters can be specified in any order, and > those not specified have a default value. > > Julian > > > > On Oct 19, 2015, at 5:18 PM, Ted Dunning <[email protected]> wrote: > > > > Wouldn't a table function be a better option? > > > > Something like this perhaps? > > > > select * from > > delimitedFile(dfs.`default`.`/path/to/file/something.psv`, '|') > > > > ? > > > > Or how about fake-o parameters that the delimited record scanner knows > how > > to push down into the scanning of the data? That would look like this: > > > > select * from > > dfs.`default`.`/path/to/file/something.psv` > > where magicFieldDelimiter = '|'; > > > > > > > > On Mon, Oct 19, 2015 at 2:28 PM, Julien Le Dem <[email protected]> > wrote: > > > >> I'm looking into passing information on how to interpret a file through > the > >> select clause in Drill. > >> Something along the lines of: > >> *select * from > >> dfs.`default`.`/path/to/file/something.psv?type=text&delimiter=|`;* > >> (In this example, we want to specify a specific delimiter, but that > would > >> apply to any *type* of format) > >> > >> Which would allow to read a file without having to centrally configure > >> formats: https://drill.apache.org/docs/querying-plain-text-files/ > >> Which makes it easier to try to read an existing file. > >> Typically once the user has found the proper settings, they would update > >> the central configuration. > >> > >> thoughts? > >> > >> -- > >> Julien > >> > > -- *Jim Scott* Director, Enterprise Strategy & Architecture +1 (347) 746-9281 @kingmesal <https://twitter.com/kingmesal> <http://www.mapr.com/> [image: MapR Technologies] <http://www.mapr.com> Now Available - Free Hadoop On-Demand Training <http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>
