Currently Calcite supports the following syntax, apparently used in Phoenix. *select empno + x from EMP_MODIFIABLEVIEW extend (x int not null)*
Another option to consider is hint syntax (many DBs use this one) basically it's a multiline comment followed by a plus: *select /*+.....*/ col_name from t* This would allow us to pass not only schema but join / index hints etc. Example: *select /*+ SCHEMA(a int not null, b int) */ a from t* One minus we would need to implement this first in Calcite if Calcite community would be in favor of such changes. Kind regards, Arina On Mon, Sep 10, 2018 at 7:42 AM Paul Rogers <[email protected]> wrote: > Hi Weijie, > > Thanks for the paper pointer. F1 uses the same syntax as Scope (the system > cited in my earlier note): data type after the name. > > Another description is [1]. Neither paper describe how F1 handles arrays. > However, this second paper points out that Protobuf is F1's native format, > and so F1 has support for nested types. Drill does also, but in Drill, a > reference to "customer.phone.cell" cause the nested "cell" column to be > projected as a top-level column. And, neither paper say whether F1 is used > with O/JDBC, and if so, how they handle the mapping from nested types to > the flat tuple structure required by xDBC. > > Have you come across these details? > > Thanks, > - Paul > > > > On Thursday, September 6, 2018, 8:43:57 PM PDT, weijie tong < > [email protected]> wrote: > > Google's latest paper about F1[1] claims to support any data sources by > using an extension api called TVF see section 6.3. Also need to declare > column datatype before the query. > > > [1] http://www.vldb.org/pvldb/vol11/p1835-samwel.pdf > > On Fri, Sep 7, 2018 at 9:47 AM Paul Rogers <[email protected]> > wrote: > > > Hi All, > > > > We've discussed quite a few times whether Drill should or should not > > support or require schemas, and if so, how the user might express the > > schema. > > > > I came across a paper [1] that suggests a simple, elegant SQL extension: > > > > EXTRACT <column>[:<type>] {,<column>[:<type>]} > > FROM <stream_name> > > > > Paraphrasing into Drill's SQL: > > > > SELECT <column>[:<type>][AS <alias>] {,<column>[:<type>][AS <alias>]} > > FROM <table_name> > > > > Have a collection of JSON files in which string column `foo` appears in > > only half the files? Don't want to get schema conflicts with VARCHAR and > > nullable INT? Just do: > > > > SELECT name:VARCHAR, age:INT, foo:VARCHAR > > FROM `my-dir` ... > > > > Not only can the syntax be used to specify the "natural" type for a > > column, it might also specify a preferred type. For example. "age:INT" > says > > that "age" is an INT, even though JSON would normally parse it as a > BIGINT. > > Similarly, using this syntax is a easy way to tell Drill how to convert > CSV > > columns from strings to DATE, INT, FLOAT, etc. without the need for CAST > > functions. (CAST functions read the data in one format, then convert it > to > > another in a Project operator. Using a column type might let the reader > do > > the conversion -- something that is easy to implement if using the > "result > > set loader" mechanism.) > > > > Plus, the syntax fits nicely into the existing view file structure. If > the > > types appear in views, then client tools can continue to use standard SQL > > without the type information. > > > > When this idea came up in the past, someone mentioned the issue of > > nullable vs. non-nullable. (Let's also include arrays, since Drill > supports > > that. Maybe add a suffix to the the name: > > > > SELECT req:VARCHAR NOT NULL, opt:INT NULL, arr:FLOAT[] FROM ... > > > > Not pretty, but works with the existing SQL syntax rules. > > > > Obviously, Drill has much on its plate, so not suggestion that Drill > > should do this soon. Just passing it along as yet another option to > > consider. > > > > Thanks, > > - Paul > > > > [1] http://www.cs.columbia.edu/~jrzhou/pub/Scope-VLDBJ.pdf >
