[ 
https://issues.apache.org/jira/browse/TAJO-809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyunsik Choi reassigned TAJO-809:
---------------------------------

    Assignee: Hyunsik Choi  (was: David Chen)

> Language extension for non-scalar types
> ---------------------------------------
>
>                 Key: TAJO-809
>                 URL: https://issues.apache.org/jira/browse/TAJO-809
>             Project: Tajo
>          Issue Type: New Feature
>            Reporter: David Chen
>            Assignee: Hyunsik Choi
>
> This ticket is to track the work for defining the syntax for nested schemas, 
> maps, arrays, and unions and the work for adding the syntax to the parser. 
> Initially, we can add stubs for the parser endpoints that will then be 
> fleshed out when support for the data type is actually implemented (see other 
> subtasks of TAJO-710).
> I have an idea of a possible DDL syntax for these types, and I would like to 
> get your feedback on it. I considered just using Hive's syntax but I felt 
> that it was not the best syntax for these types.
> Instead of calling nested records "structs" like the way Hive does, I simply 
> call them records as well and use the same syntax used for declaring the 
> top-level record fields:
> {code}
> create table record_example (
>     nested_field record (
>       field1 int,
>       field2 double),
>     two_levels_nested record (
>       inner_nested record (
>         field3 string,
>         field4 int),
>       field5 int),
>   ) using parquet;
> {code}
> For arrays, maps, and unions, I am using a syntax inspired by Scala's syntax 
> for generics:
> {code}
> create table array_example (
>     int_array array[int],
>     record_array array[record (
>       field1 int,
>       field2 string)]
>   ) using avro;
> create table map_example (
>     string_to_int map[string, int],
>     int_to_record map[int, record (
>       field1 string,
>       field2 int)],
>   ) using avro;
> create table union_example (
>     integers union[bit, smallint, integer, bigint]
>   ) using parquet;
> {code}
> Of course, it is possible that when we implement these data types, we may 
> make changes to the syntax, but for now, I think we should define an initial 
> language. Once the initial syntax has stabilized, I will write a formal 
> grammar for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to