David Chen created TAJO-809:
-------------------------------

             Summary: Langauge extension for non-scalar types
                 Key: TAJO-809
                 URL: https://issues.apache.org/jira/browse/TAJO-809
             Project: Tajo
          Issue Type: New Feature
            Reporter: David Chen


This ticket is to track the work for defining the syntax for nested schemas, 
maps, arrays, and unions and the work for adding the syntax to the parser. 
Initially, we can add stubs for the parser endpoints that will then be fleshed 
out when support for the data type is actually implemented (see other subtasks 
of TAJO-710).

I have an idea of a possible DDL syntax for these types, and I would like to 
get your feedback on it. I considered just using Hive's syntax but I felt that 
it was not the best syntax for these types.

Instead of calling nested records "structs" like the way Hive does, I simply 
call them records as well and use the same syntax used for declaring the 
top-level record fields:

{code}
create table record_example (
    nested_field record (
      field1 int,
      field2 double),
    two_levels_nested record (
      inner_nested record (
        field3 string,
        field4 int),
      field5 int),
  ) using parquet;
{code}

For arrays, maps, and unions, I am using a syntax inspired by Scala's syntax 
for generics:

{code}
create table array_example (
        int_array array[int],
        record_array array[record (
                field1 int,
                field2 string)]
  ) using avro;

create table map_example (
        string_to_int map[string, int],
        int_to_record map[int, record (
                field1 string,
                field2 int)],
  ) using avro;

create table union_example (
        integers union[bit, smallint, integer, bigint]
  ) using parquet;
{code}

Of course, it is possible that when we implement these data types, we may make 
changes to the syntax, but for now, I think we should define an initial 
language. Once the initial syntax has stabilized, I will write a formal grammar 
for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to