[ 
https://issues.apache.org/jira/browse/TAJO-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14804800#comment-14804800
 ] 

ASF GitHub Bot commented on TAJO-1832:
--------------------------------------

Github user hyunsik commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/756#discussion_r39816943
  
    --- Diff: tajo-client/src/main/proto/ClientProtos.proto ---
    @@ -198,6 +198,7 @@ message CreateTableRequest {
       required TableProto meta = 4;
       required string path = 5;
       optional PartitionMethodProto partition = 6;
    +  required bool hasSelfDescSchema = 7;
    --- End diff --
    
    In this message scheme, SchemaProto is required. it may have an empty 
Schema? This semantic seems to be different from TableDesc. It's trivial thing. 
If we see more consistent semantic, the code would be more comprehensive.


> Well support for self-describing data formats
> ---------------------------------------------
>
>                 Key: TAJO-1832
>                 URL: https://issues.apache.org/jira/browse/TAJO-1832
>             Project: Tajo
>          Issue Type: New Feature
>          Components: Planner/Optimizer
>            Reporter: Jihoon Son
>            Assignee: Jihoon Son
>
> *Problem*
> Tajo already has a support for self-describing data formats like JSON, 
> Parquet, or ORC. While they are capable of providing schema information by 
> themselves, users must define schema to query on them with the current 
> implementation. To solve this inconvenience, we have to improve our query 
> planner to support self-describing data formats well. 
> *Solution*
> First, we need to allow omitting schema definition for the create table 
> statement. When a query is submitted for a self-describing table, the columns 
> which don't exist in that table will be filled with Nulls. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to