[jira] [Commented] (DRILL-6178) Drill does not project extra columns in some cases

Paul Rogers (JIRA) Wed, 21 Feb 2018 20:57:32 -0800

    [ 
https://issues.apache.org/jira/browse/DRILL-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372408#comment-16372408
 ]


Paul Rogers commented on DRILL-6178:
------------------------------------

May not be directly relevant to this ticket, but text files are special. 
Unprojected columns are blank, not null. The reasoning is apparently that if 
the column did exist, it could never be null (as CSV only supports blanks, not 
nulls). So, to ensure that non-existent columns are compatible with existing 
columns, the non-existent columns are defined as blank non-nullable Varchar.

To be clear, imagine we have two files, one with (a) the other with (a, b). We 
do {{SELECT a, b FROM ourFile.csv}} When reading the first file, b is missing 
so we make it an empty non-nullable Varchar, In the second file, column b 
exists and is defined as a non-nullable Varchar. Since the two columns have the 
same name and type, they can be merged later in, say, a Merge Receiver.

Given this explanation, it is not clear why the example output has null 
columns. It should have blank columns as in the second column of the example 
output.

> Drill does not project extra columns in some cases
> --------------------------------------------------
>
>                 Key: DRILL-6178
>                 URL: https://issues.apache.org/jira/browse/DRILL-6178
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 1.12.0
>            Reporter: Robert Hou
>            Assignee: Pritesh Maker
>            Priority: Major
>         Attachments: 10.tbl
>
>
> Drill is supposed to project extra columns as null columns.  This table has 
> 10 columns.  The extra columns are shown as null:
> {noformat}
> 0: jdbc:drill:zk=10.10.104.85:5181> select columns[0], columns[3], 
> columns[4], columns[5], columns[6], columns[7], columns[8], columns[9], 
> columns[10], columns[11], columns[12], columns[13], columns[14], columns[15] 
> from `resource-manager/1.tbl`;
> +---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+----------+----------+----------+----------+
> | EXPR$0 | EXPR$1 | EXPR$2 | EXPR$3 | EXPR$4 | EXPR$5 | EXPR$6 | EXPR$7 | 
> EXPR$8 | EXPR$9 | EXPR$10 | EXPR$11 | EXPR$12 | EXPR$13 |
> +---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+----------+----------+----------+----------+
> | 1 | | null | null | null | null | -61 | -255.0 | null | null | null | null 
> | null | null |
> +---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+----------+----------+----------+----------+{noformat}
>  
> If I run the same query against a table with 10 rows and 10 columns (attached 
> to the Jira), only the 10 columns are shown.
>  
> {noformat}
> select columns[0], columns[1], columns[2], columns[3], columns[4], 
> columns[5], columns[6], columns[7], columns[8], columns[9], columns[10], 
> columns[11], columns[12], columns[13], columns[14], columns[15] from 
> `10.tbl`{noformat}
>  
>  
>  5kwidecolumns_500k.tbl



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (DRILL-6178) Drill does not project extra columns in some cases

Reply via email to