[
https://issues.apache.org/jira/browse/DRILL-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Venki Korukanti updated DRILL-2342:
-----------------------------------
Attachment: DRILL-2342-3.patch
Updated patch to consider "nullable" property as "true" if the view definition
doesn't contain it. Manually tested by running DESCRIBE on views that are
created using master version of Drill.
> Nullability property of the view created from parquet file is not correct
> -------------------------------------------------------------------------
>
> Key: DRILL-2342
> URL: https://issues.apache.org/jira/browse/DRILL-2342
> Project: Apache Drill
> Issue Type: Bug
> Components: Metadata
> Affects Versions: 0.8.0
> Reporter: Victoria Markman
> Assignee: Venki Korukanti
> Priority: Critical
> Fix For: 0.9.0
>
> Attachments: DRILL-2342-1.patch, DRILL-2342-3.patch,
> DRILL-2343-2.patch, t1.parquet
>
>
> Here is my t1 table definition:
> {code}
> message root {
> optional int32 a1;
> optional binary b1 (UTF8);
> optional int32 c1 (DATE);
> }
> {code}
> I created a view on top of it:
> {code}
> 0: jdbc:drill:schema=dfs> create view v1 as select cast(a1 as int), cast(b1
> as varchar(10)), cast(c1 as date) from t1;
> +------------+------------+
> | ok | summary |
> +------------+------------+
> | true | View 'v1' created successfully in 'dfs.aggregation' schema |
> +------------+------------+
> 1 row selected (0.096 seconds)
> {code}
> IS_NULLABLE says 'NO', which is incorrect.
> {code}
> 0: jdbc:drill:schema=dfs> describe v1;
> +-------------+------------+-------------+
> | COLUMN_NAME | DATA_TYPE | IS_NULLABLE |
> +-------------+------------+-------------+
> | EXPR$0 | INTEGER | NO |
> | EXPR$1 | VARCHAR | NO |
> | EXPR$2 | DATE | NO |
> +-------------+------------+-------------+
> 3 rows selected (0.067 seconds)
> {code}
> It is dangerous potentially, because if Calcite decided to take advantage
> over this property tomorrow and create an optimization where if column is not
> nullable "is null" predicate can be dropped, query : "select * from v1 where
> x is null" would return incorrect result.
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select * from v1 where z is null;
> +------------+------------+
> | text | json |
> +------------+------------+
> | 00-00 Screen
> 00-01 Project(x=[$0], y=[$1], z=[$2])
> 00-02 SelectionVectorRemover
> 00-03 Filter(condition=[IS NULL($2)])
> 00-04 Project(x=[CAST($2):ANY NOT NULL], y=[CAST($1):ANY NOT
> NULL], z=[CAST($0):ANY NOT NULL])
> 00-05 Scan(groupscan=[ParquetGroupScan
> [entries=[ReadEntryWithPath [path=maprfs:/aggregation/t1]],
> selectionRoot=/aggregation/t1, numFiles=1, columns=[`a1`, `b1`, `c1`]]])
> {code}
> It seems to me that in views column properties should be always nullable.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)