[
https://issues.apache.org/jira/browse/DRILL-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chun Chang closed DRILL-3551.
-----------------------------
Assignee: Chun Chang (was: Hanifi Gunes)
verified fix.
{code}
0: jdbc:drill:schema=dfs.drillTestDirDropTabl> select count(*) from
`DRILL-3551.json`;
+---------+
| EXPR$0 |
+---------+
| 20203 |
+---------+
1 row selected (0.285 seconds)
0: jdbc:drill:schema=dfs.drillTestDirDropTabl> select count(*) from
`DRILL-3551.json` t where t.others.additional is not null;
+---------+
| EXPR$0 |
+---------+
| 201 |
+---------+
1 row selected (0.343 seconds)
0: jdbc:drill:schema=dfs.drillTestDirDropTabl> select count(*) from
`DRILL-3551.json` t where t.others.additional is null;
+---------+
| EXPR$0 |
+---------+
| 20002 |
+---------+
1 row selected (0.344 seconds)
0: jdbc:drill:schema=dfs.drillTestDirDropTabl> create table tp as select * from
`DRILL-3551.json`;
+-----------+----------------------------+
| Fragment | Number of records written |
+-----------+----------------------------+
| 0_0 | 20203 |
+-----------+----------------------------+
1 row selected (1.006 seconds)
0: jdbc:drill:schema=dfs.drillTestDirDropTabl> select count(*) from tp t where
t.others.additional is not null;
+---------+
| EXPR$0 |
+---------+
| 201 |
+---------+
1 row selected (0.381 seconds)
0: jdbc:drill:schema=dfs.drillTestDirDropTabl> select count(*) from tp t where
t.others.additional is null;
+---------+
| EXPR$0 |
+---------+
| 20002 |
+---------+
1 row selected (0.314 seconds)
{code}
> CTAS from complex Json source with schema change is not written (and hence
> not read back ) correctly
> -----------------------------------------------------------------------------------------------------
>
> Key: DRILL-3551
> URL: https://issues.apache.org/jira/browse/DRILL-3551
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Data Types
> Affects Versions: 1.1.0
> Reporter: Parth Chandra
> Assignee: Chun Chang
> Priority: Critical
> Fix For: 1.2.0
>
> Attachments: DRILL-3551.json
>
>
> The source data contains -
> 20K rows with the following -
> {"some":"yes","others":{"other":"true","all":"false","sometimes":"yes"}}
> 200 rows with the following -
> {"some":"yes","others":{"other":"true","all":"false","sometimes":"yes","additional":"last
> entries only"}}
> Creating a table and reading it back returns incorrect data -
> CREATE TABLE testparquet as select * from `test.json`;
> SELECT * from testparquet;
> Yields
> | yes | {"other":"true","all":"false","sometimes":"yes"} |
> | yes | {"other":"true","all":"false","sometimes":"yes"} |
> | yes | {"other":"true","all":"false","sometimes":"yes"} |
> | yes | {"other":"true","all":"false","sometimes":"yes"} |
> The "additional" field is missing in all records
> Parquet metadata for the created file does not have the 'additional' field
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)