Parth Chandra created DRILL-3551:
------------------------------------
Summary: CTAS from complex Json source with schema change is not
written (and hence not read back ) correctly
Key: DRILL-3551
URL: https://issues.apache.org/jira/browse/DRILL-3551
Project: Apache Drill
Issue Type: Bug
Components: Execution - Data Types
Affects Versions: 1.1.0
Reporter: Parth Chandra
Assignee: Hanifi Gunes
Priority: Critical
Fix For: 1.2.0
The source data contains -
20K rows with the following -
{"some":"yes","others":{"other":"true","all":"false","sometimes":"yes"}}
200 rows with the following -
{"some":"yes","others":{"other":"true","all":"false","sometimes":"yes","additional":"last
entries only"}}
Creating a table and reading it back returns incorrect data -
CREATE TABLE testparquet as select * from `test.json`;
SELECT * from testparquet;
Yields
| yes | {"other":"true","all":"false","sometimes":"yes"} |
| yes | {"other":"true","all":"false","sometimes":"yes"} |
| yes | {"other":"true","all":"false","sometimes":"yes"} |
| yes | {"other":"true","all":"false","sometimes":"yes"} |
The "additional" field is missing in all records
Parquet metadata for the created file does not have the 'additional' field
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)