Parth Chandra created DRILL-3551:
------------------------------------

             Summary: CTAS from complex Json source with schema change  is not 
written (and hence not read back ) correctly
                 Key: DRILL-3551
                 URL: https://issues.apache.org/jira/browse/DRILL-3551
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Data Types
    Affects Versions: 1.1.0
            Reporter: Parth Chandra
            Assignee: Hanifi Gunes
            Priority: Critical
             Fix For: 1.2.0


The source data contains - 

20K rows with the following - 
{"some":"yes","others":{"other":"true","all":"false","sometimes":"yes"}}   

200 rows with the following - 
{"some":"yes","others":{"other":"true","all":"false","sometimes":"yes","additional":"last
entries only"}}

Creating a table and reading it back returns incorrect data - 

CREATE TABLE testparquet as select * from `test.json`;
SELECT * from testparquet;

Yields 

| yes  | {"other":"true","all":"false","sometimes":"yes"}  |
| yes  | {"other":"true","all":"false","sometimes":"yes"}  |
| yes  | {"other":"true","all":"false","sometimes":"yes"}  |
| yes  | {"other":"true","all":"false","sometimes":"yes"}  |

The "additional" field is missing in all records

Parquet metadata for the created file does not have the 'additional' field 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to