Would you attach a sample input file manifesting the problem? My impression from outset was that a field selection bug that we recently fixed might have caused this.
Thanks. -Hanifi On Wed, Jul 29, 2015 at 5:07 PM, Stefán Baxter <[email protected]> wrote: > Hi, > > I think that this problem only showed it self for large datasets where > assumptions were being made after 1k records. > > Were you able to reproduce this with a smaller set? > > Regards, > -Stefan > > > On Wed, Jul 29, 2015 at 2:01 PM, Hanifi Gunes (JIRA) <[email protected]> > wrote: > > > > > [ > > > https://issues.apache.org/jira/browse/DRILL-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > > ] > > > > Hanifi Gunes resolved DRILL-3551. > > --------------------------------- > > Resolution: Fixed > > > > Tested on a small input file of 20 mixed records with and w/o the > > additional field. Looks like the good old field projection problem > surfaces > > here. So quite likely fixed by DRILL-3476. Please re-open attaching an > > input file if not fixed. > > > > > CTAS from complex Json source with schema change is not written (and > > hence not read back ) correctly > > > > > > ----------------------------------------------------------------------------------------------------- > > > > > > Key: DRILL-3551 > > > URL: https://issues.apache.org/jira/browse/DRILL-3551 > > > Project: Apache Drill > > > Issue Type: Bug > > > Components: Execution - Data Types > > > Affects Versions: 1.1.0 > > > Reporter: Parth Chandra > > > Assignee: Hanifi Gunes > > > Priority: Critical > > > Fix For: 1.2.0 > > > > > > > > > The source data contains - > > > 20K rows with the following - > > > > {"some":"yes","others":{"other":"true","all":"false","sometimes":"yes"}} > > > 200 rows with the following - > > > > > > {"some":"yes","others":{"other":"true","all":"false","sometimes":"yes","additional":"last > > > entries only"}} > > > Creating a table and reading it back returns incorrect data - > > > CREATE TABLE testparquet as select * from `test.json`; > > > SELECT * from testparquet; > > > Yields > > > | yes | {"other":"true","all":"false","sometimes":"yes"} | > > > | yes | {"other":"true","all":"false","sometimes":"yes"} | > > > | yes | {"other":"true","all":"false","sometimes":"yes"} | > > > | yes | {"other":"true","all":"false","sometimes":"yes"} | > > > The "additional" field is missing in all records > > > Parquet metadata for the created file does not have the 'additional' > > field > > > > > > > > -- > > This message was sent by Atlassian JIRA > > (v6.3.4#6332) > > >
