Just an fyi I dropped a comment under the issue. -H+
On Wed, Jul 29, 2015 at 5:40 PM, Hanifi Gunes <[email protected]> wrote: > Would you attach a sample input file manifesting the problem? My > impression from outset was that a field selection bug that we recently > fixed might have caused this. > > > Thanks. > -Hanifi > > On Wed, Jul 29, 2015 at 5:07 PM, Stefán Baxter <[email protected]> > wrote: > >> Hi, >> >> I think that this problem only showed it self for large datasets where >> assumptions were being made after 1k records. >> >> Were you able to reproduce this with a smaller set? >> >> Regards, >> -Stefan >> >> >> On Wed, Jul 29, 2015 at 2:01 PM, Hanifi Gunes (JIRA) <[email protected]> >> wrote: >> >> > >> > [ >> > >> https://issues.apache.org/jira/browse/DRILL-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel >> > ] >> > >> > Hanifi Gunes resolved DRILL-3551. >> > --------------------------------- >> > Resolution: Fixed >> > >> > Tested on a small input file of 20 mixed records with and w/o the >> > additional field. Looks like the good old field projection problem >> surfaces >> > here. So quite likely fixed by DRILL-3476. Please re-open attaching an >> > input file if not fixed. >> > >> > > CTAS from complex Json source with schema change is not written (and >> > hence not read back ) correctly >> > > >> > >> ----------------------------------------------------------------------------------------------------- >> > > >> > > Key: DRILL-3551 >> > > URL: https://issues.apache.org/jira/browse/DRILL-3551 >> >> > > Project: Apache Drill >> > > Issue Type: Bug >> > > Components: Execution - Data Types >> > > Affects Versions: 1.1.0 >> > > Reporter: Parth Chandra >> > > Assignee: Hanifi Gunes >> > > Priority: Critical >> > > Fix For: 1.2.0 >> > > >> > > >> > > The source data contains - >> > > 20K rows with the following - >> > > >> {"some":"yes","others":{"other":"true","all":"false","sometimes":"yes"}} >> > > 200 rows with the following - >> > > >> > >> {"some":"yes","others":{"other":"true","all":"false","sometimes":"yes","additional":"last >> > > entries only"}} >> > > Creating a table and reading it back returns incorrect data - >> > > CREATE TABLE testparquet as select * from `test.json`; >> > > SELECT * from testparquet; >> > > Yields >> > > | yes | {"other":"true","all":"false","sometimes":"yes"} | >> > > | yes | {"other":"true","all":"false","sometimes":"yes"} | >> > > | yes | {"other":"true","all":"false","sometimes":"yes"} | >> > > | yes | {"other":"true","all":"false","sometimes":"yes"} | >> > > The "additional" field is missing in all records >> > > Parquet metadata for the created file does not have the 'additional' >> > field >> > >> > >> > >> > -- >> > This message was sent by Atlassian JIRA >> > (v6.3.4#6332) >> > >> > >
