[ https://issues.apache.org/jira/browse/DRILL-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Denys Ordynskiy closed DRILL-6874. ---------------------------------- Successfully tested on parquet, json and csv files in CTAS. > CTAS from json to parquet is not working on S3 storage > ------------------------------------------------------ > > Key: DRILL-6874 > URL: https://issues.apache.org/jira/browse/DRILL-6874 > Project: Apache Drill > Issue Type: Bug > Affects Versions: 1.14.0 > Reporter: Denys Ordynskiy > Assignee: Bohdan Kazydub > Priority: Major > Labels: ready-to-commit > Fix For: 1.15.0 > > Attachments: ctasjsontoparquet.zip, drillbit.log, > drillbit_queries.json, s3src.json, sqlline.log > > > Json file "s3src.json" was uploaded to the s3 storage. > Query from Json works fine: > select * from s3.tmp.`s3src.json`; > | id | first_name | last_name | > | 1 | first_name1 | last_name1 | > | 2 | first_name2 | last_name2 | > | 3 | first_name3 | last_name3 | > | 4 | first_name4 | last_name4 | > | 5 | first_name5 | last_name5 | > 5 rows selected (2.803 seconds) > CTAS from this json file returns successfully result: > create table s3.tmp.`ctasjsontoparquet` as select * from s3.tmp.`s3src.json`; > | Fragment | Number of records written | > | 0_0 | 5 | > 1 row selected (9.264 seconds) > *Query from the created parquet table {color:#d04437}throws an error:{color}* > select * from s3.tmp.`ctasjsontoparquet`; > {code:java} > Error: INTERNAL_ERROR ERROR: Error in parquet record reader. > Message: Failure in setting up reader > Parquet Metadata: ParquetMetaData{FileMetaData{schema: message root { > optional int64 id; > optional binary first_name (UTF8); > optional binary last_name (UTF8); > } > , metadata: {drill-writer.version=2, drill.version=1.15.0-SNAPSHOT}}, blocks: > [BlockMetaData{5, 360 [ColumnMetaData{UNCOMPRESSED [id] optional int64 id > [BIT_PACKED, RLE, PLAIN], 4}, ColumnMetaData{UNCOMPRESSED [first_name] > optional binary first_name (UTF8) [BIT_PACKED, RLE, PLAIN], 111}, > ColumnMetaData{UNCOMPRESSED [last_name] optional binary last_name (UTF8) > [BIT_PACKED, RLE, PLAIN], 241}]}]} > Fragment 0:0 > Please, refer to logs for more information. > [Error Id: 885723e4-8385-4fb0-87dd-c08b0570db95 on maprhost:31010] > (state=,code=0) > {code} > The same CTAS query works fine on MapRFS and FileSystem storages. > Log files, json file and created parquet file from S3 are in the attachments. -- This message was sent by Atlassian JIRA (v7.6.3#76005)