I also just noticed the field names in these files were written with whitespace in them, and some with raw strings describing data types, like "(INT_16)", in case that makes a difference.
On Thu, May 7, 2015 at 3:46 PM, Andrew Musselman <andrew.mussel...@gmail.com > wrote: > Here's the schema of that record.parquet file(edited for brevity): > > $ ~/hadoop-2.6.0/bin/hadoop jar > ~/parquet-mr/parquet-tools/target/parquet-tools-1.8.0-SNAPSHOT.jar schema > ~/record.parquet > message some/message/tokens path/identifying/data/location { > repeated double a; > repeated double b; > repeated float c; > ... > repeated double z; > } > > > On Thu, May 7, 2015 at 3:41 PM, Andrew Musselman < > andrew.mussel...@gmail.com> wrote: > >> I'm trying to read a parquet file in Pig, using parquet-mr jars built >> from master. Should I be building from a release tag? >> >> Pig version is binary 0.14. >> >> grunt> register >> /home/akm/parquet-mr/parquet-*/target/parquet-*-1.8.0-SNAPSHOT.jar; >> grunt> a = load '/home/akm/record.parquet' using >> org.apache.parquet.pig.ParquetLoader; >> 2015-05-07 15:39:41,860 [main] INFO >> org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths >> to process : 1 >> 2015-05-07 15:39:41,878 [main] ERROR org.apache.pig.tools.grunt.Grunt - >> ERROR 2218: Invalid resource schema: bag schema must have tuple as its field >> Details at logfile: /home/akm/pig_1431036955635.log >> >> And in that logfile: >> >> Pig Stack Trace >> --------------- >> ERROR 2218: Invalid resource schema: bag schema must have tuple as its >> field >> >> Failed to parse: Can not retrieve schema from loader >> org.apache.parquet.pig.ParquetLoader@1be72d8 >> at >> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:201) >> at >> org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1707) >> at >> org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1680) >> at org.apache.pig.PigServer.registerQuery(PigServer.java:623) >> at >> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1063) >> at >> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:501) >> at >> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) >> at >> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205) >> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66) >> at org.apache.pig.Main.run(Main.java:558) >> at org.apache.pig.Main.main(Main.java:170) >> Caused by: java.lang.RuntimeException: Can not retrieve schema from >> loader org.apache.parquet.pig.ParquetLoader@1be72d8 >> at >> org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:91) >> at >> org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:901) >> at >> org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568) >> at >> org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625) >> at >> org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) >> at >> org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) >> at >> org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) >> at >> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191) >> ... 10 more >> Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR >> 2245: Cannot get schema from loadFunc org.apache.parquet.pig.ParquetLoader >> at >> org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:179) >> at >> org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:89) >> ... 17 more >> Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR >> 2218: Invalid resource schema: bag schema must have tuple as its field >> at >> org.apache.pig.ResourceSchema$ResourceFieldSchema.throwInvalidSchemaException(ResourceSchema.java:216) >> at >> org.apache.pig.impl.logicalLayer.schema.Schema.getPigSchema(Schema.java:1916) >> at >> org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:176) >> ... 18 more >> >> ================================================================================ >> >> >