Hi there, Hadoop and Pig are new to me. When using them recently, I met a problem and don’t know why. Maybe it’s easy for some one. Can anybody solve this? Thanks a lot!
It’s about MAPREDUCE. Here is my .pig in short: register ../biopig/target/biopig-job.jar; %default reads 'test.fas'; A = load '$reads' using gov.jgi.meta.pig.storage.FastaStorage as (id: chararray, d: int, seq: byte array); … blabla… … LG = foreach LG generate group.id1, group.id2; GAP = mapreduce 'GPartition.jar' STORE A into 'input' LOAD 'output' as (id:char array, read:chararray); dump GAP; Error message: Pig Stack Trace --------------- ERROR 1200: null org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. null at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1725) at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1420) at org.apache.pig.PigServer.parseAndBuild(PigServer.java:364) at org.apache.pig.PigServer.executeBatch(PigServer.java:389) at org.apache.pig.PigServer.executeBatch(PigServer.java:375) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:170) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:747) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:228) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:203) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) at org.apache.pig.Main.run(Main.java:608) at org.apache.pig.Main.main(Main.java:156) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: Failed to parse: null at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:198) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1712) ... 18 more Caused by: java.lang.NullPointerException at org.apache.pig.parser.LogicalPlanBuilder.unquote(LogicalPlanBuilder.java:1329) at org.apache.pig.parser.LogicalPlanGenerator.mr_clause(LogicalPlanGenerator.java:18238) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1911) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188) ... 19 more ================================================================================ As little useful message provided, I cannot figure out what caused this. Before the “GAP” line, it works fine. If you add “DUMP LG” before the “GAP” line, you can get proper results. So I think the “GAP" line causes the error. For the GPartition.jar, I tested it using “$ hadoop jar Partition.jar” and it works well. It read from the file “input” and store results to file “output”. But, in fact it’s not a real mapreduce task, no mapper class nor reducer class defined in it. It’s a serial program working with HDFS. Will this be a problem? Or, just some syntax errors in my pig file? BTW, I use Hadoop-2.7.1 and pig-0.15.0. Partition.jar under the same directory of .pig file.