I think this specific issue is worth filing a new bug for, since there wasn't a specific ticket on my radar that covered the issue of getSchemaFromString not working properly in this use case.
2012/2/5 Russell Jurney <russell.jur...@gmail.com> > Do I file this, or is it a dupe? I saw lots of existing tickets that look > similar. > > > On Sun, Feb 5, 2012 at 1:53 PM, Dmitriy Ryaboy <dvrya...@gmail.com> wrote: > >> That tuple name has been named optional but I guess some places still >> assume it exists. >> + jon. >> >> On Sun, Feb 5, 2012 at 1:16 AM, Russell Jurney <russell.jur...@gmail.com >> >wrote: >> >> > This now seems like a bug in Utils.getSchemaFromString >> > >> > On Sun, Feb 5, 2012 at 1:02 AM, Russell Jurney < >> russell.jur...@gmail.com >> > >wrote: >> > >> > > To answer my own question, this is because the schemas differ. The >> > schema >> > > in the working case has a named tuple via AvroStorage. Storing to >> Mongo >> > > works when I name the tuple: >> > > >> > > ... >> > > sent_topics = FOREACH froms GENERATE FLATTEN(group) AS (from, to), >> > > pairs.subject AS pairs:bag {column:tuple (subject:chararray)}; >> > > >> > > STORE sent_topics INTO 'mongodb://localhost/test.pigola' USING >> > > MongoStorage(); >> > > >> > > >> > > I will stop cross-posting to myself now. >> > > >> > > >> > > On Sun, Feb 5, 2012 at 12:47 AM, Russell Jurney < >> > russell.jur...@gmail.com>wrote: >> > > >> > >> sent_topics = LOAD '/tmp/pair_titles.avro' USING AvroStorage(); >> > >> STORE sent_topics INTO 'mongodb://localhost/test.pigola' USING >> > >> MongoStorage(); >> > >> >> > >> That works. Why is it the case that MongoStorage only works if the >> > >> intermediate processing doesn't happen? Strangeness. >> > >> >> > >> On Sun, Feb 5, 2012 at 12:31 AM, Russell Jurney < >> > russell.jur...@gmail.com >> > >> > wrote: >> > >> >> > >>> MongoStorage is failing for me now, on a script that was failing >> > before. >> > >>> Is anyone else using it? The schema is [from:chararray, >> to:chararray, >> > >>> pairs:{null:(subject:chararray)}], which worked before. >> > >>> >> > >>> 2012-02-05 00:27:54,991 [Thread-15] INFO >> > >>> com.mongodb.hadoop.pig.MongoStorage - Store Location Config: >> > >>> Configuration: core-default.xml, core-site.xml, mapred-default.xml, >> > >>> mapred-site.xml, >> > >>> /tmp/hadoop-rjurney/mapred/local/localRunner/job_local_0001.xml For >> > URI: >> > >>> mongodb://localhost/test.pigola >> > >>> 2012-02-05 00:27:54,993 [Thread-15] INFO >> > >>> com.mongodb.hadoop.pig.MongoStorage - OutputFormat... >> > >>> com.mongodb.hadoop.MongoOutputFormat@4eb7cd92 >> > >>> 2012-02-05 00:27:55,291 [Thread-15] INFO >> > >>> com.mongodb.hadoop.pig.MongoStorage - Preparing to write to >> > >>> com.mongodb.hadoop.output.MongoRecordWriter@333ec758 >> > >>> Failed to parse: <line 1, column 35> rule identifier failed >> predicate: >> > >>> {!input.LT(1).getText().equalsIgnoreCase("NULL")}? >> > >>> at >> > >>> >> > >> org.apache.pig.parser.QueryParserDriver.parseSchema(QueryParserDriver.java:79) >> > >>> at >> > >>> >> > >> org.apache.pig.parser.QueryParserDriver.parseSchema(QueryParserDriver.java:93) >> > >>> at org.apache.pig.impl.util.Utils.parseSchema(Utils.java:175) >> > >>> at >> org.apache.pig.impl.util.Utils.getSchemaFromString(Utils.java:166) >> > >>> at >> > >>> >> > >> com.mongodb.hadoop.pig.MongoStorage.prepareToWrite(MongoStorage.java:186) >> > >>> at >> > >>> >> > >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.<init>(PigOutputFormat.java:125) >> > >>> at >> > >>> >> > >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:86) >> > >>> at >> > >>> >> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:553) >> > >>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) >> > >>> at >> > >>> >> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) >> > >>> 2012-02-05 00:27:55,320 [Thread-15] INFO >> > >>> com.mongodb.hadoop.pig.MongoStorage - Stored Schema: >> [from:chararray, >> > >>> to:chararray, pairs:{null:(subject:chararray)}] >> > >>> 2012-02-05 00:27:55,323 [Thread-15] WARN >> > >>> org.apache.hadoop.mapred.LocalJobRunner - job_local_0001 >> > >>> java.io.IOException: java.lang.NullPointerException >> > >>> at >> > >>> >> > >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:464) >> > >>> at >> > >>> >> > >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:427) >> > >>> at >> > >>> >> > >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:407) >> > >>> at >> > >>> >> > >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:261) >> > >>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176) >> > >>> at >> > org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566) >> > >>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) >> > >>> at >> > >>> >> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) >> > >>> Caused by: java.lang.NullPointerException >> > >>> at com.mongodb.hadoop.pig.MongoStorage.putNext(MongoStorage.java:68) >> > >>> at >> > >>> >> > >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) >> > >>> at >> > >>> >> > >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) >> > >>> at >> > >>> >> > >> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:508) >> > >>> at >> > >>> >> > >> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) >> > >>> at >> > >>> >> > >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:462) >> > >>> ... 7 more >> > >>> >> > >>> >> > >>> -- >> > >>> Russell Jurney >> > >>> twitter.com/rjurney >> > >>> russell.jur...@gmail.com >> > >>> datasyndrome.com >> > >>> >> > >> >> > >> >> > >> >> > >> -- >> > >> Russell Jurney >> > >> twitter.com/rjurney >> > >> russell.jur...@gmail.com >> > >> datasyndrome.com >> > >> >> > > >> > > >> > > >> > > -- >> > > Russell Jurney >> > > twitter.com/rjurney >> > > russell.jur...@gmail.com >> > > datasyndrome.com >> > > >> > >> > >> > >> > -- >> > Russell Jurney >> > twitter.com/rjurney >> > russell.jur...@gmail.com >> > datasyndrome.com >> > >> > > > > -- > Russell Jurney > twitter.com/rjurney > russell.jur...@gmail.com > datasyndrome.com >