Ok, yes I found a Jira script that says there was a bug for what I'm describing with a work around: http://search-hadoop.com/m/N2nF12WQ02y/+%2522log+file%2522&subj=dynamically+calling+STORE
Thanks for your help! On Mon, Oct 10, 2011 at 9:06 PM, Dustin Whitney <[email protected]>wrote: > Thanks for your help. I'm using Elastic Map Reduce, so Pig 0.6, and > running: > > STORE FILES INTO '/mnt/output' USING > org.apache.pig.piggybank.storage.MultiStorage('/mnt/output','0', 'gz', > '\\t'); > > And getting an error (stack trace below) that it can't create a directory. > I see that it's creating a file called /mnt/output, but not a directory. Is > this perhaps a bug in the version of Pig running on Elastic Map Reduce? > > Pig Stack Trace > --------------- > ERROR 2135: Received error from store function.Mkdirs failed to create > /mnt/output/tmcustomer-2011-10-07-GET-200 > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to > store alias 699 > at > org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1004) > at org.apache.pig.PigServer.registerQuery(PigServer.java:386) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:739) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141) > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) > at org.apache.pig.Main.main(Main.java:374) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR > 2135: Received error from store function.Mkdirs failed to create > /mnt/output/tmcustomer-2011-10-07-GET-200 > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:140) > at > org.apache.pig.backend.local.executionengine.LocalPigLauncher.runPipeline(LocalPigLauncher.java:149) > at > org.apache.pig.backend.local.executionengine.LocalPigLauncher.launchPig(LocalPigLauncher.java:110) > at > org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execute(LocalExecutionEngine.java:165) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781) > at org.apache.pig.PigServer.execute(PigServer.java:774) > at org.apache.pig.PigServer.access$100(PigServer.java:90) > at org.apache.pig.PigServer$Graph.execute(PigServer.java:952) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:999) > ... 12 more > Caused by: java.io.IOException: Mkdirs failed to create > /mnt/output/tmcustomer-2011-10-07-GET-200 > at > org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:367) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:524) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:505) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:412) > at > org.apache.pig.piggybank.storage.MultiStorage.createOutputStream(MultiStorage.java:205) > at > org.apache.pig.piggybank.storage.MultiStorage.getStore(MultiStorage.java:225) > at > org.apache.pig.piggybank.storage.MultiStorage.putNext(MultiStorage.java:246) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:127) > ... 20 more > > ================================================================================ > > > On Mon, Oct 10, 2011 at 8:36 PM, Norbert Burger > <[email protected]>wrote: > >> In case it's not obvious, you'd also need a FLATTEN(group) in there before >> the FOREACH to break the tuple apart so that the fields could by >> synthesized >> into a filename. >> >> Norbert >> >> On Mon, Oct 10, 2011 at 12:57 PM, Jacob Perkins >> <[email protected]>wrote: >> >> > You'll have to run a FOREACH...GENERATE over the data first and generate >> > a single key to look like the filename you want. Then you can use >> > MultiStorage() from the piggybank. See: >> > >> > org.apache.pig.piggybank.storage.MultiStorage >> > >> > in the pig api docs. >> > >> > --jacob >> > @thedatachef >> > >> > On Mon, 2011-10-10 at 18:43 +0200, Dustin Whitney wrote: >> > > Hello all, >> > > >> > > I'm new to Hadoop and Pig, and I've got a question. I've got relation >> > that >> > > looks like this via GROUP >> > > >> > > ((customer1,2011-10-07,GET,200),{....}) >> > > ((customer1,2011-10-07,PUT,201),{....}) >> > > ((customer1,2011-10-07,PUT,202),{....}) >> > > ((customer2,2011-10-07,GET,200),{....}) >> > > ((customer2,2011-10-07,PUT,201),{....}) >> > > ((customer2,2011-10-07,PUT,202),{....}) >> > > >> > > >> > > I'd like each group (i.e. the data in the {...}) stored separately, >> and >> > I'd >> > > like to use the values in the first tuple to name my file, so the >> first >> > file >> > > would be customer1-2011-10-07-GET-200, and the second would be >> > > customer1-2011-10-07-PUT-201, etc. Is this possible? I can only see >> how >> > to >> > > save a single full relation to file, and I can't find any >> documentation >> > that >> > > states how I might use variables to name things. >> > > >> > > Thanks, >> > > Dustin >> > >> > >> > >> > >
