Re: Store Groups Separately

Dustin Whitney Mon, 10 Oct 2011 12:49:09 -0700

Ok, yes I found a Jira script that says there was a bug for what I'm
describing with a work around:
http://search-hadoop.com/m/N2nF12WQ02y/+%2522log+file%2522&subj=dynamically+calling+STORE


Thanks for your help!

On Mon, Oct 10, 2011 at 9:06 PM, Dustin Whitney
<[email protected]>wrote:

> Thanks for your help. I'm using Elastic Map Reduce, so Pig 0.6, and
> running:
>
> STORE FILES INTO '/mnt/output' USING
> org.apache.pig.piggybank.storage.MultiStorage('/mnt/output','0', 'gz',
> '\\t');
>
> And getting an error (stack trace below) that it can't create a directory.
> I see that it's creating a file called /mnt/output, but not a directory. Is
> this perhaps a bug in the version of Pig running on Elastic Map Reduce?
>
> Pig Stack Trace
> ---------------
> ERROR 2135: Received error from store function.Mkdirs failed to create
> /mnt/output/tmcustomer-2011-10-07-GET-200
>
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to
> store alias 699
>         at
> org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1004)
>         at org.apache.pig.PigServer.registerQuery(PigServer.java:386)
>         at
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:739)
>         at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324)
>         at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
>         at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
>         at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
>         at org.apache.pig.Main.main(Main.java:374)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
> 2135: Received error from store function.Mkdirs failed to create
> /mnt/output/tmcustomer-2011-10-07-GET-200
>         at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:140)
>         at
> org.apache.pig.backend.local.executionengine.LocalPigLauncher.runPipeline(LocalPigLauncher.java:149)
>         at
> org.apache.pig.backend.local.executionengine.LocalPigLauncher.launchPig(LocalPigLauncher.java:110)
>         at
> org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execute(LocalExecutionEngine.java:165)
>         at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:781)
>         at org.apache.pig.PigServer.execute(PigServer.java:774)
>         at org.apache.pig.PigServer.access$100(PigServer.java:90)
>         at org.apache.pig.PigServer$Graph.execute(PigServer.java:952)
>         at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:999)
>         ... 12 more
> Caused by: java.io.IOException: Mkdirs failed to create
> /mnt/output/tmcustomer-2011-10-07-GET-200
>         at
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:367)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:524)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:505)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:412)
>         at
> org.apache.pig.piggybank.storage.MultiStorage.createOutputStream(MultiStorage.java:205)
>         at
> org.apache.pig.piggybank.storage.MultiStorage.getStore(MultiStorage.java:225)
>         at
> org.apache.pig.piggybank.storage.MultiStorage.putNext(MultiStorage.java:246)
>         at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:127)
>         ... 20 more
>
> ================================================================================
>
>
> On Mon, Oct 10, 2011 at 8:36 PM, Norbert Burger 
> <[email protected]>wrote:
>
>> In case it's not obvious, you'd also need a FLATTEN(group) in there before
>> the FOREACH to break the tuple apart so that the fields could by
>> synthesized
>> into a filename.
>>
>> Norbert
>>
>> On Mon, Oct 10, 2011 at 12:57 PM, Jacob Perkins
>> <[email protected]>wrote:
>>
>> > You'll have to run a FOREACH...GENERATE over the data first and generate
>> > a single key to look like the filename you want. Then you can use
>> > MultiStorage() from the piggybank. See:
>> >
>> > org.apache.pig.piggybank.storage.MultiStorage
>> >
>> > in the pig api docs.
>> >
>> > --jacob
>> > @thedatachef
>> >
>> > On Mon, 2011-10-10 at 18:43 +0200, Dustin Whitney wrote:
>> > > Hello all,
>> > >
>> > > I'm new to Hadoop and Pig, and I've got a question.  I've got relation
>> > that
>> > > looks like this via GROUP
>> > >
>> > > ((customer1,2011-10-07,GET,200),{....})
>> > > ((customer1,2011-10-07,PUT,201),{....})
>> > > ((customer1,2011-10-07,PUT,202),{....})
>> > > ((customer2,2011-10-07,GET,200),{....})
>> > > ((customer2,2011-10-07,PUT,201),{....})
>> > > ((customer2,2011-10-07,PUT,202),{....})
>> > >
>> > >
>> > > I'd like each group (i.e. the data in the {...}) stored separately,
>> and
>> > I'd
>> > > like to use the values in the first tuple to name my file, so the
>> first
>> > file
>> > > would be customer1-2011-10-07-GET-200, and the second would be
>> > > customer1-2011-10-07-PUT-201, etc.  Is this possible? I can only see
>> how
>> > to
>> > > save a single full relation to file, and I can't find any
>> documentation
>> > that
>> > > states how I might use variables to name things.
>> > >
>> > > Thanks,
>> > > Dustin
>> >
>> >
>> >
>>
>
>

Re: Store Groups Separately

Reply via email to