Raghu,

I change the code to what you sugguested, but I got an exception when i try
to store.
java.io.IOException: File already
exists:file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-a/part-r-00000
at
org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:228)
at
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
at
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:484)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:465)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:372)
at
com.bluekai.analytics.pig.storage.BkMultiStorage$MultiStorageOutputFormat$1.createOutputStream(BkMultiStorage.java:325)
at
com.bluekai.analytics.pig.storage.BkMultiStorage$MultiStorageOutputFormat$1.getStore(BkMultiStorage.java:304)
at
com.bluekai.analytics.pig.storage.BkMultiStorage$MultiStorageOutputFormat$1.getStore(BkMultiStorage.java:298)
at
com.bluekai.analytics.pig.storage.BkMultiStorage$MultiStorageOutputFormat$1.write(BkMultiStorage.java:285)
at
com.bluekai.analytics.pig.storage.BkMultiStorage$MultiStorageOutputFormat$1.write(BkMultiStorage.java:261)
at
com.bluekai.analytics.pig.storage.BkMultiStorage.putNext(BkMultiStorage.java:184)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
at
org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:508)
at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:395)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:381)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:250)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)

where prefix-a is dynamically generated based on my tuple.

final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-0/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-1/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-2/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-3/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-4/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-5/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-6/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-7/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-8/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-9/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-A/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-B/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-C/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-D/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-E/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-F/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-G/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-H/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-I/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-J/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-K/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-L/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-M/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-N/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-O/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-P/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-Q/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-R/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-S/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-T/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-U/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-V/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-W/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-X/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-Y/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-Z/part-r-00000
final output stores at
file:/user/bbuda/ids/_temporary/_attempt_local_0001_r_000000_0/prefix-a/part-r-00000

I am wondering if it is because the path is case insensitive?



Thanks,

Felix



On Fri, Nov 4, 2011 at 3:31 PM, Raghu Angadi <[email protected]> wrote:

> You need to set output path to
> '/Users/felix/Documents/pig/multi_store_output'
> in your setStoreLocation().
> Alternately for clarity, you could modify your store udf to be more like:
> store load_log INTO '/Users/felix/Documents/pig/multi_store_output' using
> MyMultiStorage('ns_{0}/site_{1}', '2,1', '1,2');
>
> The reason FileOutputFormat needs a real path is that, at run time hadoop
> actually uses a temporary path then move the output to correct path if the
> job succeeds.
>
> Raghu.
>
> On Thu, Nov 3, 2011 at 9:45 AM, Dmitriy Ryaboy <[email protected]> wrote:
>
> > Don't use FileOutputFormat? Or rather, use something that extends it and
> > overrides the validation.
> >
> > On Wed, Nov 2, 2011 at 3:19 PM, felix gao <[email protected]> wrote:
> >
> > > If you don't call that funciton. Hadoop is going to throw exception for
> > not
> > > having output set for the job.
> > > something like
> > > Caused by: org.apache.hadoop.mapred.InvalidJobConfException: Output
> > > directory not set.
> > > at
> > >
> > >
> >
> org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:120)
> > > at
> > >
> > >
> >
> org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:87)
> > >
> > > So i have to set it and then somehow delete it after pig completes.
> > >
> > >
> > >
> > >
> > > On Wed, Nov 2, 2011 at 3:00 PM, Ashutosh Chauhan <[email protected]
> > > >wrote:
> > >
> > > > Then, don't call FileOutputFormat.setOutputPath(job, new
> > Path(location));
> > > > Looks like I am missing something here.
> > > >
> > > > Ashutosh
> > > > On Wed, Nov 2, 2011 at 14:10, felix gao <[email protected]> wrote:
> > > >
> > > > > Ashutosh,
> > > > >
> > > > > I problem is I don't wan to use that location at all since I am
> > > > > constructing the output location based on tuple input. The location
> > is
> > > > just
> > > > > a dummy holder for me to substitute the right parameters
> > > > >
> > > > > Felix
> > > > >
> > > > > On Wed, Nov 2, 2011 at 10:47 AM, Ashutosh Chauhan <
> > > [email protected]
> > > > > >wrote:
> > > > >
> > > > > > Hey Felix,
> > > > > >
> > > > > > >> The only problem is that in the setStoreLocation function we
> > have
> > > to
> > > > > > call
> > > > > > >> FileOutputFormat.setOutputPath(job, new Path(location));
> > > > > >
> > > > > > Cant you massage location to appropriate string you want to?
> > > > > >
> > > > > > Ashutosh
> > > > > >
> > > > > > On Tue, Nov 1, 2011 at 18:07, felix gao <[email protected]>
> wrote:
> > > > > >
> > > > > > > I have wrote a custom store function that primarily based on
> the
> > > > > > > multi-storage store function.  They way I use it is
> > > > > > >
> > > > > > >
> > > > > > > store load_log INTO
> > > > > > > '/Users/felix/Documents/pig/multi_store_output/ns_{0}/site_{1}'
> > > using
> > > > > > > MyMultiStorage('2,1', '1,2');
> > > > > > > where {0} and {1} will be substituted with the tuple index at 0
> > and
> > > > > index
> > > > > > > at 1.  Everything is fine and all the data is written to the
> > > correct
> > > > > > place.
> > > > > > >  The only problem is that in the setStoreLocation function we
> > have
> > > to
> > > > > > call
> > > > > > > FileOutputFormat.setOutputPath(job, new Path(location)); i have
> > > > > > > 'Users/felix/Documents/pig/multi_store_output/ns_{0}/site_{1}'
> as
> > > my
> > > > > > output
> > > > > > > location so there is actually a folder created in my fs with
> > ns_{0}
> > > > > > > and site_{1}.  Is there a way to tell hadoop not to create
> those
> > > > output
> > > > > > > directory?
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Felix
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to