Great, I'll submit a patch later in the day. Best,
stan On Tue, Jan 24, 2012 at 11:29 AM, Bill Graham <[email protected]> wrote: > Oops, I meant to address Stan in my last email. :) > > ---------- Forwarded message ---------- > From: Bill Graham <[email protected]> > Date: Tue, Jan 24, 2012 at 8:28 AM > Subject: Re: Multiple files with AvroStorage and comma separated lists > To: [email protected] > > > Hi Philipp, > > This is in fact a bug, so if you wouldn't mind submitting the patch, that > would be great. > > thanks, > Bill > > > On Tue, Jan 24, 2012 at 8:22 AM, Stan Rosenberg < > [email protected]> wrote: > >> Philipp, >> >> I would say that it is a bug. I ran into the same problem some time >> ago. Essentially, AvroStorage does not recognize globs and does not >> recognize commas, both of which >> are supported by hadoop's FileInputFormat. I ended up patching >> AvroStorage to make it compatible with hadoop's semantics of input >> paths. I haven't submitted a patch though. >> If there is some interest, I'd be more than glad to submit it. >> >> Bets, >> >> stan >> >> >> On Tue, Jan 24, 2012 at 4:26 AM, Philipp <[email protected]> wrote: >> > Dear Pig users, >> > >> > I tried to load several files with AvroStorage by using a comma separated >> > list. The statement I used is: >> > >> > test_data= LOAD 'repo_1/part-r-00000.avro,repo_2/part-r-00000.avro' USING >> > org.apache.pig.piggybank.storage.avro.AvroStorage(); >> > >> > Pig states that no input paths were specified in job. Please see the >> > stacktrace below. >> > I tried pig version0.8.1-cdh3u2 and 0.9.1. >> > >> > Does anyone observe the same behavior? Is it a bug or a feature? >> > >> > Thanks, Philipp >> > >> > >> > >> > >> > >> > /Stacktrace:/ >> > >> > rg.apache.pig.backend.executionengine.ExecException: ERROR 2118: No input >> > paths specified in job >> > at >> > >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:282) >> > at >> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885) >> > at >> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779) >> > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730) >> > at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378) >> > at >> > >> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247) >> > at >> > org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279) >> > at java.lang.Thread.run(Thread.java:679) >> > Caused by: java.io.IOException: No input paths specified in job >> > at >> > >> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:186) >> > at >> > >> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241) >> > at >> > >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:270) >> > ... 7 more >> > >> > > > > -- > *Note that I'm no longer using my Yahoo! email address. Please email me at > [email protected] going forward.* > > > > -- > *Note that I'm no longer using my Yahoo! email address. Please email me at > [email protected] going forward.*
