Re: Problem with Pig Store command

hc busy Tue, 21 Sep 2010 15:22:00 -0700

I'm not sure then. Maybe ask other ppl for suggestions.

The fact that the output is not absolute seem suspicious, also try using ','
instead of space, did u try


store W into '*/tmp/*wordtesting' using PigStorage(',');

and see if that does the trick?

err, let's see... maybe you're looking at the wrong hadoop cluster? did you
try within the same grunt where you do the above store, do

ls /tmp/wordtesting

and see if that results in something, if so, your hadoop and pig are
pointing to different hadoop clusters.


imo.

On Tue, Sep 21, 2010 at 2:53 PM, Alex Wang <wanga...@gmail.com> wrote:

> Hi hc,
>
> Sorry that I didn't mention it. But load works ok. Here is a portion of the
> output of dump W
>
> (2162,4111,yellow,a)
> (4652,1317,yep,interjection)
> (157,60592,yes,interjection)
> (533,19459,yesterday,adv)
> (265,35058,yet,adv)
> (4040,1626,yield,n)
> (3339,2139,yield,v)
>
> Only the store command is not working...
>
> Alex
>
>
> On Tue, Sep 21, 2010 at 2:48 PM, hc busy <hc.b...@gmail.com> wrote:
>
> > probly because load failed.
> >
> > W = load 'wordbag' using PigStorage(' ') as (f1:int, f2:int,
> > name:chararray,
> > type:chararray);
> > T = group W all;
> > U = foreach T generate COUNT(W);
> > dump U;
> >
> > will probably say that the wordbag contained nothing. Debug the loading
> > portion to fix this problem.
> >
> >
> >
> >
> > On Tue, Sep 21, 2010 at 1:50 PM, Alex Wang <wanga...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > >
> > >
> > > I am using pig 0.7.0 in hadoop mapreduce mode.
> > >
> > >
> > >
> > > The problem I have is that I simply can't use
> > >
> > >
> > >
> > > STORE INTO alias USING PigStorage();
> > >
> > >
> > >
> > > I can load dataset in, write UDFs to manipulate the dataset, but I
> can't
> > > store it. The output is a directory in HDFS with 0 bytes.
> > >
> > >
> > >
> > > As an example, I've been testing with a simple script:
> > >
> > >
> > >
> > > W = load 'wordbag' using PigStorage(' ') as (f1:int, f2:int,
> > > name:chararray,
> > > type:chararray);
> > >
> > > store W into 'wordtesting' using PigStorage(' ');
> > >
> > >
> > >
> > > I run the code in grunt, and the output of hadoop fs -ls is:
> > >
> > >
> > >
> > > drwxr-xr-x   - awang supergroup          0 2010-09-21 13:45
> > > /user/awang/wordtesting
> > >
> > >
> > >
> > > The grunt messages are:
> > >
> > >
> > >
> > > grunt> store filteredW into 'wordtesting' using PigStorage(' ');
> > >
> > > 2010-09-21 13:45:35,210 [main] INFO
> > > org.apache.pig.impl.logicalLayer.optimizer.PruneColumns
> > > - No column pruned for W
> > >
> > > 2010-09-21 13:45:35,210 [main] INFO
> > > org.apache.pig.impl.logicalLayer.optimizer.PruneColumns
> > > - No map keys pruned for W
> > >
> > > 2010-09-21 13:45:35,440 [main] INFO
> > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
> > > - (Name: Store(hdfs://pineal:9000/user/awang/wordtesting:PigStorage('
> '))
> > -
> > > 1-46 Operator Key: 1-46)
> > >
> > > 2010-09-21 13:45:35,498 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
> > > - MR plan size before optimization: 1
> > >
> > > 2010-09-21 13:45:35,498 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
> > > - MR plan size after optimization: 1
> > >
> > > 2010-09-21 13:45:35,549 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> > > - mapred.job.reduce.markreset.buffer.percent is not set, set to default
> > 0.3
> > >
> > > 2010-09-21 13:45:38,100 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> > > - Setting up single store job
> > >
> > > 2010-09-21 13:45:38,166 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - 1 map-reduce job(s) waiting for submission.
> > >
> > > 2010-09-21 13:45:38,173 [Thread-15] WARN
> > >  org.apache.hadoop.mapred.JobClient
> > > - Use GenericOptionsParser for parsing the arguments. Applications
> should
> > > implement Tool for the same.
> > >
> > > 2010-09-21 13:45:38,307 [Thread-15] INFO
> > > org.apache.hadoop.mapreduce.lib.input.FileInputFormat
> > > - Total input paths to process : 1
> > >
> > > 2010-09-21 13:45:38,307 [Thread-15] INFO
> > > org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil
> > > - Total input paths to process : 1
> > >
> > > 2010-09-21 13:45:38,670 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - HadoopJobId: job_201009211320_0002
> > >
> > > 2010-09-21 13:45:38,670 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - More information at:
> > > http://pineal:50030/jobdetails.jsp?jobid=job_201009211320_0002
> > >
> > > 2010-09-21 13:45:38,673 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - 0% complete
> > >
> > > 2010-09-21 13:45:48,755 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - 50% complete
> > >
> > > 2010-09-21 13:45:53,835 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - 100% complete
> > >
> > > 2010-09-21 13:45:53,835 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - Successfully stored result in:
> > > "hdfs://pineal:9000/user/awang/wordtesting"
> > >
> > > 2010-09-21 13:45:53,846 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - Records written : 1
> > >
> > > 2010-09-21 13:45:53,846 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - Bytes written : 20
> > >
> > > 2010-09-21 13:45:53,846 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - Spillable Memory Manager spill count : 0
> > >
> > > 2010-09-21 13:45:53,847 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - Proactive spill count : 0
> > >
> > > 2010-09-21 13:45:53,847 [main] INFO
> > >
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - Success!
> > >
> > >
> > >
> > >
> > >
> > > I've been struggling with this for a long time…. It works if I have a
> one
> > > bytearray in my tuple, but once I defined my schema, it  no longer
> works.
> > >
> > >
> > >
> > > Anyone has any idea? Please help!! Thanks!
> > >
> > >
> > >
> > > Best regards,
> > >
> > > Alex
> > >
> >
>

Re: Problem with Pig Store command

Reply via email to