Re: Problem with Pig Store command

hc busy Tue, 21 Sep 2010 14:49:26 -0700

probly because load failed.

W = load 'wordbag' using PigStorage(' ') as (f1:int, f2:int, name:chararray,
type:chararray);
T = group W all;
U = foreach T generate COUNT(W);
dump U;


will probably say that the wordbag contained nothing. Debug the loading
portion to fix this problem.




On Tue, Sep 21, 2010 at 1:50 PM, Alex Wang <wanga...@gmail.com> wrote:

> Hi,
>
>
>
> I am using pig 0.7.0 in hadoop mapreduce mode.
>
>
>
> The problem I have is that I simply can't use
>
>
>
> STORE INTO alias USING PigStorage();
>
>
>
> I can load dataset in, write UDFs to manipulate the dataset, but I can't
> store it. The output is a directory in HDFS with 0 bytes.
>
>
>
> As an example, I've been testing with a simple script:
>
>
>
> W = load 'wordbag' using PigStorage(' ') as (f1:int, f2:int,
> name:chararray,
> type:chararray);
>
> store W into 'wordtesting' using PigStorage(' ');
>
>
>
> I run the code in grunt, and the output of hadoop fs -ls is:
>
>
>
> drwxr-xr-x   - awang supergroup          0 2010-09-21 13:45
> /user/awang/wordtesting
>
>
>
> The grunt messages are:
>
>
>
> grunt> store filteredW into 'wordtesting' using PigStorage(' ');
>
> 2010-09-21 13:45:35,210 [main] INFO
> org.apache.pig.impl.logicalLayer.optimizer.PruneColumns
> - No column pruned for W
>
> 2010-09-21 13:45:35,210 [main] INFO
> org.apache.pig.impl.logicalLayer.optimizer.PruneColumns
> - No map keys pruned for W
>
> 2010-09-21 13:45:35,440 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
> - (Name: Store(hdfs://pineal:9000/user/awang/wordtesting:PigStorage(' ')) -
> 1-46 Operator Key: 1-46)
>
> 2010-09-21 13:45:35,498 [main] INFO
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
> - MR plan size before optimization: 1
>
> 2010-09-21 13:45:35,498 [main] INFO
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
> - MR plan size after optimization: 1
>
> 2010-09-21 13:45:35,549 [main] INFO
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
>
> 2010-09-21 13:45:38,100 [main] INFO
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - Setting up single store job
>
> 2010-09-21 13:45:38,166 [main] INFO
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 1 map-reduce job(s) waiting for submission.
>
> 2010-09-21 13:45:38,173 [Thread-15] WARN
>  org.apache.hadoop.mapred.JobClient
> - Use GenericOptionsParser for parsing the arguments. Applications should
> implement Tool for the same.
>
> 2010-09-21 13:45:38,307 [Thread-15] INFO
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat
> - Total input paths to process : 1
>
> 2010-09-21 13:45:38,307 [Thread-15] INFO
> org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil
> - Total input paths to process : 1
>
> 2010-09-21 13:45:38,670 [main] INFO
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - HadoopJobId: job_201009211320_0002
>
> 2010-09-21 13:45:38,670 [main] INFO
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - More information at:
> http://pineal:50030/jobdetails.jsp?jobid=job_201009211320_0002
>
> 2010-09-21 13:45:38,673 [main] INFO
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 0% complete
>
> 2010-09-21 13:45:48,755 [main] INFO
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 50% complete
>
> 2010-09-21 13:45:53,835 [main] INFO
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 100% complete
>
> 2010-09-21 13:45:53,835 [main] INFO
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Successfully stored result in:
> "hdfs://pineal:9000/user/awang/wordtesting"
>
> 2010-09-21 13:45:53,846 [main] INFO
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Records written : 1
>
> 2010-09-21 13:45:53,846 [main] INFO
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Bytes written : 20
>
> 2010-09-21 13:45:53,846 [main] INFO
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Spillable Memory Manager spill count : 0
>
> 2010-09-21 13:45:53,847 [main] INFO
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Proactive spill count : 0
>
> 2010-09-21 13:45:53,847 [main] INFO
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Success!
>
>
>
>
>
> I've been struggling with this for a long time…. It works if I have a one
> bytearray in my tuple, but once I defined my schema, it  no longer works.
>
>
>
> Anyone has any idea? Please help!! Thanks!
>
>
>
> Best regards,
>
> Alex
>

Re: Problem with Pig Store command

Reply via email to