Hi All,

I am using following test case with mr1+hdfs2, the mapreduce job succeed but 
there is no output data file "part-m-00000" is generated. Following is the 
detail of the test case and my current investigation. I want to trace this 
issue, please give your suggestions. Like which classes or functions I should 
pay attention to during debugging. Thanks~


cat $PIG_HOME/bin/test/student
lynn,28,3
ff,22,4
chen,27,5
John,20,4
Mary,25,4
Bill,30,5
Joe,40,4

Run into pig grunt via command "$PIG_HOME/bin/pig":
grunt> copyFromLocal $PIG_HOME/pig/bin/test/student /user/pig/student
grunt> A = load 'student' using PigStorage(',') as (name:chararray, age:int, 
gpa:float);
grunt> B = foreach A generate name;
grunt> store B into 'result';


The correct output folder "result" stored at hdfs should be like following:

hadoop fs -ls /user/pig/result
Found 3 items
-rw-r--r--   2 pig pig          0 2013-07-30 00:52 /user/pig/result/_SUCCESS
drwxr-xr-x   - pig pig          0 2013-07-30 00:52 /user/pig/result/_logs
-rw-r--r--   2 pig pig         23 2013-07-30 00:52 /user/pig/part-m-00000

But in this test case, there is no output data(part-m-00000) stored at hdfs,:
grunt> fs -ls /user/pig/result
Found 2 items
-rw-r--r--   1 pig pig          0 2013-07-30 01:37 /user/pig/result/_SUCCESS
drwx------   - pig pig          0 2013-07-30 01:37 /user/pig/result/_logs



During running the test case, I can see the output data can be generated at 
hdfs: 
"/user/pig/result/_temporary/_attempt_201308010000_0008_m_000000_0/part-m-00000".
 This "_temporary" file will be deleted at the end of this job. But file 
"part-m-00000" is not saved as "/user/biadmin/tmpuser0/part-m-00000" in hdfs 
via rename command.

Reply via email to