Do you observe the same thing when running without Hadoop? (cat, map, sort and then reduce)
Could you provide the counters of your job? You should be able to get them using the job tracker interface. The most probable answer without more information would be that your reducer do not output any <key,value>s. Regards Bertrand On Thu, Aug 30, 2012 at 5:52 AM, Periya.Data <periya.d...@gmail.com> wrote: > Hi All, > My Hadoop streaming job (in Python) runs to "completion" (both map and > reduce says 100% complete). But, when I look at the output directory in > HDFS, the part files are empty. I do not know what might be causing this > behavior. I understand that the percentages represent the records that have > been read in (not processed). > > The following are some of the logs. The detailed logs from Cloudera Manager > says that there were no Map Outputs...which is interesting. Any > suggestions? > > > 12/08/30 03:27:14 INFO streaming.StreamJob: To kill this job, run: > 12/08/30 03:27:14 INFO streaming.StreamJob: /usr/lib/hadoop-0.20/bin/hadoop > job -Dmapred.job.tracker=xxxxx.yyy.com:8021 -kill job_201208232245_3182 > 12/08/30 03:27:14 INFO streaming.StreamJob: Tracking URL: > http://xxxxxx.yyyy.com:60030/jobdetails.jsp?jobid=job_201208232245_3182 > 12/08/30 03:27:15 INFO streaming.StreamJob: map 0% reduce 0% > 12/08/30 03:27:20 INFO streaming.StreamJob: map 33% reduce 0% > 12/08/30 03:27:23 INFO streaming.StreamJob: map 67% reduce 0% > 12/08/30 03:27:29 INFO streaming.StreamJob: map 100% reduce 0% > 12/08/30 03:27:33 INFO streaming.StreamJob: map 100% reduce 100% > 12/08/30 03:27:35 INFO streaming.StreamJob: Job complete: > job_201208232245_3182 > 12/08/30 03:27:35 INFO streaming.StreamJob: Output: /user/GHU > Thu Aug 30 03:27:24 GMT 2012 > *** END > bash-3.2$ > bash-3.2$ hadoop fs -ls /user/ghu/ > Found 5 items > -rw-r--r-- 3 ghu hadoop 0 2012-08-30 03:27 /user/GHU/_SUCCESS > drwxrwxrwx - ghu hadoop 0 2012-08-30 03:27 /user/GHU/_logs > -rw-r--r-- 3 ghu hadoop 0 2012-08-30 03:27 /user/GHU/part-00000 > -rw-r--r-- 3 ghu hadoop 0 2012-08-30 03:27 /user/GHU/part-00001 > -rw-r--r-- 3 ghu hadoop 0 2012-08-30 03:27 /user/GHU/part-00002 > bash-3.2$ > > -------------------------------------------------------------------------------------------------------------------- > > > Metadata Status Succeeded Type MapReduce Id job_201208232245_3182 > Name CaidMatch > User srisrini Mapper class PipeMapper Reducer class > Scheduler pool name default Job input directory > hdfs://xxxxx.yyy.txt,hdfs://xxxx.yyyy.com/user/GHUcaidlist.txt Job output > directory hdfs://xxxx.yyyy.com/user/GHU/ Timing > Duration 20.977s Submit time Wed, 29 Aug 2012 08:27 PM Start time Wed, 29 > Aug 2012 08:27 PM Finish time Wed, 29 Aug 2012 08:27 PM > > > > > > > Progress and Scheduling Map Progress > 100.0% > Reduce Progress > 100.0% > Launched maps 4 Data-local maps 3 Rack-local maps 1 Other local maps > Desired maps 3 Launched reducers > Desired reducers 0 Fairscheduler running tasks > Fairscheduler minimum share > Fairscheduler demand > Current Resource Usage Current User CPUs 0 Current System CPUs 0 > Resident > memory 0 B Running maps 0 Running reducers 0 Aggregate Resource Usage > and Counters User CPU 0s System CPU 0s Map Slot Time 12.135s Reduce slot > time 0s Cumulative disk reads > Cumulative disk writes 155.0 KiB Cumulative HDFS reads 3.6 KiB > Cumulative > HDFS writes > Map input bytes 2.5 KiB Map input records 45 Map output records 0 > Reducer > input groups > Reducer input records > Reducer output records > Reducer shuffle bytes > Spilled records > -- Bertrand Dechoux