Re: no output written to HDFS

Bertrand Dechoux Wed, 29 Aug 2012 22:46:05 -0700

Do you observe the same thing when running without Hadoop? (cat, map, sort
and then reduce)


Could you provide the counters of your job? You should be able to get them
using the job tracker interface.

The most probable answer without more information would be that your
reducer do not output any <key,value>s.

Regards

Bertrand



On Thu, Aug 30, 2012 at 5:52 AM, Periya.Data <periya.d...@gmail.com> wrote:

> Hi All,
>    My Hadoop streaming job (in Python) runs to "completion" (both map and
> reduce says 100% complete). But, when I look at the output directory in
> HDFS, the part files are empty. I do not know what might be causing this
> behavior. I understand that the percentages represent the records that have
> been read in (not processed).
>
> The following are some of the logs. The detailed logs from Cloudera Manager
> says that there were no Map Outputs...which is interesting. Any
> suggestions?
>
>
> 12/08/30 03:27:14 INFO streaming.StreamJob: To kill this job, run:
> 12/08/30 03:27:14 INFO streaming.StreamJob: /usr/lib/hadoop-0.20/bin/hadoop
> job  -Dmapred.job.tracker=xxxxx.yyy.com:8021 -kill job_201208232245_3182
> 12/08/30 03:27:14 INFO streaming.StreamJob: Tracking URL:
> http://xxxxxx.yyyy.com:60030/jobdetails.jsp?jobid=job_201208232245_3182
> 12/08/30 03:27:15 INFO streaming.StreamJob:  map 0%  reduce 0%
> 12/08/30 03:27:20 INFO streaming.StreamJob:  map 33%  reduce 0%
> 12/08/30 03:27:23 INFO streaming.StreamJob:  map 67%  reduce 0%
> 12/08/30 03:27:29 INFO streaming.StreamJob:  map 100%  reduce 0%
> 12/08/30 03:27:33 INFO streaming.StreamJob:  map 100%  reduce 100%
> 12/08/30 03:27:35 INFO streaming.StreamJob: Job complete:
> job_201208232245_3182
> 12/08/30 03:27:35 INFO streaming.StreamJob: Output: /user/GHU
> Thu Aug 30 03:27:24 GMT 2012
> *** END
> bash-3.2$
> bash-3.2$ hadoop fs -ls /user/ghu/
> Found 5 items
> -rw-r--r--   3 ghu hadoop          0 2012-08-30 03:27 /user/GHU/_SUCCESS
> drwxrwxrwx   - ghu hadoop          0 2012-08-30 03:27 /user/GHU/_logs
> -rw-r--r--   3 ghu hadoop          0 2012-08-30 03:27 /user/GHU/part-00000
> -rw-r--r--   3 ghu hadoop          0 2012-08-30 03:27 /user/GHU/part-00001
> -rw-r--r--   3 ghu hadoop          0 2012-08-30 03:27 /user/GHU/part-00002
> bash-3.2$
>
> --------------------------------------------------------------------------------------------------------------------
>
>
> Metadata Status Succeeded  Type MapReduce  Id job_201208232245_3182
> Name CaidMatch
>  User srisrini  Mapper class PipeMapper  Reducer class
>  Scheduler pool name default  Job input directory
> hdfs://xxxxx.yyy.txt,hdfs://xxxx.yyyy.com/user/GHUcaidlist.txt  Job output
> directory hdfs://xxxx.yyyy.com/user/GHU/  Timing
> Duration 20.977s  Submit time Wed, 29 Aug 2012 08:27 PM  Start time Wed, 29
> Aug 2012 08:27 PM  Finish time Wed, 29 Aug 2012 08:27 PM
>
>
>
>
>
>
>  Progress and Scheduling Map Progress
> 100.0%
>  Reduce Progress
> 100.0%
>  Launched maps 4  Data-local maps 3  Rack-local maps 1  Other local maps
>  Desired maps 3  Launched reducers
>  Desired reducers 0  Fairscheduler running tasks
>  Fairscheduler minimum share
>  Fairscheduler demand
>  Current Resource Usage Current User CPUs 0  Current System CPUs 0
>  Resident
> memory 0 B  Running maps 0  Running reducers 0  Aggregate Resource Usage
> and Counters User CPU 0s  System CPU 0s  Map Slot Time 12.135s  Reduce slot
> time 0s  Cumulative disk reads
>  Cumulative disk writes 155.0 KiB  Cumulative HDFS reads 3.6 KiB
>  Cumulative
> HDFS writes
>  Map input bytes 2.5 KiB  Map input records 45  Map output records 0
>  Reducer
> input groups
>  Reducer input records
>  Reducer output records
>  Reducer shuffle bytes
>  Spilled records
>



-- 
Bertrand Dechoux

Re: no output written to HDFS

Reply via email to