You can find the reduce container from RM's web page. BTW: from above log, you can check if application master crashes.
On Mon, Apr 14, 2014 at 3:12 PM, Rahul Singh <[email protected]>wrote: > how do i identify an reduce container? there are multiple container dirs > in my application id folder in userlogs. > > > On Mon, Apr 14, 2014 at 12:29 PM, Gordon Wang <[email protected]> wrote: > >> Hi Rahul, >> >> What is the log of reduce container ? Please paste the log and we can see >> the reason. >> >> >> On Mon, Apr 14, 2014 at 2:38 PM, Rahul Singh >> <[email protected]>wrote: >> >>> Hi, >>> I am running a job(wordcount example) on 3 node cluster(1 master and 2 >>> slave), some times the job passes but some times it fails(as reduce fails, >>> input data few kbs). >>> I am not able to nail down the reason of this inconsistency. >>> >>> >>> >>> failed log: >>> >>> 14/04/14 11:57:24 WARN util.NativeCodeLoader: Unable to load >>> native-hadoop library for your platform... using builtin-java classes where >>> applicable >>> 14/04/14 11:57:25 INFO client.RMProxy: Connecting to ResourceManager at / >>> 20.0.1.206:8032 >>> 14/04/14 11:57:26 INFO input.FileInputFormat: Total input paths to >>> process : 1 >>> 14/04/14 11:57:26 INFO mapreduce.JobSubmitter: number of splits:1 >>> 14/04/14 11:57:26 INFO mapreduce.JobSubmitter: Submitting tokens for >>> job: job_1397454060494_0003 >>> 14/04/14 11:57:26 INFO impl.YarnClientImpl: Submitted application >>> application_1397454060494_0003 >>> 14/04/14 11:57:26 INFO mapreduce.Job: The url to track the job: >>> http://20.0.1.206:8088/proxy/application_1397454060494_0003/ >>> 14/04/14 11:57:26 INFO mapreduce.Job: Running job: job_1397454060494_0003 >>> 14/04/14 11:57:34 INFO mapreduce.Job: Job job_1397454060494_0003 running >>> in uber mode : false >>> 14/04/14 11:57:34 INFO mapreduce.Job: map 0% reduce 0% >>> 14/04/14 11:57:40 INFO mapreduce.Job: map 100% reduce 0% >>> 14/04/14 11:57:46 INFO mapreduce.Job: map 100% reduce 13% >>> 14/04/14 11:57:48 INFO mapreduce.Job: map 100% reduce 25% >>> 14/04/14 11:57:49 INFO mapreduce.Job: map 100% reduce 38% >>> 14/04/14 11:57:50 INFO mapreduce.Job: map 100% reduce 50% >>> 14/04/14 11:57:54 INFO mapreduce.Job: Task Id : >>> attempt_1397454060494_0003_r_000003_0, Status : FAILED >>> 14/04/14 11:57:54 INFO mapreduce.Job: Task Id : >>> attempt_1397454060494_0003_r_000001_0, Status : FAILED >>> 14/04/14 11:57:56 INFO mapreduce.Job: Task Id : >>> attempt_1397454060494_0003_r_000005_0, Status : FAILED >>> 14/04/14 11:57:56 INFO mapreduce.Job: Task Id : >>> attempt_1397454060494_0003_r_000007_0, Status : FAILED >>> 14/04/14 11:58:02 INFO mapreduce.Job: map 100% reduce 63% >>> 14/04/14 11:58:04 INFO mapreduce.Job: map 100% reduce 75% >>> 14/04/14 11:58:09 INFO mapreduce.Job: Task Id : >>> attempt_1397454060494_0003_r_000003_1, Status : FAILED >>> 14/04/14 11:58:11 INFO mapreduce.Job: Task Id : >>> attempt_1397454060494_0003_r_000005_1, Status : FAILED >>> 14/04/14 11:58:24 INFO mapreduce.Job: Task Id : >>> attempt_1397454060494_0003_r_000003_2, Status : FAILED >>> 14/04/14 11:58:26 INFO mapreduce.Job: Task Id : >>> attempt_1397454060494_0003_r_000005_2, Status : FAILED >>> 14/04/14 11:58:40 INFO mapreduce.Job: map 100% reduce 100% >>> 14/04/14 11:58:40 INFO mapreduce.Job: Job job_1397454060494_0003 failed >>> with state FAILED due to: Task failed task_1397454060494_0003_r_000003 >>> Job failed as tasks failed. failedMaps:0 failedReduces:1 >>> >>> 14/04/14 11:58:40 INFO mapreduce.Job: Counters: 51 >>> File System Counters >>> FILE: Number of bytes read=80 >>> FILE: Number of bytes written=596766 >>> FILE: Number of read operations=0 >>> FILE: Number of large read operations=0 >>> FILE: Number of write operations=0 >>> HDFS: Number of bytes read=175 >>> HDFS: Number of bytes written=28 >>> HDFS: Number of read operations=21 >>> HDFS: Number of large read operations=0 >>> HDFS: Number of write operations=12 >>> Job Counters >>> Failed reduce tasks=9 >>> Killed reduce tasks=1 >>> Launched map tasks=1 >>> Launched reduce tasks=16 >>> Data-local map tasks=1 >>> Total time spent by all maps in occupied slots (ms)=3477 >>> Total time spent by all reduces in occupied slots (ms)=148867 >>> Total time spent by all map tasks (ms)=3477 >>> Total time spent by all reduce tasks (ms)=148867 >>> Total vcore-seconds taken by all map tasks=3477 >>> Total vcore-seconds taken by all reduce tasks=148867 >>> Total megabyte-seconds taken by all map tasks=3560448 >>> Total megabyte-seconds taken by all reduce tasks=152439808 >>> Map-Reduce Framework >>> Map input records=3 >>> Map output records=13 >>> Map output bytes=110 >>> Map output materialized bytes=112 >>> Input split bytes=117 >>> Combine input records=13 >>> Combine output records=6 >>> Reduce input groups=4 >>> Reduce shuffle bytes=80 >>> Reduce input records=4 >>> Reduce output records=4 >>> Spilled Records=10 >>> Shuffled Maps =6 >>> Failed Shuffles=0 >>> Merged Map outputs=6 >>> GC time elapsed (ms)=142 >>> CPU time spent (ms)=6420 >>> Physical memory (bytes) snapshot=1100853248 >>> Virtual memory (bytes) snapshot=4468314112 >>> Total committed heap usage (bytes)=1406992384 >>> Shuffle Errors >>> BAD_ID=0 >>> CONNECTION=0 >>> IO_ERROR=0 >>> WRONG_LENGTH=0 >>> WRONG_MAP=0 >>> WRONG_REDUCE=0 >>> File Input Format Counters >>> Bytes Read=58 >>> File Output Format Counters >>> Bytes Written=28 >>> >>> Job Passing Logs: >>> hadoop jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples-2.3.0.jar >>> wordcount /user/hduser/input /user/hduser/output_wordcount9 >>> 14/04/14 11:47:27 WARN util.NativeCodeLoader: Unable to load >>> native-hadoop library for your platform... using builtin-java classes where >>> applicable >>> 14/04/14 11:47:28 INFO client.RMProxy: Connecting to ResourceManager at / >>> 20.0.1.206:8032 >>> 14/04/14 11:47:28 INFO input.FileInputFormat: Total input paths to >>> process : 1 >>> 14/04/14 11:47:29 INFO mapreduce.JobSubmitter: number of splits:1 >>> 14/04/14 11:47:29 INFO mapreduce.JobSubmitter: Submitting tokens for >>> job: job_1397454060494_0002 >>> 14/04/14 11:47:29 INFO impl.YarnClientImpl: Submitted application >>> application_1397454060494_0002 >>> 14/04/14 11:47:29 INFO mapreduce.Job: The url to track the job: >>> http://20.0.1.206:8088/proxy/application_1397454060494_0002/ >>> 14/04/14 11:47:29 INFO mapreduce.Job: Running job: job_1397454060494_0002 >>> 14/04/14 11:47:36 INFO mapreduce.Job: Job job_1397454060494_0002 running >>> in uber mode : false >>> 14/04/14 11:47:36 INFO mapreduce.Job: map 0% reduce 0% >>> 14/04/14 11:47:50 INFO mapreduce.Job: Task Id : >>> attempt_1397454060494_0002_m_000000_0, Status : FAILED >>> 14/04/14 11:48:05 INFO mapreduce.Job: Task Id : >>> attempt_1397454060494_0002_m_000000_1, Status : FAILED >>> 14/04/14 11:48:20 INFO mapreduce.Job: Task Id : >>> attempt_1397454060494_0002_m_000000_2, Status : FAILED >>> 14/04/14 11:48:26 INFO mapreduce.Job: map 100% reduce 0% >>> 14/04/14 11:48:34 INFO mapreduce.Job: map 100% reduce 13% >>> 14/04/14 11:48:35 INFO mapreduce.Job: map 100% reduce 25% >>> 14/04/14 11:48:37 INFO mapreduce.Job: map 100% reduce 50% >>> 14/04/14 11:48:41 INFO mapreduce.Job: Task Id : >>> attempt_1397454060494_0002_r_000001_0, Status : FAILED >>> 14/04/14 11:48:42 INFO mapreduce.Job: Task Id : >>> attempt_1397454060494_0002_r_000003_0, Status : FAILED >>> 14/04/14 11:48:43 INFO mapreduce.Job: Task Id : >>> attempt_1397454060494_0002_r_000005_0, Status : FAILED >>> 14/04/14 11:48:44 INFO mapreduce.Job: Task Id : >>> attempt_1397454060494_0002_r_000007_0, Status : FAILED >>> 14/04/14 11:48:50 INFO mapreduce.Job: map 100% reduce 63% >>> 14/04/14 11:48:51 INFO mapreduce.Job: map 100% reduce 75% >>> 14/04/14 11:48:52 INFO mapreduce.Job: map 100% reduce 88% >>> 14/04/14 11:48:58 INFO mapreduce.Job: Task Id : >>> attempt_1397454060494_0002_r_000005_1, Status : FAILED >>> 14/04/14 11:49:05 INFO mapreduce.Job: map 100% reduce 100% >>> 14/04/14 11:49:06 INFO mapreduce.Job: Job job_1397454060494_0002 >>> completed successfully >>> 14/04/14 11:49:06 INFO mapreduce.Job: Counters: 52 >>> File System Counters >>> FILE: Number of bytes read=112 >>> FILE: Number of bytes written=767175 >>> FILE: Number of read operations=0 >>> FILE: Number of large read operations=0 >>> FILE: Number of write operations=0 >>> HDFS: Number of bytes read=175 >>> HDFS: Number of bytes written=40 >>> HDFS: Number of read operations=27 >>> HDFS: Number of large read operations=0 >>> HDFS: Number of write operations=16 >>> Job Counters >>> Failed map tasks=3 >>> Failed reduce tasks=5 >>> Launched map tasks=4 >>> Launched reduce tasks=13 >>> Other local map tasks=3 >>> Data-local map tasks=1 >>> Total time spent by all maps in occupied slots (ms)=41629 >>> Total time spent by all reduces in occupied slots (ms)=104530 >>> Total time spent by all map tasks (ms)=41629 >>> Total time spent by all reduce tasks (ms)=104530 >>> Total vcore-seconds taken by all map tasks=41629 >>> Total vcore-seconds taken by all reduce tasks=104530 >>> Total megabyte-seconds taken by all map tasks=42628096 >>> Total megabyte-seconds taken by all reduce tasks=107038720 >>> Map-Reduce Framework >>> Map input records=3 >>> Map output records=13 >>> Map output bytes=110 >>> Map output materialized bytes=112 >>> Input split bytes=117 >>> Combine input records=13 >>> Combine output records=6 >>> Reduce input groups=6 >>> Reduce shuffle bytes=112 >>> Reduce input records=6 >>> Reduce output records=6 >>> Spilled Records=12 >>> Shuffled Maps =8 >>> Failed Shuffles=0 >>> Merged Map outputs=8 >>> GC time elapsed (ms)=186 >>> CPU time spent (ms)=8890 >>> Physical memory (bytes) snapshot=1408913408 >>> Virtual memory (bytes) snapshot=5727019008 >>> Total committed heap usage (bytes)=1808990208 >>> Shuffle Errors >>> BAD_ID=0 >>> CONNECTION=0 >>> IO_ERROR=0 >>> WRONG_LENGTH=0 >>> WRONG_MAP=0 >>> WRONG_REDUCE=0 >>> File Input Format Counters >>> Bytes Read=58 >>> File Output Format Counters >>> Bytes Written=40 >>> >>> Thanks and Regards, >>> -Rahul Singh >>> >> >> >> >> -- >> Regards >> Gordon Wang >> > > -- Regards Gordon Wang
