[
https://issues.apache.org/jira/browse/HADOOP-3604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arun C Murthy updated HADOOP-3604:
----------------------------------
Description:
I was running gridmix with Hadoop 0.18.
I set the map output compression to true.
Most of the jobs completed just fine.
Three jobs, however, got stuck.
Each has one reducer stuck at shuffling phase.
Here is the log:
2008-06-20 00:06:01,264 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=SHUFFLE, sessionId=
2008-06-20 00:06:01,415 INFO org.apache.hadoop.streaming.PipeMapRed: PipeMapRed
exec [/bin/cat]
2008-06-20 00:06:01,463 INFO org.apache.hadoop.mapred.ReduceTask:
ShuffleRamManager: MemoryLimit=134217728, MaxSingleShuffleLimit=33554432
2008-06-20 00:06:01,474 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded
the native-hadoop library
2008-06-20 00:06:01,475 INFO org.apache.hadoop.io.compress.zlib.ZlibFactory:
Successfully loaded & initialized native-zlib library
2008-06-20 00:06:01,476 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,477 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,477 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,478 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,478 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,486 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,486 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,487 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,487 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,488 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,488 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,489 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,489 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,489 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,493 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,496 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,496 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,496 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,497 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,497 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,500 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200806192318_0450_r_000016_0 Thread started: Thread for merging on-disk
files
2008-06-20 00:06:01,500 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200806192318_0450_r_000016_0 Thread waiting: Thread for merging on-disk
files
2008-06-20 00:06:01,502 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200806192318_0450_r_000016_0 Need another 270 map output(s) where 0 is
already in progress
2008-06-20 00:06:01,503 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200806192318_0450_r_000016_0 Thread started: Thread for merging in
memory files
2008-06-20 00:06:01,503 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200806192318_0450_r_000016_0: Got 0 new map-outputs & number of known
map outputs is 0
2008-06-20 00:06:01,504 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200806192318_0450_r_000016_0 Scheduled 0 of 0 known outputs (0 slow
hosts and 0 dup hosts)
2008-06-20 00:06:06,654 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200806192318_0450_r_000016_0: Got 269 new map-outputs & number of known
map outputs is 269
2008-06-20 00:06:06,656 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200806192318_0450_r_000016_0 Scheduled 229 of 269 known outputs (0 slow
hosts and 40 dup hosts)
2008-06-20 00:06:07,163 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling
846183 bytes (210104 raw bytes) into RAM-FS from
attempt_200806192318_0450_m_000089_0
2008-06-20 00:06:07,163 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling
820890 bytes (204371 raw bytes) into RAM-FS from
attempt_200806192318_0450_m_000083_0
2008-06-20 00:06:07,166 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling
835672 bytes (208085 raw bytes) into RAM-FS from
attempt_200806192318_0450_m_000122_0
was:
I was running gridmix with Hadoop 0.18.
I set the map output compression to true.
Most of the jobs completed just fine.
Three jobs, however, got stuck.
Each has one reducer stuck at shuffling phase.
Here is the log:
2008-06-20 00:06:01,264 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=SHUFFLE, sessionId=
2008-06-20 00:06:01,415 INFO org.apache.hadoop.streaming.PipeMapRed: PipeMapRed
exec [/bin/cat]
2008-06-20 00:06:01,463 INFO org.apache.hadoop.mapred.ReduceTask:
ShuffleRamManager: MemoryLimit=134217728, MaxSingleShuffleLimit=33554432
2008-06-20 00:06:01,474 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded
the native-hadoop library
2008-06-20 00:06:01,475 INFO org.apache.hadoop.io.compress.zlib.ZlibFactory:
Successfully loaded & initialized native-zlib library
2008-06-20 00:06:01,476 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,477 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,477 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,478 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,478 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,486 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,486 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,487 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,487 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,488 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,488 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,489 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,489 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,489 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,493 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,496 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,496 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,496 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,497 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,497 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor
2008-06-20 00:06:01,500 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200806192318_0450_r_000016_0 Thread started: Thread for merging on-disk
files
2008-06-20 00:06:01,500 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200806192318_0450_r_000016_0 Thread waiting: Thread for merging on-disk
files
2008-06-20 00:06:01,502 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200806192318_0450_r_000016_0 Need another 270 map output(s) where 0 is
already in progress
2008-06-20 00:06:01,503 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200806192318_0450_r_000016_0 Thread started: Thread for merging in
memory files
2008-06-20 00:06:01,503 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200806192318_0450_r_000016_0: Got 0 new map-outputs & number of known
map outputs is 0
2008-06-20 00:06:01,504 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200806192318_0450_r_000016_0 Scheduled 0 of 0 known outputs (0 slow
hosts and 0 dup hosts)
2008-06-20 00:06:06,654 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200806192318_0450_r_000016_0: Got 269 new map-outputs & number of known
map outputs is 269
2008-06-20 00:06:06,656 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200806192318_0450_r_000016_0 Scheduled 229 of 269 known outputs (0 slow
hosts and 40 dup hosts)
2008-06-20 00:06:07,163 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling
846183 bytes (210104 raw bytes) into RAM-FS from
attempt_200806192318_0450_m_000089_0
2008-06-20 00:06:07,163 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling
820890 bytes (204371 raw bytes) into RAM-FS from
attempt_200806192318_0450_m_000083_0
2008-06-20 00:06:07,166 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling
835672 bytes (208085 raw bytes) into RAM-FS from
attempt_200806192318_0450_m_000122_0
Priority: Blocker (was: Major)
Assignee: Arun C Murthy
Runping, did you see any timeouts? Or was that all of the reduce's logs?
I assume you are using the DefaultCodec.
I'm marking this as a blocker till we understand what is happening...
> Reduce stuck at shuffling phase
> -------------------------------
>
> Key: HADOOP-3604
> URL: https://issues.apache.org/jira/browse/HADOOP-3604
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
> Assignee: Arun C Murthy
> Priority: Blocker
>
> I was running gridmix with Hadoop 0.18.
> I set the map output compression to true.
> Most of the jobs completed just fine.
> Three jobs, however, got stuck.
> Each has one reducer stuck at shuffling phase.
> Here is the log:
> 2008-06-20 00:06:01,264 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=SHUFFLE, sessionId=
> 2008-06-20 00:06:01,415 INFO org.apache.hadoop.streaming.PipeMapRed:
> PipeMapRed exec [/bin/cat]
> 2008-06-20 00:06:01,463 INFO org.apache.hadoop.mapred.ReduceTask:
> ShuffleRamManager: MemoryLimit=134217728, MaxSingleShuffleLimit=33554432
> 2008-06-20 00:06:01,474 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded
> the native-hadoop library
> 2008-06-20 00:06:01,475 INFO org.apache.hadoop.io.compress.zlib.ZlibFactory:
> Successfully loaded & initialized native-zlib library
> 2008-06-20 00:06:01,476 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,477 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,477 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,478 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,478 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,486 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,486 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,487 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,487 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,488 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,488 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,489 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,489 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,489 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,493 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,496 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,496 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,496 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,497 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,497 INFO org.apache.hadoop.io.compress.CodecPool: Got
> brand-new decompressor
> 2008-06-20 00:06:01,500 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200806192318_0450_r_000016_0 Thread started: Thread for merging
> on-disk files
> 2008-06-20 00:06:01,500 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200806192318_0450_r_000016_0 Thread waiting: Thread for merging
> on-disk files
> 2008-06-20 00:06:01,502 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200806192318_0450_r_000016_0 Need another 270 map output(s) where 0
> is already in progress
> 2008-06-20 00:06:01,503 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200806192318_0450_r_000016_0 Thread started: Thread for merging in
> memory files
> 2008-06-20 00:06:01,503 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200806192318_0450_r_000016_0: Got 0 new map-outputs & number of known
> map outputs is 0
> 2008-06-20 00:06:01,504 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200806192318_0450_r_000016_0 Scheduled 0 of 0 known outputs (0 slow
> hosts and 0 dup hosts)
> 2008-06-20 00:06:06,654 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200806192318_0450_r_000016_0: Got 269 new map-outputs & number of
> known map outputs is 269
> 2008-06-20 00:06:06,656 INFO org.apache.hadoop.mapred.ReduceTask:
> attempt_200806192318_0450_r_000016_0 Scheduled 229 of 269 known outputs (0
> slow hosts and 40 dup hosts)
> 2008-06-20 00:06:07,163 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling
> 846183 bytes (210104 raw bytes) into RAM-FS from
> attempt_200806192318_0450_m_000089_0
> 2008-06-20 00:06:07,163 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling
> 820890 bytes (204371 raw bytes) into RAM-FS from
> attempt_200806192318_0450_m_000083_0
> 2008-06-20 00:06:07,166 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling
> 835672 bytes (208085 raw bytes) into RAM-FS from
> attempt_200806192318_0450_m_000122_0
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.