it's possible that they are compressing the output, I'm now rebuilding the code after commenting out the setOutputCompress(true) in the code
also will run with compression param set to false but still it's quite surprising why compression should take so long (8--10minutes) On Thu, Oct 9, 2014 at 11:06 AM, Yang <[email protected]> wrote: > my Q-Job MR job shows as 100% mapper complete (it's a map-only job) very > quickly, but the job itself does not finish, until about 10 minutes later. > this is rather surprising. my input is a sparse vector of 37000 rows, and > the column count is 8000, with each row usually having < 10 elements set to > non-zero. so the input size is fairly small. > > > I looked at the Q-job code, it seems rather normal, i.e. it's not doing > anything special after the map() function is completed. so I wonder why > it's lagging so long after 100% ? > > > here is the syslog from hadoop: > > > > 2014-10-09 10:37:40,504 INFO [main] > org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded & > initialized native-zlib library > 2014-10-09 10:37:40,538 INFO [main] org.apache.hadoop.io.compress.CodecPool: > Got brand-new decompressor [.gz] > 2014-10-09 10:37:40,548 INFO [main] org.apache.hadoop.io.compress.CodecPool: > Got brand-new decompressor [.gz] > 2014-10-09 10:37:40,548 INFO [main] org.apache.hadoop.io.compress.CodecPool: > Got brand-new decompressor [.gz] > 2014-10-09 10:37:40,549 INFO [main] org.apache.hadoop.io.compress.CodecPool: > Got brand-new decompressor [.gz] > 2014-10-09 10:39:39,143 WARN [communication thread] > org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: Error reading the stream > java.io.IOException: No such process > 2014-10-09 10:40:09,117 INFO [main] org.apache.hadoop.io.compress.CodecPool: > Got brand-new compressor [.deflate] > 2014-10-09 10:46:23,991 INFO [main] org.apache.hadoop.io.compress.CodecPool: > Got brand-new decompressor [.deflate] > 2014-10-09 10:46:23,992 INFO [main] org.apache.hadoop.io.compress.CodecPool: > Got brand-new decompressor [.deflate] > 2014-10-09 10:46:23,992 INFO [main] org.apache.hadoop.io.compress.CodecPool: > Got brand-new decompressor [.deflate] > 2014-10-09 10:46:23,992 INFO [main] org.apache.hadoop.io.compress.CodecPool: > Got brand-new decompressor [.deflate] > 2014-10-09 10:46:31,219 INFO > [LeaseRenewer:[email protected]:8020] > org.apache.hadoop.ipc.Client: Retrying connect to server: > apollo-phx-nn.vip.ebay.com/10.115.201.75:8020. Already tried 0 time(s); > maxRetries=45 > 2014-10-09 10:47:45,241 INFO [main] org.apache.hadoop.io.compress.CodecPool: > Got brand-new compressor [.deflate] > 2014-10-09 10:47:46,571 INFO [main] org.apache.hadoop.mapred.Task: > Task:attempt_1412781120464_7857_m_000000_0 is done. And is in the process of > committing > 2014-10-09 10:47:46,739 INFO [main] org.apache.hadoop.mapred.Task: Task > attempt_1412781120464_7857_m_000000_0 is allowed to commit now > 2014-10-09 10:47:47,389 INFO [main] > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved output of > task 'attempt_1412781120464_7857_m_000000_0' to > hdfs://apollo-phx-nn.vip.ebay.com:8020/user/yyang15/CIReco/shoes/ssvd/tmp/ssvd/Q-job/_temporary/1/task_1412781120464_7857_m_000000 > 2014-10-09 > <http://apollo-phx-nn.vip.ebay.com:8020/user/yyang15/CIReco/shoes/ssvd/tmp/ssvd/Q-job/_temporary/1/task_1412781120464_7857_m_0000002014-10-09> > 10:47:47,574 INFO [main] org.apache.hadoop.mapred.Task: Task > 'attempt_1412781120464_7857_m_000000_0' done. > 2014-10-09 10:47:47,575 INFO [main] > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics > system... > 2014-10-09 10:47:47,576 INFO [ganglia] > org.apache.hadoop.metrics2.impl.MetricsSinkAdapter: ganglia thread > interrupted. > 2014-10-09 10:47:47,576 INFO [main] > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system > stopped. > 2014-10-09 10:47:47,576 INFO [main] > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system > shutdown complete. > >
