[
https://issues.apache.org/jira/browse/MAPREDUCE-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rajat Jain updated MAPREDUCE-6321:
----------------------------------
Affects Version/s: 2.6.0
> Map tasks take a lot of time to start up
> ----------------------------------------
>
> Key: MAPREDUCE-6321
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6321
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: mrv2
> Affects Versions: 2.6.0
> Reporter: Rajat Jain
> Priority: Critical
> Labels: performance
>
> I have noticed repeatedly that the map tasks take a lot of time to startup on
> YARN clusters. This is not the scheduling part, this is after the actual
> container is launched containing the Map task. Take for example, the sample
> log from a mapper of a Pi job that I launched. The command I used to launch
> the Pi job was:
> {code}
> hadoop jar
> /usr/lib/hadoop/share/hadoop/mapreduce/hadoop*mapreduce*examples*jar pi 10 100
> {code}
> This is the sample job from one of the mappers which took 14 seconds to
> complete. If you notice from the logs, most of the time taken by this job is
> during the start up. I notice that the most mappers take anywhere between 7
> to 15 seconds during start up and have seen this behavior consistent across
> mapreduce jobs. This really affects the performance of short running mappers.
> I run a hadoop2 / yarn cluster on a 4-5 node m1.xlarge cluster, and the
> mapper memory is always specified as 2048m and so on.
> Log:
> {code}
> 2015-04-18 06:48:34,081 INFO [main]
> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
> hadoop-metrics2.properties
> 2015-04-18 06:48:34,637 INFO [main]
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period
> at 10 second(s).
> 2015-04-18 06:48:34,637 INFO [main]
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system
> started
> 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild:
> Executing with tokens:
> 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind:
> mapreduce.job, Service: job_1429338752209_0059, Ident:
> (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@5d48e5d6)
> 2015-04-18 06:48:35,391 INFO [main] org.apache.hadoop.mapred.YarnChild:
> Sleeping for 0ms before retrying again. Got null now.
> 2015-04-18 06:48:36,656 INFO [main] org.apache.hadoop.mapred.YarnChild:
> mapreduce.cluster.local.dir for child:
> /media/ephemeral3/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral1/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral0/yarn/local/usercache/rjain/appcache/application_1429338752209_0059
> 2015-04-18 06:48:36,657 INFO [main] org.apache.hadoop.mapred.YarnChild:
> mapreduce.cluster.reserved.local.dir for child:
> 2015-04-18 06:48:36,706 INFO [main]
> org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is
> deprecated. Instead, use fs.defaultFS
> 2015-04-18 06:48:37,387 INFO [main]
> org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is
> deprecated. Instead, use fs.defaultFS
> 2015-04-18 06:48:39,388 INFO [main]
> org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated.
> Instead, use dfs.metrics.session-id
> 2015-04-18 06:48:39,448 INFO [main]
> org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is
> deprecated. Instead, use fs.defaultFS
> 2015-04-18 06:48:41,060 INFO [main]
> org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to
> org.apache.hadoop.mapred.Task$TaskReporter@601211d0 comment setting up
> progress from Task
> 2015-04-18 06:48:41,098 INFO [main] org.apache.hadoop.mapred.Task: Using
> ResourceCalculatorProcessTree : [ ]
> 2015-04-18 06:48:41,585 INFO [main] org.apache.hadoop.mapred.MapTask:
> Processing split:
> hdfs://ec2-54-211-109-245.compute-1.amazonaws.com:9000/user/rjain/QuasiMonteCarlo_1429339685772_504558444/in/part4:0+118
> 2015-04-18 06:48:43,926 INFO [main] org.apache.hadoop.mapred.MapTask:
> (EQUATOR) 0 kvi 234881020(939524080)
> 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask:
> mapreduce.task.io.sort.mb: 896
> 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: soft
> limit at 657666880
> 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask:
> bufstart = 0; bufvoid = 939524096
> 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart
> = 234881020; length = 58720256
> 2015-04-18 06:48:43,946 INFO [main] org.apache.hadoop.mapred.MapTask: Map
> output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
> 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask:
> Starting flush of map output
> 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask:
> Spilling map output
> 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask:
> bufstart = 0; bufend = 18; bufvoid = 939524096
> 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart
> = 234881020(939524080); kvend = 234881016(939524064); length = 5/58720256
> 2015-04-18 06:48:44,065 INFO [main]
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext$DirSelector:
> Returning directory:
> /media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059/attempt_1429338752209_0059_m_000004_0_spill_0.out
> 2015-04-18 06:48:44,089 INFO [main] org.apache.hadoop.io.compress.CodecPool:
> Got brand-new compressor [.snappy]
> 2015-04-18 06:48:44,100 INFO [main] org.apache.hadoop.mapred.MapTask:
> Finished spill 0
> 2015-04-18 06:48:44,111 INFO [main] org.apache.hadoop.mapred.Task:
> Task:attempt_1429338752209_0059_m_000004_0 is done. And is in the process of
> committing
> 2015-04-18 06:48:44,164 INFO [main]
> org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to null
> comment clearing progress now that task is done
> 2015-04-18 06:48:44,191 INFO [main] org.apache.hadoop.mapred.Task: Counters
> for attempt_1429338752209_0059_m_000004_0
> 2015-04-18 06:48:44,203 INFO [main] org.apache.hadoop.mapred.Task: File
> System Counters
> 2015-04-18 06:48:44,206 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of bytes read = 0
> 2015-04-18 06:48:44,206 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of bytes written = 116166
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of read operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of large read operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of write operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of bytes read = 297
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of bytes written = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of read operations = 4
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of large read operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of write operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: Map-Reduce
> Framework
> 2015-04-18 06:48:44,208 INFO [main] org.apache.hadoop.mapred.Task: Map input
> records = 1
> 2015-04-18 06:48:44,208 INFO [main] org.apache.hadoop.mapred.Task: Map output
> records = 2
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Map output
> bytes = 18
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Map output
> materialized bytes = 34
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Input
> split bytes = 179
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Combine
> input records = 0
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Spilled
> Records = 2
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Failed
> Shuffles = 0
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Merged Map
> outputs = 0
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: GC time
> elapsed (ms) = 158
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: CPU time
> spent (ms) = 2080
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Physical
> memory (bytes) snapshot = 1104715776
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Virtual
> memory (bytes) snapshot = 2900013056
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Total
> committed heap usage (bytes) = 1251475456
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: File Input
> Format Counters
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Bytes Read
> = 118
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: File
> System Counters
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of bytes read = 0
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of bytes written = 116166
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of read operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of large read operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of write operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of bytes read = 297
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of bytes written = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of read operations = 4
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of large read operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of write operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: Map-Reduce
> Framework
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: Map input
> records = 1
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output
> records = 2
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output
> bytes = 18
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output
> materialized bytes = 34
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Input
> split bytes = 179
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Combine
> input records = 0
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Spilled
> Records = 2
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Failed
> Shuffles = 0
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Merged Map
> outputs = 0
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: GC time
> elapsed (ms) = 158
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: CPU time
> spent (ms) = 2080
> 2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Physical
> memory (bytes) snapshot = 1104715776
> 2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Virtual
> memory (bytes) snapshot = 2900013056
> 2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Total
> committed heap usage (bytes) = 1251475456
> 2015-04-18 06:48:44,214 INFO [main] org.apache.hadoop.mapred.Task: File path:
> /media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059/output/attempt_1429338752209_0059_m_000004_0/file.out
> 2015-04-18 06:48:44,223 INFO [main] org.apache.hadoop.mapred.Task: Task
> 'attempt_1429338752209_0059_m_000004_0' done.
> 2015-04-18 06:48:44,225 INFO [main]
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics
> system...
> 2015-04-18 06:48:44,231 INFO [main]
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system
> stopped.
> 2015-04-18 06:48:44,233 INFO [main]
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system
> shutdown complete.
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)