Rajat Jain created MAPREDUCE-6321:
-------------------------------------
Summary: Map tasks take a lot of time to start up
Key: MAPREDUCE-6321
URL: https://issues.apache.org/jira/browse/MAPREDUCE-6321
Project: Hadoop Map/Reduce
Issue Type: Improvement
Reporter: Rajat Jain
I have noticed repeatedly that the map tasks take a lot of time to startup on
YARN clusters. This is not the scheduling part, this is after the actual
container is launched containing the Map task. Take for example, the sample log
from a mapper of a Pi job that I launched. The command I used to launch the Pi
job was:
{code}
hadoop jar /usr/lib/hadoop/share/hadoop/mapreduce/hadoop*mapreduce*examples*jar
pi 10 100
{code}
This is the sample job from one of the mappers which took 14 seconds to
complete. If you notice from the logs, most of the time taken by this job is
during the start up. I notice that the most mappers take anywhere between 7 to
15 seconds during start up and have seen this behavior consistent across
mapreduce jobs. This really affects the performance of short running mappers.
I run a hadoop2 / yarn cluster on a 4-5 node m1.xlarge cluster, and the mapper
memory is always specified as 2048m and so on.
Log:
{code}
2015-04-18 06:48:34,081 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
hadoop-metrics2.properties
2015-04-18 06:48:34,637 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at
10 second(s).
2015-04-18 06:48:34,637 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system
started
2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild:
Executing with tokens:
2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind:
mapreduce.job, Service: job_1429338752209_0059, Ident:
(org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@5d48e5d6)
2015-04-18 06:48:35,391 INFO [main] org.apache.hadoop.mapred.YarnChild:
Sleeping for 0ms before retrying again. Got null now.
2015-04-18 06:48:36,656 INFO [main] org.apache.hadoop.mapred.YarnChild:
mapreduce.cluster.local.dir for child:
/media/ephemeral3/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral1/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral0/yarn/local/usercache/rjain/appcache/application_1429338752209_0059
2015-04-18 06:48:36,657 INFO [main] org.apache.hadoop.mapred.YarnChild:
mapreduce.cluster.reserved.local.dir for child:
2015-04-18 06:48:36,706 INFO [main]
org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is
deprecated. Instead, use fs.defaultFS
2015-04-18 06:48:37,387 INFO [main]
org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is
deprecated. Instead, use fs.defaultFS
2015-04-18 06:48:39,388 INFO [main]
org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated.
Instead, use dfs.metrics.session-id
2015-04-18 06:48:39,448 INFO [main]
org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is
deprecated. Instead, use fs.defaultFS
2015-04-18 06:48:41,060 INFO [main]
org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to
org.apache.hadoop.mapred.Task$TaskReporter@601211d0 comment setting up progress
from Task
2015-04-18 06:48:41,098 INFO [main] org.apache.hadoop.mapred.Task: Using
ResourceCalculatorProcessTree : [ ]
2015-04-18 06:48:41,585 INFO [main] org.apache.hadoop.mapred.MapTask:
Processing split:
hdfs://ec2-54-211-109-245.compute-1.amazonaws.com:9000/user/rjain/QuasiMonteCarlo_1429339685772_504558444/in/part4:0+118
2015-04-18 06:48:43,926 INFO [main] org.apache.hadoop.mapred.MapTask: (EQUATOR)
0 kvi 234881020(939524080)
2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask:
mapreduce.task.io.sort.mb: 896
2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: soft
limit at 657666880
2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart
= 0; bufvoid = 939524096
2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart =
234881020; length = 58720256
2015-04-18 06:48:43,946 INFO [main] org.apache.hadoop.mapred.MapTask: Map
output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: Starting
flush of map output
2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling
map output
2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart
= 0; bufend = 18; bufvoid = 939524096
2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart =
234881020(939524080); kvend = 234881016(939524064); length = 5/58720256
2015-04-18 06:48:44,065 INFO [main]
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext$DirSelector:
Returning directory:
/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059/attempt_1429338752209_0059_m_000004_0_spill_0.out
2015-04-18 06:48:44,089 INFO [main] org.apache.hadoop.io.compress.CodecPool:
Got brand-new compressor [.snappy]
2015-04-18 06:48:44,100 INFO [main] org.apache.hadoop.mapred.MapTask: Finished
spill 0
2015-04-18 06:48:44,111 INFO [main] org.apache.hadoop.mapred.Task:
Task:attempt_1429338752209_0059_m_000004_0 is done. And is in the process of
committing
2015-04-18 06:48:44,164 INFO [main]
org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to null
comment clearing progress now that task is done
2015-04-18 06:48:44,191 INFO [main] org.apache.hadoop.mapred.Task: Counters for
attempt_1429338752209_0059_m_000004_0
2015-04-18 06:48:44,203 INFO [main] org.apache.hadoop.mapred.Task: File System
Counters
2015-04-18 06:48:44,206 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of bytes read = 0
2015-04-18 06:48:44,206 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of bytes written = 116166
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of read operations = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of large read operations = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of write operations = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of bytes read = 297
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of bytes written = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of read operations = 4
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of large read operations = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of write operations = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: Map-Reduce
Framework
2015-04-18 06:48:44,208 INFO [main] org.apache.hadoop.mapred.Task: Map input
records = 1
2015-04-18 06:48:44,208 INFO [main] org.apache.hadoop.mapred.Task: Map output
records = 2
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Map output
bytes = 18
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Map output
materialized bytes = 34
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Input split
bytes = 179
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Combine
input records = 0
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Spilled
Records = 2
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Failed
Shuffles = 0
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Merged Map
outputs = 0
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: GC time
elapsed (ms) = 158
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: CPU time
spent (ms) = 2080
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Physical
memory (bytes) snapshot = 1104715776
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Virtual
memory (bytes) snapshot = 2900013056
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Total
committed heap usage (bytes) = 1251475456
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: File Input
Format Counters
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Bytes Read =
118
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: File System
Counters
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of bytes read = 0
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of bytes written = 116166
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of read operations = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of large read operations = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of write operations = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of bytes read = 297
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of bytes written = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of read operations = 4
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of large read operations = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of write operations = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: Map-Reduce
Framework
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: Map input
records = 1
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output
records = 2
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output
bytes = 18
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output
materialized bytes = 34
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Input split
bytes = 179
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Combine
input records = 0
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Spilled
Records = 2
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Failed
Shuffles = 0
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Merged Map
outputs = 0
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: GC time
elapsed (ms) = 158
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: CPU time
spent (ms) = 2080
2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Physical
memory (bytes) snapshot = 1104715776
2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Virtual
memory (bytes) snapshot = 2900013056
2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Total
committed heap usage (bytes) = 1251475456
2015-04-18 06:48:44,214 INFO [main] org.apache.hadoop.mapred.Task: File path:
/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059/output/attempt_1429338752209_0059_m_000004_0/file.out
2015-04-18 06:48:44,223 INFO [main] org.apache.hadoop.mapred.Task: Task
'attempt_1429338752209_0059_m_000004_0' done.
2015-04-18 06:48:44,225 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics
system...
2015-04-18 06:48:44,231 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system
stopped.
2015-04-18 06:48:44,233 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system
shutdown complete.
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)