[
https://issues.apache.org/jira/browse/MAPREDUCE-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rajat Jain updated MAPREDUCE-6321:
----------------------------------
Description:
I have noticed repeatedly that the map tasks take a lot of time to startup on
YARN clusters. This is not the scheduling part, this is after the actual
container is launched containing the Map task. Take for example, the sample log
from a mapper of a Pi job that I launched. The command I used to launch the Pi
job was:
{code}
hadoop jar /usr/lib/hadoop/share/hadoop/mapreduce/hadoop*mapreduce*examples*jar
pi 10 100
{code}
This is the sample job from one of the mappers which took 14 seconds to
complete. If you notice from the logs, most of the time taken by this job is
during the start up. I notice that the most mappers take anywhere between 7 to
15 seconds during start up and have seen this behavior consistent across
mapreduce jobs. This really affects the performance of short running mappers.
I run a hadoop2 / yarn cluster on a 4-5 node m1.xlarge cluster, and the mapper
memory is always specified as 2048m and so on.
Log:
{code}
2015-04-18 06:48:34,081 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
hadoop-metrics2.properties
2015-04-18 06:48:34,637 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at
10 second(s).
2015-04-18 06:48:34,637 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system
started
2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild:
Executing with tokens:
2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind:
mapreduce.job, Service: job_1429338752209_0059, Ident:
(org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@5d48e5d6)
2015-04-18 06:48:35,391 INFO [main] org.apache.hadoop.mapred.YarnChild:
Sleeping for 0ms before retrying again. Got null now.
2015-04-18 06:48:36,656 INFO [main] org.apache.hadoop.mapred.YarnChild:
mapreduce.cluster.local.dir for child:
/media/ephemeral3/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral1/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral0/yarn/local/usercache/rjain/appcache/application_1429338752209_0059
2015-04-18 06:48:36,706 INFO [main]
org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is
deprecated. Instead, use fs.defaultFS
2015-04-18 06:48:37,387 INFO [main]
org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is
deprecated. Instead, use fs.defaultFS
2015-04-18 06:48:39,388 INFO [main]
org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated.
Instead, use dfs.metrics.session-id
2015-04-18 06:48:39,448 INFO [main]
org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is
deprecated. Instead, use fs.defaultFS
2015-04-18 06:48:41,060 INFO [main]
org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to
org.apache.hadoop.mapred.Task$TaskReporter@601211d0 comment setting up progress
from Task
2015-04-18 06:48:41,098 INFO [main] org.apache.hadoop.mapred.Task: Using
ResourceCalculatorProcessTree : [ ]
2015-04-18 06:48:41,585 INFO [main] org.apache.hadoop.mapred.MapTask:
Processing split:
hdfs://ec2-54-211-109-245.compute-1.amazonaws.com:9000/user/rjain/QuasiMonteCarlo_1429339685772_504558444/in/part4:0+118
2015-04-18 06:48:43,926 INFO [main] org.apache.hadoop.mapred.MapTask: (EQUATOR)
0 kvi 234881020(939524080)
2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask:
mapreduce.task.io.sort.mb: 896
2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: soft
limit at 657666880
2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart
= 0; bufvoid = 939524096
2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart =
234881020; length = 58720256
2015-04-18 06:48:43,946 INFO [main] org.apache.hadoop.mapred.MapTask: Map
output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: Starting
flush of map output
2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling
map output
2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart
= 0; bufend = 18; bufvoid = 939524096
2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart =
234881020(939524080); kvend = 234881016(939524064); length = 5/58720256
2015-04-18 06:48:44,065 INFO [main]
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext$DirSelector:
Returning directory:
/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059/attempt_1429338752209_0059_m_000004_0_spill_0.out
2015-04-18 06:48:44,089 INFO [main] org.apache.hadoop.io.compress.CodecPool:
Got brand-new compressor [.snappy]
2015-04-18 06:48:44,100 INFO [main] org.apache.hadoop.mapred.MapTask: Finished
spill 0
2015-04-18 06:48:44,111 INFO [main] org.apache.hadoop.mapred.Task:
Task:attempt_1429338752209_0059_m_000004_0 is done. And is in the process of
committing
2015-04-18 06:48:44,164 INFO [main]
org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to null
comment clearing progress now that task is done
2015-04-18 06:48:44,191 INFO [main] org.apache.hadoop.mapred.Task: Counters for
attempt_1429338752209_0059_m_000004_0
2015-04-18 06:48:44,203 INFO [main] org.apache.hadoop.mapred.Task: File System
Counters
2015-04-18 06:48:44,206 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of bytes read = 0
2015-04-18 06:48:44,206 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of bytes written = 116166
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of read operations = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of large read operations = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of write operations = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of bytes read = 297
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of bytes written = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of read operations = 4
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of large read operations = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of write operations = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: Map-Reduce
Framework
2015-04-18 06:48:44,208 INFO [main] org.apache.hadoop.mapred.Task: Map input
records = 1
2015-04-18 06:48:44,208 INFO [main] org.apache.hadoop.mapred.Task: Map output
records = 2
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Map output
bytes = 18
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Map output
materialized bytes = 34
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Input split
bytes = 179
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Combine
input records = 0
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Spilled
Records = 2
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Failed
Shuffles = 0
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Merged Map
outputs = 0
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: GC time
elapsed (ms) = 158
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: CPU time
spent (ms) = 2080
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Physical
memory (bytes) snapshot = 1104715776
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Virtual
memory (bytes) snapshot = 2900013056
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Total
committed heap usage (bytes) = 1251475456
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: File Input
Format Counters
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Bytes Read =
118
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: File System
Counters
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of bytes read = 0
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of bytes written = 116166
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of read operations = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of large read operations = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of write operations = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of bytes read = 297
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of bytes written = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of read operations = 4
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of large read operations = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of write operations = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: Map-Reduce
Framework
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: Map input
records = 1
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output
records = 2
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output
bytes = 18
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output
materialized bytes = 34
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Input split
bytes = 179
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Combine
input records = 0
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Spilled
Records = 2
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Failed
Shuffles = 0
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Merged Map
outputs = 0
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: GC time
elapsed (ms) = 158
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: CPU time
spent (ms) = 2080
2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Physical
memory (bytes) snapshot = 1104715776
2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Virtual
memory (bytes) snapshot = 2900013056
2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Total
committed heap usage (bytes) = 1251475456
2015-04-18 06:48:44,214 INFO [main] org.apache.hadoop.mapred.Task: File path:
/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059/output/attempt_1429338752209_0059_m_000004_0/file.out
2015-04-18 06:48:44,223 INFO [main] org.apache.hadoop.mapred.Task: Task
'attempt_1429338752209_0059_m_000004_0' done.
2015-04-18 06:48:44,225 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics
system...
2015-04-18 06:48:44,231 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system
stopped.
2015-04-18 06:48:44,233 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system
shutdown complete.
{code}
was:
I have noticed repeatedly that the map tasks take a lot of time to startup on
YARN clusters. This is not the scheduling part, this is after the actual
container is launched containing the Map task. Take for example, the sample log
from a mapper of a Pi job that I launched. The command I used to launch the Pi
job was:
{code}
hadoop jar /usr/lib/hadoop/share/hadoop/mapreduce/hadoop*mapreduce*examples*jar
pi 10 100
{code}
This is the sample job from one of the mappers which took 14 seconds to
complete. If you notice from the logs, most of the time taken by this job is
during the start up. I notice that the most mappers take anywhere between 7 to
15 seconds during start up and have seen this behavior consistent across
mapreduce jobs. This really affects the performance of short running mappers.
I run a hadoop2 / yarn cluster on a 4-5 node m1.xlarge cluster, and the mapper
memory is always specified as 2048m and so on.
Log:
{code}
2015-04-18 06:48:34,081 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
hadoop-metrics2.properties
2015-04-18 06:48:34,637 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at
10 second(s).
2015-04-18 06:48:34,637 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system
started
2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild:
Executing with tokens:
2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind:
mapreduce.job, Service: job_1429338752209_0059, Ident:
(org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@5d48e5d6)
2015-04-18 06:48:35,391 INFO [main] org.apache.hadoop.mapred.YarnChild:
Sleeping for 0ms before retrying again. Got null now.
2015-04-18 06:48:36,656 INFO [main] org.apache.hadoop.mapred.YarnChild:
mapreduce.cluster.local.dir for child:
/media/ephemeral3/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral1/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral0/yarn/local/usercache/rjain/appcache/application_1429338752209_0059
2015-04-18 06:48:36,657 INFO [main] org.apache.hadoop.mapred.YarnChild:
mapreduce.cluster.reserved.local.dir for child:
2015-04-18 06:48:36,706 INFO [main]
org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is
deprecated. Instead, use fs.defaultFS
2015-04-18 06:48:37,387 INFO [main]
org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is
deprecated. Instead, use fs.defaultFS
2015-04-18 06:48:39,388 INFO [main]
org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated.
Instead, use dfs.metrics.session-id
2015-04-18 06:48:39,448 INFO [main]
org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is
deprecated. Instead, use fs.defaultFS
2015-04-18 06:48:41,060 INFO [main]
org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to
org.apache.hadoop.mapred.Task$TaskReporter@601211d0 comment setting up progress
from Task
2015-04-18 06:48:41,098 INFO [main] org.apache.hadoop.mapred.Task: Using
ResourceCalculatorProcessTree : [ ]
2015-04-18 06:48:41,585 INFO [main] org.apache.hadoop.mapred.MapTask:
Processing split:
hdfs://ec2-54-211-109-245.compute-1.amazonaws.com:9000/user/rjain/QuasiMonteCarlo_1429339685772_504558444/in/part4:0+118
2015-04-18 06:48:43,926 INFO [main] org.apache.hadoop.mapred.MapTask: (EQUATOR)
0 kvi 234881020(939524080)
2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask:
mapreduce.task.io.sort.mb: 896
2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: soft
limit at 657666880
2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart
= 0; bufvoid = 939524096
2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart =
234881020; length = 58720256
2015-04-18 06:48:43,946 INFO [main] org.apache.hadoop.mapred.MapTask: Map
output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: Starting
flush of map output
2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling
map output
2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart
= 0; bufend = 18; bufvoid = 939524096
2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart =
234881020(939524080); kvend = 234881016(939524064); length = 5/58720256
2015-04-18 06:48:44,065 INFO [main]
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext$DirSelector:
Returning directory:
/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059/attempt_1429338752209_0059_m_000004_0_spill_0.out
2015-04-18 06:48:44,089 INFO [main] org.apache.hadoop.io.compress.CodecPool:
Got brand-new compressor [.snappy]
2015-04-18 06:48:44,100 INFO [main] org.apache.hadoop.mapred.MapTask: Finished
spill 0
2015-04-18 06:48:44,111 INFO [main] org.apache.hadoop.mapred.Task:
Task:attempt_1429338752209_0059_m_000004_0 is done. And is in the process of
committing
2015-04-18 06:48:44,164 INFO [main]
org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to null
comment clearing progress now that task is done
2015-04-18 06:48:44,191 INFO [main] org.apache.hadoop.mapred.Task: Counters for
attempt_1429338752209_0059_m_000004_0
2015-04-18 06:48:44,203 INFO [main] org.apache.hadoop.mapred.Task: File System
Counters
2015-04-18 06:48:44,206 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of bytes read = 0
2015-04-18 06:48:44,206 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of bytes written = 116166
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of read operations = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of large read operations = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of write operations = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of bytes read = 297
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of bytes written = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of read operations = 4
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of large read operations = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of write operations = 0
2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: Map-Reduce
Framework
2015-04-18 06:48:44,208 INFO [main] org.apache.hadoop.mapred.Task: Map input
records = 1
2015-04-18 06:48:44,208 INFO [main] org.apache.hadoop.mapred.Task: Map output
records = 2
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Map output
bytes = 18
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Map output
materialized bytes = 34
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Input split
bytes = 179
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Combine
input records = 0
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Spilled
Records = 2
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Failed
Shuffles = 0
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Merged Map
outputs = 0
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: GC time
elapsed (ms) = 158
2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: CPU time
spent (ms) = 2080
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Physical
memory (bytes) snapshot = 1104715776
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Virtual
memory (bytes) snapshot = 2900013056
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Total
committed heap usage (bytes) = 1251475456
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: File Input
Format Counters
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Bytes Read =
118
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: File System
Counters
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of bytes read = 0
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of bytes written = 116166
2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of read operations = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of large read operations = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number
of write operations = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of bytes read = 297
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of bytes written = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of read operations = 4
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of large read operations = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number
of write operations = 0
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: Map-Reduce
Framework
2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: Map input
records = 1
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output
records = 2
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output
bytes = 18
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output
materialized bytes = 34
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Input split
bytes = 179
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Combine
input records = 0
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Spilled
Records = 2
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Failed
Shuffles = 0
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Merged Map
outputs = 0
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: GC time
elapsed (ms) = 158
2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: CPU time
spent (ms) = 2080
2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Physical
memory (bytes) snapshot = 1104715776
2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Virtual
memory (bytes) snapshot = 2900013056
2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Total
committed heap usage (bytes) = 1251475456
2015-04-18 06:48:44,214 INFO [main] org.apache.hadoop.mapred.Task: File path:
/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059/output/attempt_1429338752209_0059_m_000004_0/file.out
2015-04-18 06:48:44,223 INFO [main] org.apache.hadoop.mapred.Task: Task
'attempt_1429338752209_0059_m_000004_0' done.
2015-04-18 06:48:44,225 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics
system...
2015-04-18 06:48:44,231 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system
stopped.
2015-04-18 06:48:44,233 INFO [main]
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system
shutdown complete.
{code}
> Map tasks take a lot of time to start up
> ----------------------------------------
>
> Key: MAPREDUCE-6321
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6321
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: mrv2
> Affects Versions: 2.6.0
> Reporter: Rajat Jain
> Priority: Critical
> Labels: performance
>
> I have noticed repeatedly that the map tasks take a lot of time to startup on
> YARN clusters. This is not the scheduling part, this is after the actual
> container is launched containing the Map task. Take for example, the sample
> log from a mapper of a Pi job that I launched. The command I used to launch
> the Pi job was:
> {code}
> hadoop jar
> /usr/lib/hadoop/share/hadoop/mapreduce/hadoop*mapreduce*examples*jar pi 10 100
> {code}
> This is the sample job from one of the mappers which took 14 seconds to
> complete. If you notice from the logs, most of the time taken by this job is
> during the start up. I notice that the most mappers take anywhere between 7
> to 15 seconds during start up and have seen this behavior consistent across
> mapreduce jobs. This really affects the performance of short running mappers.
> I run a hadoop2 / yarn cluster on a 4-5 node m1.xlarge cluster, and the
> mapper memory is always specified as 2048m and so on.
> Log:
> {code}
> 2015-04-18 06:48:34,081 INFO [main]
> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
> hadoop-metrics2.properties
> 2015-04-18 06:48:34,637 INFO [main]
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period
> at 10 second(s).
> 2015-04-18 06:48:34,637 INFO [main]
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system
> started
> 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild:
> Executing with tokens:
> 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind:
> mapreduce.job, Service: job_1429338752209_0059, Ident:
> (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@5d48e5d6)
> 2015-04-18 06:48:35,391 INFO [main] org.apache.hadoop.mapred.YarnChild:
> Sleeping for 0ms before retrying again. Got null now.
> 2015-04-18 06:48:36,656 INFO [main] org.apache.hadoop.mapred.YarnChild:
> mapreduce.cluster.local.dir for child:
> /media/ephemeral3/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral1/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral0/yarn/local/usercache/rjain/appcache/application_1429338752209_0059
> 2015-04-18 06:48:36,706 INFO [main]
> org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is
> deprecated. Instead, use fs.defaultFS
> 2015-04-18 06:48:37,387 INFO [main]
> org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is
> deprecated. Instead, use fs.defaultFS
> 2015-04-18 06:48:39,388 INFO [main]
> org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated.
> Instead, use dfs.metrics.session-id
> 2015-04-18 06:48:39,448 INFO [main]
> org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is
> deprecated. Instead, use fs.defaultFS
> 2015-04-18 06:48:41,060 INFO [main]
> org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to
> org.apache.hadoop.mapred.Task$TaskReporter@601211d0 comment setting up
> progress from Task
> 2015-04-18 06:48:41,098 INFO [main] org.apache.hadoop.mapred.Task: Using
> ResourceCalculatorProcessTree : [ ]
> 2015-04-18 06:48:41,585 INFO [main] org.apache.hadoop.mapred.MapTask:
> Processing split:
> hdfs://ec2-54-211-109-245.compute-1.amazonaws.com:9000/user/rjain/QuasiMonteCarlo_1429339685772_504558444/in/part4:0+118
> 2015-04-18 06:48:43,926 INFO [main] org.apache.hadoop.mapred.MapTask:
> (EQUATOR) 0 kvi 234881020(939524080)
> 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask:
> mapreduce.task.io.sort.mb: 896
> 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: soft
> limit at 657666880
> 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask:
> bufstart = 0; bufvoid = 939524096
> 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart
> = 234881020; length = 58720256
> 2015-04-18 06:48:43,946 INFO [main] org.apache.hadoop.mapred.MapTask: Map
> output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
> 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask:
> Starting flush of map output
> 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask:
> Spilling map output
> 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask:
> bufstart = 0; bufend = 18; bufvoid = 939524096
> 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart
> = 234881020(939524080); kvend = 234881016(939524064); length = 5/58720256
> 2015-04-18 06:48:44,065 INFO [main]
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext$DirSelector:
> Returning directory:
> /media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059/attempt_1429338752209_0059_m_000004_0_spill_0.out
> 2015-04-18 06:48:44,089 INFO [main] org.apache.hadoop.io.compress.CodecPool:
> Got brand-new compressor [.snappy]
> 2015-04-18 06:48:44,100 INFO [main] org.apache.hadoop.mapred.MapTask:
> Finished spill 0
> 2015-04-18 06:48:44,111 INFO [main] org.apache.hadoop.mapred.Task:
> Task:attempt_1429338752209_0059_m_000004_0 is done. And is in the process of
> committing
> 2015-04-18 06:48:44,164 INFO [main]
> org.apache.hadoop.fs.s3native.NativeS3FileSystem: setting Progress to null
> comment clearing progress now that task is done
> 2015-04-18 06:48:44,191 INFO [main] org.apache.hadoop.mapred.Task: Counters
> for attempt_1429338752209_0059_m_000004_0
> 2015-04-18 06:48:44,203 INFO [main] org.apache.hadoop.mapred.Task: File
> System Counters
> 2015-04-18 06:48:44,206 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of bytes read = 0
> 2015-04-18 06:48:44,206 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of bytes written = 116166
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of read operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of large read operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of write operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of bytes read = 297
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of bytes written = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of read operations = 4
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of large read operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of write operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: Map-Reduce
> Framework
> 2015-04-18 06:48:44,208 INFO [main] org.apache.hadoop.mapred.Task: Map input
> records = 1
> 2015-04-18 06:48:44,208 INFO [main] org.apache.hadoop.mapred.Task: Map output
> records = 2
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Map output
> bytes = 18
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Map output
> materialized bytes = 34
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Input
> split bytes = 179
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Combine
> input records = 0
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Spilled
> Records = 2
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Failed
> Shuffles = 0
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Merged Map
> outputs = 0
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: GC time
> elapsed (ms) = 158
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: CPU time
> spent (ms) = 2080
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Physical
> memory (bytes) snapshot = 1104715776
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Virtual
> memory (bytes) snapshot = 2900013056
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Total
> committed heap usage (bytes) = 1251475456
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: File Input
> Format Counters
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Bytes Read
> = 118
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: File
> System Counters
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of bytes read = 0
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of bytes written = 116166
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of read operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of large read operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: FILE:
> Number of write operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of bytes read = 297
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of bytes written = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of read operations = 4
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of large read operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS:
> Number of write operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: Map-Reduce
> Framework
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: Map input
> records = 1
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output
> records = 2
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output
> bytes = 18
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output
> materialized bytes = 34
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Input
> split bytes = 179
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Combine
> input records = 0
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Spilled
> Records = 2
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Failed
> Shuffles = 0
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Merged Map
> outputs = 0
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: GC time
> elapsed (ms) = 158
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: CPU time
> spent (ms) = 2080
> 2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Physical
> memory (bytes) snapshot = 1104715776
> 2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Virtual
> memory (bytes) snapshot = 2900013056
> 2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Total
> committed heap usage (bytes) = 1251475456
> 2015-04-18 06:48:44,214 INFO [main] org.apache.hadoop.mapred.Task: File path:
> /media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059/output/attempt_1429338752209_0059_m_000004_0/file.out
> 2015-04-18 06:48:44,223 INFO [main] org.apache.hadoop.mapred.Task: Task
> 'attempt_1429338752209_0059_m_000004_0' done.
> 2015-04-18 06:48:44,225 INFO [main]
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics
> system...
> 2015-04-18 06:48:44,231 INFO [main]
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system
> stopped.
> 2015-04-18 06:48:44,233 INFO [main]
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system
> shutdown complete.
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)