[
https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960478#comment-14960478
]
Varun Saxena commented on MAPREDUCE-6513:
-----------------------------------------
The headroom is not very high(sometimes comes as 0 in response too) as other
heavy apps are running. We notice that we always ramp up and ramping down never
happens which schedules reducers too aggressively. As can be seen below, there
is no ramp down(except first time - 651 ramp downs).
And we always find ramp up happening.
{noformat}
2015-10-13 04:36:53,038 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:53:42,132 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:651
2015-10-13 04:53:43,135 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:53:44,137 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:53:45,140 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:53:46,143 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:53:47,146 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:53:48,149 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:53:49,152 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:53:50,155 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:53:51,158 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:53:52,161 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:53:53,164 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:53:54,167 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:53:55,170 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:53:56,181 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:53:57,184 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:53:58,187 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:53:59,190 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:00,193 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:01,205 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:02,208 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:03,211 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:04,213 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:05,216 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:06,219 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:07,221 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:08,225 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:09,228 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:10,231 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:11,235 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:12,239 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:13,242 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:14,245 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:15,248 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:16,276 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:17,280 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:18,283 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:19,286 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:20,289 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:21,292 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:22,295 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:23,298 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:24,301 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:25,304 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
2015-10-13 04:54:26,307 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all
scheduled reduces:0
{noformat}
{noformat}
2015-10-13 04:37:39,685 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: All maps assigned.
Ramping up all remaining reduces:651
2015-10-13 04:55:04,912 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 10
2015-10-13 04:55:05,923 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 5
2015-10-13 04:55:06,929 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 5
2015-10-13 04:55:07,945 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 1
2015-10-13 04:55:12,031 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 10
2015-10-13 04:55:13,053 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 12
2015-10-13 04:55:14,061 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 10
2015-10-13 04:55:16,075 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 2
2015-10-13 04:55:17,092 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 1
2015-10-13 04:55:20,147 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 1
2015-10-13 04:55:21,165 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 10
2015-10-13 04:55:22,175 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 2
2015-10-13 04:55:23,184 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 5
2015-10-13 04:55:24,197 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 1
2015-10-13 04:55:29,299 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 8
2015-10-13 04:55:30,311 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 15
2015-10-13 04:55:31,320 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 10
2015-10-13 04:55:32,327 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 1
2015-10-13 04:55:43,496 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 1
2015-10-13 04:55:44,509 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 4
2015-10-13 04:55:45,521 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 5
2015-10-13 04:55:46,530 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 4
2015-10-13 04:55:47,543 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 2
2015-10-13 04:55:57,680 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 1
2015-10-13 04:55:58,698 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 6
2015-10-13 04:55:59,715 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 5
2015-10-13 04:56:00,721 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 6
2015-10-13 04:56:05,795 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 2
2015-10-13 04:56:07,820 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 1
2015-10-13 04:56:08,831 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 2
2015-10-13 04:56:09,841 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 2
2015-10-13 04:56:10,853 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 2
2015-10-13 04:56:22,018 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 15
2015-10-13 04:56:23,036 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 2
2015-10-13 04:56:24,043 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 6
2015-10-13 04:56:29,114 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 2
2015-10-13 04:56:31,138 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 4
2015-10-13 04:56:32,148 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 3
2015-10-13 04:56:33,157 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 3
2015-10-13 04:56:45,328 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 6
2015-10-13 04:56:46,349 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 2
2015-10-13 04:56:47,356 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 3
2015-10-13 04:56:57,499 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 8
2015-10-13 04:56:58,514 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 7
2015-10-13 04:56:59,521 INFO [RMCommunicator Allocator]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping up 10
{noformat}
> MR job got hanged forever when one NM unstable for some time
> ------------------------------------------------------------
>
> Key: MAPREDUCE-6513
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6513
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: applicationmaster, resourcemanager
> Affects Versions: 2.7.0
> Reporter: Bob
> Assignee: Varun Saxena
> Priority: Critical
>
> when job is in-progress which is having more tasks,one node became unstable
> due to some OS issue.After the node became unstable, the map on this node
> status changed to KILLED state.
> Currently maps which were running on unstable node are rescheduled, and all
> are in scheduled state and wait for RM assign container.Seen ask requests for
> map till Node is good (all those failed), there are no ask request after
> this. But AM keeps on preempting the reducers (it's recycling).
> Finally reducers are waiting for complete mappers and mappers did n't get
> container..
> My Question Is:
> ============
> why map requests did not sent AM ,once after node recovery.?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)