[
https://issues.apache.org/jira/browse/HADOOP-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579343#action_12579343
]
Amar Kamat commented on HADOOP-2119:
------------------------------------
The attached patch does the following
Maps :
1) Replaces {{ArrayList}} with {{LinkedList}} for the currently used caches
(call it *NR* caches).
2) Failed TIPs are added (if it can be) at the front of the *NR* caches. [for
fail-early]
3) Removal of a tip from the *NR* caches is on demand i.e remove
running/non-runnable TIPs while searching for a new TIP.
4) Maintains a new set of caches called *R* caches for running TIPs. This
caches are similar to the *NR* caches but provides faster removal. Additions to
the caches are in the form of appends. Removal is one shot i.e a non-running
TIP is removed at once from all the *R* caches. [for speculation]
Reduces :
1) Maintains a LinkedList of non-running reducers i.e *NR* cache. [for
non-running tasks]
2) Failed reducers are added to the front of *NR* cache. [for fail-early]
3) Maintains a set of running reducers with faster removal capability. [for
speculation]
----
Also,
1) Search preference is as follows {{FAILED}}, {{NON-RUNNING}}, {{RUNNING}}
2) Search order is as follows
{noformat}
1. Search local cache i.e strong locality
2. Search bottom-up (i.e from the node's parent to the node's top level
ancestor) for a TIP i.e weak locality.
3. Search breadth wise across top-level ancestors for a TIP i.e for a non
local TIP.
{noformat}
3) Introducing a _default-node_. TIP's that are not local to any of the node
are local to default node. This node takes care of random-writer like cases i.e
adapting the random-writer like cases to the cache structure. _default-node_
belongs to _default-rack_ and hence all the nodes share the non-local TIPs
through _default-rack_.
4) The JobTracker need not be synchronized for providing reports to the
JobClient and hence these API's doesn't lock the JT. Some staleness is okay.
5) Commits are now in batches. But batching takes fixed number of tasks at a
time. Default is 5000. So at a time 5000 tasks will be batch committed. The
reason for doing this 'fixed sized batching' is that committing too many TIPs
in one go locks the JobTracker for a very long duration causing *lost
rpc/tracker* issues.
6) TIPs use trackers hostname instead of tracker name for maintaining the list
of machines where the TIP failed.
7) One major bottleneck which we observed was in
{{JobInProgress.isJobComplete()}} where all the TIPs were scanned. This is
costly since {{isJobComplete()}} is called once every completed/failed task
(via {{TaskCommit}} thread) and proves costly in case of large number of maps.
Now this check is done by using the counts of finished TIPs.
> JobTracker becomes non-responsive if the task trackers finish task too fast
> ---------------------------------------------------------------------------
>
> Key: HADOOP-2119
> URL: https://issues.apache.org/jira/browse/HADOOP-2119
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.16.0
> Reporter: Runping Qi
> Assignee: Amar Kamat
> Priority: Critical
> Fix For: 0.17.0
>
> Attachments: HADOOP-2119-v4.1.patch, hadoop-2119.patch,
> hadoop-jobtracker-thread-dump.txt
>
>
> I ran a job with 0 reducer on a cluster with 390 nodes.
> The mappers ran very fast.
> The jobtracker lacks behind on committing completed mapper tasks.
> The number of running mappers displayed on web UI getting bigger and bigger.
> The jos tracker eventually stopped responding to web UI.
> No progress is reported afterwards.
> Job tracker is running on a separate node.
> The job tracker process consumed 100% cpu, with vm size 1.01g (reach the heap
> space limit).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.