[
https://issues.apache.org/jira/browse/MAPREDUCE-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12868865#action_12868865
]
dhruba borthakur commented on MAPREDUCE-1797:
---------------------------------------------
We are seeing a scalability bottleneck in our 1600 node map-reduce cluster.
A job is submitted to the JT. The JT spawns a thread A to read in the splits
from hdfs. The JT keeps the JobInProgress lock while reading in the splits from
the splits file. while this is happening, another client issues a getCounter()
call to the JT. The getCounter RPC is handled by a thread B, it first acquires
the JT lock and then blocks while trying to acquire the JobInProgress lock.
This causes the entire JT to block until thread A release the JobInProgress
lock.
One suggestion is to change the JobTracker lock into a reader/writer lock.
> The JobTracker lock can be a reader/writer lock
> -----------------------------------------------
>
> Key: MAPREDUCE-1797
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1797
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: jobtracker
> Reporter: dhruba borthakur
>
> The Jobtracker has a global lock and a per-job JobInProgress lock. The aim
> for the JobInprogress lock is to support the ability to lock a single job's
> metadata without blocking out the entire JobTracker. However, many code
> paths acquire the JobTracker lock and then acquire the JobInProgress lock
> while keeping the JobTracker lock. This somewhat defeats the benefit of
> having a per-job lock.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.