[
https://issues.apache.org/jira/browse/MAPREDUCE-4662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459713#comment-13459713
]
Kihwal Lee commented on MAPREDUCE-4662:
---------------------------------------
bq. One solution is to specify maximum number of queued requests for
LinkedBlockingQueue.
That could be it, but this solution needs more changes. When the queue is full
and the max number of threads are running, new task will be rejected. We could
apply CallerRunsPolicy, but the whole point of having ThreadPoolExecutor is to
avoid blocking of JobTracker for doing job completion.
I think the main requirements here are:
* Absorb bursty job completions - queueing with sufficient capacity or fast
dispatching with a large thread pool.
* Avoid limiting job throughput - enough number of worker threads
* Minimize consumption of extra resource - limit the number of worker threads
* Don't drop anything.
To satisfy the first and second requirements, one can think of the following
two approaches.
* Have a bounded queue and a sufficiently large thread pool. Since we cannot
drop any job completion, we want CallerRunsPolicy for rejected ones.
* Alternatively, use an unbounded queue and a reasonable number of core
threads. No work will be rejected in this case.
Between the two, the second one has an advantage, considering the third
requirement and its simplicity. The question is, what is the reasonable number
of core threads to avoid lagging behind forever? Base on our experience, 3 to 5
seems reasonable. The moveToDone() throughput varies a lot, but it topped at
around 0.8/second in one of busiest clusters I've seen. If the job completion
rate goes over this rate for a long time, the queue will grow and history won't
show up for most of newer jobs.
Here are the two approaches in code:
* The queue is bounded but will absorb bursts of about 100. If the core thread
cannot keep up, up to 10 more threads will be created to help the core thread
drain the queue. If the queue cannot be drained fast enough, the caller will
directly execute the work. This will block the job tracker, since
JobTracker#finalizeJob() is a synchronized method. So the thread pool size and
the queue size must be sufficiently large.
{noformat}
executor = new ThreadPoolExecutor(1, 10, 1, TimeUnit.HOURS,
new LinkedBlockingQueue<Runnable>(100),
ThreadPoolExecutor.CallerRunsPolicy);
{noformat}
* The following will eventually start up 5 threads and keep them running.
Non-blocking and least amount of changes.
{noformat}
executor = new ThreadPoolExecutor(5, 5, 1, TimeUnit.HOURS, new
LinkedBlockingQueue<Runnable>());
{noformat}
What do you think is better? Or can you think of any better approaches?
> JobHistoryFilesManager thread pool never expands
> ------------------------------------------------
>
> Key: MAPREDUCE-4662
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4662
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: jobhistoryserver
> Affects Versions: 1.0.2
> Reporter: Thomas Graves
>
> The job history file manager creates a threadpool with core size 1 thread,
> max pool size 3. It never goes beyond 1 thread though because its using a
> LinkedBlockingQueue which doesn't have a max size.
> void start() {
> executor = new ThreadPoolExecutor(1, 3, 1,
> TimeUnit.HOURS, new LinkedBlockingQueue<Runnable>());
> }
> According to the ThreadPoolExecutor java doc page it only increases the
> number of threads when the queue is full. Since the queue we are using has no
> max size it never fills up and we never get more then 1 thread.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira