[
https://issues.apache.org/jira/browse/TEZ-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508247#comment-14508247
]
Bikas Saha commented on TEZ-1897:
---------------------------------
The last patch builds on the previous patch to actually use the concurrent
dispatcher to run Task and TaskAttempt events concurrently. There is a
configuration to turn this on or off and when it is turned off the code runs
exactly the same path as it does today. So this change is very safe.
In order to keep things sane, events for a given Task and its attempts are
serialized on the same thread. This is done by using a serializing hash
determined from the TezTaskID. Different tasks run on different threads. That
takes care of a lot of locking issues. Next, Vertex has reference to DAG, Task
has reference to Vertex and Attempt has reference to Task and Vertex. This
helps remove unnecessary locking issues and delays that occur when they are
accessed from the AppContext to get dag/vertex/task etc. and then look-up into
their internal maps. This change would be beneficial in general by reducing
lock contention compared to today.
Added a simulation test that runs 50000 tasks at 1000 concurrency which runs up
to 30% faster with the change than without.
The patch has the config turned on for patch test execution. This will be off
by default and is marked private so only advanced users can try this for large
clusters where we can get 10-20K running tasks concurrently.
Please review.
> Create a concurrent version of AsyncDispatcher
> ----------------------------------------------
>
> Key: TEZ-1897
> URL: https://issues.apache.org/jira/browse/TEZ-1897
> Project: Apache Tez
> Issue Type: Task
> Reporter: Bikas Saha
> Assignee: Bikas Saha
> Attachments: TEZ-1897.1.patch, TEZ-1897.2.patch, TEZ-1897.3.patch,
> TEZ-1897.4.patch, TEZ-1897.5.patch, TEZ-1897.5.patch
>
>
> Currently, it processes events on a single thread. For events that can be
> executed in parallel, e.g. vertex manager events, allowing higher concurrency
> may be beneficial.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)