[ https://issues.apache.org/jira/browse/TEZ-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508247#comment-14508247 ]
Bikas Saha commented on TEZ-1897: --------------------------------- The last patch builds on the previous patch to actually use the concurrent dispatcher to run Task and TaskAttempt events concurrently. There is a configuration to turn this on or off and when it is turned off the code runs exactly the same path as it does today. So this change is very safe. In order to keep things sane, events for a given Task and its attempts are serialized on the same thread. This is done by using a serializing hash determined from the TezTaskID. Different tasks run on different threads. That takes care of a lot of locking issues. Next, Vertex has reference to DAG, Task has reference to Vertex and Attempt has reference to Task and Vertex. This helps remove unnecessary locking issues and delays that occur when they are accessed from the AppContext to get dag/vertex/task etc. and then look-up into their internal maps. This change would be beneficial in general by reducing lock contention compared to today. Added a simulation test that runs 50000 tasks at 1000 concurrency which runs up to 30% faster with the change than without. The patch has the config turned on for patch test execution. This will be off by default and is marked private so only advanced users can try this for large clusters where we can get 10-20K running tasks concurrently. Please review. > Create a concurrent version of AsyncDispatcher > ---------------------------------------------- > > Key: TEZ-1897 > URL: https://issues.apache.org/jira/browse/TEZ-1897 > Project: Apache Tez > Issue Type: Task > Reporter: Bikas Saha > Assignee: Bikas Saha > Attachments: TEZ-1897.1.patch, TEZ-1897.2.patch, TEZ-1897.3.patch, > TEZ-1897.4.patch, TEZ-1897.5.patch, TEZ-1897.5.patch > > > Currently, it processes events on a single thread. For events that can be > executed in parallel, e.g. vertex manager events, allowing higher concurrency > may be beneficial. -- This message was sent by Atlassian JIRA (v6.3.4#6332)