[
https://issues.apache.org/jira/browse/TEZ-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265147#comment-14265147
]
Siddharth Seth commented on TEZ-1897:
-------------------------------------
Snippet from TEZ-1867
bq. Ideally we should not have cross-locks (and we mostly dont) in a messaging
based system like we have. There are many cases where large numbers of some
events (eg. task attempt events during job start for a large map) that prevent
other events from being processed in a timely manner (eg. vertex events during
the backlog of attempt events) So we should definitely try out offloading
events to separate handlers. The good thing is that if there are many issues
after that then its an easy change to revert after this patch. Just the setup
of the dispatchers can be fixed in DAGAppMaster. Given that, trying it out
seems reasonable cost wise.
There are bottlenecks running everything in a single thread; don't think they
add up to a lot in terms of time though - likely <1-2 seconds for large
queries. This is worth exploring, but will likely lead to a lot of races which
won't be caught easily - and may not be worth the effort for the 1-2 second
gain. Agree that maintaining the option to always fallback to a single thread
is definitely required.
Would be useful to have an overview of the dispatcher related changes - how
this relates to TEZ-1914 for example, will the same be used to move committers
off the central dispatcher ? Is affinity required - within a single vertex, for
example, events should not be processed out of order. Just setting up multiple
handler threads doesn't help this case.
> Allow higher concurrency in AsyncDispatcher
> -------------------------------------------
>
> Key: TEZ-1897
> URL: https://issues.apache.org/jira/browse/TEZ-1897
> Project: Apache Tez
> Issue Type: Task
> Reporter: Bikas Saha
> Assignee: Bikas Saha
> Attachments: TEZ-1897.1.patch, TEZ-1897.2.patch
>
>
> Currently, it processes events on a single thread. For events that can be
> executed in parallel, e.g. vertex manager events, allowing higher concurrency
> may be beneficial.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)