[
https://issues.apache.org/jira/browse/TEZ-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14213353#comment-14213353
]
Bikas Saha commented on TEZ-14:
-------------------------------
Looking at the code further, using a separate dispatcher would not be a trivial
change and it will further complicate and bloat this patch. The new dispatcher
would have to be passed around and then used based on event type to send only
speculator events. Then it would have to be shutdown upon completion and we
need to make sure shutdown is not stuck on it. Each event handler now is either
global (like the scheduler) or has an id (dagid/vertexid/etc). There is no
addressable id for a speculator. So we have to address it via vertex but then
why not send it to the vertex. It anyways is a vertex specific entity that
belongs inside the vertex and should adhere to the vertex lifecycle eg. invoked
only when the vertex is running. Its not clear that this is an independent
addressable entity. Also, if we were to move to a separate dispatcher then we
should ideally first finish TEZ-93 so that multiple dispatchers are transparent
to most of the code vs. having to pass around different dispatchers and choose
which one to use based on the event everywhere in the code. TEZ-93 is something
we should target for 0.6 or 0.7 or else we will continue to be susceptible to
event backlogs.
Based on the above, unless I hear otherwise in the next couple of days, I am
going to commit this patch and create a follow up for moving the speculation
off the central async dispatcher. Potentially after a full or partial
implementation of TEZ-93.
> Support for speculation of slow tasks
> -------------------------------------
>
> Key: TEZ-14
> URL: https://issues.apache.org/jira/browse/TEZ-14
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Bikas Saha
> Assignee: Bikas Saha
> Attachments: TEZ-14.1.patch, TEZ-14.2.patch, TEZ-14.3.patch,
> TEZ-14.4.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)