[
https://issues.apache.org/jira/browse/TEZ-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119403#comment-14119403
]
Jeff Zhang commented on TEZ-1345:
---------------------------------
[~hitesh] Attach 2 patches for the following 2 solutions:
* (Tez-1345-5.patch )Add VertexManagerInitializationDone to recovery log. Found
that it is more complicated than I expected, I have to create a new Event and
new HistoryEvent for it. And in this case, need to change code of
StartRecoveryTransition
* (Tez-1345-6.patch ) On second thought, I think that the reason that init
events may be written to recovery log after VertexInitEvent is that currently
we just put it in eventQueue of AsyncDispatcher ( by using
RouteEventTransition). I think the easy way is to handle the init events
synchronously ( Tez-1345.aptch, Tez-1345-2.patch ). I create the new patch
Tez-1345-6.patch to cache the root init events in RootInitializerManager and
write these events before writing VertexInitializedEvent. This can ensure the
all the root inits events are written before VertexInitializedEvnet, and in
this way we don't need to change the code in StartRecoveryTransition
Regarding the performance [~bikassaha] mentioned, I think actually there's no
performance issue here. Because writing recovery log is in another separate
thread in RecoverySevice, it won't block AsyncDispatcher in Vertex.
> Add checks to guarantee all init events are written to recovery to consider
> vertex initialized
> ----------------------------------------------------------------------------------------------
>
> Key: TEZ-1345
> URL: https://issues.apache.org/jira/browse/TEZ-1345
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Hitesh Shah
> Assignee: Jeff Zhang
> Attachments: Tez-1345-2.patch, Tez-1345-3.patch, Tez-1345-4.patch,
> Tez-1345-5.patch, Tez-1345-6.patch, Tez-1345.patch
>
>
> Related to issue discovered in TEZ-1033
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)