[ 
https://issues.apache.org/jira/browse/TEZ-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126920#comment-14126920
 ] 

Jeff Zhang commented on TEZ-1539:
---------------------------------

First I'd like to confirm that where does InputInitializerEvent generated, AM 
or NN ? Because I don't see any real use case of InputInitializerEvent in tez. 
Maybe hive is using it, haven't checked it.  

Regarding the impact on recovery, I don't think this change will impact the 
current recovery because the InputInitializerEvent only exists between NEW and 
INITED ( if the recovered state is NEW, then these events should be able to 
regenerated ( not sure, see below comment ), if the recovered state is in 
INTIED, that means init is done, these InputInitializerEvent is not needed any 
more ).

But this make me have concern that actually currently we didn't log the 
InputInitializerEvent in recovery log, so they can not been recovered. So the 
recovery issue may already exist there. 

> Allow a FIRE_ONCE_ON_SUCCESS model for events generated by user code
> --------------------------------------------------------------------
>
>                 Key: TEZ-1539
>                 URL: https://issues.apache.org/jira/browse/TEZ-1539
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>         Attachments: TEZ-1539.1.wip.txt, TEZ-1539.2.txt
>
>
> Specifically for InputInitalizerEvents and VertexManagerEvents.
> Pasting comment from TEZ-1447
> In a majority of cases, events generated by different attempts of the same 
> task will be identical - in which case just making use of the event generated 
> by the first successful attempt is adequate. Doing something like this manes 
> that users don't worry about retries, indices etc - and can just rely on 
> receiving a set of events which are to be processed once the vertex succeeds.
> If different attempts of the same workload generate different events - 
> processing is likely to be incorrect, since it's very possible for all data 
> to be processed (VERTEX successful), then a failure and retry - which 
> generates a different event. The initializer doesn't even run at this point, 
> since it's already done it's work and is complete. Handling such scenarios, 
> likely involves re-running the entire initializer and re-starting the vertex 
> which processed the event from scratch. In situations like this, where data 
> generated may be different, the best bet is for speculation to be disabled 
> (when it's supported), and max-attempts to be set to 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to