[
https://issues.apache.org/jira/browse/TEZ-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14177986#comment-14177986
]
Jeff Zhang commented on TEZ-1267:
---------------------------------
bq. ROUTE_EVENT_TRANSITIONS from the NEW / INITIALIZING / INITED state. This,
generally, will not process the relevant event - except for the case of
VertexManagerEvents. I think it's safe to leave the change as is - i.e. allow a
transition into FAILED state, but I don't think it'll be triggered, at least
not for NEW. VertexManagers aren't setup before the NEW state - and we likely
need to put in checks for this before - as part of a separate jira.
Only allow transition into FAILED state from INITIALIZING / INITED. create
[TEZ-1691|https://issues.apache.org/jira/browse/TEZ-1691] for checking
VertexManager before handling VertexManageEvent.
bq. SOURCE_TASK_ATTEMPT_COMPLETE while in NEW / INITIALIZING / INITED - the
events will always be cached, and hence should not lead to problems. We could
choose to leave these transitions as is, or allow the FAILED state.
leave these transition as is, no FAILED state transition
bq. Transition from INITED to RUNNING - it's possible for the VertexManager to
have scheduled tasks before generating an error. In such cases, I think we need
to try killing any invoked tasks - rather than transitioning directly to FAILED
state. (Technically, the schedule could be in any of the states - but it
shouldn't be used before the vertex starts.
kill task and transite to TERMINATING if exception happens in INITED to RUNNING.
bq. Minor: Log messages say ", vertexId=" + vertex.logIdentifier + ",". This
should just be vertex instead of vertexId, since logIndetifier contains both.
Change it to "vertex"
bq. These changes are required for the EdgePlugins as well, and also for
InputInitializers. I think we should get this in, and handle the Edges and
InputInitializers in a separate jira.
create jira [TEZ-1689|https://issues.apache.org/jira/browse/TEZ-1689] to track
exception handling in Edges and InputInitializers.
Besides, make the following changes:
* Move some common logic into finish(). like add diagnostics, shutdown
rootInputInitializerManager
> Exception handling when Routing Events
> --------------------------------------
>
> Key: TEZ-1267
> URL: https://issues.apache.org/jira/browse/TEZ-1267
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Siddharth Seth
> Assignee: Jeff Zhang
> Priority: Critical
> Attachments: Tez-1267.patch
>
>
> Events are generated by user code. In some places they're also handled by
> user code within the AM. Currently, exceptions which are generated when
> handling user code will end up killing the AM (and hence leading to a retry).
> Instead, failure to handle such events, should cause the application to fail.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)