[
https://issues.apache.org/jira/browse/TEZ-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180863#comment-14180863
]
Bikas Saha commented on TEZ-1267:
---------------------------------
For code like this and other would it have made sense to put all the logging
and tryEnactKill code into a common method to reduce code verbosity and
duplication? VertexManager code is all Tez code. Maybe it could do all of this
and return a diagnostic object back because the real try-catch is in there
around user code. Instead of the Tez verteximpl code do a try-catch around Tez
vertexmanager code.
{code}
- vertexManager.onVertexStarted(pendingReportedSrcCompletions);
+ try {
+ vertexManager.onVertexStarted(pendingReportedSrcCompletions);
+ } catch (AMUserCodeException e) {
+ String msg = "Exception in " + e.getSource() +", vertex=" +
logIdentifier;
+ LOG.error(msg, e);
+ addDiagnostic(msg + "," + ExceptionUtils.getStackTrace(e.getCause()));
+ tryEnactKill(VertexTerminationCause.AM_USERCODE_FAILURE,
TaskTerminationCause.AM_USERCODE_FAILURE);
+ return VertexState.TERMINATING;
+ }{code}
> Exception handling for VertexManager
> ------------------------------------
>
> Key: TEZ-1267
> URL: https://issues.apache.org/jira/browse/TEZ-1267
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Siddharth Seth
> Assignee: Jeff Zhang
> Priority: Critical
> Fix For: 0.5.2
>
> Attachments: TEZ-1267-2.patch, TEZ-1267-3.patch, TEZ-1267-4.patch,
> Tez-1267.patch
>
>
> Events are generated by user code. In some places they're also handled by
> user code within the AM. Currently, exceptions which are generated when
> handling user code will end up killing the AM (and hence leading to a retry).
> Instead, failure to handle such events, should cause the application to fail.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)