[
https://issues.apache.org/jira/browse/TEZ-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14077152#comment-14077152
]
Chen He commented on TEZ-707:
-----------------------------
{quote}
When not in local mode, the RM informs the AM if a container crashes. That's
missing in the current patch - there's no notification to the scheduler or the
rest of the AM. Typically, when a task fails - we end up killing the JVM - but
that isn't always going to be the case - more so in local mode, since that ends
up killing the AM. The errors need to be handled, so that the AM eventually
reaches a correct state. This is where the callbacks can help (there's
obviously other ways to implement this though).
{quote}
some updates to track task and let maximumpoolsize = pool size.
Acutally, Exceptions during running task will be handled in the try...catch
sentence and finally send to AM through "sendContainerLaunchFailedMsg". I did
not see uncovered case. During our pig on TEZ local mode test, the DAGAM got
the "container" failing information through the "sendContainerLaunchFailedMsg".
Maybe I am wrong.
> Create LocalContainerLauncher
> -----------------------------
>
> Key: TEZ-707
> URL: https://issues.apache.org/jira/browse/TEZ-707
> Project: Apache Tez
> Issue Type: Sub-task
> Affects Versions: 0.3.0
> Reporter: Chen He
> Assignee: Chen He
> Priority: Blocker
> Attachments: TEZ-707-2014-7015.patch, TEZ-707-2014-7015.patch.review,
> TEZ-707-v3.patch, TEZ-707-v4.patch, TEZ-707.patch, TEZ-707.patch,
> TEZ-707.patch, Tez-707.patch.v2, tez-707.patch
>
>
> Create LocalContainerLauncher and make it work for a single stage DAG. The
> TaskSchedulerEventHandler still asks RM for new container but
> LocalContainerLauncher will run TezTask in form of thread instead of using
> this container from yarn.
--
This message was sent by Atlassian JIRA
(v6.2#6252)