[ 
https://issues.apache.org/jira/browse/TEZ-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-3161:
--------------------------------
    Attachment: TEZ-3161.4.txt

Updated patch with the following changes.
- FailureType renamed to TaskFailureType
- Have retained the APIs introduced in the patch. The existing API is going to 
get confusing otherwise. Added specific javadocs on fatalError explaining the 
behaviour, along with deprecation. This seems like the least confusing to me.
- Marked killSlef as private
- Renamed unsuccessfulEnd to taskFailureType
- Added writing to history. Is there some place that ATS data is being read 
back as well ? I couldn't find that.
- Changed the TaskImpl log line to be easier to understand

bq. Wouldnt there be only one specific termination cause to indicate that the 
user-code told the framework to abort itself or kill itself?
The TaskAttemptEndReason is set based on which component reported the error - 
Input / Processor / Output - at least from the task. There's a bunch of other 
EndReasons which are independent of this. FailureType would now indicate the 
FailureType on top of whatever EndReason is set.

> Allow task to report different kinds of errors - fatal / kill
> -------------------------------------------------------------
>
>                 Key: TEZ-3161
>                 URL: https://issues.apache.org/jira/browse/TEZ-3161
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>         Attachments: TEZ-3161.1.txt, TEZ-3161.2.txt, TEZ-3161.3.txt, 
> TEZ-3161.4.txt
>
>
> In some cases, task failures will be the same across all attempts - e.g. 
> exceeding memory utilization on an operation. In this case, there's no point 
> in running another attempt of the same task.
> There's other cases where a task may want to mark itself as KILLED - i.e. a 
> temporary error. An example of this is pipelined shuffle.
> Tez should allow both operations.
> cc [~vikram.dixit], [~rajesh.balamohan]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to