[ 
https://issues.apache.org/jira/browse/GOBBLIN-273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhixiong Chen updated GOBBLIN-273:
----------------------------------
    Component/s: gobblin-core

> Add failure monitoring
> ----------------------
>
>                 Key: GOBBLIN-273
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-273
>             Project: Apache Gobblin
>          Issue Type: Task
>          Components: gobblin-core
>            Reporter: Zhixiong Chen
>            Assignee: Zhixiong Chen
>
> When a job failed with a very long log, it's not easy to dive into the log 
> and find the reason of the failure. Here a reporter is plugin-ed into the 
> Gobblin Metrics architecture to collect job failure events into a file. A job 
> now has task level and dataset level failure events reported for free.
> h3. `MetricContext#submitFailureEvent`
> When a failure event needs to be reported, it should be submitted with this 
> method, which encapsulates the event into a `FailureEventNotification`
> h3. `FileFailureEventReporter`
> Report all failure events into a file. Each job has its own report folder. 
> h3. Configurations
> To enable job failure reporting, the following configurations are required
> {code:java}
> // Some comments here
> metrics.enabled=true
> fs.uri=<file system uri> // by default, local file system is used
> failure.log.dir=<root folder of all jobs failure reports>
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to