[ 
https://issues.apache.org/jira/browse/NUTCH-2518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16398326#comment-16398326
 ] 

Sebastian Nagel commented on NUTCH-2518:
----------------------------------------

Hi [~kpm1985], have a look at NUTCH-2442 and [PR 
#239|https://github.com/apache/nutch/pull/239/files] which fixed the same 
problem for Injector and few other classes. The clean-up actions to be taken in 
case a job fails (job.waitForCompletion(true) returns false) are the same as 
done if running the job throws an exception. If the cleanup requires more than 
a single statement, it's better to move it to a cleanup method. Just follow the 
way it's done in PR#239.

> Must check return value of job.waitForCompletion()
> --------------------------------------------------
>
>                 Key: NUTCH-2518
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2518
>             Project: Nutch
>          Issue Type: Bug
>          Components: crawldb, fetcher, generator, hostdb, linkdb
>    Affects Versions: 1.15
>            Reporter: Sebastian Nagel
>            Assignee: Kenneth McFarland
>            Priority: Critical
>             Fix For: 1.15
>
>
> The return value of job.waitForCompletion() of the new MapReduce API 
> (NUTCH-2375) must always be checked. If it's not true, the job has been 
> failed or killed. Accordingly, the program
> - should not proceed with further jobs/steps
> - must clean-up temporary data, unlock CrawlDB, etc.
> - exit with non-zero exit value, so that scripts running the crawl workflow 
> can handle the failure
> Cf. NUTCH-2076, NUTCH-2442, [NUTCH-2375 PR 
> #221|https://github.com/apache/nutch/pull/221#issuecomment-332941883].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to