[ https://issues.apache.org/jira/browse/NUTCH-2518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16403094#comment-16403094 ]
Kenneth McFarland commented on NUTCH-2518: ------------------------------------------ Yes this is just a proper subset of the fix for sure. I started with a grep on the function call and there are a few classes to be checked. I wanted to start with the class you mentioned first so I had a base case. Just wanted to open the PR asap so I could conform to feedback and ordering of tasks if need be. Not done for the day by any means but haven't hit my free time interval just yet. > Must check return value of job.waitForCompletion() > -------------------------------------------------- > > Key: NUTCH-2518 > URL: https://issues.apache.org/jira/browse/NUTCH-2518 > Project: Nutch > Issue Type: Bug > Components: crawldb, fetcher, generator, hostdb, linkdb > Affects Versions: 1.15 > Reporter: Sebastian Nagel > Assignee: Kenneth McFarland > Priority: Critical > Fix For: 1.15 > > > The return value of job.waitForCompletion() of the new MapReduce API > (NUTCH-2375) must always be checked. If it's not true, the job has been > failed or killed. Accordingly, the program > - should not proceed with further jobs/steps > - must clean-up temporary data, unlock CrawlDB, etc. > - exit with non-zero exit value, so that scripts running the crawl workflow > can handle the failure > Cf. NUTCH-2076, NUTCH-2442, [NUTCH-2375 PR > #221|https://github.com/apache/nutch/pull/221#issuecomment-332941883]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)