[ https://issues.apache.org/jira/browse/NUTCH-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783622#action_12783622 ]
Julien Nioche commented on NUTCH-770: ------------------------------------- "time limit" is definitely better than timebomb (but not as amusing). FetchQueues : having it there has the advantage that we can count how many URLs have been skipped due to the time limit. That's in the same spirit as https://issues.apache.org/jira/browse/NUTCH-658 which I have been using for a while. It's very useful to know what happens to the URLs as input and reveals quite a lot about the behaviour of the fetch. Codestyle : I suppose the following Eclipse codestyle is the one to use ? (http://wiki.apache.org/lucene-java/HowToContribute?action=AttachFile&do=view&target=Eclipse-Lucene-Codestyle.xml) > Timebomb for Fetcher > -------------------- > > Key: NUTCH-770 > URL: https://issues.apache.org/jira/browse/NUTCH-770 > Project: Nutch > Issue Type: Improvement > Reporter: Julien Nioche > Attachments: log-770, NUTCH-770.patch > > > This patch provides the Fetcher with a timebomb mechanism. By default the > timebomb is not activated; it can be set using the parameter > fetcher.timebomb.mins. The number of minutes is relative to the start of the > Fetch job. When the number of minutes is reached, the QueueFeeder skips all > remaining entries then all active queues are purged. This allows to keep the > Fetch step under comtrol and works well in combination with NUTCH-769 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.