I have created an issue for this functionality: https://issues.apache.org/jira/browse/NUTCH-2481
Sent: Thursday, December 14, 2017 at 2:07 PM From: "Semyon Semyonov" <[email protected]> To: "usernutch.apache.org" <[email protected]> Subject: Usage previous stage HostDb data for generate(fetched deltas) Dear all, I plan to improve hostdb functionality to have a DB_FETCHED delta for generate stage. Lets say for each website we have condition of generate while number of fetched < 150. The problem is for some websites that condition will (almost)never be finished, because of its structure. For example 1) Round1. 1 page 2) Round2. 10 pages 3) Round3. 80 pages 4) Round 4. 1 page 5) Round 5. 1 page ...etc. I would like to add the delta condition for fetched that describes speed of the process. Lets say generate while number of fetched < 150 && delta_fetched > 1. Therefore in this case the process should stop on round 5 with total number of fetched equals to 92. To make it I plan to modify updatehostdb function and add delta variable in hostdatum for fetched. Do you think it is a good idea to make it in such a way? Semyon.

