Adaptive crawl delay -------------------- Key: NUTCH-475 URL: https://issues.apache.org/jira/browse/NUTCH-475 Project: Nutch Issue Type: Improvement Components: fetcher Reporter: Doğacan Güney Fix For: 1.0.0
Current fetcher implementation waits a default interval before making another request to the same server (if crawl-delay is not specified in robots.txt). IMHO, an adaptive implementation will be better. If the server is under little load and can server requests fast, then fetcher can ask for more pages in a given interval. Similarly, if the server is suffering from heavy load, fetcher can slow down(w.r.t that host), easing the load on the server. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers