Adaptive crawl delay
--------------------

                 Key: NUTCH-475
                 URL: https://issues.apache.org/jira/browse/NUTCH-475
             Project: Nutch
          Issue Type: Improvement
          Components: fetcher
            Reporter: Doğacan Güney
             Fix For: 1.0.0


Current fetcher implementation waits a default interval before making another 
request to the same server (if crawl-delay is not specified in robots.txt). 
IMHO, an adaptive implementation will be better. If the server is under little 
load and can server requests fast, then fetcher can ask for more pages in a 
given interval. Similarly, if the server is suffering from heavy load, fetcher 
can slow down(w.r.t that host), easing the load on the server.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to