Hi.
Check the mail archive, some of theses things was already discussed and I guess people already have some code / plans but it is not yet part of the sources.
In any cases such contributions are very welcome from my point of view.

Stefan


Am 24.01.2006 um 11:08 schrieb Guenter, Matthias:

Hi
Would it be of interest for the project to have an extension of crawl that allows:
- shaping the bandwidth used (inbound)
- keeping the number of request per second in a certain limit
- is able to schedule that with a difference between working hours and night

And an extension that crawls only file: /http: requests which have changed after a given date.
Sort of  sh ./nutch crawl -changedafter="2006-01-04"?

The code could be delivered end of April as part of a student project.

Kind regards

Matthias Günter





-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to