Hi.
Check the mail archive, some of theses things was already discussed
and I guess people already have some code / plans but it is not yet
part of the sources.
In any cases such contributions are very welcome from my point of view.
Stefan
Am 24.01.2006 um 11:08 schrieb Guenter, Matthias:
Hi
Would it be of interest for the project to have an extension of
crawl that allows:
- shaping the bandwidth used (inbound)
- keeping the number of request per second in a certain limit
- is able to schedule that with a difference between working hours
and night
And an extension that crawls only file: /http: requests which have
changed after a given date.
Sort of sh ./nutch crawl -changedafter="2006-01-04"?
The code could be delivered end of April as part of a student project.
Kind regards
Matthias Günter
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers