Is there a way to measure (some sort of stats) how many requests did nutch
send to a website for one day or one hour ? I would like to measure the
crawl rate ?

Here are the options i tried so far (with the dump i created out of crawldb)

- use the "tstamp" field in the index and aggregate it and count by every
unique date/hour
- filter the crawldb by modified date ( to the date being analyzed) and
then aggregate again by date/hour ( to make sure we dont just count
db_fetched, but everything else).

Thanks
Srini

Reply via email to