2007/4/19, Briggs <[EMAIL PROTECTED]>: > Nutch 0.9 > > Anyone know if it is possible to be more granular regarding crawl > frequency? Meaning, that I would like some sites to be crawled more > often then others. Like, a news site should be crawled every day, but > your average business website should be crawled every 30 days. So, is > it possible to specify a crawl frequency for specific urls, or is it > only global for within the crawl db? I suppose I could have several > crawldbs or something like that, and deal with it.. but, just curious.
There's something like that in the nutch JIRA (couldn't find it, though), only the JIRA issue is about an adaptive algorithm (as opposed to user provided settings) which would determine the rate of content change at any given URL and adapt the crawl frequency accordingly. Don't know if it's more than a wish, at this point. Cheers, t.n.a. ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
