Hi In my project, I really re-crawl the website everytime, and add one url dedup listener to the crawl job. I mean when nutch finishes the crawl web site, url dedup follows.
Any good idea? /Jack On 5/27/05, k-team <[EMAIL PROTECTED]> wrote: > hi Jack, > > > You can use operation system built in scheduler such as crontab in > > Unix, or some java lib such as Quartz. > > mmm maybe I have explained myself badly. yeah, I know cron but I was > wondering how nutch decides to recrawl -- for example -- urls that are > one week old. > > thanks. > > ciao, > Marco > ------------------------------------------------------- This SF.Net email is sponsored by Yahoo. Introducing Yahoo! Search Developer Network - Create apps using Yahoo! Search APIs Find out how you can build Yahoo! directly into your own Applications - visit http://developer.yahoo.net/?fr=offad-ysdn-ostg-q22005 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
