Which version are you using? On 8/3/06, Nahuel ANGELINETTI <[EMAIL PROTECTED]> wrote:
But the websites just added hasn't been yet crawled... And they're not crawled during recrawl... Does "bin/nutch purge" will restart all ? Le Thu, 3 Aug 2006 09:21:04 -0300, "Lourival Júnior" <[EMAIL PROTECTED]> a écrit : > In the nutch conf/nutch-default.xml configuration file exist a > property call db.default.fetch.interval. When you crawl a site, nutch > schedules the next fetch to "today + db.default.fetch.interval" days. > If execute the recrawl command and the pages that you fetch don't > reach this date, they won't be re-fetched. When you add new urls to > the webdb, they will be ready to be fetch. So at this moment only > this pages will be fetched by the recrawl script. > > I hope I helped you. If I said some wrong thing, please correct me :) > > Regards > > On 8/3/06, Nahuel ANGELINETTI <[EMAIL PROTECTED]> wrote: > > > > I have another question, I done what you give me... But it inject > > the new urls and "recrawl" it, but against the first crawl It > > doesn't download the web pages and really crawl them... perhaps I'm > > mistaking somewhere... > > Any idea ? > > > > Regards, > > > > -- > > Nahuel ANGELINETTI > > > > Le Thu, 3 Aug 2006 08:31:22 -0300, > > "Lourival Júnior" <[EMAIL PROTECTED]> a écrit : > > > > > Hi Nahuel! > > > > > > You could use the command bin/nutch inject $nutch-dir/db -urlfile > > > urlfile.txt. To recrawl your WebDB you can use this > > > script.< > > http://today.java.net/pub/a/today/2006/02/16/introduction-to-nutch-2.html> > > > > > > Take a look to the adddays argument and to the configuration > > > property db.default.fetch.interval.They influence to the result. > > > > > > Regards! > > > > > > On 8/3/06, Nahuel ANGELINETTI <[EMAIL PROTECTED]> wrote: > > > > > > > > Hello, > > > > > > > > I was searching for the method to add new url to the crawling > > > > url list and how to recrawl all urls... > > > > > > > > Can you help me ? > > > > > > > > thanks, > > > > > > > > -- > > > > Nahuel ANGELINETTI > > > > > > > > > > > > > > > > > >
-- Lourival Junior Universidade Federal do Pará Curso de Bacharelado em Sistemas de Informação http://www.ufpa.br/cbsi Msn: [EMAIL PROTECTED]