0.7.2 of nutch

Le Thu, 3 Aug 2006 09:37:24 -0300,
"Lourival Júnior" <[EMAIL PROTECTED]> a écrit :

> Which version are you using?
> 
> On 8/3/06, Nahuel ANGELINETTI <[EMAIL PROTECTED]> wrote:
> >
> > But the websites just added hasn't been yet crawled... And they're
> > not crawled during recrawl...
> > Does "bin/nutch purge" will restart all ?
> >
> >
> >
> > Le Thu, 3 Aug 2006 09:21:04 -0300,
> > "Lourival Júnior" <[EMAIL PROTECTED]> a écrit :
> >
> > > In the nutch conf/nutch-default.xml configuration file exist a
> > > property call db.default.fetch.interval. When you crawl a site,
> > > nutch schedules the next fetch to "today +
> > > db.default.fetch.interval" days. If execute the recrawl command
> > > and the pages that you fetch don't reach this date, they won't be
> > > re-fetched. When you add new urls to the webdb, they will be
> > > ready to be fetch. So at this moment only this pages will be
> > > fetched by the recrawl script.
> > >
> > > I hope I helped you. If I said some wrong thing, please correct
> > > me :)
> > >
> > > Regards
> > >
> > > On 8/3/06, Nahuel ANGELINETTI <[EMAIL PROTECTED]> wrote:
> > > >
> > > > I have another question, I done what you give me... But it
> > > > inject the new urls and "recrawl" it, but against the first
> > > > crawl It doesn't download the web pages and really crawl
> > > > them... perhaps I'm mistaking somewhere...
> > > > Any idea ?
> > > >
> > > > Regards,
> > > >
> > > > --
> > > > Nahuel ANGELINETTI
> > > >
> > > > Le Thu, 3 Aug 2006 08:31:22 -0300,
> > > > "Lourival Júnior" <[EMAIL PROTECTED]> a écrit :
> > > >
> > > > > Hi Nahuel!
> > > > >
> > > > > You could use the command bin/nutch inject $nutch-dir/db
> > > > > -urlfile urlfile.txt. To recrawl your WebDB you can use this
> > > > > script.<
> > > >
> > http://today.java.net/pub/a/today/2006/02/16/introduction-to-nutch-2.html>
> > > > >
> > > > > Take a look to the adddays argument and to the configuration
> > > > > property db.default.fetch.interval.They influence to the
> > > > > result.
> > > > >
> > > > > Regards!
> > > > >
> > > > > On 8/3/06, Nahuel ANGELINETTI <[EMAIL PROTECTED]> wrote:
> > > > > >
> > > > > > Hello,
> > > > > >
> > > > > > I was searching for the method to add new url to the
> > > > > > crawling url list and how to recrawl all urls...
> > > > > >
> > > > > > Can you help me ?
> > > > > >
> > > > > > thanks,
> > > > > >
> > > > > > --
> > > > > > Nahuel ANGELINETTI
> > > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> > >
> > >
> >
> 
> 
> 

Reply via email to