and recrawl
Hi,
62 docs are in the index.
José
De : Alexander Aristov [EMAIL PROTECTED] Date d'envoi : mardi 2 décembre 2008
06:58 À : nutch-user@lucene.apache.org Objet : Re: RE : Problem with crawl and
recrawl
Maybe silly question but
How to know how many
@lucene.apache.org
Objet : RE : RE : Problem with crawl and recrawl
Hi,
62 docs are in the index.
José
De : Alexander Aristov [EMAIL PROTECTED] Date d'envoi : mardi
2 décembre 2008 06:58 À : nutch-user@lucene.apache.org Objet : Re: RE :
Problem
: Julien Nioche [mailto:[EMAIL PROTECTED]
Envoyé : lundi 8 décembre 2008 18:22
À : nutch-user@lucene.apache.org
Objet : Re: RE : Problem with crawl and recrawl
Bonjour Jose,
Sorry if I am suggesting something obvious but after you've done the *
updateDB* do you call *generate* to get a new segment
When you do the generate, fetch commands, are you doing and updatedb
command also and then multiple generate and fetch cycles? The depth 3
parameter automates this on the crawl command.
Dennis
José Mestre wrote:
Hi,
I'm using nutch to index part of an intranet website.
When I use the
Message-
From: Dennis Kubes [mailto:[EMAIL PROTECTED]
Sent: lundi 1 décembre 2008 18:48
To: nutch-user@lucene.apache.org
Subject: Re: Problem with crawl and recrawl
When you do the generate, fetch commands, are you doing and updatedb
command also and then multiple generate and fetch cycles
(DB_fetched): 62
CrawlDb statistics: done
I don't understand why urls are unfetched ?
Regards.
Jo
De : José Mestre [EMAIL PROTECTED]
Date d'envoi : lundi 1 décembre 2008 19:01
À : nutch-user@lucene.apache.org
Objet : RE: Problem with crawl and recrawl
Hi,
I use
PROTECTED]
Date d'envoi : lundi 1 décembre 2008 19:01
À : nutch-user@lucene.apache.org
Objet : RE: Problem with crawl and recrawl
Hi,
I use the script and I've already tried line by line.
Yes after the fetch I do an updatedb, and after I do a fetch again, ... as
many fetch as depth value