> In the first crawl i have no problems, but when I recrawl in my crawl
> database there are pages and links from the previous operation, so if
> I first crawl with depth 1 and later I recrawl with depth 1 again is
> like a depth 2 crawling. From an example:
> 
> I make a depth-1 crawling on www.fgfgfgfgfgfgf.com ; it recovers
> information from that page and in that information there is a link to
> www.vbvbvbvbvbvbvbvb.com.   When I recrawl with depth 1 again it
> recrawls from the first web and from the second one, that was added in
> the first crawl. So this is like I made a depth-2 crawling on the
> first web, not a depth-1 recrawling.

I think you are looking for this property setting:

<property>
  <name>db.update.additions.allowed</name>
  <value>false</value>
  <description>If true, updatedb will add newly discovered URLs, if false
  only already existing URLs in the CrawlDb will be updated and no new
  URLs will be added.
  </description>
</property>

I hope that helps.

JohnM

-- 
john mendenhall
[EMAIL PROTECTED]
surf utopia
internet services

Reply via email to