> In the first crawl i have no problems, but when I recrawl in my crawl > database there are pages and links from the previous operation, so if > I first crawl with depth 1 and later I recrawl with depth 1 again is > like a depth 2 crawling. From an example: > > I make a depth-1 crawling on www.fgfgfgfgfgfgf.com ; it recovers > information from that page and in that information there is a link to > www.vbvbvbvbvbvbvbvb.com. When I recrawl with depth 1 again it > recrawls from the first web and from the second one, that was added in > the first crawl. So this is like I made a depth-2 crawling on the > first web, not a depth-1 recrawling.
I think you are looking for this property setting: <property> <name>db.update.additions.allowed</name> <value>false</value> <description>If true, updatedb will add newly discovered URLs, if false only already existing URLs in the CrawlDb will be updated and no new URLs will be added. </description> </property> I hope that helps. JohnM -- john mendenhall [EMAIL PROTECTED] surf utopia internet services
