2006/10/18, [EMAIL PROTECTED] <[EMAIL PROTECTED]>:
Btw we have some virtual local hosts, hoz does the db.ignore.external.links deal with that ?
Update: setting db.ignore.external.links to true in nutch-site (and later also in nutch-default as a sanity check) *doesn't work*: I feed the crawl process a handfull of URLs and can only helplessly watch as the crawl spreads to dozens of other sites. In answer to your question, it seems pointless to talk about virtual host handling if the elementary filtering logic doesn't seem to work... :-\ t.n.a.