2006/10/18, [EMAIL PROTECTED] <[EMAIL PROTECTED]>: > Btw we have some virtual local hosts, hoz does the db.ignore.external.links > deal with that ?
Update: setting db.ignore.external.links to true in nutch-site (and later also in nutch-default as a sanity check) *doesn't work*: I feed the crawl process a handfull of URLs and can only helplessly watch as the crawl spreads to dozens of other sites. In answer to your question, it seems pointless to talk about virtual host handling if the elementary filtering logic doesn't seem to work... :-\ t.n.a. ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
