"... don't you achieve the same functionality using the db.ignore.external.links property in nutch-default.xml?"
I have a similar doubt. using db.ignore.external.links won't keep it from reaching external domains that it can get from a redirection. as extracted from : http://lucene.472066.n3.nabble.com/db-ignore-external-links-true-and-redirects-td615411.html "if I start at http://www.xyz.com and Nutch finds a link pointing to http://www.xyz.com/blog which is actually a redirection to http://blog.xyz.com then Nutch will start fetching pages from http://blog.xyz.com even though it was not in seed url file" Does this patch solve this ? -- View this message in context: http://lucene.472066.n3.nabble.com/Staying-in-Domain-tp915885p1314022.html Sent from the Nutch - User mailing list archive at Nabble.com.

