> Hello List, > > I have a newbie question and I hope that someone can help me. > I do a whole web-crawl but I don“t want to leave the injected > domains --> nofollow to external domain. > > How can I do that?
Hi, I havent seen any option to do that in mine experience with Nutch. The way I do that is at the same time I generate the list of url's to crawl I also change the regex-urlfilter.txt Pay a notice that that will slow down the search a bit as for every URL the nutch will go trough that file Hope that helps Bogdan
