Thank you very much for this patch, this is great news and I am looking forward to using it.
Before I just have a small question: I already have an index with around 1 million pages indexed using Nutch 0.9 and would like first to remove all pages which are not ending with ".be" from this index. Is this possible and if yes how ? Or should I better start over again from scratch with a new index using Nutch 1.0 ? Regards --- On Sat, 12/13/08, Dennis Kubes <[email protected]> wrote: > From: Dennis Kubes <[email protected]> > Subject: Updated Domain URLFilter > To: [email protected] > Date: Saturday, December 13, 2008, 8:57 AM > An updated patch has been added for the domain urlfilter. > This now includes the matching against domain suffix, domain > name, and hostname in that order. > > Dennis
