Thank you very much for this patch, this is great news and I am looking forward 
to using it.

Before I just have a small question: I already have an index with around 1 
million pages indexed using Nutch 0.9 and would like first to remove all pages 
which are not ending with ".be" from this index. Is this possible and if yes 
how ? Or should I better start over again from scratch with a new index using 
Nutch 1.0 ?

Regards



--- On Sat, 12/13/08, Dennis Kubes <[email protected]> wrote:

> From: Dennis Kubes <[email protected]>
> Subject: Updated Domain URLFilter
> To: [email protected]
> Date: Saturday, December 13, 2008, 8:57 AM
> An updated patch has been added for the domain urlfilter. 
> This now includes the matching against domain suffix, domain
> name, and hostname in that order.
> 
> Dennis


      

Reply via email to