Thank you much.


> Subject: RE: Nutch Content Filtering
> From: [email protected]
> To: [email protected]
> Date: Tue, 17 Jul 2012 07:55:28 +0000
> 
> HtmlParseFilter or IndexingFilter. If you do want to parse and extract 
> outlinks use an indexing filter to deny pages from being indexed. If you just 
> want to throw away the whole page and it's outlinks if it does not contain 
> your terms then implement HtmlParseFilter. See plugins for examples.
> 
>  
>  
                                          

Reply via email to