I think for Nutch 2x it was HTMLParseFilter was renamed to ParseFilter. This is 
not true for 1.x, see NUTCH-1482.

 https://issues.apache.org/jira/browse/NUTCH-1482

 
 
-----Original message-----
> From:Tony Mullins <[email protected]>
> Sent: Wed 12-Jun-2013 14:37
> To: [email protected]
> Subject: HTMLParseFilter equivalent in Nutch 2.2 ???
> 
> Hi ,
> 
> If I go to http://wiki.apache.org/nutch/AboutPlugins  ,here  it shows me
> HTMLParseFilter is extension point for adding custom metadata to HTML and
> its 'Filter' method's signature is 'public ParseResult filter(Content
> content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment
> doc)'  but its in api 1.4 doc.
> 
> I am on Nutch 2.2 and there is no class by name of HTMLParseFilter in  v2.2
> api doc
> http://nutch.apache.org/apidocs-2.2/allclasses-noframe.html.
> 
> So please tell me which class to use in v2.2 api for adding my custom rule
> to extract some data from HTML page (is it ParseFilter ?) and add it to
> HMTL metadata so later then I could add it to my Solr using indexfilter
> plugin.
> 
> 
> Thanks,
> Tony.
> 

Reply via email to