Hi
this can be done with a scoring filter
Julien
On Tuesday, 26 November 2013, Amit Sela wrote:
> Hi all,
>
> I'd like to write a plugin that in case of a successful fetch, where the
> fetched URL is a redirect (not the original URL that was attempted to
> fetch), the original URL nutch attempted to fetch will be added to
> Content's Metadata.
>
> Example:
>
> If a fetch was successful, and CrawlDatum's Status = 4 (db_redir_temp), I'd
> like to add the URL in CrawlDatum's Metadata.get("_pst_") to Content's
> Metadata.
>
> Which extension point should I implement ?
>
> Thanks,
>
> Amit.
>
--
Open Source Solutions for Text Engineering
http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble