Hi all,

I'd like to write a plugin that in case of a successful fetch, where the
fetched URL is a redirect (not the original URL that was attempted to
fetch), the original URL nutch attempted to fetch will be added to
Content's Metadata.

Example:

If a fetch was successful, and CrawlDatum's Status = 4 (db_redir_temp), I'd
like to add the URL in CrawlDatum's Metadata.get("_pst_") to Content's
Metadata.

Which extension point should I implement ?

Thanks,

Amit.

Reply via email to