Sebastian Nagel created NUTCH-2398:
--------------------------------------
Summary: Fetcher saving redirected robots.txt under redirect
target URL
Key: NUTCH-2398
URL: https://issues.apache.org/jira/browse/NUTCH-2398
Project: Nutch
Issue Type: Bug
Components: fetcher
Affects Versions: 1.13
Reporter: Sebastian Nagel
Priority: Minor
Fix For: 1.14
NUTCH-2300 lets the Fetcher store optionally the robots.txt response (content
and HTTP status). If the '.../robots.txt' is redirected, the redirected content
is also stored but with the redirect source URL as key. It should use the
redirect target URL instead. Otherwise one of the responses is overwritten in
the segments map file.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)