Sujen Shah created NUTCH-2331:
---------------------------------
Summary: REST API Fetch fails to retrieve HDFS path on distributed
mode
Key: NUTCH-2331
URL: https://issues.apache.org/jira/browse/NUTCH-2331
Project: Nutch
Issue Type: Bug
Components: fetcher, REST_api
Reporter: Sujen Shah
Assignee: Sujen Shah
Currently in the REST API, if the user does not specify the absolute path of
the segment to fetch and only the crawlId, then the fetcher would find the
latest segment generated and use that.
But as of now, the above functionality will only work in local mode as per
https://github.com/apache/nutch/blob/master/src/java/org/apache/nutch/fetcher/Fetcher.java#L562-L573.
Need to update these lines to enable fetcher to read the directory and list
files from an hdfs system.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)