Sujen Shah created NUTCH-2331:
---------------------------------

             Summary: REST API Fetch fails to retrieve HDFS path on distributed 
mode
                 Key: NUTCH-2331
                 URL: https://issues.apache.org/jira/browse/NUTCH-2331
             Project: Nutch
          Issue Type: Bug
          Components: fetcher, REST_api
            Reporter: Sujen Shah
            Assignee: Sujen Shah


Currently in the REST API, if the user does not specify the absolute path of 
the segment to fetch and only the crawlId, then the fetcher would find the 
latest segment generated and use that. 

But as of now, the above functionality will only work in local mode as per 
https://github.com/apache/nutch/blob/master/src/java/org/apache/nutch/fetcher/Fetcher.java#L562-L573.

Need to update these lines to enable fetcher to read the directory and list 
files from an hdfs system. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to