Nutch version please? Sebastian and others worked on this a while ago. I don't know about the progress on it. There is most certainly open/resolved tickets for it on Jira please look there. Thank you Lewis
On Wed, Mar 27, 2013 at 12:26 PM, Bai Shen <[email protected]> wrote: > I'm trying to crawl a local file system. I've made the changes to not > ignore file urls and added protocol-file to the plugins list. I've > included file:///data/mydir in my url fille. > > However, when I run the fetch, Nutch tries to connect to file://data/mydir > and therefore returns a 404 error. I think the root slash is being > stripped during the injection, but I can't seem to find out why. > > Anybody have any suggestions or ideas? > > Thanks. > -- *Lewis*

