Please also see

https://issues.apache.org/jira/browse/NUTCH-1484

Sebastien resolved this one off and AFAIK fixed the solution.

On Thu, Mar 28, 2013 at 6:09 AM, Bai Shen <[email protected]> wrote:

> Finally found it in JIRA.
>
> https://issues.apache.org/jira/browse/NUTCH-1483
>
> I'll give the patch a try and see if that fixes my issue.
>
> On Wed, Mar 27, 2013 at 4:29 PM, Lewis John Mcgibbney <
> [email protected]> wrote:
>
> > Nutch version please?
> > Sebastian and others worked on this a while ago.
> > I don't know about the progress on it. There is most certainly
> > open/resolved tickets for it on Jira please look there.
> > Thank you
> > Lewis
> >
> > On Wed, Mar 27, 2013 at 12:26 PM, Bai Shen <[email protected]>
> > wrote:
> >
> > > I'm trying to crawl a local file system.  I've made the changes to not
> > > ignore file urls and added protocol-file to the plugins list.  I've
> > > included file:///data/mydir in my url fille.
> > >
> > > However, when I run the fetch, Nutch tries to connect to
> > file://data/mydir
> > > and therefore returns a 404 error.  I think the root slash is being
> > > stripped during the injection, but I can't seem to find out why.
> > >
> > > Anybody have any suggestions or ideas?
> > >
> > > Thanks.
> > >
> >
> >
> >
> > --
> > *Lewis*
> >
>



-- 
*Lewis*

Reply via email to