Because it's a URL indeed.
You can either filter this kind of URL by configuring
crawl-urlfilter.txt (-^.*/$ may helps, but I'm not sure about the
regular expression) or filter the search result (you need to develop a
nutch plugin).


On Thu, Apr 29, 2010 at 4:33 AM, BK <> wrote:
> While indexing files on local file system, why does NUTCH interpret the
> directory as a URL - fetching file:/C:/temp/html/
> This causes the index page of this directory to show up on search results.
> Any solutions for this issue??
> Bharteesh Kulkarni

Reply via email to