In my opinion you should be able to use directly Lucene; indeed nutch relies upon Lucene for indexing and retrieval. In your case you don't need to crawl as files are just local and static HTML; you need to index files and to be able to retrieve them through querying the index so Lucene should be what you need.
- Re: Crawling local files? Giovanni Novelli
- Re: [Nutch-general] Re: Crawling local files? ogjunk-nutch
