In my opinion you should be able to use directly Lucene; indeed nutch
relies upon Lucene for indexing and retrieval. In your case you don't
need to crawl as files are just local and static HTML; you need to
index files and to be able to retrieve them through querying the index
so Lucene should be what you need.

Reply via email to