Hi,

I am a newbie to Nutch and Lucene. Have a task to build a framework for
webpage caching on local system (i.e. download and store webpage in local
filesystem), indexing (index pages on keywords), search (search the local
webpage cache using the keywords). The preference would be to build
framework using Java API available in third party jars.

On first glance, it seems Nutch+Hadoop+Lucene should be a good option to
build this framework. Do you think it is a right option? Any ideas, links
would be appreciated.

Regards,
Amit

Reply via email to