>Hello,
>I've been searching for three days and I haven't still found a solution. >What I need is, use nutch from java just to crawl a number of urls and then >use lucene to index the pages that nutch finds. I have to integrate this in >my app so I need to make it all from java code. I know how to Index with >Lucene, but don't know how to just crawl with nutch, There are numerous tutorials (try the Nutch wiki for starters) scattered across the net for this. >do it programmatically Don't really understand this terminology! Could you please be more specific. >and from where get the urls to index with Lucene. I assume you will specify the URLs you wish Nutch to fetch. It's then a case of specifying that the URLs will be indexed by Lucene. By adding the lucene-core(version).jar to your nutch environment variable this will use Lucene as the indexer >Also, the urls will be dynamically added and removed. added and removed from where? crawl-urlfilter? regex-urlfilter? Lucene once they have been indexed? >Any help would be appreciated. >Thanks! >Email has been scanned for viruses by Altman Technologies' email management >service - www.altman.co.uk/emailsystems Glasgow Caledonian University is a registered Scottish charity, number SC021474 Winner: Times Higher Education’s Widening Participation Initiative of the Year 2009 and Herald Society’s Education Initiative of the Year 2009 http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

