Hello! Sounds very interesting. Anyway, Solr can run embedded in a Java application called EmbeddedSolrServer. You do need to make some changes to the SolrIndexer tools in Nutch.
Cheers -----Original message----- > From:Emre Çelikten <[email protected]> > Sent: Thu 07-Jun-2012 22:24 > To: [email protected] > Subject: Building Lucene index with Nutch 1.4 > > Hello everybody, > > As part of a project, I am working on a FOSS tool that will build language > models using data obtained from the web which will then be used for speech > recognition. I plan to make this tool quite compact by encapsulating as > much as I can in a single Java application and not requiring the user to > install/configure tons of stuff. > > I have managed to set up Nutch and am able to crawl a website inside a Java > application. The next thing I need to do is to search for certain keywords > in the obtained data. I have read that the ability to build Lucene indexes > has been removed from Nutch and we now need to use Solr instead. The way > Solr works (servlets, HTTP) is not really appropriate for a tool that only > needs search functionality that is invisible to the user. > > What would you recommend me to do in this case? Is there absolutely no way > of building Lucene indexes? I could not find anything other than > recommendations to use Solr instead. Should I try to use an older version > of Nutch? > > Thanks in advance, > > Emre >

