Solr itself does all three things. There is no need for Nutch- that is needed for crawling web sites, not file systems (as the original question specifies).
Solr operates as a web service, running in any Java servlet container. Detecting changes to files is more tricky: there is no implementation for the real-time update system available for Windows. You would have to implement that. Otherwise you can poll a file system and re-index altered files. On Fri, Jan 14, 2011 at 4:54 AM, Markus Jelsma <markus.jel...@openindex.io> wrote: > Nutch can crawl the file system as well. Nutch 1.x can also provide search but > this is delegated to Solr in Nutch 2.x. Solr can provide the search and Nutch > can provide Solr with content from your intranet. > > On Friday 14 January 2011 13:17:52 Cathy Hemsley wrote: >> Hi, >> Thanks for suggesting this. >> However, I'm not sure a 'crawler' will work: as the various pages are not >> necessarily linked (it's complicated: basically our intranet is a dynamic >> and managed collection of independantly published web sites, and users >> found information using categorisation and/or text searching), so we need >> something that will index all the files in a given folder, rather than >> follow links like a crawler. Can Nutch do this? As well as the other >> requirements below? >> Regards >> Cathy >> >> On 14 January 2011 12:09, Markus Jelsma <markus.jel...@openindex.io> wrote: >> > Please visit the Nutch project. It is a powerful crawler and can >> > integrate with Solr. >> > >> > http://nutch.apache.org/ >> > >> > > Hi Solr users, >> > > >> > > I hope you can help. We are migrating our intranet web site management >> > > system to Windows 2008 and need a replacement for Index Server to do >> > > the text searching. I am trying to establish if Lucene and Solr is a >> > >> > feasible >> > >> > > replacement, but I cannot find the answers to these questions: >> > > >> > > 1. Can Solr be set up to recursively index a folder containing an >> > > indeterminate and variable large number of subfolders, containing files >> > >> > of >> > >> > > all types: XML, HTML, PDF, DOC, spreadsheets, powerpoint >> > > presentations, text files etc. If so, how? >> > > 2. Can Solr be queried over the web and return a list of files that >> > > match >> > >> > a >> > >> > > search query entered by a user, and also return the abstracts for these >> > > files, as well as 'hit highlighting'. If so, how? >> > > 3. Can Solr be run as a service (like Index Server) that automatically >> > > detects changes to the files within the indexed folder and updates the >> > > index? If so, how? >> > > >> > > Thanks for your help >> > > >> > > Cathy Hemsley > > -- > Markus Jelsma - CTO - Openindex > http://www.linkedin.com/in/markus17 > 050-8536620 / 06-50258350 > -- Lance Norskog goks...@gmail.com