The KDE thing is very interesting, thanks for the link! I wash hoping for something cross-platform though.
As regards using Nutch: how would it handle file updates? It seems to me a Web crawler would only get new files and changes on each crawl, whereas a desktop search engine like Spotlight for instance indexes a file as soon as it gets made or modified. There's also this document I found on the Web: it describes some problems with using Nutch on the personal scale owing to its specialization for web crawling----it says there is a limit on files crawled per directory, and size of files crawled. This was all I was able to find under "Nutch desktop search" in Google. However, now that I look at it more closely it's from 2004, so it seems to me Nutch might have gotten rid of these problems in the interim.... http://docs.google.com/viewer?a=v&q=cache:bDjjs__eYPcJ:www.commercenet.com/images/0/06/CN-TR-04-04.pdf+nutch+desktop+search&hl=en&gl=us&pid=bl&srcid=ADGEESg12Bq0VDGk3FpevwOHIdbfr1bCkEZ3CH1yojEliyfeCJv_3JhGRe1gMPx66LiywsUYFWJhKKzsLBVoCtATNcghrW4DRLWlT5sd4YhIWMVaQjMKs5xN-8vqTOHFV2pw9bzCtoQY&sig=AHIEtbTpxSL0xmZJxa5CWm8MzDWD4vyAAg Thanks, Andrew On Mon, Aug 15, 2011 at 6:07 AM, Markus Jelsma <[email protected]>wrote: > With Nutch you can crawl your FS with ease and index to a Solr instance. > It'll > surely work. But you may also be interested in the cool KDE technologies > that > are specifically built for desktop search. > > http://thomasmcguire.wordpress.com/2009/10/03/akonadi-nepomuk-and-strigi- > explained/ > > On Monday 15 August 2011 04:41:11 Andrew Naylor wrote: > > Any suggestions for the best way to get desktop search in the > > Lucene/Solr/Nutch/Tika ecosystem? I want to be able to access (from my > own > > program) lists of terms that are indexed and weights for each file, for > > example, but if a filesystem indexer and index updater already exists > > somewhere I'd like to use it rather than write my own. > > > > I'm planning on working in Clojure, btw, not that that should make any > > difference--- > > > > Thanks, > > > > Andrew > > -- > Markus Jelsma - CTO - Openindex > http://www.linkedin.com/in/markus17 > 050-8536620 / 06-50258350 >

