On 28/03/2008, Jan Lehnardt <[EMAIL PROTECTED]> wrote: > > On Mar 27, 2008, at 15:38 , Johan Sørensen wrote: > > On Thu, Mar 27, 2008 at 10:04 PM, Chris Anderson <[EMAIL PROTECTED]> > > wrote: > >> This is all still academic, but Lucene seems like the best fit for > >> lightweight integration, with Solr and Sphinx providing a wide range > >> of target support. Eg. if the API can support them, it'd be hard to > >> imagine what it couldn't support. > > > > If we're throwing out alternatives I'd put a vote down for Xapian > > (http://www.xapian.org/), in fact I'd be willing to put some code > > behind it, once/if there's some sort of API in place on the CouchDB > > side of things.
For fun, I've been playing with integrating Hyperestraier (http://hyperestraier.sourceforge.net/) into CouchDB. I only really have a prototype indexer working but it's successfully indexing and removing CouchDB documents. I'm using Hyperestraier's HTTP interface (http://hyperestraier.sourceforge.net/nguide-en.html#protocol) but it should be easy to use an in-memory index with relative ease. I have to see, it's been very easy to get CouchDB and Hyperestraier plugged together so far. One of the big helpers if that Hyperestraier identifies documents by URL just like CouchDB :). Of course, the indexer needs to be nicely configurable to tell it what documents to index, what indexes the documents get added to, and what bits of the document go in the various indexes. Also, an indexer isn't much use if there's no searcher. I'll have to work on that too but I'm guessing that will be easier. By the way, how do you perform a full text search on CouchDB? I'm probably missing something really obvious but I didn't spot anything that actually calls couch_ft_query:execute/2. I was planning to trace it back from there to discover the URL. > There's no need to vote if you come up with code :) Yes, I should do that too :). - Matt
