I would suggest have the ability to tie a lucene index entry to an emitted key, then when we get a match in lucene we map that to a list of emitted keys, then reduce from there. However, I think the original question from Behrad still stands, is it more efficient to emit multiple keys per document, or create multiple views?
On Tue, Jun 29, 2010 at 09:54, Robert Newson <[email protected]>wrote: > You are correct that you cannot do map/reduce with the Lucene > full-text indexing engine. People keep asking for it, but no one can > explain how it could be implemented. :) > > B. > > On Tue, Jun 29, 2010 at 2:50 PM, Luke Driscoll <[email protected]> > wrote: > > One of the reasons that I have not been using couchdb-lucene is that it > > doesn't seem to allow us to use the reduce functionality of a view, and > that > > the results are just the documents. Am I way off in this statement? > > > > Luke > > > > On Tue, Jun 29, 2010 at 09:38, Sebastian Cohnen < > > [email protected]> wrote: > > > >> in that case, you want the power of a search engine like lucene/solr. > you > >> should definitely have a look at couchdb-lucene. > >> > >> You wrote in another reply: > >> > but we are querying through multiple document "keys" not key contents! > >> and we are not bios toward > >> > full-text indexers for now in this project. > >> > >> you can actually define very precisely what you want to have indexed by > >> lucene, may it be only one key, or everything down to attachments. > >> > >> > >> On 29.06.2010, at 15:10, Behrad Zari wrote: > >> > >> > It's AND-logic. I read my post again and found that I've written "OR" > >> > mistakenly. > >> > > >> > Actually talking, we are gonna filter results that are starting with > >> > key[i]=val[i] AND starting with key[j]=val[j] > >> > (our searches use startkey and endkey to filter doc ranges) > >> > So, we may emit compound/array keys: [key1, ...] to simulate > AND-logic. > >> > > >> > I get your idea Sebastian, but CouchDB is preferred to do the actual > >> filtering > >> > (unity of all _byKey view results) since each of returned rows from > the > >> server > >> > may contain huge number of results! and this is not an efficient > solution > >> > instead of a MAY-BE-EXISTING serverside solution. (based on B+trees) > >> > > >> > --Behrad > >> > > >> > > >> > >> > > >
