On Tue, Apr 20, 2010 at 9:42 PM, Ning Tan <[email protected]> wrote: > On Tue, Apr 20, 2010 at 9:08 PM, Nick Poulden <[email protected]> wrote: > > My question is how to approach this filtering problem with couchdb. At > the > > moment I have different views on 'account_type', 'registration_date' and > > 'last_login_date'. My program queries each view separately and returns an > > intersection of the resulting key arrays. This is working ok while my > > database is small but it's not very scalable as all the document keys > have > > to be loaded in to memory before the intersection can happen. > > For these "ad hoc" queries, I would suggest a separate indexing system > (e.g. Couch-Lucene, etc). > > We use a Solr system for similar purposes. >
Historically the solution to this problem has always been *run all your changes through something else*. I'm pretty down on that as an actual solution -- it's feasible but makes *something else* a pretty bigass bottleneck. Now you have to have a distributed *something else* and a distributed couch -- there's obvious benefits to a distributed couch but your *something else* is just serving up answers to specific ad-hoc questions... There's a pretty slick solution built into couch -- replication -- but the hard part is getting couch's replication to talk to your *something else*. I'm trying to fix that now, and it ought to be easy. I'll hit this list back once I get couch replication working on persevere 2.0 but the idea is all your couch _changes should be able to flow into some other defined store (mongo, lucene, sql, whatever) and from your couchapp (or anywhere else) you should be able to hit your *something else* to resolve multiterm queries. I'll post back in a few days but just wanted to jump in and point out that this kind of thing ought to be totally doable.
