On Tue, Jan 12, 2010 at 7:32 PM, Roger Binns <[email protected]> wrote:
> Note that stale=ok is also not a sufficient solution. You do not know in
> advance if it will return you stale data, or if CouchDB will decide to
> update the view (and take ages to do so).
I'd rather always query with stale=ok from client, and do without
stale from say server-side cron curl. Then I always have fast and
(eventually) consistent data on client, even if couch runs (you may
say background) reindexing at that time. I'm saying that just to be
sure you didn't miss this way of usage.
It's not that it changes things in general, but it works for me for
dataset under 10M docs. Sure the initial indexing is rather slow, but
after indexing and compaction is done, reindexing of additional say
10-20k docs and compaction runs in not that big amount of time and
space. For me it's about 1000 docs/sec updating view index (so view
update takes 10-20sec) and about 10M of increase of view file size per
10000 docs, and additional 2-3Gb and ~10 min for view compaction. That
all is run on amazon medium high-cpu instance with data on ebs volume.
Avarage document in my case is like
{
"name": "POS.LEUCHT",
"partno": "ZZZ945051",
"make": "Seat",
"ships": 5,
"date": 1245704400,
"price0": 13.05
}
And view map functions emit (doc.partno) and
emit([doc.make,doc.partno]) without reduce.
(in very small letters) I may suppose that with some further couchdb
improvements you'd expect same scenario to work well for 100M docs on
one server, and if you have even bigger data you'd better do
partitioning.
Hope that helps,
--
DU