I ran some experiments and figured out a few things:

If a view's key has components, CouchDB will indeed maintain intermediate reduce results at each group level. It will use these intermediate results to efficiently calculate arbitrary ranges. For example, if I asked for sum(2008-03-11 to 2008-07-25), CouchDB will call reduce twice. The first call will sum all the included days in march and july. The second reduce will have combine=true and sum the previous result with april, may and june, who's sums are already in the index.

CouchDB also seems to intrinsically partition keyspace into groups of approximately 43-45. I don't know the significance of this number but it is probably some tweaked threshold value for the b-tree algorithm.

The bottom line is that reduced views with arbitrary key ranges run in log time, without doing anything special.

Chris Anderson wrote:
On Sat, Nov 22, 2008 at 9:09 PM, Jedediah Smith
<[EMAIL PROTECTED]> wrote:
A possible compromise would be to use group_level to find the balance per
component and then add those together on the client. Example:

balance(2008-11-22) =
 sum(-inf to 2007-) +
 sum(2008-01- to 2008-10-) +
 sum(2008-11-01 to 2008-11-22)


This looks like the right way to combine multiple time ranges to me.
Adding on the client is a fine thing in a case like this. However, I
think you can do it in a single query.

If a view like the
above existed and I updated an old transaction, there would only be one
rereduce for each group level, right?

Querying with group=false will be faster, I think. (I should benchmark this...)

In the normal case, with a modest amount of data, that's about right.
Each grouped view query (I think... I really should bust out the log()
in the views to know for sure...) will fire at least one JavaScript
rereduce. In the case of very very much data and a first time reduce
query over that range, the rereduce could run a few times, but the #
of rereduces run should increase only logarithmically with the # of
rows, if I'm not mistaken. It's only when you run multiple queries (or
multple reduces for groups within a range) that you're likely to run
into a linear increase in the number of rereduces. Again, this should
be explored in the log, but I think you'll get a minimum of 1 rereduce
per group query.

The simplest query to get someone's running balance would be something like:

_view/viewname?startkey=["bob", BEGINNING_OF_TIME]&endkey=["bob", CURRENT_DATE]

which has an implicit reduce=true&group=false.

BTW Jan I really like your array date format.

Reply via email to