On Jun 17, 2010, at 9:29 AM, afters wrote:

> On 17 June 2010 18:10, J Chris Anderson <[email protected]> wrote:
> 
>> 
>> The reduce-limit is a general heuristic, because some very bad reduces will
>> actually grow asymptotically so that the full reduce contains as much data
>> as the entire group=true reduce. It sounds like yours is OK (large but not
>> growing) so you are probably fine (although keeping 4kb of stuff in the
>> intermediate reduction value storage is going to kill performance.
> 
> 
> I could limit it to 1kb perhaps - at this point it doesn't matter too much.
> I imagine it would still maim, if not kill, performance. Correct?

I bet 1kb will be more than 4 times faster than 4kb, so it's worth a shot. But 
I'm guess you are probably better off in terms of scalability to have a lean 
reduce index, and use the results from that to know which document to fetch.

OTOH if you are gonna be working only with smaller data sets, then you may even 
be fine with what you've got. Just be aware that with large reductions 
(especially reductions that are giant when called without group=true) you are 
introducing a bunch of overhead, and things will slow down as your database 
grows.

If you keep your reduces simple, like _sum and _count, or similar data 
structures, you should be fine.

Read this for a survey of reduction techniques that can scale 
http://labs.google.com/papers/sawzall.html

> 
> Any way to break it up and maybe use the reduce to know which document to
>> query to get the big blob of text?
>> 
>> 
> I could certainly do that. Indeed my original plan, before discovering the
> magic of 'group=true', was to fetch each piece of entity-data separately.
> 
> a.
> 
> 
>> Chris
>> 
>> 

Reply via email to