Re: Possible bug in indexer... (really)

Paul Davis Sat, 04 Jul 2009 17:31:28 -0700

2009/7/4 Göran Krampe <[email protected]>:
> Adam Kocoloski wrote:
>>
>> Not sure if it's described, but it is by design.  The reduce function
>> executes when the btree is modified.  We can't afford to cache KVs from an
>> index update in memory regardless of size; we have to set some threshold
>> when we flush them to disk.
>
> And I presume you can't write KVs *without* doing the reduce?
>
> When I wrote "described" I am referring to the blog post by Ricky Ho btw. It
> seems to imply a strict ordering, map -> reduce -> rereduce. IIRC.
>


That was probably just the theoretical aspect. Map's always happen
first obviously, and then when the key/values are inserted into the
btree during a flush the entire tree is built which means that > 0
reduces are called and then re-reduces are run to fill out the tree.
At the moment we aren't delaying re-reduce calls because it'd require
a major overhaul to the btree code.

>> I think the fundamental question is why the flush operations were
>> occurring so frequently the second time around.  Is it because you were
>> building up a largish hash for the reduce value?  Probably.  Nevertheless,
>> I'd like to have a better handle on that.
>
> Yeah, well, I am on vacation now - but some other guys are not. We could of
> course start by trying to rewrite this the Right Way first as Chris said.
>
> I am curious if it can be done using grouping because we dismissed grouping
> due to its relatively slow performance (it runs lots of reduces at query
> time IIRC) :)
>
> Btw, the solution used now DOES return the map for a full year in about 230
> ms, including parsing on client side. So query time was perfectly fine, but
> view generation was not. This shows to me that it *can* work.
>
> regards, Göran
>
>

Re: Possible bug in indexer... (really)

Reply via email to