2009/7/5 Göran Krampe <[email protected]>: > Hi Paul! > > Paul Davis wrote: >> >> 2009/7/4 Göran Krampe <[email protected]>: >>> >>> Adam Kocoloski wrote: >>>> >>>> Not sure if it's described, but it is by design. The reduce function >>>> executes when the btree is modified. We can't afford to cache KVs from >>>> an >>>> index update in memory regardless of size; we have to set some threshold >>>> when we flush them to disk. >>> >>> And I presume you can't write KVs *without* doing the reduce? >>> >>> When I wrote "described" I am referring to the blog post by Ricky Ho btw. >>> It >>> seems to imply a strict ordering, map -> reduce -> rereduce. IIRC. >> >> That was probably just the theoretical aspect. Map's always happen >> first obviously, and then when the key/values are inserted into the >> btree during a flush the entire tree is built which means that > 0 >> reduces are called and then re-reduces are run to fill out the tree. >> At the moment we aren't delaying re-reduce calls because it'd require >> a major overhaul to the btree code. > > Right, so there is nothing preventing delaying the reduce and rereduce calls > untl the end - except of course the problem of actually implementing it. > Just want to understand. >
Just to be clear, its not an implementation I expect to actually materialize. It'd require a hefty amount of reworking of the btree updates so much that it'd change some of the fundamental behavior and unless I'm just not as creative as I usually am at 5am, I can't think of a decent way of implementing such an update very effectively. > regards, Göran > >
