Your analysis is correct. Just to clarify, the btrees are *mostly* self 
balancing, but it omits the 2 expensive rotations on btree rebalancing, which 
are used when removing keys from the tree. It is possible for btrees to become 
unbalanced, but a simple compaction does fix it. However, it needs to recompute 
the reductions, and for views that means recalling out to JS. For database 
compaction, the reductions are very cheap CPU wise, so it's not a big there. 
But if the btrees were fully balanced, then there no need to recompute 
reductions during compaction, just copy the nodes and their existing reduction 
values.

-Damien


On Jan 12, 2010, at 4:49 AM, Robert Dionne wrote:

> I don't think so, perhaps I misspoke. What I meant was the comment, "a 
> balancing condition", implied that it was balanced and was merely missing a 
> particular case. It's not balanced in the textbook sense of having an order n 
> that determines lower and upper bounds on the number of nodes at each level.
> 
> The amount of branching is determined by a "chunkify" function that write out 
> chunks of a constant size and so will depend on the key size of kv nodes and 
> the reductions for the kp nodes. 
> 
> So when deletions occur it will be unbalanced in the sense that compaction 
> results in a different b-tree which requires the reductions to be computed 
> again. I believe the intent of #224 is that if the b-tree is balanced in the 
> text book sense that compaction can just be a copy of the nodes.
> 
> I haven't looked at this in some time, and it requires a very detailed 
> analysis to see if it might in fact be improved, particularly with respect to 
> views. I'd love to hear Damien's or Jan's or Paul's opinion on this. When I 
> looked at this last summer I noted there been considerable research in trees 
> used in databases in recent years.  
> 
> 
> 
> 
> On Jan 12, 2010, at 4:02 AM, Brian Candler wrote:
> 
>> On Mon, Jan 11, 2010 at 07:58:10AM -0500, Robert Dionne wrote:
>>> jira-224 makes an odd claim that it misses "a balancing condition", it
>>> doesn't balance at all so I'm not sure what this means.
>> 
>> If it's not balanced at all, does that mean there are degenerate cases which
>> could end up with having to do a linear search through the nodes?
>> 
>>   .
>>    ||||
>>        \
>>         ||||
>>             \
>>              ||||
>>                  \
>>                   ||||
>> 
>> Could inserting with strictly sequential ids cause that condition to occur?
>> (As would be the case with utc_random)
> 

Reply via email to