Doug may have had a key bounce in his input

> Let's do the math:
>
> 258687736 (Record Size)
> 192283300 (Key Size)
> ========

The key size is actually 19283300 in Chris' figures

Regarding 68,063 being less than the current modulus of 82,850.  I think the 
answer may lie in the splitting process.

As I understand it, the first time a split occurs group 1 is split and its 
contents are split between new group 1 and new group 2. All the other groups 
effectively get 1 added to their number. The next split is group 3 (which was 
2) into 3 and 4 and so forth. A pointer is kept to say where the next split 
will take place and also to help sort out how to adjust the algorithm to 
identify which group matches a given key.

Based on this, if you started with 1000 groups, by the time you have split the 
500th time you will have 1500 groups.  The first 1000 will be relatively empty, 
the last 500 will probably be overflowed, but not terribly badly.  By the time 
you get to the 1000th split, you will have 2000 groups and they will, one 
hopes, be quite reasonably spread with very little overflow.

So I expect the average access times would drift up and down in a cycle.  The 
cycle time would get longer as the file gets bigger but the worst time would be 
roughly the the same each cycle.

Given the power of two introduced into the algorithm by the before/after the 
split thing, I wonder if there is such a need to start off with a prime?

Regards, Keith

PS I'm getting a bit Tony^H^H^H^Hverbose nowadays.

_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

Reply via email to