Greets, I've been trying to understand this comment regarding ArrayUtil.getNextSize():
* The growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ... Maybe I'm missing something, but I can't see how the formula yields such a growth pattern: return (targetSize >> 3) + (targetSize < 9 ? 3 : 6) + targetSize; For input values of 9 or greater, all that formula does is multiply by 1.125 and add 6. (Values enumerated below my sig.) The comment appears to have originated with this Python commit: http://svn.python.org/view/python/trunk/Objects/listobject.c?r1=35279&r2=35280 I think it was wrong then, and it's wrong now. The primary purpose of getNextSize() is to minimize reallocations during dynamic array resizing by overallocating on certain actions. Exactly how much we overallocate by doesn't seem to matter that much. Python apparently adds an extra eighth or so. Ruby reportedly multiplies by 1.5. Theoretically, multipliers larger than the golden mean are supposed to be suboptimal because they tend to induce memory fragmentation: subsequent reallocations cannot reuse previously freed sections, because they never add up to the total required by the newly requested fragment. However, that assumes a reasonably closed memory usage pattern, and so long as the freed fragment can be reused by someone else somewhere, it won't go to waste. IMO, minimizing memory fragmentation is so dependent on the internal implementation of the system's memory allocator as to be not worth the trouble, but if we were to do it, I think the right approach is outlined in this comment documenting the intention of the Python resizing routine prior to the commit that introduced the new (broken?) algo: http://svn.python.org/view/python/trunk/Objects/listobject.c?revision=35125&view=markup /* Round up: * If n < 256, to a multiple of 8. * If n < 2048, to a multiple of 64. * If n < 16384, to a multiple of 512. * If n < 131072, to a multiple of 4096. * If n < 1048576, to a multiple of 32768. * If n < 8388608, to a multiple of 262144. * If n < 67108864, to a multiple of 2097152. * If n < 536870912, to a multiple of 16777216. * ... * If n < 2**(5+3*i), to a multiple of 2**(3*i). I can't really see the point of adding the small constant (6) for large values, as is done in the new algo. And if oversizing is important for small values (debatable, since there will always be lots of small memory chunks floating around in the allocation pool), then rounding up to 8 consistently makes more sense to me than the current behavior. IMO, just overallocating by some multiplier between 1.125 and 1.5 achieves our primary goal of avoiding pathological reallocation behavior, and that's enough. How about simplifying ArrayUtil.getNextSize() down to this? return targetSize + (targetSize / 8); Marvin Humphrey mar...@smokey:~ $ perl -le 'print "$_ => " . (($_ >> 3) + ($_ < 9 ? 3 : 6 ) + $_) for 0 .. 100' 0 => 3 1 => 4 2 => 5 3 => 6 4 => 7 5 => 8 6 => 9 7 => 10 8 => 12 9 => 16 10 => 17 11 => 18 12 => 19 13 => 20 14 => 21 15 => 22 16 => 24 17 => 25 18 => 26 19 => 27 20 => 28 21 => 29 22 => 30 23 => 31 24 => 33 25 => 34 26 => 35 27 => 36 28 => 37 29 => 38 30 => 39 31 => 40 32 => 42 33 => 43 34 => 44 35 => 45 36 => 46 37 => 47 38 => 48 39 => 49 40 => 51 41 => 52 42 => 53 43 => 54 44 => 55 45 => 56 46 => 57 47 => 58 48 => 60 49 => 61 50 => 62 51 => 63 52 => 64 53 => 65 54 => 66 55 => 67 56 => 69 57 => 70 58 => 71 59 => 72 60 => 73 61 => 74 62 => 75 63 => 76 64 => 78 65 => 79 66 => 80 67 => 81 68 => 82 69 => 83 70 => 84 71 => 85 72 => 87 73 => 88 74 => 89 75 => 90 76 => 91 77 => 92 78 => 93 79 => 94 80 => 96 81 => 97 82 => 98 83 => 99 84 => 100 85 => 101 86 => 102 87 => 103 88 => 105 89 => 106 90 => 107 91 => 108 92 => 109 93 => 110 94 => 111 95 => 112 96 => 114 97 => 115 98 => 116 99 => 117 100 => 118 --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org