On Wed, Jan 13, 2010 at 05:48:08AM -0500, Michael McCandless wrote: > Have you notified python-dev?
No, not yet. Is it kosher with the Python license to have copied-and-pasted that comment? It's not credited from what I can see. Small, but we should probably fix that. > Right, and also to strike a balance with not wasting too much > over-allocated but not-yet-used RAM (hence 1/8 growth, not 1/2 or 1). I agree, a smaller size is better. Say you start with an array which hold 800 elements but which has been over-allocated by one eighth (+100 elements = 900). Reallocating at 900 and again at 1000-something isn't that different from reallocating only once at 1000. So long as you avoid reallocating at every increment -- 801, 802, etc -- you have achieved your main goal. Both mild and aggressive over-allocation solve that problem, but aggressive over-allocation also involves significant RAM costs. Where the best balance lies depends on how bad the reallocation performance is in relation to the cost of insertion. An eighth seems pretty good. Doubling seems too high. 1.5, dunno, seems like it could work. According to this superficially credible-seeming comment at stackoverflow, that's what Ruby does: http://stackoverflow.com/questions/1100311/what-is-the-ideal-growth-rate-for-a-dynamically-allocated-array/1100449#1100449 > But, you have to do something different when targetSize < 8, else that > formula doesn't grow. That's right, it was by design. For small sizes, things get tricky. If it's a byte array, you definitely want to round up to the size of a pointer, and as we enter the era of ubiquitous 64-bit processors, rounding up to the nearest multiple of 8 seems proper. But what about arrays of objects? Would we really want every two-element array to reserve space for 8 objects? The way I've got this handled in a forthcoming patch to Lucy is to trust the user about the size they want in most cases and count on them to add the logic for small sizes (as was done in TermBufferImpl.java). The Grow() methods of CharBuf, ByteBuf, and VArray obey the exact amount -- if you want an overallocation, you'll invoke the over-estimator on the argument you supply to Grow(). It's just the incremental appends like VArray's Push(), or CharBuf's Cat_Char() that trigger the automatic over-allocation internally. > Also, I think for smallish sizes we want faster than 12.5% growth, > because the constant-time cost of the mechanics of doing a > reallocation are at that point relatively high (wrt the cost of > copying the bytes over to the new array). But is that important if in most cases where it will grow incrementally, you've already overallocated manually? Marvin Humphrey --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org