On 14 June 2012 19:28, Robert Haas <robertmh...@gmail.com> wrote: > I thought that doubling repeatedly would be overly aggressive in terms > of memory usage. Blowing the buffers out to 8kB because we hit a > string that's a bit over 4kB isn't so bad, but blowing them out to > 256MB because we hit a string that's a bit over 128MB seems a bit > excessive.
That's pretty much what all popular dynamic array implementations do, from C++'s std::vector to Python's list (it's a misnomer). Having to allocate 256MB for a buffer to contain a string a bit over 128MB may seem excessive, until you later get a 250MB string. Even if doubling is generally excessive, which I doubt, that's beside the point, which is that expanding the array by some constant proportion results in each insertion taking amortized constant time. > Suppose we expand the buffer a > kilobyte at a time from an initial size of 1kB all the way out to > 256MB. That's 256,000 palloc calls, so we must be sorting at least > 256,000 datums, at least 128,000 of which are longer than 128MB. I > think the cost of calling memcpy() and strcoll() repeatedly on all > those long datums - not to mention repeatedly detoasting them - is > going to bludgeon the palloc overhead into complete insignificance. I fail to understand how this sortsupport buffer fundamentally differs from a generic dynamic array abstraction built to contain chars. That being the case, I see no reason not to just do what everyone else does when expanding dynamic arrays, and no reason why we shouldn't make essentially the same time-space trade-off here as others do elsewhere. Another concern is that it seems fairly pointless to have two buffers. Wouldn't it be more sensible to have a single buffer that was partitioned to make two logical, equally-sized buffers, given that in general each buffer is expected to grow at exactly the same rate? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers