On 3/9/15 12:28 PM, Alvaro Herrera wrote:
Robert Haas wrote:
On Sat, Mar 7, 2015 at 5:49 PM, Andres Freund <and...@2ndquadrant.com> wrote:
On 2015-03-05 15:28:12 -0600, Jim Nasby wrote:
I was thinking the simpler route of just repalloc'ing... the memcpy would
suck, but much less so than the extra index pass. 64M gets us 11M tuples,
which probably isn't very common.
That has the chance of considerably increasing the peak memory usage
though, as you obviously need both the old and new allocation during the
And in contrast to the unused memory at the tail of the array, which
will usually not be actually allocated by the OS at all, this is memory
that's actually read/written respectively.
Yeah, I'm not sure why everybody wants to repalloc() that instead of
making several separate allocations as needed. That would avoid
increasing peak memory usage, and would avoid any risk of needing to
copy the whole array. Also, you could grow in smaller chunks, like
64MB at a time instead of 1GB or more at a time. Doubling an
allocation that's already 1GB or more gets big in a hurry.
Yeah, a chunk list rather than a single chunk seemed a good idea to me
That will be significantly more code than a simple repalloc, but as long
as people are OK with that I can go that route.
Also, I think the idea of starting with an allocation assuming some
small number of dead tuples per page made sense -- and by the time that
space has run out, you have a better estimate of actual density of dead
tuples, so you can do the second allocation based on that new estimate
(but perhaps clamp it at say 1 GB, just in case you just scanned a
portion of the table with an unusually high percentage of dead tuples.)
I like the idea of using a fresh idea of dead tuple density when we need
more space. We would also clamp this at maintenance_work_mem, not a
Speaking of which... people have referenced allowing > 1GB of dead
tuples, which means allowing maintenance_work_mem > MAX_KILOBYTES. The
comment for that says:
/* upper limit for GUC variables measured in kilobytes of memory */
/* note that various places assume the byte size fits in a "long"
So I'm not sure how well that will work. I think that needs to be a
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: