On 2014-09-03 17:08:12 -0400, Robert Haas wrote: > On Sat, Aug 30, 2014 at 2:04 PM, Andres Freund <and...@2ndquadrant.com> wrote: > > If the sort buffer is allocated when the checkpointer is started, not > > everytime we sort, as you've done in your version of the patch I think > > that risk is pretty manageable. If we really want to be sure nothing is > > happening at runtime, even if the checkpointer was restarted, we can put > > the sort array in shared memory. > > I don't think that allocating the array at checkpointer start time > helps. If it works, then you're strictly worse off than if you > allocate it at every checkpoint, because you're holding onto the > memory all the time instead of only when it's being used. And if it > fails, what then? Sure, you can have that copy of the checkpointer > process exit, but that does nothing good. The postmaster will keep on > restarting it and it will keep on dying for lack of memory, and no > checkpoints will complete. Oops.
It's imo quite clearly better to keep it allocated. For one after postmaster started the checkpointer successfully you don't need to be worried about later failures to allocate memory if you allocate it once (unless the checkpointer FATALs out which should be exceedingly rare - we're catching ERRORs). It's much much more likely to succeed initially. Secondly it's not like there's really that much time where no checkpointer isn't running. > So it seems to me that the possibly-sensible approaches are: > > 1. Allocate an array when we need to sort, and if the allocation > fails, have some kind of fallback strategy, like logging a WARNING an > writing the buffers out without sorting them. If it succeeds, do the > checkpoint and then free the memory until we need it again. I think if we want to go that way I vote for keeping the array allocated and continuing to try to allocate it after allocation failures. And, as you suggest, fall back to a simple sequential search through all buffers. > 2. Putting the array in shared_memory, so that once the server is > started, we can be sure the memory is allocated and the sort will > work. But I prefer this approach. If we ever want to have more than one process writing out data for checkpoints we're going to need it anyway. And that's something not that far away for large setups imo. Especially due to checksums. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers