On Tue, Dec 10, 2013 at 07:50:20PM +0200, Heikki Linnakangas wrote: > On 12/10/2013 07:27 PM, Noah Misch wrote: > >On Thu, Dec 05, 2013 at 06:12:48PM +0200, Heikki Linnakangas wrote: > >>>On Wed, Nov 20, 2013 at 8:32 AM, Heikki Linnakangas > >>><hlinnakan...@vmware.com> wrote: > >>>>* As discussed in the "Something fishy happening on frogmouth" thread, I > >>>>don't like the fact that the dynamic shared memory segments will be > >>>>permanently leaked if you kill -9 postmaster and destroy the data > >>>>directory.
> >>I really think we need to do something about it. To use your earlier > >>example of parallel sort, it's not acceptable to permanently leak a 512 > >>GB segment on a system with 1 TB of RAM. > > > >I don't. Erasing your data directory after an unclean shutdown voids any > >expectations for a thorough, automatic release of system resources. Don't do > >that. The next time some new use of a persistent resource violates your hope > >for this scenario, there may be no remedy. > > Well, the point of erasing the data directory is to release system > resources. I would normally expect "killall -9 <process>; rm -rf > <data dir>" to thorougly get rid of the running program and all the > resources. It's surprising enough that the regular shared memory > segment is left behind Your expectation is misplaced. Processes and files are simply not the only persistent system resources of interest. > but at least that one gets cleaned up when > you start a new server (on same port). In the most-typical case, yes. In rare cases involving multiple postmasters starting and stopping, the successor to the erased data directory will not clean up the sysv segment. > Let's not add more cases like that, if we can avoid it. Only if we can avoid it for a modicum of effort and feature compromise. You're asking for PostgreSQL to reshape its use of persistent resources so you can throw around "killall -9 postgres; rm -rf $PGDATA" without so much as a memory leak. That use case, not PostgreSQL, has the defect here. > BTW, what if the data directory is seriously borked, and the server > won't start? Sure, don't do that, but it would be nice to have a way > to recover if you do anyway. (docs?) If something is corrupting your data directory in an open-ended manner, you have bigger problems than a memory leak until reboot. Recovering DSM happens before we read the control file, so the damage would need to fall among a short list of files for this to happen (bugs excluded). Nonetheless, I don't object to documenting the varieties of system resources that PostgreSQL may reserve and referencing the OS facilities for inspecting them. Are you actually using PostgreSQL this way: frequent "killall -9 postgres; rm -rf $PGDATA" after arbitrarily-bad $PGDATA corruption? Some automated fault injection test rig, perhaps? > >>One idea is to create the shared memory object with shm_open, and wait > >>until all the worker processes that need it have attached to it. Then, > >>shm_unlink() it, before using it for anything. That way the segment will > >>be automatically released once all the processes close() it, or die. In > >>particular, kill -9 will release it. (This is a variant of my earlier > >>idea to create a small number of anonymous shared memory file > >>descriptors in postmaster startup with shm_open(), and pass them down to > >>child processes with fork()). I think you could use that approach with > >>SysV shared memory as well, by destroying the segment with > >>sgmget(IPC_RMID) immediately after all processes have attached to it. > > > >That leaves a window in which we still leak the segment, > > A small window is better than a large one. Yes. > Another refinement is to wait for all the processes to attach before > setting the segment's size with ftruncate(). That way, when the > window is open for leaking the segment, it's still 0-sized so > leaking it is not a big deal. > > >and it is less > >general: not every use of DSM is conducive to having all processes attach in > >a > >short span of time. > > Let's cross that bridge when we get there. AFAICS it fits all the > use cases discussed this far. It does fit the use cases discussed thus far. nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers