On Wed, Apr 13, 2011 at 7:20 AM, A.M. <age...@themactionfaction.com> wrote: > The goal of this patch is to eliminate SysV shared memory, not to implement > NFS-capable locking which, as you point out, is virtually impossible. > > As far as I can tell, in the worst case, my patch does not change how > postgresql handles the NFS case. SysV shared memory won't work across NFS, so > that interlock won't catch, so postgresql is left with looking at a lock file > with PID of process on another machine, so that won't catch either. This > patch does not alter the lock file semantics, but merely augments the file > with file locking. > > At least with this patch, there is a chance the lock might work across NFS. > In the best case, it can allow for shared-storage postgresql failover, which > is a new feature. > > Furthermore, there is an improvement in shared memory handling in that it is > unlinked immediately after creation, so only the postmaster and its children > have access to it (through file descriptor inheritance). This means shared > memory cannot be stomped on by any other process. > > Considering that possibly working NFS locking is a side-effect of this patch > and not its goal and, in the worst possible scenario, it doesn't change > current behavior, I don't see how this can be a ding against this patch.
I don't see why we need to get rid of SysV shared memory; needing less of it seems just as good. In answer to your off-list question, one of the principle ways I've seen fcntl() locking fall over and die is when someone removes the lock file. You might think that this could be avoided by picking something important like pg_control as the log file, but it turns out that doesn't really work: http://0pointer.de/blog/projects/locking.html Tom's point is valid too. Many storage appliances present themselves as an NFS server, so it's very plausible for the data directory to be on an NFS server, and there's no guarantee that flock() won't be broken there. If our current interlock were known to be unreliable also maybe we wouldn't care very much, but AFAICT it's been extremely robust. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers