Sean, to the issue that you are describing it is also might be possible to
do it some other way around. One, perhaps more portable, is to share a
connected socketpair between two communicating processes, so that you can
do non-blocking read on one of its ends from time to time and check if it
returns EOF. Which would be the case if whatever process holds the other
end of it is no longer there. So instead of shared memory segment, you can
have pool of descriptors, one for each worker that you care about. Polling
on those would be trivial with just regular poll(2). The only issue might
be that postgres forks a lot, so we would probably need to implement
FD_CLOFORK to avoid copying those extra fds into every child.
Something akin to a solution that I recently posted to work around problem
that you cannot really waitpid() on a grand-child see PG BUG #14199 for
details & patch.
But yes, it would be really nice to get rid of SYSV shared memory use in PG
completely as some point one way or another.
On Thu, Jun 23, 2016 at 3:42 PM, Sean Chittenden <s...@chittenden.org>
> Small nit:
> PostgreSQL used SYSV because it allowed for the detection of dead
> processes. If you `kill -9`’ed a process, PostgreSQL can detect that and
> then shut down and perform an automatic recovery. In this regard, sysv is
> pretty clever. The move to POSIX shared mem was done for a host of
> reasons, but it means that you don’t have to adjust your SYSV limits. My
> understanding from a few years ago is that there is still a ~64KB SYSV
> memory segment that is still used to act as the latch to signal if a
> process was killed, but all of the shared buffers are stored in posix
> mmap’ed regions.
> At this point in time this could be replaced with kqueue(2) EVFILT_PROC,
> but no one has done that yet.
> Sean Chittenden
> > On Jun 22, 2016, at 07:26 , Maxim Sobolev <sobo...@freebsd.org> wrote:
> > Konstantin,
> > Not if you do sem_unlink() immediately, AFAIK. And that's what PG does.
> > the window of opportunity for the leakage is quite small, much smaller
> > for SYSV primitives. Sorry for missing your status update message, I've
> > missed it somehow.
> > ----
> > mySem = sem_open(semname, O_CREAT | O_EXCL,
> > (mode_t) IPCProtection,
> > (unsigned) 1);
> > #ifdef SEM_FAILED
> > if (mySem != (sem_t *) SEM_FAILED)
> > break;
> > #else
> > if (mySem != (sem_t *) (-1))
> > break;
> > #endif
> > /* Loop if error indicates a collision */
> > if (errno == EEXIST || errno == EACCES || errno == EINTR)
> > continue;
> > /*
> > * Else complain and abort
> > */
> > elog(FATAL, "sem_open(\"%s\") failed: %m", semname);
> > }
> > /*
> > * Unlink the semaphore immediately, so it can't be accessed
> > externally.
> > * This also ensures that it will go away if we crash.
> > */
> > sem_unlink(semname);
> > return mySem;
> > ----
> > -Max
> > On Wed, Jun 22, 2016 at 3:02 AM, Konstantin Belousov <
> > wrote:
> >> On Tue, Jun 21, 2016 at 12:48:00PM -0700, Maxim Sobolev wrote:
> >>> Thanks, Konstantin for the great work, we are definitely looking
> >> to
> >>> get all those improvements to be part of the default FreeBSD
> >>> Would be nice if you can post an update some day later as to what's
> >>> integrated and what's not.
> >> I did posted the update several days earlier. Since you replying to
> >> thread, it would be not unreasonable to read recent messages that were
> >> sent.
> >>> Just in case, I've opened #14206 with PG to switch us to using POSIX
> >>> semaphores by default. Apart from the mentioned performance benefits,
> >> SYSV
> >>> semaphores are PITA to deal with as they come in very limited
> >> by
> >>> default. Also they might stay around if PG dies/gets nuked and prevent
> >>> from starting again due to overflow. We've got some quite ugly code to
> >>> clean up those using ipcrm(1) in our build scripts to deal with just
> >> that.
> >>> I am happy that code could be retired now.
> >> Named semaphores also stuck around if processes are killed without
> > _______________________________________________
> > freebsd-performa...@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-performance
> > To unsubscribe, send any mail to "
email@example.com mailing list
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"