I've gotten a bit tired of seeing "could not create semaphores: No space left on device" failures in the buildfarm, so I looked into whether we should consider preferring unnamed POSIX semaphores over SysV semaphores.
We've had code for named and unnamed POSIX semaphores in our tree for a long time, but it's not actually used on any current platform AFAIK. There are good reasons to avoid the named-semaphore variant: typically that eats a file descriptor per sema per backend. However that complaint doesn't necessarily apply to unnamed semaphores. Indeed, it seems that on Linux an unnamed POSIX semaphore is basically a futex, which eats zero kernel resources; all the state is in userspace. Although in normal cases the semaphore code paths aren't very heavily exercised in our code, I was able to get a measurable performance difference by building with --disable-spinlocks, so that spinlocks are emulated with semaphores. On an 8-core RHEL6 machine, "pgbench -S -c 20 -j 20" seems to be about 4% faster with unnamed semaphores than SysV semaphores. It'd be good to replicate that test on some higher-end hardware, but provisionally I'd say unnamed semaphores are faster. The data structure is bigger: Linux's type sem_t is 32 bytes on 64-bit machines (16 bytes on 32-bit) whereas we use 8 bytes for SysV semaphores. But there aren't normally a huge number of semaphores in a cluster, and anyway this comparison is cheating because it ignores the space taken for the kernel data structures backing the SysV semaphores. There was some previous discussion about this in https://www.postgresql.org/message-id/flat/20160621193412.5792.65085%40wrigleys.postgresql.org but that thread tailed off without a resolution, partly because it wasn't the kind of change we'd consider making in late beta. One thing I expressed concern about there was whether there are any hidden kernel resources underlying an unnamed semaphore. So far as I can tell by strace'ing sem_init and sem_destroy, there are not, at least on Linux. Another issue is raised in today's discussion https://www.postgresql.org/message-id/flat/14947.1475690465%40sss.pgh.pa.us where it appears that we might need to be more careful about putting memory barriers into the unnamed-semaphore code (probably because it might not enter the kernel). But if that's a bug, we'd want to fix it anyway, IMO. So for Linux, I think probably we should switch. macOS seems not to have unnamed POSIX semaphores, only named ones (the functions exist, but they always fail with ENOSYS). However, some googling suggests that other BSD derivatives do have these primitives, so somebody ought to do a similar comparison on them to see if switching is a win. (The first thread above asserts that it is for FreeBSD, but someone should recheck using a test case that stresses semaphores more.) Dunno about other platforms. sem_init is nominally required by SUS v2, but it doesn't seem to actually exist everywhere, so I doubt we can drop SysV altogether. I'd be inclined to change the default on a platform- by-platform basis not whole hog. If anyone wants to test, the main thing you have to do to try this in the existing code is to add "USE_UNNAMED_POSIX_SEMAPHORES=1" and "--disable-spinlocks" to your configure arguments. On Linux you may need to add -lrt to the backend LIBS list, though on my machine configure is putting that in already. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers