On Sat, Aug 13, 2016 at 2:08 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > amul sul <sul_a...@yahoo.co.in> writes: >> When I am calling dsm_create on Linux using the POSIX DSM implementation can >> succeed, but result in SIGBUS when later try to access the memory. This >> happens because of my system does not have enough shm space & current >> allocation in dsm_impl_posix does not allocate disk blocks[1]. I wonder can >> we use fallocate system call (i.e. Zero-fill the file) to ensure that all >> the file space has really been allocated, so that we don't later seg fault >> when accessing the memory mapping. But here we will endup by loop calling >> ‘write’ squillions of times. > > Wouldn't that just result in a segfault during dsm_create? > > I think probably what you are describing here is kernel misbehavior > akin to memory overcommit. Maybe it *is* memory overcommit and can > be turned off the same way. If not, you have material for a kernel > bug fix/enhancement request.
I think this may be different from overcommit. In dsm_impl_posix we do shm_open, then ftruncate. That creates a file with a hole. Based on an LKML discussion where someone tried to address this with a patch that was rejected[1], it believe that Linux implements POSIX shmem as a tmpfs file and in this case the file has a hole, which is not the same phenomenon as unallocated virtual memory pages resulting from overcommit policy. In dsm_impl_mmap it looks like we have code to deal with the same problem: we do open, then, ftruncate, and then we explicitly write a bunch of zeros to the file, with this comment: /* * Zero-fill the file. We have to do this the hard way to ensure that * all the file space has really been allocated, so that we don't * later seg fault when accessing the memory mapping. This is pretty * pessimal. */ Maybe we didn't do that for dsm_impl_posix because maybe you can't write to a fd created with shm_open like that, I don't know. But it looks like if we used fallocate or posix_fallocate in the dsm_impl_posix case we'd get a nice ESPC error, instead of success-but-later-SIGBUS-on-access. Whether there is *also* the possibility of overcommit biting you later I don't know, but I suspect that's an independent problem. The OOM killer kills you with SIGKILL, not SIGBUS. [1] https://lkml.org/lkml/2013/7/31/64 -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers