Re: [HACKERS] munmap() failure due to sloppy handling of hugepage size

Andres Freund Wed, 12 Oct 2016 13:29:16 -0700


On October 12, 2016 1:25:54 PM PDT, Tom Lane <[email protected]> wrote:
>If any of you were following the thread at
>https://www.postgresql.org/message-id/flat/CAOan6TnQeSGcu_627NXQ2Z%2BWyhUzBjhERBm5RN9D0QFWmk7PoQ%40mail.gmail.com
>I spent quite a bit of time following a bogus theory, but the problem
>turns out to be very simple: on Linux, munmap() is pickier than mmap()
>about the length of a hugepage allocation.  The comments in
>sysv_shmem.c
>mention that on older kernels mmap() with MAP_HUGETLB will fail if
>given
>a length request that's not a multiple of the hugepage size.  Well, the
>behavior they replaced that with is little better: mmap() succeeds, but
>it gives you back a region that's been silently enlarged to the next
>hugepage boundary, and then munmap() will fail if you specify the
>region
>size you asked for rather than the region size you were given.
>
>Since AFAICS there is no way to inquire what region size you were
>given,
>this API is astonishingly brain-dead IMO.  But that seems to be what
>we've got.  Chris Richards reported it against a 3.16.7 kernel, and
>I can replicate the behavior on RHEL6 (2.6.32) by asking for an
>odd-size
>huge page region.
>
>We've mostly masked this by rounding up to a 2MB boundary, which is
>what
>the hugepage size typically is.  But that assumption is wrong on some
>hardware, and it's not likely to get less wrong as time passes.
>
>A little bit of research suggests that on Linux the thing to do would
>be
>to get the actual default hugepage size by reading /proc/meminfo and
>looking for a line like "Hugepagesize:       2048 kB".  I don't know
>of any more-portable API, so this does nothing for non-Linux kernels.
>But we have not heard of similar misbehavior on other platforms, even
>though IA64 and PPC64 can both have hugepages larger than 2MB, so it's
>reasonable to hope that other implementations of munmap() don't have
>the same gotcha.


We had that, but Heikki ripped it out when merging... I think you're supposed 
to use /sys to get the available size.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] munmap() failure due to sloppy handling of hugepage size

Reply via email to