On October 12, 2016 1:25:54 PM PDT, Tom Lane <t...@sss.pgh.pa.us> wrote:
>If any of you were following the thread at
>I spent quite a bit of time following a bogus theory, but the problem
>turns out to be very simple: on Linux, munmap() is pickier than mmap()
>about the length of a hugepage allocation. The comments in
>mention that on older kernels mmap() with MAP_HUGETLB will fail if
>a length request that's not a multiple of the hugepage size. Well, the
>behavior they replaced that with is little better: mmap() succeeds, but
>it gives you back a region that's been silently enlarged to the next
>hugepage boundary, and then munmap() will fail if you specify the
>size you asked for rather than the region size you were given.
>Since AFAICS there is no way to inquire what region size you were
>this API is astonishingly brain-dead IMO. But that seems to be what
>we've got. Chris Richards reported it against a 3.16.7 kernel, and
>I can replicate the behavior on RHEL6 (2.6.32) by asking for an
>huge page region.
>We've mostly masked this by rounding up to a 2MB boundary, which is
>the hugepage size typically is. But that assumption is wrong on some
>hardware, and it's not likely to get less wrong as time passes.
>A little bit of research suggests that on Linux the thing to do would
>to get the actual default hugepage size by reading /proc/meminfo and
>looking for a line like "Hugepagesize: 2048 kB". I don't know
>of any more-portable API, so this does nothing for non-Linux kernels.
>But we have not heard of similar misbehavior on other platforms, even
>though IA64 and PPC64 can both have hugepages larger than 2MB, so it's
>reasonable to hope that other implementations of munmap() don't have
>the same gotcha.
We had that, but Heikki ripped it out when merging... I think you're supposed
to use /sys to get the available size.
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: