On Wed, Oct 12, 2016 at 5:10 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvhe...@2ndquadrant.com> writes: >> Tom Lane wrote: >>> According to >>> https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt >>> looking into /proc/meminfo is the longer-standing API and thus is >>> likely to work on more kernel versions. Also, if you look into >>> /sys then you are going to see multiple possible values and it's >>> not clear how to choose the right one. > >> I'm not sure that this is the best rationale. In my system there are >> 2MB and 1GB huge page sizes; in systems with lots of memory (let's say 8 >> GB of shared memory is requested) it seems a clear winner to allocate 8 >> 1GB hugepages than 4096 2MB hugepages because the page table is so much >> smaller. The /proc interface only shows the 2MB page size, so if we go >> that route we'd not be getting the full benefit of the feature. > > And you'll tell mmap() which one to do how exactly? I haven't found > anything explaining how applications get to choose which page size applies > to their request. The kernel document says that /proc/meminfo reflects > the "default" size, and I'd assume that that's what we'll get from mmap.
hm. for (recent) linux, I see: MAP_HUGE_2MB, MAP_HUGE_1GB (since Linux 3.8) Used in conjunction with MAP_HUGETLB to select alternative hugetlb page sizes (respectively, 2 MB and 1 GB) on systems that support multiple hugetlb page sizes. More generally, the desired huge page size can be configured by encoding the base-2 logarithm of the desired page size in the six bits at the offset MAP_HUGE_SHIFT. (A value of zero in this bit field provides the default huge page size; the default huge page size can be discovered vie the Hugepagesize field exposed by /proc/meminfo.) Thus, the above two constants are defined as: #define MAP_HUGE_2MB (21 << MAP_HUGE_SHIFT) #define MAP_HUGE_1GB (30 << MAP_HUGE_SHIFT) The range of huge page sizes that are supported by the system can be discovered by listing the subdirectories in /sys/kernel/mm/hugepages. via: http://man7.org/linux/man-pages/man2/mmap.2.html#NOTES ISTM all this silliness is pretty much unique to linux anyways. Instead of reading the filesystem, what about doing test map and test unmap? We could zero in on the page size for default I think with some probing of known possible values. merlin -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers