While testing applications here at AMD we discovered that
one of programs that was linked with BDT hugepage mappings
failed to load on SLES11 but not on SLES10. The issue was
caused by having symbols in the dynamic symbol table that
are associated with very large addresses. This causes
prepare_segment() in elflink.c to create a very large
libhugetlbfs file in which the data + BSS segments are
copied, which can cause the mapping of the associated
hugetlbfs file to the original address to fail since the
flags argument to mmap() contains MAP_PRIVATE instead of
MAP_SHARED. (Newer kernels reserve more huge pages to
ensure that there will be pages available for any COW
operations.)
The following a type script of a simple test that
demonstrates the problem:
SLES10:
$ cat /etc/issue
Welcome to SUSE Linux Enterprise Server 10 SP2 (x86_64) - Kernel \r (\l).
$ cat sl.c
extern int v[];
int *addr_of_v()
{
return v;
}
$ cat smain.c
#include <stdio.h>
extern int *addr_of_v(void);
#define NN 600
int v[NN*1024*1024/sizeof(int)];
int main()
{
printf("addr_of_v returned %p\n", addr_of_v());
return 0;
}
$ gcc -shared -fPIC sl.c -o sl.so
$ gcc smain.c sl.so
-Wl,--script=/home/dgilmore/libhugetlbfs-2.7/local/usr/local/share/libhugetlbfs/ldscripts/elf_x86_64.xBDT
$ HUGETLB_VERBOSE=99
LD_LIBRARY_PATH=.:/home/dgilmore/libhugetlbfs-2.7/local/usr/local/lib64 ./a.out
libhugetlbfs [pcetilapia03:16539]: INFO: Parsed kernel version: [2] . [6] .
[16] [post-release: 60]
libhugetlbfs [pcetilapia03:16539]: INFO: HUGETLB_SHARE=0, sharing disabled
libhugetlbfs [pcetilapia03:16539]: INFO: Segment 0 (phdr 2):
0x1000000-0x100081c (filesz=0x81c) (prot = 0x5)
libhugetlbfs [pcetilapia03:16539]: INFO: Segment 1 (phdr 3):
0x2000000-0x27800260 (filesz=0x228) (prot = 0x7)
libhugetlbfs [pcetilapia03:16539]: INFO: libhugetlbfs version: 2.7
libhugetlbfs [pcetilapia03:16540]: INFO: Mapped hugeseg at 0x2aaaaac00000.
Copying 0x81c bytes and 0 extra bytes from 0x1000000...done
libhugetlbfs [pcetilapia03:16539]: INFO: Prepare succeeded
libhugetlbfs [pcetilapia03:16541]: INFO: Mapped hugeseg at 0x2aaaaac00000.
Copying 0x228 bytes and 0x25800038 extra bytes from 0x2000000...done
libhugetlbfs [pcetilapia03:16539]: INFO: Prepare succeeded
addr_of_v returned 0x2000260
$ cat /proc/sys/vm/nr_hugepages
450
$
SLES11:
$ cat /etc/issue
Welcome to SUSE Linux Enterprise Server 11 (x86_64) - Kernel \r (\l).
$ HUGETLB_VERBOSE=99
LD_LIBRARY_PATH=.:/home/dgilmore/libhugetlbfs-2.7/local/usr/local/lib64 ./a.out
libhugetlbfs [pcetilapia04:22891]: INFO: Parsed kernel version: [2] . [6] .
[27] [post-release: 39]
libhugetlbfs [pcetilapia04:22891]: INFO: Feature private_reservations is
present in this kernel
libhugetlbfs [pcetilapia04:22891]: INFO: Kernel has MAP_PRIVATE reservations.
Disabling heap prefaulting.
libhugetlbfs [pcetilapia04:22891]: INFO: HUGETLB_SHARE=0, sharing disabled
libhugetlbfs [pcetilapia04:22891]: INFO: Segment 0 (phdr 2):
0x1000000-0x100081c (filesz=0x81c) (prot = 0x5)
libhugetlbfs [pcetilapia04:22891]: INFO: Segment 1 (phdr 3):
0x2000000-0x27800260 (filesz=0x228) (prot = 0x7)
libhugetlbfs [pcetilapia04:22891]: INFO: libhugetlbfs version: 2.7
libhugetlbfs [pcetilapia04:22895]: INFO: Mapped hugeseg at 0x2aaaaac00000.
Copying 0x81c bytes and 0 extra bytes from 0x1000000...done
libhugetlbfs [pcetilapia04:22891]: INFO: Prepare succeeded
libhugetlbfs [pcetilapia04:22896]: INFO: Mapped hugeseg at 0x2aaaaac00000.
Copying 0x228 bytes and 0x25800038 extra bytes from 0x2000000...done
libhugetlbfs [pcetilapia04:22891]: INFO: Prepare succeeded
Failed to map hugepage segment 1: 2000000-27a00000 (errno=12)
Aborted
$ cat /proc/sys/vm/nr_hugepages
450
$
However on SLES11, if I rebuild libhugetlbfs with the following patch, the
mapping will succeed.
$ diff -u /home/dgilmore/libhugetlbfs-2.7/elflink.c{.1,}
--- /home/dgilmore/libhugetlbfs-2.7/elflink.c.1 2009-12-20 01:40:19.000000000
-0800
+++ /home/dgilmore/libhugetlbfs-2.7/elflink.c 2010-01-21 15:06:50.124600000
-0800
@@ -1120,7 +1120,7 @@
start = ALIGN_DOWN((unsigned long)seg[i].vaddr, hpage_size);
offset = (unsigned long)(seg[i].vaddr - start);
mapsize = ALIGN(offset + seg[i].memsz, hpage_size);
- mmap_flags = MAP_PRIVATE|MAP_FIXED;
+ mmap_flags = MAP_SHARED|MAP_FIXED;
/*
* If this is a read-only mapping whose contents are
$ HUGETLB_VERBOSE=99
LD_LIBRARY_PATH=.:/home/dgilmore/libhugetlbfs-2.7/usr/local/lib64 ./a.out
libhugetlbfs [pcetilapia04:23286]: INFO: Parsed kernel version: [2] . [6] .
[27] [post-release: 39]
libhugetlbfs [pcetilapia04:23286]: INFO: Feature private_reservations is
present in this kernel
libhugetlbfs [pcetilapia04:23286]: INFO: Kernel has MAP_PRIVATE reservations.
Disabling heap prefaulting.
libhugetlbfs [pcetilapia04:23286]: INFO: HUGETLB_SHARE=0, sharing disabled
libhugetlbfs [pcetilapia04:23286]: INFO: Segment 0 (phdr 2):
0x1000000-0x10008b4 (filesz=0x8b4) (prot = 0x5)
libhugetlbfs [pcetilapia04:23286]: INFO: Segment 1 (phdr 3):
0x2000000-0x27800260 (filesz=0x230) (prot = 0x7)
libhugetlbfs [pcetilapia04:23286]: INFO: libhugetlbfs version: 2.7 (modified)
libhugetlbfs [pcetilapia04:23287]: INFO: Mapped hugeseg at 0x2aaaaac00000.
Copying 0x8b4 bytes and 0 extra bytes from 0x1000000...done
libhugetlbfs [pcetilapia04:23286]: INFO: Prepare succeeded
libhugetlbfs [pcetilapia04:23288]: INFO: Mapped hugeseg at 0x2aaaaac00000.
Copying 0x230 bytes and 0x25800030 extra bytes from 0x2000000...done
libhugetlbfs [pcetilapia04:23286]: INFO: Prepare succeeded
addr_of_v returned 0x2000260
$
So far my testing hasn't uncovered issues with this change. Does
anyone know why the final mmap() calls use use MAP_PRIVATE instead of
MAP_SHARED?
Doug
------------------------------------------------------------------------------
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
_______________________________________________
Libhugetlbfs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel