Sorry, I botched Hugh's e-mail address, please make sure to reply to the correct one.
Thanks, Nish On 05.02.2007 [16:19:04 -0800], Nishanth Aravamudan wrote: > Hi all, > > So, here's the current state of the hugepages portion of my > /proc/meminfo (x86_64, 2.6.20-rc7, will test with 2.6.20 shortly, but > AFAICS, there haven't been many changes to hugepage code between the > two): > > HugePages_Total: 100 > HugePages_Free: 100 > HugePages_Rsvd: 18446744073709551615 > Hugepagesize: 2048 kB > > That's not good :) > > Context: I'm currently working on some patches for libhugetlbfs which > should ultimately help us reduce our hugepage usage when remapping > segments so they are backed by hugepages. The current algorithm maps in > hugepage file as MAP_SHARED, copies over the segment data, then unmaps > the file. It then unmaps the program's segments, and maps in the same > hugepage file MAP_PRIVATE, so that we take COW faults. Now, the problem > is, for writable segments (data) the COW fault instatiates a new > hugepage, but the original MAP_SHARED hugepage stays resident in the > page cache. So, for a program that could survive (after the initial > remapping algorithm) with only 2 hugepages in use, uses 3 hugepages > instead. > > To work around this, I've modified the algorithm to prefault in the > writable segment in the remapping code (via a one-byte read and write). > Then, I issue a posix_fadvise(segment_fd, 0, 0, FADV_DONTNEED), to try > and drop the shared hugepage from the page cache. With a small dummy > relinked app (that just sleeps), this does reduce our run-time hugepage > cost from 3 to 2. But, I'm noticing that libhugetlbfs' `make func` > utility, which tests libhugetlbfs' functionality only, every so often > leads to a lot of "VM killing process ...". This only appears to happen > to a particular testcase (xBDT.linkshare, which remaps the BSS, data and > text segments and tries to share the text segments between 2 processes), > but when it does, it happens for a while (that is, if I try and run that > particular test manually, it keeps getting killed) and /proc/meminfo > reports a garbage value for HugePages_Rsvd like I listed above. If I > rerun `make func`, sometimes the problem goes away (Rsvd returns to a > sane value, as well...). > > I've added Hugh & David to the Cc, because they discussed a similar > problem a few months back. Maybe there is still a race somewhere? > > I'm willing to test any possible fixes, and I'll work on making this > more easily reproducible (although it seems to happen pretty regularly > here) with a simpler test. > > Thanks, > Nish > > -- > Nishanth Aravamudan <[EMAIL PROTECTED]> > IBM Linux Technology Center -- Nishanth Aravamudan <[EMAIL PROTECTED]> IBM Linux Technology Center ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Libhugetlbfs-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel
