elflink: Fork creation of hugetlbfs mappings.

[PPC] Prior to memory slices being added in 2.6.22, our temporary hugetlbfs
mappings are created in the segment below the stack. If a 32-bit PPC app
grows its stack larger than one segment, it will segfault because the
neighboring segment has been marked as "hugepage-only".

Providing a hint address for our temporary mappings is unreliable because,
prior to 2.6.22, if the hint address is invalid get_unmapped_area()
simplistically started at the top of the address space and searched down.
This means we could end up with the kernel picking an address that's higher
than the hint address.

To get around this limitation, David Gibson suggested forking a child to
perform the temporary hugetlbfs mappings in its own address space to avoid
tainting segments in the parent's. I have also measured a small performance
improvement, which Adam believes may be due to SLBs.

Since there is no performance hit on x86 and x86_64, nor 64-bit binaries,
perform the forking regardless of architecture or bit-ness.

Signed-off-by: Steve Fox <[EMAIL PROTECTED]>
---

diff --git a/elflink.c b/elflink.c
index 4fce9b9..36679fc 100644
--- a/elflink.c
+++ b/elflink.c
@@ -32,6 +32,7 @@
 #include <sys/file.h>
 #include <linux/unistd.h>
 #include <sys/mman.h>
+#include <sys/wait.h>
 #include <errno.h>
 #include <limits.h>
 #include <elf.h>
@@ -893,7 +894,7 @@ static int find_or_prepare_shared_file(struct seg_info 
*htlb_seg_info)
 static int obtain_prepared_file(struct seg_info *htlb_seg_info)
 {
        int fd = -1;
-       int ret;
+       int ret, pid, status;
 
        /* Share only read-only segments */
        if (sharing && !(htlb_seg_info->prot & PROT_WRITE)) {
@@ -909,11 +910,33 @@ static int obtain_prepared_file(struct seg_info 
*htlb_seg_info)
                return -1;
        htlb_seg_info->fd = fd;
 
-       ret = prepare_segment(htlb_seg_info);
-       if (ret < 0) {
-               DEBUG("Failed to prepare segment\n");
+       /* [PPC] Prior to 2.6.22 (which added slices), our temporary hugepage
+        * mappings are placed in the segment before the stack. This 'taints' 
that
+        * segment for be hugepage-only for the lifetime of the process, 
resulting
+        * in a maximum stack size of 256MB. If we instead create our hugepage
+        * mappings in a child process, we can avoid this problem.
+        *
+        * This does not adversely affect non-PPC platforms so do it everywhere.
+        */
+       if ((pid = fork()) < 0) {
+               DEBUG("fork failed");
                return -1;
        }
+       if (pid == 0) {
+               ret = prepare_segment(htlb_seg_info);
+               if (ret < 0) {
+                       DEBUG("Failed to prepare segment\n");
+                       exit(1);
+               }
+               else
+                       exit(0);
+       }
+       ret = waitpid(pid, &status, 0);
+       if (ret == -1) {
+               DEBUG("waitpid failed");
+               return -1;
+       }
+
        DEBUG("Prepare succeeded\n");
        return 0;
 }

-- 

Steve Fox
IBM Linux Technology Center

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Libhugetlbfs-devel mailing list
Libhugetlbfs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel

Reply via email to