I'm running into ENOMEM failures with libhugetlbfs testsuite [1] on
a power8 lpar system running 4.8 or latest git [2]. Repeated runs of
this suite trigger multiple OOMs, that eventually kill entire system,
it usually takes 3-5 runs:

 * Total System Memory......:  18024 MB
 * Shared Mem Max Mapping...:    320 MB
 * System Huge Page Size....:     16 MB
 * Available Huge Pages.....:     20
 * Total size of Huge Pages.:    320 MB
 * Remaining System Memory..:  17704 MB
 * Huge Page User Group.....:  hugepages (1001)

I see this only on ppc (BE/LE), x86_64 seems unaffected and successfully
ran the tests for ~12 hours.

Bisect has identified following patch as culprit:
  commit 67961f9db8c477026ea20ce05761bde6f8bf85b0
  Author: Mike Kravetz <mike.krav...@oracle.com>
  Date:   Wed Jun 8 15:33:42 2016 -0700
    mm/hugetlb: fix huge page reserve accounting for private mappings

Following patch (made with my limited insight) applied to
latest git [2] fixes the problem for me:

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ec49d9e..7261583 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1876,7 +1876,7 @@ static long __vma_reservation_common(struct hstate *h,
                 * return value of this routine is the opposite of the
                 * value returned from reserve map manipulation routines above.
-               if (ret)
+               if (ret >= 0)
                        return 0;
                        return 1;


[1] https://github.com/libhugetlbfs/libhugetlbfs
[2] v4.8-14230-gb67be92

Reply via email to