On 12.10.2006 [11:41:12 -0700], Nishanth Aravamudan wrote:
> On 12.10.2006 [11:12:53 -0700], Nishanth Aravamudan wrote:
> > On 11.10.2006 [11:29:05 +1000], David Gibson wrote:
> > > On Tue, Oct 10, 2006 at 11:28:55AM -0700, Nishanth Aravamudan wrote:
> > > > On 10.10.2006 [15:36:47 +1000], David Gibson wrote:
> 
> <snip>
> 
> > > > > Incidentally have people been running the testsuite routinely?  For me
> > > > > on current mainline it now produces many errors, and crashes the
> > > > > machine (POWER5 LPAR).
> > > > 
> > > > We have been running them as regularly as possible. Is this related to
> > > > your recent post to LKML? Or an independent one?
> > > 
> > > The crash is related to my lkml post, yes.  I'm also getting testcase
> > > failures on a bunch of the share cases, though, and nearly all the
> > > 32-bit versions of the elflink tests.
> > 
> > FWIW, everything passes on x86_64. So this would appear to be
> > ppc-specific breakage? I'm bringing up a G5 to do some testing and see
> > if I can't help track down the issues.
> 
> I just ran the testsuite on a ppc64 kernel on a 2-way 2.0GHz G5 with 200
> hugepages allocated and it passed just fine.
> 
> How many hugepages did you have allocated? If it's fewer than 10, then I
> know the problem and it's fixable.

After some further debuggin, an "ah ha" moment -- the failed tests would
appear to be directly related to your other complaint on the excessive
requirements of hugepages. At one point, while `watch cat /proc/meminfo`
while running `make func`, I noted that the number of hugepages Rsvd
reached 150 (which would be 15 per each of the 10 processes in the
linkshare testcase).

In the short-term, while we fix the excessive reservation issue, I can
change the run_tests.sh script to only request a number of processes
that will work given the number of free hugepages on the system. How
does this look?

Description: Since we now pad the BSS of relinked binaries, we now
require a larger number of hugepages than before, even if most of them
are unused. This leads to issues with the linkshare testcase, as it
spawns a fixed number of threads, all of which will consume hugepages
and eventually lead to a ENOMEM (in hugepages) condition. Modify the
testcase invocation to spawn a number of threads relative to the number
of free hugepages (even if the BSS padding is fixed differently, this is
a reasonable thing to do). Also modify the linkshare testcase to do
nothing if no threads are requested (which will now occur if the number
of hugepages free in the system is 0).

Signed-off-by: Nishanth Aravamudan <[EMAIL PROTECTED]>

diff --git a/tests/linkshare.c b/tests/linkshare.c
index 227af08..f3fd50e 100644
--- a/tests/linkshare.c
+++ b/tests/linkshare.c
@@ -169,6 +169,9 @@ int main(int argc, char *argv[], char *e
                num_sharings = atoi(argv[1]);
                if (num_sharings > 99999)
                        FAIL("Too many sharings requested (max = 99999)");
+               if (num_sharings <= 0)
+                       FAIL("Number of sharings requested must be greater "
+                                                       "than or equal to 0");
 
                children = (pid_t *)malloc(num_sharings * sizeof(pid_t));
                if (!children)
diff --git a/tests/run_tests.sh b/tests/run_tests.sh
index b74b1e6..600d8aa 100755
--- a/tests/run_tests.sh
+++ b/tests/run_tests.sh
@@ -64,14 +64,15 @@ elfshare_test () {
     baseprog="${args[$N]}"
     unset args[$N]
     set -- "[EMAIL PROTECTED]"
+    NUM_THREADS=$((`free_hpages` / 15 - 1))
     killall -HUP hugetlbd
-    run_test HUGETLB_SHARE=2 "$@" "xB.$baseprog" 10
+    run_test HUGETLB_SHARE=2 "$@" "xB.$baseprog" $NUM_THREADS
     killall -HUP hugetlbd
-    run_test HUGETLB_SHARE=1 "$@" "xB.$baseprog" 10
+    run_test HUGETLB_SHARE=1 "$@" "xB.$baseprog" $NUM_THREADS
     killall -HUP hugetlbd
-    run_test HUGETLB_SHARE=2 "$@" "xBDT.$baseprog" 10
+    run_test HUGETLB_SHARE=2 "$@" "xBDT.$baseprog" $NUM_THREADS
     killall -HUP hugetlbd
-    run_test HUGETLB_SHARE=1 "$@" "xBDT.$baseprog" 10
+    run_test HUGETLB_SHARE=1 "$@" "xBDT.$baseprog" $NUM_THREADS
 }
 
 setup_shm_sysctl() {

-- 
Nishanth Aravamudan <[EMAIL PROTECTED]>
IBM Linux Technology Center

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Libhugetlbfs-devel mailing list
Libhugetlbfs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel

Reply via email to