On Tue, Jan 23, 2007 at 10:56:23PM +0000, Adam Litke wrote:
> Sonny Rao discovered a bug which affects powerpc machines. If the stack
> is allowed to grow into a segment that has been converted over to huge
> pages, current kernels will try to instantiate normal pages into a huge
> page-only segment. This test case reproduces this behavior for 32bit
> and 64bit binaries.
>
> Since there is no upstream fix for this bug yet, I don't think it's wise
> to run it all the time and needlessly render machines unstable.
> However, we don't want to forget that it's not being run by default
> either. This patch also introduces a SKIP mode to run_tests.sh. It
> will cause the named test to be skipped and a message printed that
> indicates that a test was skipped.
Hrm. I'd prefer a separate 'skip_test' function for this.
Incidentally, something that would be a really good idea, but I've
never got around to would be some 'summary' support in the test code:
at the end of the run it would print the total number of tests passed,
failed, skipped (due to bad config or this new skip notation), etc.
> diff --git a/tests/Makefile b/tests/Makefile
> index 7cd823a..7ba7ba7 100644
> --- a/tests/Makefile
> +++ b/tests/Makefile
> @@ -6,7 +6,7 @@ LIB_TESTS = gethugepagesize test_root fi
> chunk-overcommit mprotect alloc-instantiate-race mlock \
> truncate_reserve_wraparound truncate_sigbus_versus_oom \
> map_high_truncate_2 truncate_above_4GB direct \
> - misaligned_offset brk_near_huge task-size-overrun
> + misaligned_offset brk_near_huge task-size-overrun stack_grow_into_huge
> LIB_TESTS_64 = straddle_4GB huge_at_4GB_normal_below \
> huge_below_4GB_normal_above
> NOLIB_TESTS = malloc malloc_manysmall dummy
> diff --git a/tests/run_tests.sh b/tests/run_tests.sh
> index 30ca82a..a914813 100755
> --- a/tests/run_tests.sh
> +++ b/tests/run_tests.sh
> @@ -35,7 +35,10 @@ run_test_bits () {
> BITS=$1
> shift
>
> - if [ -d obj$BITS ]; then
> + if [ "$1" == "SKIP" ]; then
> + shift
> + echo "$@ ($BITS): SKIPPED"
> + elif [ -d obj$BITS ]; then
> echo -n "$@ ($BITS): "
> PATH="obj$BITS:$PATH" LD_LIBRARY_PATH="$LD_LIBRARY_PATH:../obj$BITS"
> $ENV "$@"
> fi
> @@ -154,6 +157,7 @@ # Specific kernel bug tests
> run_test truncate_above_4GB
> run_test brk_near_huge
> run_test task-size-overrun
> + run_test SKIP stack_grow_into_huge
>
> # Tests requiring an active mount and hugepage COW
> run_test private
> diff --git a/tests/stack_grow_into_huge.c b/tests/stack_grow_into_huge.c
> new file mode 100644
> index 0000000..ff1058e
> --- /dev/null
> +++ b/tests/stack_grow_into_huge.c
> @@ -0,0 +1,89 @@
> +/*
> + * libhugetlbfs - Easy use of Linux hugepages
> + * Copyright (C) 2005-2006 David Gibson & Adam Litke, IBM Corporation.
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public License
> + * as published by the Free Software Foundation; either version 2.1 of
> + * the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <sys/mman.h>
> +#include <sys/resource.h>
> +#include <sys/wait.h>
> +
> +#include <hugetlbfs.h>
> +#include "hugetests.h"
> +
> +/*
> + * Test rationale:
> + *
> + * On PowerPC, the address space is divided into segments. These segments
> can
> + * contain either huge pages or normal pages, but not both. All segments are
> + * initially set up to map normal pages. When a huge page mapping is created
> + * within a set of empty segments, they are "enabled" for huge pages at that
> + * time. Once enabled for huge pages, they can not be used again for normal
> + * pages for the remaining lifetime of the process.
> + *
> + * If the segment immediately preceeding the segment containing the stack is
> + * converted to huge pages and the stack is made to grow into the this
> + * preceeding segment, some kernels may attempt to map normal pages into the
> + * huge page-only segment -- resulting in bugs.
> + */
> +
> +int main(int argc, char *argv[])
> +{
> + int fd, i, pid, s;
> + struct rlimit r;
> + char *b;
> + unsigned long hpage_size = gethugepagesize();
> +
> + test_init(argc, argv);
> +
> + i = getrlimit(RLIMIT_STACK, &r);
> + if (i)
> + CONFIG("getrlimit failed");
> +
> + if (r.rlim_cur != RLIM_INFINITY)
> + CONFIG("Stack rlimit must be 'unlimited'");
I'm a bit torn here. We're now running the testsuite as root, so we
could do a setrlimit here, simplifying things. On the other hand I
rather miss the fact that the testsuite can't be run as non-root any
more, so I'm disinclined to make it harder to re-acquire that.
> + fd = hugetlbfs_unlinked_fd();
> + if (fd < 0)
> + CONFIG("Couldn't get hugepage fd");
> +
> + b = mmap(0, hpage_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
> + if (b != MAP_FAILED) {
> + munmap(b, hpage_size);
> + } else
> + FAIL("mmap");
Braces on neither or both clauses, please.
> + if ((pid = fork()) < 0)
> + FAIL("fork");
> +
> + if (pid == 0) {
> + while (1) { alloca(16*1024*1024); }
Ow, please don't skimp on proper formatting, extra lines don't cost.
In any case I'd be inclined to but the child logic into a separate
function for clarity.
> + exit (0);
And make use of the exit(), to de-indent the rest of the checks.
> + } else {
> + waitpid(pid, &s, 0);
Best to check waitpid()s return code, just in case.
> + if (WIFSIGNALED(s)) {
> + if (WTERMSIG(s) == SIGTRAP) {
> + FAIL("SIGTRAP received");
> + } else {
> + PASS();
> + }
This could do with some explaining comments. And probably an inverted
sense of the test: what signal *should* we get if we get a stack
collision (I'm guessing SEGV, but we'd have to check the kernel code).
> + }
> + }
> + FAIL("Child not signalled");
> +}
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Libhugetlbfs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel