Thanks for replying! The entire program output is at the bottom. The error is: Unable to mmap hugepage 4194304 bytes Unable to mmap hugepage 4194304 bytes
This is surely mpich specific , as ompi works just fine. The application is very simple, just a simple hello-world example with MPI_Init, send/receive, gather, barrier. My valgrind version is from trunk; however, this happens with the 3.9 release too which is the latest stable I'm guessing. OS is Cray's compute node linux - 64bit uname -a gives: Linux xxxx-ext1 2.6.32.59-0.7-default #1 SMP 2012-07-13 15:50:56 +0200 x86_64 x86_64 x86_64 GNU/Linux Btw, I ran pretty large scientific codes with valgrind without a problem, the major issues are typically non-handled instructions which can be omitted from time to time with compiler flags. Memory was never an issue. ============= janjust@login8:~/janjust_proj/tmp$ aprun -n 2 -N 1 ../valgrind-trunk-build/bin/valgrind --tool=none ./a.out ==16936== Nulgrind, the minimal Valgrind tool ==20916== Nulgrind, the minimal Valgrind tool ==16936== Copyright (C) 2002-2013, and GNU GPL'd, by Nicholas Nethercote. ==20916== Copyright (C) 2002-2013, and GNU GPL'd, by Nicholas Nethercote. ==16936== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info ==16936== Command: ./a.out ==20916== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info ==20916== Command: ./a.out ==16936== ==20916== Unable to mmap hugepage 4194304 bytes Unable to mmap hugepage 4194304 bytes For file /var/lib/hugetlbfs/global/pagesize-2097152/hugepagefile.MPICH.0.16937.kvs_4760754 err Invalid argument For file /var/lib/hugetlbfs/global/pagesize-2097152/hugepagefile.MPICH.0.20917.kvs_4760754 err Invalid argument Rank 1 [Mon Mar 31 15:45:07 2014] [c0-0c0s1n0] Fatal error in MPI_Init: Other MPI error, error stack: MPIR_Init_thread(449).............: MPID_Init(234)....................: channel initialization failed MPIDI_CH3_Init(83)................: MPID_nem_init(325)................: MPID_nem_gni_init(1695)...........: MPID_nem_gni_dma_buffers_init(769): Out of memory Rank 0 [Mon Mar 31 15:45:07 2014] [c0-0c0s1n3] Fatal error in MPI_Init: Other MPI error, error stack: MPIR_Init_thread(449).............: MPID_Init(234)....................: channel initialization failed MPIDI_CH3_Init(83)................: MPID_nem_init(325)................: MPID_nem_gni_init(1695)...........: MPID_nem_gni_dma_buffers_init(769): Out of memory ==16937== ==20917== _pmiu_daemon(SIGCHLD): [NID 00093] [c0-0c0s1n3] [Mon Mar 31 15:45:07 2014] PE RANK 0 exit signal Killed _pmiu_daemon(SIGCHLD): [NID 00002] [c0-0c0s1n0] [Mon Mar 31 15:45:07 2014] PE RANK 1 exit signal Killed ==16936== ==20916== [NID 00093] 2014-03-31 15:45:07 Apid 4760754: initiated application termination Application 4760754 exit codes: 137 Application 4760754 resources: utime ~0s, stime ~0s, Rss ~28836, inblocks ~10526, outblocks ~54958 ============== -- View this message in context: http://valgrind.10908.n7.nabble.com/mpich-unable-to-munmap-hugepages-tp49150p49153.html Sent from the Valgrind - Users mailing list archive at Nabble.com. ------------------------------------------------------------------------------ _______________________________________________ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users