On Mon, 2014-03-31 at 14:43 -0700, janjust wrote:
> (hm my direct reply seems to be getting rejected)
> 
> Yes,
> The output is rather large so I attached 3 files that were the result of
> running it with 2 procs. 1 for stdout and the other two are from
> --log-file=valgrind.%p
> -Tommy
> 
> val.out <http://valgrind.10908.n7.nabble.com/file/n49155/val.out>  
> 
> valgrind.26200
> <http://valgrind.10908.n7.nabble.com/file/n49155/valgrind.26200>  
> 
> valgrind.26269
> <http://valgrind.10908.n7.nabble.com/file/n49155/valgrind.26269>  

Looking at the output, this seems to be the relevant trace:

SYSCALL[26201,1](  2) sys_open ( 0x5ec9cc(/proc/mounts), 0 ) --> [async] ... 
SYSCALL[26201,1](  2) ... [async] --> Success(0x0:0x11) 
SYSCALL[26201,1](  5) sys_newfstat ( 17, 0xffebf76e0 )[sync] --> 
Success(0x0:0x0) 
SYSCALL[26201,1](  9) sys_mmap ( 0x0, 4096, 3, 34, -1, 0 ) --> [pre-success] 
Success(0x0:0x4e71000) 
SYSCALL[26201,1](  0) sys_read ( 17, 0x4e71000, 1024 ) --> [async] ... 
SYSCALL[26201,1](  0) ... [async] --> Success(0x0:0x400) 
SYSCALL[26201,1](137) sys_statfs ( 
0xffebfaa85(/var/lib/hugetlbfs/global/pagesize-2097152), 0xffebf9a80 )[sync] 
--> Success(0x0:0x0) 
SYSCALL[26201,1](  3) sys_close ( 17 )[sync] --> Success(0x0:0x0) 
SYSCALL[26201,1]( 11) sys_munmap ( 0x4e71000, 4096 )[sync] --> Success(0x0:0x0) 
SYSCALL[26201,1](  2) sys_open ( 
0xffebf8a80(/var/lib/hugetlbfs/global/pagesize-2097152/hugepagefile.MPICH.0.26201.kvs_4761352),
 66, 493 ) --> [async] ... 
SYSCALL[26201,1](  2) ... [async] --> Success(0x0:0x11) 
SYSCALL[26201,1]( 87) sys_unlink ( 
0xffebf8a80(/var/lib/hugetlbfs/global/pagesize-2097152/hugepagefile.MPICH.0.26201.kvs_4761352)
 ) --> [async] ... 
SYSCALL[26201,1]( 87) ... [async] --> Success(0x0:0x0) 
SYSCALL[26201,1](  9) sys_mmap ( 0x0, 4194304, 3, 1, 17, 0 ) --> [pre-fail] 
Failure(0x16) 

I then tried to reproduce the problem above with the small below
code, doing exactly the same syscalls with same parameter,
except the fd arg to mmap, which must be the result of the open.
The below works on my system.
You could try the below (natively and under valgrind) and see
if that fails or not.
If that does not fail, then you should replace 4m.txt with a path
name similar to the above (assuming the path
/var/lib/hugetlbfs/global/pagesize-2097152/hugepagefile.MPICH.0.26201.kvs_4761352
is on a "strangely mounted" filesystem cfr the open /proc/mounts just
above).
If the below (and the modified below) succeeds, but the mpich run
fails, then I guess you will be obliged to debug the valgrind code
itself, to see what exactly makes the syscall fail with EINVAL:
  is it the valgrind checks ?
  or is it the real syscall failing ?
Rather than debugging Valgrind, you might first try to find if the
syscall itself fails  by running valgrind under strace e.g.
  strace -f valgrind mmap_huge

Philippe

#include <stdio.h>
#include <sys/mman.h>
main()
{
  int fd;
  char *m;
  fd = open("4m.txt",  66, 493 );
  printf ("open result : %d\n", fd);
  unlink ("4m.txt");
  m = (char*) mmap ( 0x0, 4194304, 3, 1, fd, 0 );
  printf ("mmap result %p\n", m);
}



------------------------------------------------------------------------------
_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to