On Mon, Apr 2, 2012 at 9:52 PM, Tabrez Ali <stali at geology.wisc.edu> wrote:
> Matt/Barry > > My intention was to make sure that the code is bug free and since PETSc > was pre-installed on the cluster with various compilers it was easier to > test quickly rather than build all combinations myself. Performance is of > absolutely no concern. > > Things were working fine with 3.1 but recently the OS (Cray Linux Env) was > upgraded and so was PETSc (to 3.2). > > Matt > > I am attaching entire output. > > Unable to start debugger in xterm: No such file or directory > aborting job: xterm is not in the path. Matt > Tabrez > > --- > > stali at krakenpf2:~/meshes> which xterm > /usr/bin/xterm > stali at krakenpf2:~/meshes> aprun -n 1 ./defmod -f > 2d_point_load_dyn_abc.inp -on_error_attach_debugger > Reading input ... > Reading mesh data ... > Forming [K] ... > Forming [M] & [M]^-1 ... > Applying constraints ... > Forming RHS ... > Setting up solver ... > Solving ... > Time Step 0 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > [0]PETSC ERROR: to get more information on the crash. > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > [0]PETSC ERROR: PETSC: Attaching gdb to ./defmod of pid 26164 on display > :0.0 on machine nid03538 > Unable to start debugger in xterm: No such file or directory > aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0 > _pmii_daemon(SIGCHLD): [NID 03538] [c12-3c2s4n2] [Mon Apr 2 22:50:09 > 2012] PE 0 exit signal Aborted > Application 134950 exit codes: 134 > Application 134950 resources: utime ~1s, stime ~0s > > On 04/02/2012 09:04 PM, Matthew Knepley wrote: > > On Mon, Apr 2, 2012 at 8:57 PM, Barry Smith <bsmith at mcs.anl.gov> wrote: > >> >> On Apr 2, 2012, at 8:10 PM, Tabrez Ali wrote: >> >> > Hello >> > >> > I am trying to debug a program using the switch >> '-on_error_attach_debugger' but the vendor/sysadmin built PETSc 3.2.00 is >> unable to start the debugger in xterm (see text below). But xterm is >> installed. What am I doing wrong? >> > >> > Btw the segfault happens during a call to MatMult but only with >> vendor/sysadmin supplied PETSc 3.2 with PGI and Intel compilers only and >> _not_ with CRAY or GNU compilers. >> >> My advice, blow off "the vendor/sysadmin supplied PETSc 3.2" and just >> built it yourself so you can get real work done instead of trying to debug >> their mess. I promise the vendor one is not like a billion times faster >> or anything. > > > If you want to justify this to anyone (like a funder), just run both on > ex5 for a large size and look at the flops on MatMult. That > is probably your dominant cost (or your PC). > > Matt > > >> >> Barry >> >> >> >> > >> > I also dont get the segfault if I build PETSc 3.2-p7 myself with >> PGI/Intel compilers. >> > >> > Any ideas on how to diagnose the problem? Unfortunately I cannot seem >> to run valgrind on this particular machine. >> > >> > Thanks in advance. >> > >> > Tabrez >> > >> > --- >> > >> > stali at krakenpf1:~/meshes> which xterm >> > /usr/bin/xterm >> > stali at krakenpf1:~/meshes> aprun -n 1 ./defmod -f >> 2d_point_load_dyn_abc.inp -on_error_attach_debugger >> > ... >> > ... >> > ... >> > [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range >> > [0]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> > [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSCERROR: >> or try >> http://valgrind.org on GNU/linux and Apple Mac OS X to find memory >> corruption errors >> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >> and run >> > [0]PETSC ERROR: to get more information on the crash. >> > [0]PETSC ERROR: User provided function() line 0 in unknown directory >> unknown file >> > [0]PETSC ERROR: PETSC: Attaching gdb to ./defmod of pid 32384 on >> display localhost:20.0 on machine nid10649 >> > Unable to start debugger in xterm: No such file or directory >> > aborting job: >> > application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0 >> > _pmii_daemon(SIGCHLD): [NID 10649] [c23-3c0s6n1] [Mon Apr 2 13:06:48 >> 2012] PE 0 exit signal Aborted >> > Application 133198 exit codes: 134 >> > Application 133198 resources: utime ~1s, stime ~0s >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120402/93e0a49c/attachment-0001.htm>
