On Tue, 3 Apr 2012, Anton Popov wrote: > I support 100% what Barry said. Just get the work done. Cray and IBM Linux > systems do not support ALL the systems calls that PETSc uses. So it's always > kind of problem to purge manually petscconf.h in between of "configure" and > "make" on their machines. I wander how you could install any PETSc without > modifying petscconf.h.
You shouldn't have to manually modify petscconf.h on these machines. There could still be some warnings at link time - but that shouldn't mean breakages at tuntime. > If you just don't care, usually you get segfaults right > at PetscInitialize() step. If this is the case - then it should be verifyiable with PETSc examples. Satish > Literally it means, there is no way you can debug > anything, they should reinstall PETSc, keeping in mind the exact list of > system calls they support, and PETSc requirements. > > By the way, the times when GNU compilers were "order of magnitude" slower than > "vendor compilers" have passed long ago. Just give it a try, compile some > simple computationally intensive code with gcc and something from "vendor" > with aggressive optimization, and check execution time on a large data set. > I'm sure you'll be surprised. > > Cheers, > Anton > > On 4/3/12 3:57 AM, Barry Smith wrote: > > On Apr 2, 2012, at 8:10 PM, Tabrez Ali wrote: > > > > > Hello > > > > > > I am trying to debug a program using the switch > > > '-on_error_attach_debugger' but the vendor/sysadmin built PETSc 3.2.00 is > > > unable to start the debugger in xterm (see text below). But xterm is > > > installed. What am I doing wrong? > > > > > > Btw the segfault happens during a call to MatMult but only with > > > vendor/sysadmin supplied PETSc 3.2 with PGI and Intel compilers only and > > > _not_ with CRAY or GNU compilers. > > My advice, blow off "the vendor/sysadmin supplied PETSc 3.2" and just > > built it yourself so you can get real work done instead of trying to debug > > their mess. I promise the vendor one is not like a billion times faster or > > anything. > > > > Barry > > > > > > > > > I also dont get the segfault if I build PETSc 3.2-p7 myself with PGI/Intel > > > compilers. > > > > > > Any ideas on how to diagnose the problem? Unfortunately I cannot seem to > > > run valgrind on this particular machine. > > > > > > Thanks in advance. > > > > > > Tabrez > > > > > > --- > > > > > > stali at krakenpf1:~/meshes> which xterm > > > /usr/bin/xterm > > > stali at krakenpf1:~/meshes> aprun -n 1 ./defmod -f > > > 2d_point_load_dyn_abc.inp -on_error_attach_debugger > > > ... > > > ... > > > ... > > > [0]PETSC ERROR: > > > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > > > probably memory access out of range > > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > > [0]PETSC ERROR: or see > > > http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC > > > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > > > memory corruption errors > > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > > > run > > > [0]PETSC ERROR: to get more information on the crash. > > > [0]PETSC ERROR: User provided function() line 0 in unknown directory > > > unknown file > > > [0]PETSC ERROR: PETSC: Attaching gdb to ./defmod of pid 32384 on display > > > localhost:20.0 on machine nid10649 > > > Unable to start debugger in xterm: No such file or directory > > > aborting job: > > > application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0 > > > _pmii_daemon(SIGCHLD): [NID 10649] [c23-3c0s6n1] [Mon Apr 2 13:06:48 > > > 2012] PE 0 exit signal Aborted > > > Application 133198 exit codes: 134 > > > Application 133198 resources: utime ~1s, stime ~0s > >
