On Mon, Apr 14, 2014 at 10:01 AM, TAY wee-beng <[email protected]> wrote:
> On 14/4/2014 10:44 PM, Matthew Knepley wrote: > > On Mon, Apr 14, 2014 at 9:40 AM, TAY wee-beng <[email protected]> wrote: > >> Hi Barry, >> >> I'm not too sure how to do it. I'm running mpi. So I run: >> >> mpirun -n 4 ./a.out -start_in_debugger >> > > add -debugger_pause 10 > > > It seems that I need to use a value of 60. It gives a segmentation fault > after the location I debug earlier: > > #0 0x00002b672cdee78a in f90array3daccessscalar_ () > from /home/wtay/Lib/petsc-3.4.4_shared_rel/lib/libpetsc.so > #1 0x00002b672cdedcae in F90Array3dAccess () > from /home/wtay/Lib/petsc-3.4.4_shared_rel/lib/libpetsc.so > #2 0x00002b672d2ad044 in dmdavecrestorearrayf903_ () > from /home/wtay/Lib/petsc-3.4.4_shared_rel/lib/libpetsc.so > #3 0x00000000008a1d8d in fractional_initial_mp_initial_ () > #4 0x0000000000539289 in MAIN__ () > #5 0x000000000043c04c in main () > > What;'s that supposed to mean? > It appears that you are restoring an array that you did not actually get. Matt > Matt > > >> I got the msg below. Before the gdb windows appear (thru x11), the >> program aborts. >> >> Also I tried running in another cluster and it worked. Also tried in the >> current cluster in debug mode and it worked too. >> >> *mpirun -n 4 ./a.out -start_in_debugger* >> >> *--------------------------------------------------------------------------* >> *An MPI process has executed an operation involving a call to the* >> *"fork()" system call to create a child process. Open MPI is currently* >> *operating in a condition that could result in memory corruption or* >> *other system errors; your MPI job may hang, crash, or produce silent* >> *data corruption. The use of fork() (or system() or other calls that* >> *create child processes) is strongly discouraged. * >> >> *The process that invoked fork was:* >> >> * Local host: n12-76 (PID 20235)* >> * MPI_COMM_WORLD rank: 2* >> >> *If you are *absolutely sure* that your application will successfully* >> *and correctly survive a call to fork(), you may disable this warning* >> *by setting the mpi_warn_on_fork MCA parameter to 0.* >> >> *--------------------------------------------------------------------------* >> *[2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display >> localhost:50.0 on machine n12-76* >> *[0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display >> localhost:50.0 on machine n12-76* >> *[1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display >> localhost:50.0 on machine n12-76* >> *[3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display >> localhost:50.0 on machine n12-76* >> *[n12-76:20232] 3 more processes have sent help message >> help-mpi-runtime.txt / mpi_init:warn-fork* >> *[n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see >> all help / error messages* >> >> *....* >> >> * 1* >> *[1]PETSC ERROR: >> ------------------------------------------------------------------------* >> *[1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range* >> *[1]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger* >> *[1]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>[1]PETSC >> ERROR: or try http://valgrind.org <http://valgrind.org> on GNU/linux and >> Apple Mac OS X to find memory corruption errors* >> *[1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >> and run * >> *[1]PETSC ERROR: to get more information on the crash.* >> *[1]PETSC ERROR: User provided function() line 0 in unknown directory >> unknown file (null)* >> *[3]PETSC ERROR: >> ------------------------------------------------------------------------* >> *[3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range* >> *[3]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger* >> *[3]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>[3]PETSC >> ERROR: or try http://valgrind.org <http://valgrind.org> on GNU/linux and >> Apple Mac OS X to find memory corruption errors* >> *[3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >> and run * >> *[3]PETSC ERROR: to get more information on the crash.* >> *[3]PETSC ERROR: User provided function() line 0 in unknown directory >> unknown file (null)* >> >> ... >> >> Thank you. >> >> Yours sincerely, >> >> TAY wee-beng >> >> On 14/4/2014 9:05 PM, Barry Smith wrote: >> >> Because IO doesn’t always get flushed immediately it may not be hanging at >> this point. It is better to use the option -start_in_debugger then type >> cont in each debugger window and then when you think it is “hanging” do a >> control C in each debugger window and type where to see where each process >> is you can also look around in the debugger at variables to see why it is >> “hanging” at that point. >> >> Barry >> >> This routines don’t have any parallel communication in them so are >> unlikely to hang. >> >> On Apr 14, 2014, at 6:52 AM, TAY wee-beng <[email protected]> >> <[email protected]> wrote: >> >> >> Hi, >> >> My code hangs and I added in mpi_barrier and print to catch the bug. I found >> that it hangs after printing "7". Is it because I'm doing something wrong? I >> need to access the u,v,w array so I use DMDAVecGetArrayF90. After access, I >> use DMDAVecRestoreArrayF90. >> >> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"3" >> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"4" >> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"5" >> call >> I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array) >> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"6" >> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) !must be in >> reverse order >> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"7" >> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *,"8" >> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >> -- >> Thank you. >> >> Yours sincerely, >> >> TAY wee-beng >> >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
