Try running under valgrind 
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind


On Apr 14, 2014, at 9:47 PM, TAY wee-beng <[email protected]> wrote:

> 
> Hi Barry,
> 
> As I mentioned earlier, the code works fine in PETSc debug mode but fails in 
> non-debug mode.
> 
> I have attached my code.
> 
> Thank you
> 
> Yours sincerely,
> 
> TAY wee-beng
> 
> On 15/4/2014 2:26 AM, Barry Smith wrote:
>>   Please send the code that creates da_w and the declarations of w_array
>> 
>>   Barry
>> 
>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng 
>> <[email protected]>
>>  wrote:
>> 
>> 
>>> Hi Barry,
>>> 
>>> I'm not too sure how to do it. I'm running mpi. So I run:
>>> 
>>>  mpirun -n 4 ./a.out -start_in_debugger
>>> 
>>> I got the msg below. Before the gdb windows appear (thru x11), the program 
>>> aborts.
>>> 
>>> Also I tried running in another cluster and it worked. Also tried in the 
>>> current cluster in debug mode and it worked too.
>>> 
>>> mpirun -n 4 ./a.out -start_in_debugger
>>> --------------------------------------------------------------------------
>>> An MPI process has executed an operation involving a call to the
>>> "fork()" system call to create a child process.  Open MPI is currently
>>> operating in a condition that could result in memory corruption or
>>> other system errors; your MPI job may hang, crash, or produce silent
>>> data corruption.  The use of fork() (or system() or other calls that
>>> create child processes) is strongly discouraged.  
>>> 
>>> The process that invoked fork was:
>>> 
>>>   Local host:          n12-76 (PID 20235)
>>>   MPI_COMM_WORLD rank: 2
>>> 
>>> If you are *absolutely sure* that your application will successfully
>>> and correctly survive a call to fork(), you may disable this warning
>>> by setting the mpi_warn_on_fork MCA parameter to 0.
>>> --------------------------------------------------------------------------
>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display 
>>> localhost:50.0 on machine n12-76
>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display 
>>> localhost:50.0 on machine n12-76
>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display 
>>> localhost:50.0 on machine n12-76
>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display 
>>> localhost:50.0 on machine n12-76
>>> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt 
>>> / mpi_init:warn-fork
>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all 
>>> help / error messages
>>> 
>>> ....
>>> 
>>>  1
>>> [1]PETSC ERROR: 
>>> ------------------------------------------------------------------------
>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, 
>>> probably memory access out of range
>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>> [1]PETSC ERROR: or see 
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: 
>>> or try http://valgrind.org
>>>  on GNU/linux and Apple Mac OS X to find memory corruption errors
>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and 
>>> run 
>>> [1]PETSC ERROR: to get more information on the crash.
>>> [1]PETSC ERROR: User provided function() line 0 in unknown directory 
>>> unknown file (null)
>>> [3]PETSC ERROR: 
>>> ------------------------------------------------------------------------
>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, 
>>> probably memory access out of range
>>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>> [3]PETSC ERROR: or see 
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: 
>>> or try http://valgrind.org
>>>  on GNU/linux and Apple Mac OS X to find memory corruption errors
>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and 
>>> run 
>>> [3]PETSC ERROR: to get more information on the crash.
>>> [3]PETSC ERROR: User provided function() line 0 in unknown directory 
>>> unknown file (null)
>>> 
>>> ...
>>> Thank you.
>>> 
>>> Yours sincerely,
>>> 
>>> TAY wee-beng
>>> 
>>> On 14/4/2014 9:05 PM, Barry Smith wrote:
>>> 
>>>>   Because IO doesn’t always get flushed immediately it may not be hanging 
>>>> at this point.  It is better to use the option -start_in_debugger then 
>>>> type cont in each debugger window and then when you think it is “hanging” 
>>>> do a control C in each debugger window and type where to see where each 
>>>> process is you can also look around in the debugger at variables to see 
>>>> why it is “hanging” at that point.
>>>> 
>>>>    Barry
>>>> 
>>>>   This routines don’t have any parallel communication in them so are 
>>>> unlikely to hang.
>>>> 
>>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng 
>>>> 
>>>> <[email protected]>
>>>> 
>>>>  wrote:
>>>> 
>>>> 
>>>> 
>>>>> Hi,
>>>>> 
>>>>> My code hangs and I added in mpi_barrier and print to catch the bug. I 
>>>>> found that it hangs after printing "7". Is it because I'm doing something 
>>>>> wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. 
>>>>> After access, I use DMDAVecRestoreArrayF90.
>>>>> 
>>>>>         call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>         call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"3" 
>>>>>         call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>>>         call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"4"
>>>>>         call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>>>         call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"5"
>>>>>         call 
>>>>> I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array)
>>>>>     
>>>>>         call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"6"
>>>>>         call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)  !must be 
>>>>> in reverse order
>>>>>         call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"7"
>>>>>         call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)    
>>>>>         call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"8"
>>>>>         call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>>>> -- 
>>>>> Thank you.
>>>>> 
>>>>> Yours sincerely,
>>>>> 
>>>>> TAY wee-beng
>>>>> 
>>>>> 
>>>>> 
> 
> 
> 
> <code.txt>

Reply via email to