Please send the code that creates da_w and the declarations of w_array

  Barry

On Apr 14, 2014, at 9:40 AM, TAY wee-beng <[email protected]> wrote:

> Hi Barry,
> 
> I'm not too sure how to do it. I'm running mpi. So I run:
> 
>  mpirun -n 4 ./a.out -start_in_debugger
> 
> I got the msg below. Before the gdb windows appear (thru x11), the program 
> aborts.
> 
> Also I tried running in another cluster and it worked. Also tried in the 
> current cluster in debug mode and it worked too.
> 
> mpirun -n 4 ./a.out -start_in_debugger
> --------------------------------------------------------------------------
> An MPI process has executed an operation involving a call to the
> "fork()" system call to create a child process.  Open MPI is currently
> operating in a condition that could result in memory corruption or
> other system errors; your MPI job may hang, crash, or produce silent
> data corruption.  The use of fork() (or system() or other calls that
> create child processes) is strongly discouraged.  
> 
> The process that invoked fork was:
> 
>   Local host:          n12-76 (PID 20235)
>   MPI_COMM_WORLD rank: 2
> 
> If you are *absolutely sure* that your application will successfully
> and correctly survive a call to fork(), you may disable this warning
> by setting the mpi_warn_on_fork MCA parameter to 0.
> --------------------------------------------------------------------------
> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on display 
> localhost:50.0 on machine n12-76
> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on display 
> localhost:50.0 on machine n12-76
> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on display 
> localhost:50.0 on machine n12-76
> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on display 
> localhost:50.0 on machine n12-76
> [n12-76:20232] 3 more processes have sent help message help-mpi-runtime.txt / 
> mpi_init:warn-fork
> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 to see all 
> help / error messages
> 
> ....
> 
>  1
> [1]PETSC ERROR: 
> ------------------------------------------------------------------------
> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, 
> probably memory access out of range
> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [1]PETSC ERROR: or see 
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: 
> or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory 
> corruption errors
> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and 
> run 
> [1]PETSC ERROR: to get more information on the crash.
> [1]PETSC ERROR: User provided function() line 0 in unknown directory unknown 
> file (null)
> [3]PETSC ERROR: 
> ------------------------------------------------------------------------
> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, 
> probably memory access out of range
> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [3]PETSC ERROR: or see 
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[3]PETSC ERROR: 
> or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory 
> corruption errors
> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and 
> run 
> [3]PETSC ERROR: to get more information on the crash.
> [3]PETSC ERROR: User provided function() line 0 in unknown directory unknown 
> file (null)
> 
> ...
> Thank you.
> 
> Yours sincerely,
> 
> TAY wee-beng
> 
> On 14/4/2014 9:05 PM, Barry Smith wrote:
>>   Because IO doesn’t always get flushed immediately it may not be hanging at 
>> this point.  It is better to use the option -start_in_debugger then type 
>> cont in each debugger window and then when you think it is “hanging” do a 
>> control C in each debugger window and type where to see where each process 
>> is you can also look around in the debugger at variables to see why it is 
>> “hanging” at that point.
>> 
>>    Barry
>> 
>>   This routines don’t have any parallel communication in them so are 
>> unlikely to hang.
>> 
>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng 
>> <[email protected]>
>>  wrote:
>> 
>> 
>>> Hi,
>>> 
>>> My code hangs and I added in mpi_barrier and print to catch the bug. I 
>>> found that it hangs after printing "7". Is it because I'm doing something 
>>> wrong? I need to access the u,v,w array so I use DMDAVecGetArrayF90. After 
>>> access, I use DMDAVecRestoreArrayF90.
>>> 
>>>         call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>         call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"3" 
>>>         call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr)
>>>         call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"4"
>>>         call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr)
>>>         call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"5"
>>>         call 
>>> I_IIB_uv_initial_1st_dm(I_cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_v1,I_cell_w1,u_array,v_array,w_array)
>>>     
>>>         call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"6"
>>>         call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr)  !must be in 
>>> reverse order
>>>         call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"7"
>>>         call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr)    
>>>         call MPI_Barrier(MPI_COMM_WORLD,ierr);  if (myid==0) print *,"8"
>>>         call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr)
>>> -- 
>>> Thank you.
>>> 
>>> Yours sincerely,
>>> 
>>> TAY wee-beng
>>> 
>>> 
> 

Reply via email to