On Sun, May 18, 2014 at 8:18 PM, TAY wee-beng <[email protected]> wrote:
> Hi Barry, > > I am trying to sort out the details so that it's easier to pinpoint the > error. However, I tried on gnu gfortran and it worked well. On intel ifort, > it stopped at one of the "DMDAVecGetArrayF90". Does it definitely mean that > it's a bug in ifort? Do you work with both intel and gnu? > Yes it works with Intel. Is this using optimization? Matt > > Thank you > > Yours sincerely, > > TAY wee-beng > > On 14/5/2014 12:03 AM, Barry Smith wrote: > >> Please send you current code. So we may compile and run it. >> >> Barry >> >> >> On May 12, 2014, at 9:52 PM, TAY wee-beng <[email protected]> wrote: >> >> Hi, >>> >>> I have sent the entire code a while ago. Is there any answer? I was also >>> trying myself but it worked for some intel compiler, and some not. I'm >>> still not able to find the answer. gnu compilers for most cluster are old >>> versions so they are not able to compile since I have allocatable >>> structures. >>> >>> Thank you. >>> >>> Yours sincerely, >>> >>> TAY wee-beng >>> >>> On 21/4/2014 8:58 AM, Barry Smith wrote: >>> >>>> Please send the entire code. If we can run it and reproduce the >>>> problem we can likely track down the issue much faster than through endless >>>> rounds of email. >>>> >>>> Barry >>>> >>>> On Apr 20, 2014, at 7:49 PM, TAY wee-beng <[email protected]> wrote: >>>> >>>> On 20/4/2014 8:39 AM, TAY wee-beng wrote: >>>>> >>>>>> On 20/4/2014 1:02 AM, Matthew Knepley wrote: >>>>>> >>>>>>> On Sat, Apr 19, 2014 at 10:49 AM, TAY wee-beng <[email protected]> >>>>>>> wrote: >>>>>>> On 19/4/2014 11:39 PM, Matthew Knepley wrote: >>>>>>> >>>>>>>> On Sat, Apr 19, 2014 at 10:16 AM, TAY wee-beng <[email protected]> >>>>>>>> wrote: >>>>>>>> On 19/4/2014 10:55 PM, Matthew Knepley wrote: >>>>>>>> >>>>>>>>> On Sat, Apr 19, 2014 at 9:14 AM, TAY wee-beng <[email protected]> >>>>>>>>> wrote: >>>>>>>>> On 19/4/2014 6:48 PM, Matthew Knepley wrote: >>>>>>>>> >>>>>>>>>> On Sat, Apr 19, 2014 at 4:59 AM, TAY wee-beng <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> On 19/4/2014 1:17 PM, Barry Smith wrote: >>>>>>>>>> On Apr 19, 2014, at 12:11 AM, TAY wee-beng <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> On 19/4/2014 12:10 PM, Barry Smith wrote: >>>>>>>>>> On Apr 18, 2014, at 9:57 PM, TAY wee-beng <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> On 19/4/2014 3:53 AM, Barry Smith wrote: >>>>>>>>>> Hmm, >>>>>>>>>> >>>>>>>>>> Interface DMDAVecGetArrayF90 >>>>>>>>>> Subroutine DMDAVecGetArrayF903(da1, v,d1,ierr) >>>>>>>>>> USE_DM_HIDE >>>>>>>>>> DM_HIDE da1 >>>>>>>>>> VEC_HIDE v >>>>>>>>>> PetscScalar,pointer :: d1(:,:,:) >>>>>>>>>> PetscErrorCode ierr >>>>>>>>>> End Subroutine >>>>>>>>>> >>>>>>>>>> So the d1 is a F90 POINTER. But your subroutine seems to be >>>>>>>>>> treating it as a “plain old Fortran array”? >>>>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> So d1 is a pointer, and it's different if I declare it as "plain >>>>>>>>>> old Fortran array"? Because I declare it as a Fortran array and it >>>>>>>>>> works >>>>>>>>>> w/o any problem if I only call DMDAVecGetArrayF90 and >>>>>>>>>> DMDAVecRestoreArrayF90 with "u". >>>>>>>>>> >>>>>>>>>> But if I call DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90 with >>>>>>>>>> "u", "v" and "w", error starts to happen. I wonder why... >>>>>>>>>> >>>>>>>>>> Also, supposed I call: >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> >>>>>>>>>> u_array .... >>>>>>>>>> >>>>>>>>>> v_array .... etc >>>>>>>>>> >>>>>>>>>> Now to restore the array, does it matter the sequence they are >>>>>>>>>> restored? >>>>>>>>>> No it should not matter. If it matters that is a sign that >>>>>>>>>> memory has been written to incorrectly earlier in the code. >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Hmm, I have been getting different results on different intel >>>>>>>>>> compilers. I'm not sure if MPI played a part but I'm only using a >>>>>>>>>> single >>>>>>>>>> processor. In the debug mode, things run without problem. In >>>>>>>>>> optimized >>>>>>>>>> mode, in some cases, the code aborts even doing simple >>>>>>>>>> initialization: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>>>>>>>> >>>>>>>>>> u_array = 0.d0 >>>>>>>>>> >>>>>>>>>> v_array = 0.d0 >>>>>>>>>> >>>>>>>>>> w_array = 0.d0 >>>>>>>>>> >>>>>>>>>> p_array = 0.d0 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> The code aborts at call >>>>>>>>>> DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr), >>>>>>>>>> giving segmentation error. But other >>>>>>>>>> version of intel compiler passes thru this part w/o >>>>>>>>>> error. >>>>>>>>>> Since the response is different among different compilers, is this >>>>>>>>>> PETSc or >>>>>>>>>> intel 's bug? Or mvapich or openmpi? >>>>>>>>>> >>>>>>>>>> We do this is a bunch of examples. Can you reproduce this >>>>>>>>>> different behavior in src/dm/examples/tutorials/ex11f90.F? >>>>>>>>>> >>>>>>>>> Hi Matt, >>>>>>>>> >>>>>>>>> Do you mean putting the above lines into ex11f90.F and test? >>>>>>>>> >>>>>>>>> It already has DMDAVecGetArray(). Just run it. >>>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> It worked. The differences between mine and the code is the way the >>>>>>>> fortran modules are defined, and the ex11f90 only uses global vectors. >>>>>>>> Does >>>>>>>> it make a difference whether global or local vectors are used? Because >>>>>>>> the >>>>>>>> way it accesses x1 only touches the local region. >>>>>>>> >>>>>>>> No the global/local difference should not matter. >>>>>>>> Also, before using DMDAVecGetArrayF90, DMGetGlobalVector must be >>>>>>>> used 1st, is that so? I can't find the equivalent for local vector >>>>>>>> though. >>>>>>>> >>>>>>>> DMGetLocalVector() >>>>>>>> >>>>>>> Ops, I do not have DMGetLocalVector and DMRestoreLocalVector in my >>>>>>> code. Does it matter? >>>>>>> >>>>>>> If so, when should I call them? >>>>>>> >>>>>>> You just need a local vector from somewhere. >>>>>>> >>>>>> Hi, >>>>> >>>>> Anyone can help with the questions below? Still trying to find why my >>>>> code doesn't work. >>>>> >>>>> Thanks. >>>>> >>>>>> Hi, >>>>>> >>>>>> I insert part of my error region code into ex11f90: >>>>>> >>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>> call DMDAVecGetArrayF90(da_p,p_local,p_array,ierr) >>>>>> >>>>>> u_array = 0.d0 >>>>>> v_array = 0.d0 >>>>>> w_array = 0.d0 >>>>>> p_array = 0.d0 >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_p,p_local,p_array,ierr) >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>> >>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>> >>>>>> It worked w/o error. I'm going to change the way the modules are >>>>>> defined in my code. >>>>>> >>>>>> My code contains a main program and a number of modules files, with >>>>>> subroutines inside e.g. >>>>>> >>>>>> module solve >>>>>> <- add include file? >>>>>> subroutine RRK >>>>>> <- add include file? >>>>>> end subroutine RRK >>>>>> >>>>>> end module solve >>>>>> >>>>>> So where should the include files (#include <finclude/petscdmda.h90>) >>>>>> be placed? >>>>>> >>>>>> After the module or inside the subroutine? >>>>>> >>>>>> Thanks. >>>>>> >>>>>>> Matt >>>>>>> Thanks. >>>>>>> >>>>>>>> Matt >>>>>>>> Thanks. >>>>>>>> >>>>>>>>> Matt >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> Regards. >>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> As in w, then v and u? >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> thanks >>>>>>>>>> Note also that the beginning and end indices of the u,v,w, >>>>>>>>>> are different for each process see for example >>>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-3.4/src/dm/examples/ >>>>>>>>>> tutorials/ex11f90.F (and they do not start at 1). This is how >>>>>>>>>> to get the loop bounds. >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> In my case, I fixed the u,v,w such that their indices are the >>>>>>>>>> same. I also checked using DMDAGetCorners and DMDAGetGhostCorners. >>>>>>>>>> Now the >>>>>>>>>> problem lies in my subroutine treating it as a “plain old Fortran >>>>>>>>>> array”. >>>>>>>>>> >>>>>>>>>> If I declare them as pointers, their indices follow the C 0 start >>>>>>>>>> convention, is that so? >>>>>>>>>> Not really. It is that in each process you need to access >>>>>>>>>> them from the indices indicated by DMDAGetCorners() for global >>>>>>>>>> vectors and >>>>>>>>>> DMDAGetGhostCorners() for local vectors. So really C or Fortran >>>>>>>>>> doesn’t make any >>>>>>>>>> difference. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> So my problem now is that in my old MPI code, the u(i,j,k) follow >>>>>>>>>> the Fortran 1 start convention. Is there some way to manipulate such >>>>>>>>>> that I >>>>>>>>>> do not have to change my u(i,j,k) to u(i-1,j-1,k-1)? >>>>>>>>>> If you code wishes to access them with indices plus one from >>>>>>>>>> the values returned by DMDAGetCorners() for global vectors and >>>>>>>>>> DMDAGetGhostCorners() for local vectors then you need to manually >>>>>>>>>> subtract >>>>>>>>>> off the 1. >>>>>>>>>> >>>>>>>>>> Barry >>>>>>>>>> >>>>>>>>>> Thanks. >>>>>>>>>> Barry >>>>>>>>>> >>>>>>>>>> On Apr 18, 2014, at 10:58 AM, TAY wee-beng <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I tried to pinpoint the problem. I reduced my job size and hence >>>>>>>>>> I can run on 1 processor. Tried using valgrind but perhaps I'm using >>>>>>>>>> the >>>>>>>>>> optimized version, it didn't catch the error, besides saying >>>>>>>>>> "Segmentation >>>>>>>>>> fault (core dumped)" >>>>>>>>>> >>>>>>>>>> However, by re-writing my code, I found out a few things: >>>>>>>>>> >>>>>>>>>> 1. if I write my code this way: >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> >>>>>>>>>> u_array = .... >>>>>>>>>> >>>>>>>>>> v_array = .... >>>>>>>>>> >>>>>>>>>> w_array = .... >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> The code runs fine. >>>>>>>>>> >>>>>>>>>> 2. if I write my code this way: >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> >>>>>>>>>> call uvw_array_change(u_array,v_array,w_array) -> this >>>>>>>>>> subroutine does the same modification as the above. >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) -> error >>>>>>>>>> >>>>>>>>>> where the subroutine is: >>>>>>>>>> >>>>>>>>>> subroutine uvw_array_change(u,v,w) >>>>>>>>>> >>>>>>>>>> real(8), intent(inout) :: u(:,:,:),v(:,:,:),w(:,:,:) >>>>>>>>>> >>>>>>>>>> u ... >>>>>>>>>> v... >>>>>>>>>> w ... >>>>>>>>>> >>>>>>>>>> end subroutine uvw_array_change. >>>>>>>>>> >>>>>>>>>> The above will give an error at : >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> 3. Same as above, except I change the order of the last 3 lines >>>>>>>>>> to: >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> >>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> >>>>>>>>>> So they are now in reversed order. Now it works. >>>>>>>>>> >>>>>>>>>> 4. Same as 2 or 3, except the subroutine is changed to : >>>>>>>>>> >>>>>>>>>> subroutine uvw_array_change(u,v,w) >>>>>>>>>> >>>>>>>>>> real(8), intent(inout) :: u(start_indices(1):end_ >>>>>>>>>> indices(1),start_indices(2):end_indices(2),start_indices( >>>>>>>>>> 3):end_indices(3)) >>>>>>>>>> >>>>>>>>>> real(8), intent(inout) :: v(start_indices(1):end_ >>>>>>>>>> indices(1),start_indices(2):end_indices(2),start_indices( >>>>>>>>>> 3):end_indices(3)) >>>>>>>>>> >>>>>>>>>> real(8), intent(inout) :: w(start_indices(1):end_ >>>>>>>>>> indices(1),start_indices(2):end_indices(2),start_indices( >>>>>>>>>> 3):end_indices(3)) >>>>>>>>>> >>>>>>>>>> u ... >>>>>>>>>> v... >>>>>>>>>> w ... >>>>>>>>>> >>>>>>>>>> end subroutine uvw_array_change. >>>>>>>>>> >>>>>>>>>> The start_indices and end_indices are simply to shift the 0 >>>>>>>>>> indices of C convention to that of the 1 indices of the Fortran >>>>>>>>>> convention. >>>>>>>>>> This is necessary in my case because most of my codes start array >>>>>>>>>> counting >>>>>>>>>> at 1, hence the "trick". >>>>>>>>>> >>>>>>>>>> However, now no matter which order of the DMDAVecRestoreArrayF90 >>>>>>>>>> (as in 2 or 3), error will occur at "call >>>>>>>>>> DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> " >>>>>>>>>> >>>>>>>>>> So did I violate and cause memory corruption due to the trick >>>>>>>>>> above? But I can't think of any way other >>>>>>>>>> than the "trick" to continue using the 1 >>>>>>>>>> indices >>>>>>>>>> convention. >>>>>>>>>> >>>>>>>>>> Thank you. >>>>>>>>>> >>>>>>>>>> Yours sincerely, >>>>>>>>>> >>>>>>>>>> TAY wee-beng >>>>>>>>>> >>>>>>>>>> On 15/4/2014 8:00 PM, Barry Smith wrote: >>>>>>>>>> Try running under valgrind http://www.mcs.anl.gov/petsc/ >>>>>>>>>> documentation/faq.html#valgrind >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Apr 14, 2014, at 9:47 PM, TAY wee-beng <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hi Barry, >>>>>>>>>> >>>>>>>>>> As I mentioned earlier, the code works fine in PETSc debug mode >>>>>>>>>> but fails in non-debug mode. >>>>>>>>>> >>>>>>>>>> I have attached my code. >>>>>>>>>> >>>>>>>>>> Thank you >>>>>>>>>> >>>>>>>>>> Yours sincerely, >>>>>>>>>> >>>>>>>>>> TAY wee-beng >>>>>>>>>> >>>>>>>>>> On 15/4/2014 2:26 AM, Barry Smith wrote: >>>>>>>>>> Please send the code that creates da_w and the declarations >>>>>>>>>> of w_array >>>>>>>>>> >>>>>>>>>> Barry >>>>>>>>>> >>>>>>>>>> On Apr 14, 2014, at 9:40 AM, TAY wee-beng >>>>>>>>>> <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Barry, >>>>>>>>>> >>>>>>>>>> I'm not too sure how to do it. I'm running mpi. So I run: >>>>>>>>>> >>>>>>>>>> mpirun -n 4 ./a.out -start_in_debugger >>>>>>>>>> >>>>>>>>>> I got the msg below. Before the gdb windows appear (thru x11), >>>>>>>>>> the program aborts. >>>>>>>>>> >>>>>>>>>> Also I tried running in another cluster and it worked. Also tried >>>>>>>>>> in the current cluster in debug mode and it worked too. >>>>>>>>>> >>>>>>>>>> mpirun -n 4 ./a.out -start_in_debugger >>>>>>>>>> ------------------------------------------------------------ >>>>>>>>>> -------------- >>>>>>>>>> An MPI process has executed an operation involving a call to the >>>>>>>>>> "fork()" system call to create a child process. Open MPI is >>>>>>>>>> currently >>>>>>>>>> operating in a condition that could result in memory corruption or >>>>>>>>>> other system errors; your MPI job may hang, crash, or produce >>>>>>>>>> silent >>>>>>>>>> data corruption. The use of fork() (or system() or other calls >>>>>>>>>> that >>>>>>>>>> create child processes) is strongly discouraged. >>>>>>>>>> >>>>>>>>>> The process that invoked fork was: >>>>>>>>>> >>>>>>>>>> Local host: n12-76 (PID 20235) >>>>>>>>>> MPI_COMM_WORLD rank: 2 >>>>>>>>>> >>>>>>>>>> If you are *absolutely sure* that your application will >>>>>>>>>> successfully >>>>>>>>>> and correctly survive a call to fork(), you may disable this >>>>>>>>>> warning >>>>>>>>>> by setting the mpi_warn_on_fork MCA parameter to 0. >>>>>>>>>> ------------------------------------------------------------ >>>>>>>>>> -------------- >>>>>>>>>> [2]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20235 on >>>>>>>>>> display localhost:50.0 on machine n12-76 >>>>>>>>>> [0]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20233 on >>>>>>>>>> display localhost:50.0 on machine n12-76 >>>>>>>>>> [1]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20234 on >>>>>>>>>> display localhost:50.0 on machine n12-76 >>>>>>>>>> [3]PETSC ERROR: PETSC: Attaching gdb to ./a.out of pid 20236 on >>>>>>>>>> display localhost:50.0 on machine n12-76 >>>>>>>>>> [n12-76:20232] 3 more processes have sent help message >>>>>>>>>> help-mpi-runtime.txt / mpi_init:warn-fork >>>>>>>>>> [n12-76:20232] Set MCA parameter "orte_base_help_aggregate" to 0 >>>>>>>>>> to see all help / error messages >>>>>>>>>> >>>>>>>>>> .... >>>>>>>>>> >>>>>>>>>> 1 >>>>>>>>>> [1]PETSC ERROR: ------------------------------ >>>>>>>>>> ------------------------------------------ >>>>>>>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>>>>>>>>> Violation, probably memory access out of range >>>>>>>>>> [1]PETSC ERROR: Try option -start_in_debugger or >>>>>>>>>> -on_error_attach_debugger >>>>>>>>>> [1]PETSC ERROR: or see >>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html# >>>>>>>>>> valgrind[1]PETSC ERROR: or try http://valgrind.org >>>>>>>>>> on GNU/linux and Apple Mac OS X to find memory corruption >>>>>>>>>> errors >>>>>>>>>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, >>>>>>>>>> link, and run >>>>>>>>>> [1]PETSC ERROR: to get more information on the crash. >>>>>>>>>> [1]PETSC ERROR: User provided function() line 0 in unknown >>>>>>>>>> directory unknown file (null) >>>>>>>>>> [3]PETSC ERROR: ------------------------------ >>>>>>>>>> ------------------------------------------ >>>>>>>>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>>>>>>>>> Violation, probably memory access out of range >>>>>>>>>> [3]PETSC ERROR: Try option -start_in_debugger or >>>>>>>>>> -on_error_attach_debugger >>>>>>>>>> [3]PETSC ERROR: or see >>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html# >>>>>>>>>> valgrind[3]PETSC ERROR: or try http://valgrind.org >>>>>>>>>> on GNU/linux and Apple Mac OS X to find memory corruption >>>>>>>>>> errors >>>>>>>>>> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, >>>>>>>>>> link, and run >>>>>>>>>> [3]PETSC ERROR: to get more information on the crash. >>>>>>>>>> [3]PETSC ERROR: User provided function() line 0 in unknown >>>>>>>>>> directory unknown file (null) >>>>>>>>>> >>>>>>>>>> ... >>>>>>>>>> Thank you. >>>>>>>>>> >>>>>>>>>> Yours sincerely, >>>>>>>>>> >>>>>>>>>> TAY wee-beng >>>>>>>>>> >>>>>>>>>> On 14/4/2014 9:05 PM, Barry Smith wrote: >>>>>>>>>> >>>>>>>>>> Because IO doesn’t always get flushed immediately it may not >>>>>>>>>> be hanging at this point. It is better to use the option >>>>>>>>>> -start_in_debugger then type cont in each debugger window and then >>>>>>>>>> when you >>>>>>>>>> think it is “hanging” do a control C in each debugger window and >>>>>>>>>> type where >>>>>>>>>> to see where each process is you can also look around in the >>>>>>>>>> debugger at >>>>>>>>>> variables to see why it is “hanging” at that point. >>>>>>>>>> >>>>>>>>>> Barry >>>>>>>>>> >>>>>>>>>> This routines don’t have any parallel communication in them >>>>>>>>>> so are unlikely to hang. >>>>>>>>>> >>>>>>>>>> On Apr 14, 2014, at 6:52 AM, TAY wee-beng >>>>>>>>>> >>>>>>>>>> <[email protected]> >>>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> My code hangs and I added in mpi_barrier and print to catch the >>>>>>>>>> bug. I found that it hangs after printing "7". Is it because I'm >>>>>>>>>> doing >>>>>>>>>> something wrong? I need to access the u,v,w array so I use >>>>>>>>>> DMDAVecGetArrayF90. After access, I use DMDAVecRestoreArrayF90. >>>>>>>>>> >>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) >>>>>>>>>> print *,"3" >>>>>>>>>> call DMDAVecGetArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) >>>>>>>>>> print *,"4" >>>>>>>>>> call DMDAVecGetArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) >>>>>>>>>> print *,"5" >>>>>>>>>> call I_IIB_uv_initial_1st_dm(I_ >>>>>>>>>> cell_no_u1,I_cell_no_v1,I_cell_no_w1,I_cell_u1,I_cell_ >>>>>>>>>> v1,I_cell_w1,u_array,v_array,w_array) >>>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) >>>>>>>>>> print *,"6" >>>>>>>>>> call DMDAVecRestoreArrayF90(da_w,w_local,w_array,ierr) >>>>>>>>>> !must be in reverse order >>>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) >>>>>>>>>> print *,"7" >>>>>>>>>> call DMDAVecRestoreArrayF90(da_v,v_local,v_array,ierr) >>>>>>>>>> call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) >>>>>>>>>> print *,"8" >>>>>>>>>> call DMDAVecRestoreArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>>> -- >>>>>>>>>> Thank you. >>>>>>>>>> >>>>>>>>>> Yours sincerely, >>>>>>>>>> >>>>>>>>>> TAY wee-beng >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> <code.txt> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which >>>>>>>>>> their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which >>>>>>>>> their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which >>>>>>>> their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which >>>>>>> their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
