It is almost surely some subtle memory corruption somewhere use 
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind to track it down.




> On Oct 6, 2016, at 4:22 AM, Ivano Barletta <[email protected]> wrote:
> 
> Hello everyone
> 
> Recently I resumed the task of nesting Petsc into
> this fem ocean model, for the solution of a linear system
> 
> I followed your suggestions and "almost" everything works.
> 
> The problem raised during a run with 4 CPUs, when i got this
> error 
> 
>    3:[3]PETSC ERROR: --------------------- Error Message 
> --------------------------------------------------------------
>    3:[3]PETSC ERROR: Petsc has generated inconsistent data
>    3:[3]PETSC ERROR: Negative MPI source!
>    3:[3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html 
> for trouble shooting.
>    3:[3]PETSC ERROR: Petsc Release Version 3.7.1, May, 15, 2016
>    3:[3]PETSC ERROR: /users/home/ib04116/shympi_last4/fem3d/shympi            
>                                                                               
>                                                                               
>                                            ^A on a linux-gnu-intel named 
> n243.cluster.net by ib04116 Thu Oct  6 10:37:01 2016
>    3:[3]PETSC ERROR: Configure options 
> CFLAGS=-I/users/home/opt/netcdf/netcdf-4.2.1.1/include 
> -I/users/home/opt/szip/szip-2.1/include 
> -I/users/home/opt/hdf5/hdf5-1.8.10-patch1/include -I/usr/include 
> -I/users/home/opt/netcdf/netcdf-4.3/include 
> -I/users/home/opt/hdf5/hdf5-1.8.11/include FFLAGS=-xHost -no-prec-div -O3 
> -I/users/home/opt/netcdf/netcdf-4.2.1.1/include 
> -I/users/home/opt/netcdf/netcdf-4.3/include 
> LDFLAGS=-L/users/home/opt/netcdf/netcdf-4.2.1.1/lib -lnetcdff 
> -L/users/home/opt/szip/szip-2.1/lib 
> -L/users/home/opt/hdf5/hdf5-1.8.10-patch1/lib 
> -L/users/home/opt/netcdf/netcdf-4.2.1.1/lib -L/usr/lib64/ -lz -lnetcdf 
> -lnetcdf -lgpfs -L/users/home/opt/netcdf/netcdf-4.3/lib 
> -L/users/home/opt/hdf5/hdf5-1.8.11/lib 
> -L/users/home/opt/netcdf/netcdf-4.3/lib -lcurl --PETSC_ARCH=linux-gnu-intel 
> --with-cc=mpiicc --with-fc=mpiifort --with-cxx=mpiicpc --with-mpiexec=mpirun 
> --with-blas-lapack-dir=/users/home/opt/intel/composer_xe_2013/mkl 
> --with-scalapack-lib="-L/users/home/opt/intel/composer_xe_2013/mkl//lib/intel64
>  -lmkl_scalapack_ilp64 -lmkl_blacs_intelmpi_ilp64" 
> --with-scalapack-include=/users/home/opt/intel/composer_xe_2013/mkl/include 
> --download-metis --download-parmetis --download-mumps --download-superlu
>    3:[3]PETSC ERROR: #1 MatStashScatterGetMesg_Ref() line 692 in 
> /users/home/sco116/petsc/petsc-3.7.1/src/mat/utils/matstash.c
>    3:[3]PETSC ERROR: #2 MatStashScatterGetMesg_Private() line 663 in 
> /users/home/sco116/petsc/petsc-3.7.1/src/mat/utils/matstash.c
>    3:[3]PETSC ERROR: #3 MatAssemblyEnd_MPIAIJ() line 713 in 
> /users/home/sco116/petsc/petsc-3.7.1/src/mat/impls/aij/mpi/mpiaij.c
>    3:[3]PETSC ERROR: #4 MatAssemblyEnd() line 5187 in 
> /users/home/sco116/petsc/petsc-3.7.1/src/mat/interface/matrix.c
> 
> The code is in fortran and the Petsc version is 3.7.1
> 
> This error looks quite strange to me, because it doesn't happen always in the 
> same
> situation. The model goes through several time steps, but this error is not 
> raised always at the same time. It has happened at the fourth, for example, 
> at 
> the fifth time step. What is even more odd is that once the run of the model 
> (720 time steps)
> was completed without any error. 
> 
> What I do to solve the linear system for each time step is the following:
> 
> call petsc_solve( ..arguments..)
> 
> subroutine petsc_solve(..args)
> call PetscInitialize(PETSC_NULL_CHARACTER)
> 
> call MatCreate
> ...
> ...
> call KSPSolve(...)
> 
> call XXXDestroy()
> call PetscFinalize
> end subroutine
> 
> Do you think that calling PetscInitialize and PetscFinalize
> several times might cause problems? I guess Petsc use
> the same communicator of the model, which is MPI_COMM_WORLD
> 
> It don't have hints to troubleshoot this, since is not a 
> reproducible error and I don't know where to look to 
> sort it out.
> 
> Have you got any suggestion?
> 
> Thanks in advance
> 
> Ivano
> 
> 
> 
> 
> 2016-07-13 5:16 GMT+02:00 Barry Smith <[email protected]>:
> 
> > On Jul 12, 2016, at 4:13 AM, Matthew Knepley <[email protected]> wrote:
> >
> > On Tue, Jul 12, 2016 at 3:35 AM, Ivano Barletta <[email protected]> wrote:
> > Dear Petsc users
> >
> > my aim is to parallelize the solution of a linear
> > system into a finite elements
> > ocean model.
> >
> > The model has been almost entirely parallelized, with
> > a partitioning of the domain made element-wise through
> > the use of Zoltan libraries, so the subdomains
> > share the nodes lying on the edges.
> >
> > The linear system includes node-to-node dependencies
> > so my guess is that I need to create an halo surrounding
> > each subdomain, to allow connections of edge nodes with
> > neighbour subdomains ones
> >
> > Apart from that, my question is if Petsc accept a
> > previously made partitioning (maybe taking into account of halo)
> > using the data structures coming out of it
> >
> > Has anybody of you ever faced a similar problem?
> >
> > If all you want to do is construct a PETSc Mat and Vec for the linear 
> > system,
> > just give PETSc the non-overlapping partition to create those objects. You
> > can input values on off-process partitions automatically using 
> > MatSetValues()
> > and VecSetValues().
> 
>   Note that by just using the VecSetValues() and MatSetValues() PETSc will 
> manage all the halo business needed by the linear algebra system solver 
> automatically. You don't need to provide any halo information to PETSc. It is 
> really straightforward.
> 
>   Barry
> 
> >
> >   Thanks,
> >
> >     Matt
> >
> > Thanks in advance
> > Ivano
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their 
> > experiments is infinitely more interesting than any results to which their 
> > experiments lead.
> > -- Norbert Wiener
> 
> 

Reply via email to