Hi Francesco,
please don't drop petsc-users from the communication. This will likely
provide you with better and faster answers.
Since your current build is with debugging turned off, please
reconfigure with debugging turned on, as the error message says. Chances
are good that you will get much more precise information about what went
wrong.
Best regards,
Karli
On 04/19/2017 03:03 PM, Francesco Migliorini wrote:
Hi, thank you for your answer!
Unfortunately I cannot use Valgrind on the machine I am using, but I am
sure than the error is in using VecAssembly. Here's the error message
from PETSc:
[1]PETSC ERROR:
------------------------------------------------------------------------
[1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range
[1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[1]PETSC ERROR: or see
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
X to find memory corruption errors
[1]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
and run
[1]PETSC ERROR: to get more information on the crash.
[1]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[1]PETSC ERROR: Signal received
[1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
for trouble shooting.
[1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015
[1]PETSC ERROR: /u/migliorini/SPEED/SPEED on a arch-linux-opt named
idra116 by migliorini Wed Apr 19 10:20:48 2017
[1]PETSC ERROR: Configure options
--prefix=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/petsc/3.6.3
--with-petsc-arch=arch-linux-opt --with-fortran=1 --with-pic=1
--with-debugging=0 --with-x=0 --with-blas-lapack=1
--with-blas-lib=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/openblas/0.2.17/lib/libopenblas.so
--with-lapack-lib=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/openblas/0.2.17/lib/libopenblas.so
--with-boost=1
--with-boost-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/boost/1.60.0
--with-fftw=1
--with-fftw-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/fftw/3.3.4
--with-hdf5=1
--with-hdf5-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/hdf5/1.8.16
--with-hypre=1
--with-hypre-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/hypre/2.11.0
--with-metis=1
--with-metis-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/metis/5
--with-mumps=1
--with-mumps-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/mumps/5.0.1
--with-netcdf=1
--with-netcdf-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/netcdf/4.4.0
--with-p4est=1
--with-p4est-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/p4est/1.1
--with-parmetis=1
--with-parmetis-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/metis/5
--with-ptscotch=1
--with-ptscotch-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/scotch/6.0.4
--with-scalapack=1
--with-scalapack-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/scalapack/2.0.2
--with-suitesparse=1
--with-suitesparse-dir=/u/sw/pkgs/toolchains/gcc-glibc/5/pkgs/suitesparse/4.5.1
[1]PETSC ERROR: #1 User provided function() line 0 in unknown file
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
with errorcode 59.
do in=1,nnod_loc is a loop over the nodes contained in the local vector
because the program arrives to Petsc initialization with already
multiple processes. Then I thought Petsc was applied to all the
processes separately and therefore the global dimensions of the system
were the local ones of the MPI processes. Maybe it does not work in this
way...
2017-04-19 13:25 GMT+02:00 Karl Rupp <[email protected]
<mailto:[email protected]>>:
Hi Francesco,
please consider the following:
a) run your code through valgrind to locate the segmentation fault.
Maybe there is already a memory access problem in the sequential
version.
b) send any error messages as well as the stack trace.
c) what is you intent with "do in = nnod_loc"? Isn't nnoc_loc the
number of local elements?
Best regards,
Karli
On 04/19/2017 12:26 PM, Francesco Migliorini wrote:
Hello!
I have an MPI code in which a linear system is created and
solved with
PETSc. It works in sequential run but when I use multiple cores the
VecAssemblyBegin/End give segmentation fault. Here's a sample of
my code:
call PetscInitialize(PETSC_NULL_CHARACTER,perr)
ind(1) = 3*nnod_loc*max_time_deg
call VecCreate(PETSC_COMM_WORLD,feP,perr)
call VecSetSizes(feP,PETSC_DECIDE,ind,perr)
call VecSetFromOptions(feP,perr)
do in = nnod_loc
do jt = 1,mm
ind(1) = 3*((in -1)*max_time_deg + (jt-1))
fval(1) = fe(3*((in -1)*max_time_deg + (jt-1)) +1)
call VecSetValues(feP,1,ind,fval(1),INSERT_VALUES,perr)
ind(1) = 3*((in -1)*max_time_deg + (jt-1)) +1
fval(1) = fe(3*((in -1)*max_time_deg + (jt-1)) +2)
call VecSetValues(feP,1,ind,fval(1),INSERT_VALUES,perr)
ind(1) = 3*((in -1)*max_time_deg + (jt-1)) +2
fval(1) = fe(3*((in -1)*max_time_deg + (jt-1)) +3)
call VecSetValues(feP,1,ind,fval(1),INSERT_VALUES,perr)
enddo
enddo
enddo
call VecAssemblyBegin(feP,perr)
call VecAssemblyEnd(feP,perr)
The vector has 640.000 elements more or less but I am running on
a high
performing computer so there shouldn't be memory issues. Does anyone
know where is the problem and how can I fix it?
Thank you,
Francesco Migliorini