Re: [petsc-users] Strange Segmentation Violation error

TAY wee-beng Sat, 17 Jun 2017 08:49:59 -0700

Hi Lukasz,

Thanks for the tip.

I tied using valgrind. However, I got a lot of errors at a few oflocations. One complained of uninitialized value of :


call PetscInitialize(PETSC_NULL_CHARACTER,ierr)

But I already initialize "ierr". Are these errors valid or can I hide them?

==
==17300== Conditional jump or move depends on uninitialised value(s)

==17300== at 0x3C2A872849: _IO_file_fopen@@GLIBC_2.2.5 (in/lib64/libc-2.12.so)

==17300==    by 0x3C2A866D95: __fopen_internal (in /lib64/libc-2.12.so)
==17300==    by 0x3C2A8E2CB3: setmntent (in /lib64/libc-2.12.so)

==17300== by 0xA726083: mca_mpool_hugepage_open (in/home/tsltaywb/lib/openmpi-2.1.1/lib/openmpi/mca_mpool_hugepage.so)==17300== by 0x65A83A1: mca_base_framework_components_open (in/home/tsltaywb/lib/openmpi-2.1.1/lib/libopen-pal.so.20.10.1)==17300== by 0x6614041: mca_mpool_base_open (in/home/tsltaywb/lib/openmpi-2.1.1/lib/libopen-pal.so.20.10.1)==17300== by 0x65B1EC0: mca_base_framework_open (in/home/tsltaywb/lib/openmpi-2.1.1/lib/libopen-pal.so.20.10.1)==17300== by 0x5E11123: ompi_mpi_init (in/home/tsltaywb/lib/openmpi-2.1.1/lib/libmpi.so.20.10.1)==17300== by 0x5E31032: PMPI_Init (in/home/tsltaywb/lib/openmpi-2.1.1/lib/libmpi.so.20.10.1)==17300== by 0x5978E87: PMPI_INIT (in/home/tsltaywb/lib/openmpi-2.1.1/lib/libmpi_mpifh.so.20.11.0)

==17300==    by 0xB29696: petscinitialize_ (zstart.c:316)
==17300==    by 0xA80D2B: MAIN__ (ibm3d_high_Re.F90:63)
==17300==  Uninitialised value was created by a stack allocation
==17300==    at 0x3C2A8E2C82: setmntent (in /lib64/libc-2.12.so)
==17300==
==17300== Conditional jump or move depends on uninitialised value(s)

==17300== at 0x3C2A87284F: _IO_file_fopen@@GLIBC_2.2.5 (in/lib64/libc-2.12.so)

==17300==    by 0x3C2A866D95: __fopen_internal (in /lib64/libc-2.12.so)
==17300==    by 0x3C2A8E2CB3: setmntent (in /lib64/libc-2.12.so)

==17300==    by 0xB29696: petscinitialize_ (zstart.c:316)
==17300==    by 0xA80D2B: MAIN__ (ibm3d_high_Re.F90:63)





Thank you very much.

Yours sincerely,

================================================
TAY Wee-Beng (Zheng Weiming) 郑伟明
Personal research webpage:http://tayweebeng.wixsite.com/website
Youtube research 
showcase:https://www.youtube.com/channel/UC72ZHtvQNMpNs2uRTSToiLA
linkedin:www.linkedin.com/in/tay-weebeng
================================================

On 7/6/2017 3:22 PM, Lukasz Kaczmarczyk wrote:

On 7 Jun 2017, at 07:57, TAY wee-beng <[email protected]<mailto:[email protected]>> wrote:
Hi,
I have been PETSc together with my CFD code. There seems to be a bugwith the Intel compiler such that when I call some DM routines suchas DMLocalToLocalBegin, a segmentation violation will occur if fulloptimization is used. I had posted this question a while back. So thecurrent solution is to use -O1 -ip instead of -O3 -ipo -ip forcertain source files which uses DMLocalToLocalBegin etc.
Recently, I made some changes to the code, mainly adding some stuffs.However, depending on my options. some cases still go thru the sameprogram path.
Now when I tried to run those same cases, I got segmentationviolation, which didn't happen before:
/ IIB_I_cell_no_uvw_total2 14          10           6           3//
//           2           1/
/[0]PETSC ERROR:------------------------------------------------------------------------////[0]PETSC ERROR: Caught signal number 11 SEGV: SegmentationViolation, probably memory access out of range////[0]PETSC ERROR: Try option -start_in_debugger or-on_error_attach_debugger////[0]PETSC ERROR: or seehttp://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind////[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and AppleMac OS X to find memory corruption errors////[0]PETSC ERROR: configure using --with-debugging=yes, recompile,link, and run //
//[0]PETSC ERROR: to get more information on the crash.//
//[0]PETSC ERROR: --------------------- Error Message--------------------------------------------------------------//
//[0]PETSC ERROR: Signal received//
//[0]PETSC ERROR: Seehttp://www.mcs.anl.gov/petsc/documentation/faq.html for troubleshooting.//
//[0]PETSC ERROR: Petsc Release Version 3.7.4, Oct, 02, 2016 //
//[0]PETSC ERROR: ./a.out /
I can't debug using VS since the codes have been optimized. I triedto print messages (if (myid == 0) print "1") to pinpoint the error.Strangely, after adding these print messages, the error disappears.
/ IIB_I_cell_no_uvw_total2 14          10           6           3//
//           2           1//
// 1//
// 2//
// 3//
// 4//
// 5//
// 1 0.26873613 0.12620288 0.12949340 1.114223630.43983516E-06 -0.59311066E-01 0.25546227E+04//// 2 0.22236892 0.14528589 0.16939270 1.104591020.74556128E-02 -0.55168234E-01 0.25532419E+04//// 3 0.20764796 0.14832689 0.18780489 1.080395690.80299767E-02 -0.46972411E-01 0.25523174E+04/
Can anyone give a logical explanation why this is happening?Moreover, if I removed printing 1 to 3, and only print 4 and 5,segmentation violation appears again.
I am using Intel Fortran 2016.1.150. I wonder if it helps if I postin the Intel Fortran forum.
I can provide more info if require.
You very likely write on the memory, for example when you exceed thesize of arrays. Depending on your compilation options, startingparameters, etc. you write in an uncontrolled way on the part ofmemory which belongs to your process or protected by operation system.In the second case, you have a segmentation fault. You can havecorrect results for some runs, but your bug is there hiding in the dark.
To put light on it, you need Valgrind. Compile the code with debuggingon, no optimisation and start searching. You can run as well generatecore file and in gdb/ldb buck track error.
Lukasz

Re: [petsc-users] Strange Segmentation Violation error

Reply via email to