On Sun, May 13, 2012 at 7:39 PM, Mohammad Mirzadeh <mirzadeh at gmail.com>wrote:
> on a somewhat related topic, I recently found I have the CHKERRXX option > which throws exceptions. This seems to be a better option for me since in > some some functions with return values I cannot use CHKERRQ/V. If I use > this, do I need to catch the exception in myself or is it caught by petsc? > You catch the exception. Matt > On Sun, May 13, 2012 at 4:33 PM, Mohammad Mirzadeh <mirzadeh at > gmail.com>wrote: > >> >> >> On Sat, May 12, 2012 at 3:16 AM, Matthew Knepley <knepley at gmail.com>wrote: >> >>> On Sat, May 12, 2012 at 4:33 AM, Mohammad Mirzadeh <mirzadeh at >>> gmail.com>wrote: >>> >>>> Hi guys, >>>> >>>> I'm having a really weird issue here! My code seg faults for certain >>>> problem size and after using gdb I have been able to pinpoint the problem >>>> to a VecGetArray call. Here's a series of things I have tried so far >>>> >>>> 1) -on_error_attach_debugger -----> unsuccessful; does not launch >>>> debugger >>>> 2) -start_in_debugger -------> unsuccessful; does not start debugger >>>> >>> >>> Run with -log_summary. It will tell you what options the program got. >>> Also, are there errors relating to X? Send >>> all output to petsc-maint at mcs.anl.gov >>> >>> >> >> Matt, -log_summary also does not generate any output! I was eventually >> able to start_in_debugger using xterm. Previously I was trying to start in >> kdbg. Even with xterm, -on_error_attach_debugger does not start the >> debugger. In either case, starting the debugger in xterm using >> -start_in_debugger or attaching the debugger myself manually, I get a >> segfault at VecGetArray and then the program terminates without any further >> output. >> >> >>> 3) attaching debugger myself -----> code runs in debugger and seg faults >>>> when calling VecGetArray >>>> >>> >>> Is this a debug build? What dereference is causing the SEGV? Is the Vec >>> a valid object? It sounds like >>> it has been corrupted. >>> >>> >> >> Yes; with the -g option. How can I check if Vec is "valid"? >> >> 4) using ierr=VecGetArray;CHKERRQ(ierr) ------> PETSc does not produce >>>> error messages; the code simply seg faults and terminates >>>> 5) checking the values of ierr inside the debugger ---------> They are >>>> all 0 up untill the code terminates; I think this means petsc does not >>>> generate error? >>>> 6) checking for memory leak with valgrind -----------> All I get are >>>> leaks from OpenMPI and PetscInitialize and PetscFinalize; I think these are >>>> just routine and safe? >>>> >>> >>> >> Should I attach the whole valgrind output here or send it to petsc-maint? >> I just repeast these two a couple of times!: >> >> ==4508== 320 (288 direct, 32 indirect) bytes in 1 blocks are definitely >> lost in loss record 2,644 of 2,665 >> ==4508== at 0x4C2815C: malloc (vg_replace_malloc.c:236) >> ==4508== by 0x86417ED: ??? >> ==4508== by 0x5D4D099: orte_rml_base_comm_start (in >> /usr/lib/openmpi/lib/libopen-rte.so.0.0.0) >> ==4508== by 0x8640AD1: ??? >> ==4508== by 0x5D3AFE6: orte_ess_base_app_setup (in >> /usr/lib/openmpi/lib/libopen-rte.so.0.0.0) >> ==4508== by 0x8846E41: ??? >> ==4508== by 0x5D23A52: orte_init (in >> /usr/lib/openmpi/lib/libopen-rte.so.0.0.0) >> ==4508== by 0x5A9E806: ??? (in /usr/lib/openmpi/lib/libmpi.so.0.0.1) >> ==4508== by 0x5ABFD7F: PMPI_Init (in >> /usr/lib/openmpi/lib/libmpi.so.0.0.1) >> ==4508== by 0x530A90: PetscInitialize(int*, char***, char const*, char >> const*) (pinit.c:668) >> ==4508== by 0x4A4955: PetscSession::PetscSession(int*, char***, char >> const*, char const*) (utilities.h:17) >> ==4508== by 0x4A1DA5: main (main_Test2.cpp:49) >> >> ==4508== 74 bytes in 1 blocks are definitely lost in loss record 2,411 of >> 2,665 >> ==4508== at 0x4C2815C: malloc (vg_replace_malloc.c:236) >> ==4508== by 0x6F2DDA1: strdup (strdup.c:43) >> ==4508== by 0x5F85117: ??? (in >> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0) >> ==4508== by 0x5F85359: mca_base_param_lookup_string (in >> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0) >> ==4508== by 0xB301869: ??? >> ==4508== by 0xB2F5126: ??? >> ==4508== by 0x5F82E17: mca_base_components_open (in >> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0) >> ==4508== by 0x5ADA6BA: mca_btl_base_open (in >> /usr/lib/openmpi/lib/libmpi.so.0.0.1) >> ==4508== by 0xA6A9B93: ??? >> ==4508== by 0x5F82E17: mca_base_components_open (in >> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0) >> ==4508== by 0x5AE3C88: mca_pml_base_open (in >> /usr/lib/openmpi/lib/libmpi.so.0.0.1) >> ==4508== by 0x5A9E9E0: ??? (in /usr/lib/openmpi/lib/libmpi.so.0.0.1) >> >> >> but eventually I get: >> >> ==4508== LEAK SUMMARY: >> ==4508== definitely lost: 5,949 bytes in 55 blocks >> ==4508== indirectly lost: 3,562 bytes in 32 blocks >> ==4508== possibly lost: 0 bytes in 0 blocks >> ==4508== still reachable: 181,516 bytes in 2,660 blocks >> ==4508== suppressed: 0 bytes in 0 blocks >> ==4508== Reachable blocks (those to which a pointer was found) are not >> shown. >> ==4508== To see them, rerun with: --leak-check=full --show-reachable=y >> >> which seems considerable! >> >> >>> How can we say anything without the valgrind output? >>> >>> Matt >>> >>> >>>> >>>> What else can I try to find the problem? Any recommendation is really >>>> appreciated! >>>> >>>> Thanks, >>>> Mohammad >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120513/99109f8e/attachment-0001.htm>
