On Thu, 5 Nov 2015 16:20:26 +0100 Jan Blechta <[email protected]> wrote:
> Actually there is missing a check of the return code here > https://bitbucket.org/fenics-project/dolfin/src/dd945d70e9a7c8548b4cb88fe8bdb2abe2198b29/dolfin/nls/PETScTAOSolver.cpp?at=master&fileviewer=file-view-default#PETScTAOSolver.cpp-263 Filed here https://bitbucket.org/fenics-project/dolfin/issues/602. Jan > > Jan > > > On Thu, 5 Nov 2015 16:14:09 +0100 > Jan Blechta <[email protected]> wrote: > > > I can reproduce it in step 6683 on 3 processes but I have no idea > > why this happens. Unfortunately I don't currently have PETSc with > > debugging so it is hard to investigate. > > > > Backtrace on one of processes: > > ================================================================================= > > Breakpoint 1, 0x00007fffecea4830 in PetscError () > > from /home/jan/dev/hashstack/fenics-deps.host-debian/lib/libpetsc.so.3.6 > > (gdb) bt > > #0 0x00007fffecea4830 in PetscError () > > from /home/jan/dev/hashstack/fenics-deps.host-debian/lib/libpetsc.so.3.6 > > #1 0x00007fffecfa0989 in VecAssemblyBegin_MPI () > > from /home/jan/dev/hashstack/fenics-deps.host-debian/lib/libpetsc.so.3.6 > > #2 0x00007fffecf7b8f7 in VecAssemblyBegin () > > from /home/jan/dev/hashstack/fenics-deps.host-debian/lib/libpetsc.so.3.6 > > #3 0x00007fffeedf9631 in dolfin::PETScVector::apply ( > > this=this@entry=0x1e8e6a0, mode="insert") > > at ../../dolfin/la/PETScVector.cpp:319 > > #4 0x00007fffeedf92b3 in dolfin::PETScVector::zero (this=0x1e8e6a0) > > at ../../dolfin/la/PETScVector.cpp:342 > > #5 0x00007fffeec0de31 in dolfin::PETScTAOSolver::solve ( > > this=this@entry=0x2173000, optimisation_problem=..., x=..., > > lb=..., ub=...) at ../../dolfin/nls/PETScTAOSolver.cpp:266 > > #6 0x00007fffeec0edae in dolfin::PETScTAOSolver::solve ( > > this=this@entry=0x2173000, optimisation_problem=..., x=...) > > at ../../dolfin/nls/PETScTAOSolver.cpp:177 > > #7 0x00007fffd8d157f2 in _wrap_PETScTAOSolver_solve__SWIG_1 ( > > swig_obj=0x7fffffffc210, nobjs=3) at modulePYTHON_wrap.cxx:41488 > > #8 _wrap_PETScTAOSolver_solve (self=<optimized out>, > > args=<optimized > > out>) at modulePYTHON_wrap.cxx:41521 > > #9 0x00000000004d2017 in PyEval_EvalFrameEx () > > #10 0x00000000004cb6b1 in PyEval_EvalCodeEx () > > ================================================================================= > > > > and other processes: > > ================================================================================= > > Program received signal SIGINT, Interrupt. > > 0x00007ffff78dba77 in sched_yield () > > at ../sysdeps/unix/syscall-template.S:81 > > 81 ../sysdeps/unix/syscall-template.S: No such file or > > directory. (gdb) bt #0 0x00007ffff78dba77 in sched_yield () > > at ../sysdeps/unix/syscall-template.S:81 > > #1 0x00007fffecb1307d in opal_progress () from /usr/lib/libmpi.so.1 > > #2 0x00007fffeca58e44 in ompi_request_default_wait_all () > > from /usr/lib/libmpi.so.1 > > #3 0x00007fffd797dab2 in > > ompi_coll_tuned_allreduce_intra_recursivedoubling () > > from /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so #4 > > 0x00007fffeca6542b in PMPI_Allreduce () from /usr/lib/libmpi.so.1 #5 > > 0x00007fffecf97f6a in VecDot_MPI () > > from /home/jan/dev/hashstack/fenics-deps.host-debian/lib/libpetsc.so.3.6 > > #6 0x00007fffecf7e8b1 in VecDot () > > from /home/jan/dev/hashstack/fenics-deps.host-debian/lib/libpetsc.so.3.6 > > #7 0x00007fffed70d9b1 in TaoSolve_CG () > > from /home/jan/dev/hashstack/fenics-deps.host-debian/lib/libpetsc.so.3.6 > > #8 0x00007fffed6f0847 in TaoSolve () > > from /home/jan/dev/hashstack/fenics-deps.host-debian/lib/libpetsc.so.3.6 > > #9 0x00007fffeec0de25 in dolfin::PETScTAOSolver::solve > > ( this=this@entry=0x1fa1a40, optimisation_problem=..., x=..., > > lb=..., ub=...) at ../../dolfin/nls/PETScTAOSolver.cpp:263 #10 > > 0x00007fffeec0edae in dolfin::PETScTAOSolver::solve > > ( this=this@entry=0x1fa1a40, optimisation_problem=..., x=...) > > at ../../dolfin/nls/PETScTAOSolver.cpp:177 #11 0x00007fffd8d157f2 in > > _wrap_PETScTAOSolver_solve__SWIG_1 ( > > ================================================================================= > > > > Jan > > > > > > On Thu, 5 Nov 2015 16:30:50 +0200 > > Giorgos Grekas <[email protected]> wrote: > > > > > Hello again, > > > > > > i would like to ask for the bug which has reported in this mail is > > > it scheduled to be fixed in the following months? > > > > > > Thank you in advance and for your great support. > > > > > > On Mon, Oct 12, 2015 at 6:42 PM, Jan Blechta > > > <[email protected]> wrote: > > > > > > > On Mon, 12 Oct 2015 17:15:18 +0300 > > > > Giorgos Grekas <[email protected]> wrote: > > > > > > > > > I provide backtrace to the file bt.txt and my code. For my > > > > > code you need to run the file runMe.py. > > > > > > > > This code fails with assertion in mshr: > > > > > > > > *** Error: Unable to complete call to function > > > > add_simple_polygon(). *** Reason: Assertion !i.second failed. > > > > *** Where: This error was encountered > > > > inside ../src/CSGCGALDomain2D.cpp (line 488). > > > > > > > > This seems like a trivial bug. Could you fix it Benjamin? > > > > > > > > Jan > > > > > > > > > > > > > > > > > > > On Mon, Oct 12, 2015 at 4:40 PM, Jan Blechta > > > > > <[email protected]> wrote: > > > > > > > > > > > PETSc error code 1 does not seem to indicate an expected > > > > > > problem, > > > > > > http://www.mcs.anl.gov/petsc/petsc-dev/include/petscerror.h.html. > > > > > > It seems as an error not handled by PETSc. > > > > > > > > > > > > You could provide us with your code or try investigating the > > > > > > problem with debugger > > > > > > > > > > > > $ mpirun -n 3 xterm -e gdb -ex 'set breakpoint pending on' > > > > > > -ex 'break PetscError' -ex 'break dolfin::dolfin_error' -ex > > > > > > r -args python your_script.py > > > > > > ... > > > > > > Break point hit... > > > > > > (gdb) bt > > > > > > > > > > > > and post a backtrace here. > > > > > > > > > > > > Jan > > > > > > > > > > > > > > > > > > On Mon, 12 Oct 2015 15:16:48 +0300 > > > > > > Giorgos Grekas <[email protected]> wrote: > > > > > > > > > > > > > Hello, > > > > > > > i am using ncg from tao solver and i wanted to test my > > > > > > > code validity in a pc with 4 processors > > > > > > > before its execution in a cluster. When i run my code > > > > > > > with 2 processes (mpirun -np 2) everything > > > > > > > looks to work fine but when i use 3 or more processes i > > > > > > > have the following error: > > > > > > > > > > > > > > > > > > > > > Error: Unable to successfully call PETSc function > > > > > > > 'VecAssemblyBegin'. *** Reason: PETSc error code is: 1. > > > > > > > *** Where: This error was encountered inside > > > > > > > > > > > > > > > > > /home/ggrekas/.hashdist/tmp/dolfin-wphma2jn5fuw/dolfin/la/PETScVector.cpp. > > > > > > > *** Process: 3 > > > > > > > *** > > > > > > > *** DOLFIN version: 1.7.0dev > > > > > > > *** Git changeset: > > > > > > > 3fbd47ec249a3e4bd9d055f8a01b28287c5bcf6a *** > > > > > > > > > > > ------------------------------------------------------------------------- > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > =================================================================================== > > > > > > > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > > > > > > > = EXIT CODE: 134 > > > > > > > = CLEANING UP REMAINING PROCESSES > > > > > > > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES > > > > > > > > > > > > > > > > > =================================================================================== > > > > > > > YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Aborted > > > > > > > (signal 6) This typically refers to a problem with your > > > > > > > application. Please see the FAQ page for debugging > > > > > > > suggestions > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > So, is it an issue that i must report to the tao team? > > > > > > > > > > > > > > Thank you in advance. > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > fenics-support mailing list > > [email protected] > > http://fenicsproject.org/mailman/listinfo/fenics-support > > _______________________________________________ > fenics-support mailing list > [email protected] > http://fenicsproject.org/mailman/listinfo/fenics-support _______________________________________________ fenics-support mailing list [email protected] http://fenicsproject.org/mailman/listinfo/fenics-support
