On Thu, 5 Nov 2015 16:20:26 +0100
Jan Blechta <[email protected]> wrote:

> Actually there is missing a check of the return code here
> https://bitbucket.org/fenics-project/dolfin/src/dd945d70e9a7c8548b4cb88fe8bdb2abe2198b29/dolfin/nls/PETScTAOSolver.cpp?at=master&fileviewer=file-view-default#PETScTAOSolver.cpp-263

Filed here https://bitbucket.org/fenics-project/dolfin/issues/602.

Jan


> 
> Jan
> 
> 
> On Thu, 5 Nov 2015 16:14:09 +0100
> Jan Blechta <[email protected]> wrote:
> 
> > I can reproduce it in step 6683 on 3 processes but I have no idea
> > why this happens. Unfortunately I don't currently have PETSc with
> > debugging so it is hard to investigate.
> > 
> > Backtrace on one of processes:
> > =================================================================================
> > Breakpoint 1, 0x00007fffecea4830 in PetscError ()
> >    from /home/jan/dev/hashstack/fenics-deps.host-debian/lib/libpetsc.so.3.6
> > (gdb) bt
> > #0  0x00007fffecea4830 in PetscError ()
> >    from /home/jan/dev/hashstack/fenics-deps.host-debian/lib/libpetsc.so.3.6
> > #1  0x00007fffecfa0989 in VecAssemblyBegin_MPI ()
> >    from /home/jan/dev/hashstack/fenics-deps.host-debian/lib/libpetsc.so.3.6
> > #2  0x00007fffecf7b8f7 in VecAssemblyBegin ()
> >    from /home/jan/dev/hashstack/fenics-deps.host-debian/lib/libpetsc.so.3.6
> > #3  0x00007fffeedf9631 in dolfin::PETScVector::apply (
> >     this=this@entry=0x1e8e6a0, mode="insert")
> >     at ../../dolfin/la/PETScVector.cpp:319
> > #4  0x00007fffeedf92b3 in dolfin::PETScVector::zero (this=0x1e8e6a0)
> >     at ../../dolfin/la/PETScVector.cpp:342
> > #5  0x00007fffeec0de31 in dolfin::PETScTAOSolver::solve (
> >     this=this@entry=0x2173000, optimisation_problem=..., x=...,
> > lb=..., ub=...) at ../../dolfin/nls/PETScTAOSolver.cpp:266
> > #6  0x00007fffeec0edae in dolfin::PETScTAOSolver::solve (
> >     this=this@entry=0x2173000, optimisation_problem=..., x=...)
> >     at ../../dolfin/nls/PETScTAOSolver.cpp:177
> > #7  0x00007fffd8d157f2 in _wrap_PETScTAOSolver_solve__SWIG_1 (
> >     swig_obj=0x7fffffffc210, nobjs=3) at modulePYTHON_wrap.cxx:41488
> > #8  _wrap_PETScTAOSolver_solve (self=<optimized out>,
> > args=<optimized
> > out>) at modulePYTHON_wrap.cxx:41521
> > #9  0x00000000004d2017 in PyEval_EvalFrameEx ()
> > #10 0x00000000004cb6b1 in PyEval_EvalCodeEx ()
> > =================================================================================
> > 
> > and other processes:
> > =================================================================================
> > Program received signal SIGINT, Interrupt.
> > 0x00007ffff78dba77 in sched_yield ()
> > at ../sysdeps/unix/syscall-template.S:81
> > 81      ../sysdeps/unix/syscall-template.S: No such file or
> > directory. (gdb) bt #0  0x00007ffff78dba77 in sched_yield ()
> >     at ../sysdeps/unix/syscall-template.S:81
> > #1  0x00007fffecb1307d in opal_progress () from /usr/lib/libmpi.so.1
> > #2  0x00007fffeca58e44 in ompi_request_default_wait_all ()
> >    from /usr/lib/libmpi.so.1
> > #3  0x00007fffd797dab2 in
> > ompi_coll_tuned_allreduce_intra_recursivedoubling ()
> > from /usr/lib/openmpi/lib/openmpi/mca_coll_tuned.so #4
> > 0x00007fffeca6542b in PMPI_Allreduce () from /usr/lib/libmpi.so.1 #5
> > 0x00007fffecf97f6a in VecDot_MPI ()
> > from /home/jan/dev/hashstack/fenics-deps.host-debian/lib/libpetsc.so.3.6
> > #6  0x00007fffecf7e8b1 in VecDot ()
> > from /home/jan/dev/hashstack/fenics-deps.host-debian/lib/libpetsc.so.3.6
> > #7  0x00007fffed70d9b1 in TaoSolve_CG ()
> > from /home/jan/dev/hashstack/fenics-deps.host-debian/lib/libpetsc.so.3.6
> > #8  0x00007fffed6f0847 in TaoSolve ()
> > from /home/jan/dev/hashstack/fenics-deps.host-debian/lib/libpetsc.so.3.6
> > #9  0x00007fffeec0de25 in dolfin::PETScTAOSolver::solve
> > ( this=this@entry=0x1fa1a40, optimisation_problem=..., x=...,
> > lb=..., ub=...) at ../../dolfin/nls/PETScTAOSolver.cpp:263 #10
> > 0x00007fffeec0edae in dolfin::PETScTAOSolver::solve
> > ( this=this@entry=0x1fa1a40, optimisation_problem=..., x=...)
> > at ../../dolfin/nls/PETScTAOSolver.cpp:177 #11 0x00007fffd8d157f2 in
> > _wrap_PETScTAOSolver_solve__SWIG_1 (
> > =================================================================================
> > 
> > Jan
> > 
> > 
> > On Thu, 5 Nov 2015 16:30:50 +0200
> > Giorgos Grekas <[email protected]> wrote:
> > 
> > > Hello again,
> > > 
> > > i would like to ask for the bug which has reported in this mail is
> > > it scheduled to be fixed in the following months?
> > > 
> > > Thank you in advance and for your great support.
> > > 
> > > On Mon, Oct 12, 2015 at 6:42 PM, Jan Blechta
> > > <[email protected]> wrote:
> > > 
> > > > On Mon, 12 Oct 2015 17:15:18 +0300
> > > > Giorgos Grekas <[email protected]> wrote:
> > > >
> > > > > I provide backtrace to the file bt.txt and my code. For my
> > > > > code you need to run the file runMe.py.
> > > >
> > > > This code fails with assertion in mshr:
> > > >
> > > > *** Error:   Unable to complete call to function
> > > > add_simple_polygon(). *** Reason:  Assertion !i.second failed.
> > > > *** Where:   This error was encountered
> > > > inside ../src/CSGCGALDomain2D.cpp (line 488).
> > > >
> > > > This seems like a trivial bug. Could you fix it Benjamin?
> > > >
> > > > Jan
> > > >
> > > > >
> > > > >
> > > > > On Mon, Oct 12, 2015 at 4:40 PM, Jan Blechta
> > > > > <[email protected]> wrote:
> > > > >
> > > > > > PETSc error code 1 does not seem to indicate an expected
> > > > > > problem,
> > > > > > http://www.mcs.anl.gov/petsc/petsc-dev/include/petscerror.h.html.
> > > > > > It seems as an error not handled by PETSc.
> > > > > >
> > > > > > You could provide us with your code or try investigating the
> > > > > > problem with debugger
> > > > > >
> > > > > >   $ mpirun -n 3 xterm -e gdb -ex 'set breakpoint pending on'
> > > > > > -ex 'break PetscError' -ex 'break dolfin::dolfin_error' -ex
> > > > > > r -args python your_script.py
> > > > > >   ...
> > > > > >   Break point hit...
> > > > > >   (gdb) bt
> > > > > >
> > > > > > and post a backtrace here.
> > > > > >
> > > > > > Jan
> > > > > >
> > > > > >
> > > > > > On Mon, 12 Oct 2015 15:16:48 +0300
> > > > > > Giorgos Grekas <[email protected]> wrote:
> > > > > >
> > > > > > > Hello,
> > > > > > > i am using ncg from tao solver and i wanted to test my
> > > > > > > code validity in a pc  with 4 processors
> > > > > > > before its execution in a cluster. When i run my code
> > > > > > > with 2 processes (mpirun -np 2) everything
> > > > > > > looks to work fine but when i use 3 or more processes i
> > > > > > > have the following error:
> > > > > > >
> > > > > > >
> > > > > > >  Error:   Unable to successfully call PETSc function
> > > > > > > 'VecAssemblyBegin'. *** Reason:  PETSc error code is: 1.
> > > > > > > *** Where:   This error was encountered inside
> > > > > > >
> > > > > >
> > > > /home/ggrekas/.hashdist/tmp/dolfin-wphma2jn5fuw/dolfin/la/PETScVector.cpp.
> > > > > > > *** Process: 3
> > > > > > > ***
> > > > > > > *** DOLFIN version: 1.7.0dev
> > > > > > > *** Git changeset:
> > > > > > > 3fbd47ec249a3e4bd9d055f8a01b28287c5bcf6a ***
> > > > > > >
> > > > -------------------------------------------------------------------------
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > ===================================================================================
> > > > > > > =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> > > > > > > =   EXIT CODE: 134
> > > > > > > =   CLEANING UP REMAINING PROCESSES
> > > > > > > =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> > > > > > >
> > > > > >
> > > > ===================================================================================
> > > > > > > YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Aborted
> > > > > > > (signal 6) This typically refers to a problem with your
> > > > > > > application. Please see the FAQ page for debugging
> > > > > > > suggestions
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > So, is it an issue that i must report to the tao team?
> > > > > > >
> > > > > > > Thank you in advance.
> > > > > >
> > > > > >
> > > >
> > > >
> > 
> > _______________________________________________
> > fenics-support mailing list
> > [email protected]
> > http://fenicsproject.org/mailman/listinfo/fenics-support
> 
> _______________________________________________
> fenics-support mailing list
> [email protected]
> http://fenicsproject.org/mailman/listinfo/fenics-support

_______________________________________________
fenics-support mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics-support

Reply via email to