Public bug reported:
Hi,
I was checking a build fail in Ubuntu on armhf.
=>
https://launchpad.net/ubuntu/+source/netgen/6.2.2006+really6.2.1905+dfsg-2/+build/20717107
It worked fine for the actual build, but then crashes in the self tests:
$ export
PYTHONPATH="$PYTHONPATH:/root/netgen-6.2.2006+really6.2.1905+dfsg/debian/tmp/usr/lib/python3/dist-packages"
$ apt install python3-tk python3-numpy
$ cd ~/netgen-6.2.2006+really6.2.1905+dfsg/tests/pytest
$
LD_LIBRARY_PATH=/root/netgen-6.2.2006+really6.2.1905+dfsg/debian/tmp/usr/lib/$DEB_HOST_MULTIARCH
python3 -m pytest -k test_pickling -s
...
test_pickling.py Bus error (core dumped)
This seems to be 100% reproducible, if one follow the steps that the Debian
package build does.
The other tests pass
test_pickling.py::test_pickle_stl PASSED
test_pickling.py::test_pickle_occ PASSED
test_pickling.py::test_pickle_geom2d PASSED
test_pickling.py::test_pickle_mesh PASSED
Just test_pickle_csg fails.
And in this test the failing line is: geo_dump = pickle.dumps(geo)
With geo being <netgen.libngpy._csg.CSGeometry object at 0xf6da99b0>
Running that in python3-dbg and gdb into the core file shows the pickling
deep into netgen's code (which is better than a generic pickling issue I guess)
#0 0xf659c99e in ngcore::BinaryOutArchive::Write<double> (x=10000000000,
this=0xffa90cc4) at ./libsrc/stlgeom/../general/../core/archive.hpp:732
#1 ngcore::BinaryOutArchive::operator& (this=0xffa90cc4, d=@0x26aa6d8:
10000000000) at ./libsrc/stlgeom/../general/../core/archive.hpp:681
#2 0xf641d4de in netgen::Surface::DoArchive (archive=..., this=0x26aa6d0) at
./libsrc/csg/surface.hpp:68
#3 netgen::OneSurfacePrimitive::DoArchive (archive=..., this=0x26aa6d0) at
./libsrc/csg/surface.hpp:344
#4 netgen::QuadraticSurface::DoArchive (this=0x26aa6d0, ar=...) at
./libsrc/csg/algprim.hpp:52
#5 0xf641dc00 in netgen::Sphere::DoArchive (this=0x26aa6d0, ar=...) at
./libsrc/csg/algprim.hpp:151
#6 0xf6434c28 in ngcore::Archive::operator&<netgen::Surface, void> (val=...,
this=0xffa90cc4) at ./libsrc/csg/../general/../core/archive.hpp:307
#7 ngcore::Archive::operator&<netgen::Surface> (this=this@entry=0xffa90cc4,
p=@0x2727718: 0x26aa6d0) at ./libsrc/csg/../general/../core/archive.hpp:490
#8 0xf6430dca in ngcore::Archive::Do<netgen::Surface*, void> (n=<optimized
out>, data=<optimized out>, this=0xffa90cc4) at
./libsrc/csg/../general/../core/archive.hpp:280
#9 ngcore::Archive::operator&<netgen::Surface*> (v=std::vector of length 32,
capacity 32 = {...}, this=0xffa90cc4) at
./libsrc/csg/../general/../core/archive.hpp:209
#10 ngcore::SymbolTable<netgen::Surface*>::DoArchive<netgen::Surface*> (ar=...,
this=0x2843c64) at ./libsrc/csg/../general/../core/symboltable.hpp:44
#11 ngcore::Archive::operator&<ngcore::SymbolTable<netgen::Surface*>, void>
(val=..., this=0xffa90cc4) at ./libsrc/csg/../general/../core/archive.hpp:307
#12 netgen::CSGeometry::DoArchive (this=0x2843c60, archive=...) at
./libsrc/csg/csgeom.cpp:329
#13 0xf648a958 in ngcore::Archive::operator&<netgen::CSGeometry, void>
(val=..., this=0xffa90cc4) at ./libsrc/csg/../general/../core/archive.hpp:305
#14 ngcore::Archive::operator&<netgen::CSGeometry> (this=this@entry=0xffa90cc4,
p=@0xffa90ba4: 0x2843c60) at ./libsrc/csg/../general/../core/archive.hpp:518
#15 0xf64a4218 in ngcore::NGSPickle<netgen::CSGeometry,
ngcore::BinaryOutArchive,
ngcore::BinaryInArchive>()::{lambda(netgen::CSGeometry*)#1}::operator()(netgen::CSGeometry*)
const (
self=<optimized out>, this=<optimized out>) at
/usr/include/pybind11/pytypes.h:199
....
That is:
./libsrc/stlgeom/../general/../core/archive.hpp:732
721 private:
722 template <typename T>
723 Archive & Write (T x)
724 {
725 if (unlikely(ptr > BUFFERSIZE-sizeof(T)))
726 {
727 stream->write(&buffer[0], ptr);
728 *reinterpret_cast<T*>(&buffer[0]) = x; // NOLINT
729 ptr = sizeof(T);
730 return *this;
731 }
732 *reinterpret_cast<T*>(&buffer[ptr]) = x; // NOLINT
733 ptr += sizeof(T);
734 return *this;
735 }
736 };
With the variables in the crash file being:
(gdb) p &buffer
$5 = (std::array<char, 1024> *) 0xffa90d40
(gdb) p ptr
$3 = 1
Depending on how the real code (not gdb on the crash file) interprets
this pointer addition that might explain the SigBus as it reflects
unaligned access and if it adds that up to just "0xffa90d41" (which
happens in gdb) then it fails.
I'm a bit lost as .hpp backends to serialize/pickle python files really isn't
my home turf :-/
Therefore I wanted to reach out to you as experts on netgen if this makes sense
to you.
I can keep the repro-systems around for a while, so if you have debug-questions
or small modifications to try I should be able test them.
P.S. The reason this didn't show up in the past is because before the
tests were not correctly run at build time, the last Debian upload fixed
that and since then it is an FTFBS. But it seems not to trigger in all
environments, e.g. in the Debian builds it did not crash the same way.
FYI: I'm not entirely sure, there also is this recent bug about
unaligned access - but the logs linked there didn't look to be "the
same". Still as FYI: https://bugs.debian.org/cgi-
bin/bugreport.cgi?bug=984439
Note: I've reported the very same bug upstream and will link it, this LP
bug is meant as tracker to be found via the update-excuse tag.
** Affects: netgen
Importance: Unknown
Status: Unknown
** Affects: netgen (Ubuntu)
Importance: Undecided
Status: New
** Affects: opencascade (Ubuntu)
Importance: Undecided
Status: New
** Affects: netgen (Debian)
Importance: Unknown
Status: Unknown
** Tags: update-excuse
** Bug watch added: github.com/NGSolve/netgen/issues #89
https://github.com/NGSolve/netgen/issues/89
** Also affects: netgen via
https://github.com/NGSolve/netgen/issues/89
Importance: Unknown
Status: Unknown
** Bug watch added: Debian Bug tracker #984439
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=984439
** Also affects: netgen (Debian) via
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=984439
Importance: Unknown
Status: Unknown
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1919335
Title:
FTBFS (test fail with sigbus) on armhf in Hirsute
To manage notifications about this bug go to:
https://bugs.launchpad.net/netgen/+bug/1919335/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs