On 11/03/16 16:18, Stephan Bergmann wrote: > On 11/01/2016 05:29 PM, Alex Peshkoff wrote: >> On 11/01/16 19:02, Stephan Bergmann wrote: >>> On 10/27/2016 07:25 PM, Alex Peshkoff wrote: >>>> On 10/27/16 10:57, Stephan Bergmann wrote: >>>>> When building Firebird 3.0 as part of LibreOffice (on Linux), it >>>>> occasionally happens that the build fails with >>>>> >>>>> ... >>>>>> rm -f ../../gen/examples/employee.fdb >>>>>> ./empbuild ../../gen/examples/employee.fdb >>>>>> creating database ../../gen/examples/employee.fdb >>>>>> Turning forced writes off >>>>>> Couldn't turn forced writes off (139) >>>>>> Makefile.examples:125: recipe for target >>>>>> '../../gen/examples/employee.fdb' failed >>>>> (I've patched examples/empbuild/empbuild.e to print the failed process's >>>>> exit status, 139 i.e. SIGSEGV, >>>>> <https://cgit.freedesktop.org/libreoffice/core/commit/?id=128e7ce3ffa50b11b2d5ff9777a27b095a84e5d7> >>>>> "external/firebird: Try track down 'Couldn't turn forced writes off' >>>>> failure"). >>>>> >>>>> Looking at the core file of the SIGSEGV'ed 'gfix -write async >>>>> ../../gen/examples/employee.fdb' (see below), it smells like thread 1 is >>>>> executing code in some .so while another hread called dlcose on that .so >>>>> (there's a call to dlcose in ~DlfcnModule in >>>>> src/common/os/posix/mod_loader.cpp). Is this a known problem/could my >>>>> assumption be right? >>>>> >>>> Your assumption seems to be right. I've used to see such stacks before >>>> but could not reproduce it locally. >>>> How stable is it bug reproduced for you? If rather stable, can you >>>> uncomment debugging macro >>>> //#define DEBUG_PLUGINS >>>> near lines 50-55 in src/yvalve/PluginManager.cpp. This should print >>>> (except others) line: >>>> >>>> resetCleanup() of module <name of unloaded .so> >>>> >>>> and (I hope) help me reproduce a bug locally and finally fix it. >>> I've repeatedly tried building with that macro uncommented, but haven't >>> managed to run into the failure since (and ran into it occasionally >>> before, but not too often, say 1-in-10). It's as if the slightly >>> changed timing is enough to hide the issue for me. But I'll keep trying... >> In races case adding even not too much debugging often makes a bug >> non-reproducible :( >> May be the following can work better - comment that macro again but in >> same file (PluginManager) remove #ifdef/#endif near line 365: >> >> #ifdef DEBUG_PLUGINS >> fprintf(stderr, "resetCleanup() of module %s\n", >> name.c_str()); >> #endif >> >> That will make amount of output much smaller (as small as really needed >> to see something). > With that, I managed to see it fail once with > > ... >> rm -f ../../gen/examples/employee.fdb >> ./empbuild ../../gen/examples/employee.fdb >> resetCleanup() of module ../../gen/Debug/firebird/plugins/fbtrace >> resetCleanup() of module ../../gen/Debug/firebird/plugins/fbtrace
Understood. Nice catch. ------------------------------------------------------------------------------ Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel