On 11/03/16 16:18, Stephan Bergmann wrote:
> On 11/01/2016 05:29 PM, Alex Peshkoff wrote:
>> On 11/01/16 19:02, Stephan Bergmann wrote:
>>> On 10/27/2016 07:25 PM, Alex Peshkoff wrote:
>>>> On 10/27/16 10:57, Stephan Bergmann wrote:
>>>>> When building Firebird 3.0 as part of LibreOffice (on Linux), it
>>>>> occasionally happens that the build fails with
>>>>>
>>>>> ...
>>>>>> rm -f ../../gen/examples/employee.fdb
>>>>>> ./empbuild ../../gen/examples/employee.fdb
>>>>>> creating database ../../gen/examples/employee.fdb
>>>>>> Turning forced writes off
>>>>>> Couldn't turn forced writes off (139)
>>>>>> Makefile.examples:125: recipe for target 
>>>>>> '../../gen/examples/employee.fdb' failed
>>>>> (I've patched examples/empbuild/empbuild.e to print the failed process's
>>>>> exit status, 139 i.e. SIGSEGV,
>>>>> <https://cgit.freedesktop.org/libreoffice/core/commit/?id=128e7ce3ffa50b11b2d5ff9777a27b095a84e5d7>
>>>>> "external/firebird: Try track down 'Couldn't turn forced writes off'
>>>>> failure").
>>>>>
>>>>> Looking at the core file of the SIGSEGV'ed 'gfix -write async
>>>>> ../../gen/examples/employee.fdb' (see below), it smells like thread 1 is
>>>>> executing code in some .so while another hread called dlcose on that .so
>>>>> (there's a call to dlcose in ~DlfcnModule in
>>>>> src/common/os/posix/mod_loader.cpp).  Is this a known problem/could my
>>>>> assumption be right?
>>>>>
>>>> Your assumption seems to be right. I've used to see such stacks before
>>>> but could not reproduce it locally.
>>>> How stable is it bug reproduced for you? If rather stable, can you
>>>> uncomment debugging macro
>>>> //#define DEBUG_PLUGINS
>>>> near lines 50-55 in src/yvalve/PluginManager.cpp. This should print
>>>> (except others) line:
>>>>
>>>> resetCleanup() of module <name of unloaded .so>
>>>>
>>>> and (I hope) help me reproduce a bug locally and finally fix it.
>>> I've repeatedly tried building with that macro uncommented, but haven't
>>> managed to run into the failure since (and ran into it occasionally
>>> before, but not too often, say 1-in-10).  It's as if the slightly
>>> changed timing is enough to hide the issue for me.  But I'll keep trying...
>> In races case adding even not too much debugging often makes a bug
>> non-reproducible :(
>> May be the following can work better - comment that macro again but in
>> same file (PluginManager) remove #ifdef/#endif near line 365:
>>
>>    #ifdef DEBUG_PLUGINS
>>                   fprintf(stderr, "resetCleanup() of module %s\n",
>> name.c_str());
>> #endif
>>
>> That will make amount of output much smaller (as small as really needed
>> to see something).
> With that, I managed to see it fail once with
>
> ...
>> rm -f ../../gen/examples/employee.fdb
>> ./empbuild ../../gen/examples/employee.fdb
>> resetCleanup() of module ../../gen/Debug/firebird/plugins/fbtrace
>> resetCleanup() of module ../../gen/Debug/firebird/plugins/fbtrace

Understood. Nice catch.


------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to