https://bugs.documentfoundation.org/show_bug.cgi?id=162526

Geoff Kuenning <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |UNCONFIRMED
     Ever confirmed|1                           |0

--- Comment #2 from Geoff Kuenning <[email protected]> ---
Well, I guess I should have started with gdb...although the session took about
6 hours before I found the underlying cause, but at least it was more focused
than looking through strace output with over 100K lines or randomly updating
shared libraries.

The immediate cause of the crash is a segfault.  That's because in
ScDocShell::InitNew, in sc/source/ui/docshell/docsh2.cxx, there is this
innocent-looking bit of code:

        ScOrcusFilters* pOrcus = ScFormatFilter::Get().GetOrcusFilters();

The problem is that ScFormatFilter::Get() returns a reference to a dynamically
loaded object, and in my case that reference is a null pointer.  So C++ merrily
follows that null pointer, trying to call the virtual function GetOrcusFilters,
and crashes.

So: bug #1 is that ScFormatFilter::Get has no protection against a failed
dynamic load.  I'd guess that the same thing happens in other Libreoffice code,
although I'm certainly not the one to go hunting.

Bug #2 is a derivative: when there's a failed dynamic load, dlopen generates a
nice error message (which I'll get to in a moment), but that message is
unavailable to the user.  Instead there's just a confusing and very generic "I
crashed" message caused by the segfault.  The code for Get() in
sc/source/ui/docshell/impex.cxx contains an assert(false), but that's hardly
better (and I guess in my distro assertions must have been disabled, boo). 
I'll agree that most users wouldn't be able to understand a failed-dlopen
message, but at least they'd have something they could put into a bug report or
read off to an expert.

Now, as to the error message: decoding the error structure from dlopen, the
offending file is /usr/lib64/libreoffice/program/libscfiltlo.so and the problem
is an undefined symbol,
_ZN5orcus13create_filterENS_8format_tEPNS_11spreadsheet5iface14import_factoryE,
which demangles to orcus::create_filter(orcus::format_t,
orcus::spreadsheet::iface::import_factory*).  That symbol doesn't exist in any
version of liborcus that I have installed on the failing machines (16, 17, and
18--don't ask me why I have three variants because I don't know).  But sure
enough, the scratch VM that works has it defined in
/usr/lib64/liborcus-0.18.so.0.0.0.

And armed with that knowledge, I quickly figured out that a simple update was
needed and oocalc works again!

But: it's really not OK for C++ functions that return references to return null
pointers.  That's an absolute no-no.  So please get rid of this code in
ScFormatFilter::Get():

        return static_cast<ScFormatFilterPlugin*>(nullptr);

and replace it with proper error handling.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to