https://bugs.documentfoundation.org/show_bug.cgi?id=162526
Geoff Kuenning <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |UNCONFIRMED Ever confirmed|1 |0 --- Comment #2 from Geoff Kuenning <[email protected]> --- Well, I guess I should have started with gdb...although the session took about 6 hours before I found the underlying cause, but at least it was more focused than looking through strace output with over 100K lines or randomly updating shared libraries. The immediate cause of the crash is a segfault. That's because in ScDocShell::InitNew, in sc/source/ui/docshell/docsh2.cxx, there is this innocent-looking bit of code: ScOrcusFilters* pOrcus = ScFormatFilter::Get().GetOrcusFilters(); The problem is that ScFormatFilter::Get() returns a reference to a dynamically loaded object, and in my case that reference is a null pointer. So C++ merrily follows that null pointer, trying to call the virtual function GetOrcusFilters, and crashes. So: bug #1 is that ScFormatFilter::Get has no protection against a failed dynamic load. I'd guess that the same thing happens in other Libreoffice code, although I'm certainly not the one to go hunting. Bug #2 is a derivative: when there's a failed dynamic load, dlopen generates a nice error message (which I'll get to in a moment), but that message is unavailable to the user. Instead there's just a confusing and very generic "I crashed" message caused by the segfault. The code for Get() in sc/source/ui/docshell/impex.cxx contains an assert(false), but that's hardly better (and I guess in my distro assertions must have been disabled, boo). I'll agree that most users wouldn't be able to understand a failed-dlopen message, but at least they'd have something they could put into a bug report or read off to an expert. Now, as to the error message: decoding the error structure from dlopen, the offending file is /usr/lib64/libreoffice/program/libscfiltlo.so and the problem is an undefined symbol, _ZN5orcus13create_filterENS_8format_tEPNS_11spreadsheet5iface14import_factoryE, which demangles to orcus::create_filter(orcus::format_t, orcus::spreadsheet::iface::import_factory*). That symbol doesn't exist in any version of liborcus that I have installed on the failing machines (16, 17, and 18--don't ask me why I have three variants because I don't know). But sure enough, the scratch VM that works has it defined in /usr/lib64/liborcus-0.18.so.0.0.0. And armed with that knowledge, I quickly figured out that a simple update was needed and oocalc works again! But: it's really not OK for C++ functions that return references to return null pointers. That's an absolute no-no. So please get rid of this code in ScFormatFilter::Get(): return static_cast<ScFormatFilterPlugin*>(nullptr); and replace it with proper error handling. -- You are receiving this mail because: You are the assignee for the bug.
