https://bugs.documentfoundation.org/show_bug.cgi?id=162526
--- Comment #3 from Mike Kaganski <[email protected]> --- (In reply to Geoff Kuenning from comment #2) > Well, I guess I should have started with gdb...although the session took > about 6 hours before I found the underlying cause, but at least it was more > focused than looking through strace output with over 100K lines or randomly > updating shared libraries. > > The immediate cause of the crash is a segfault. That's because in > ScDocShell::InitNew, in sc/source/ui/docshell/docsh2.cxx, there is this > innocent-looking bit of code: > > ScOrcusFilters* pOrcus = ScFormatFilter::Get().GetOrcusFilters(); > > The problem is that ScFormatFilter::Get() returns a reference to a > dynamically loaded object, and in my case that reference is a null pointer. > So C++ merrily follows that null pointer, trying to call the virtual > function GetOrcusFilters, and crashes. Thank you! That is very useful. > So: bug #1 is that ScFormatFilter::Get has no protection against a failed > dynamic load. I'd guess that the same thing happens in other Libreoffice > code, although I'm certainly not the one to go hunting. > > Bug #2 is a derivative: when there's a failed dynamic load, dlopen generates > a nice error message (which I'll get to in a moment), but that message is > unavailable to the user. Instead there's just a confusing and very generic > "I crashed" message caused by the segfault. The code for Get() in > sc/source/ui/docshell/impex.cxx contains an assert(false), but that's hardly > better (and I guess in my distro assertions must have been disabled, boo). > I'll agree that most users wouldn't be able to understand a failed-dlopen > message, but at least they'd have something they could put into a bug report > or read off to an expert. > > Now, as to the error message: decoding the error structure from dlopen, the > offending file is /usr/lib64/libreoffice/program/libscfiltlo.so and the > problem is an undefined symbol, > _ZN5orcus13create_filterENS_8format_tEPNS_11spreadsheet5iface14import_factory > E, which demangles to orcus::create_filter(orcus::format_t, > orcus::spreadsheet::iface::import_factory*). That symbol doesn't exist in > any version of liborcus that I have installed on the failing machines (16, > 17, and 18--don't ask me why I have three variants because I don't know). > But sure enough, the scratch VM that works has it defined in > /usr/lib64/liborcus-0.18.so.0.0.0. > > And armed with that knowledge, I quickly figured out that a simple update > was needed and oocalc works again! > > But: it's really not OK for C++ functions that return references to return > null pointers. That's an absolute no-no. So please get rid of this code in > ScFormatFilter::Get(): > > return static_cast<ScFormatFilterPlugin*>(nullptr); > > and replace it with proper error handling. Please avoid teaching here. We assert the validity of the pointer; the assert(false) there in the code is for a reason. We consider that situation a hard failure and a *programming* error, to the extent that a crash is an appropriate outcome. We need to find out why the pointer is null, not to handle that. -- You are receiving this mail because: You are the assignee for the bug.
