http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51323

--- Comment #7 from David Kastrup <dak at gnu dot org> 2011-11-29 09:43:44 UTC 
---
I agree that the real fix is to force an upgrade of the compiler to a fixed
version.  However, Ubuntu 11.10 has been released and is in circulation, so we
can't reasonably implement that solution until the buggy compilers have had a
reasonable chance to be replaced everywhere.

I have reported this bug to Ubuntu.  If you are right that it can't be found in
4.6 proper, they will have acquired it via distribution specific patches.  What
that means for stability and security of the entire current Ubuntu code base,
one can only guess.

Regarding Lilypond, we have chosen to use -fno-optimize-sibling-calls based on
the gcc version number instead of an actual test, without consideration of the
architecture.  Tracking this bug down has cost us several weeks of developer
time and brought down our build infrastructure for a while until the first
workaround, -fkeep-inline-functions, has been discovered by chance.  Lilypond
is a C++ application with considerable parts written in Guile, so segfaults
usually are a problem of forgetting garbage collection protection measures.  As
far as I know, I am the only active programmer with a system programming
background.  When the bug manifests itself in a segfault, the responsible
function is no longer visible in the stack backtrace.  This makes finding the
culprit extremely unfunny.  In our case, the problem was exacerbated because
the last visible caller in the stack backtrace made its call via a function
pointer table, this table was a C++ vector, and accessing the vector in gdb was
not possible because operator[] had been inlined.  Specifying
-fkeep-inline-function, which is according to its documentation supposed to
_only_ additionally emit (unused) inline function instantiations that could
have been used for accessing that table in the debugger, made the bug
disappear.

There is no sane reason that -fkeep-inline-functions turns off sibling call
optimization, but while sabotaging the debugging of this problem, it at least
gave us a workaround.

So we simply can't afford dealing with this kind of situation more than once. 
We don't have the skill sets.  In contrast, the positive results of this
optimization are negligible for us since we don't employ systematic call
chaining (like a P code interpreter using function pointer tables likely
would).

Reply via email to