https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406

--- Comment #14 from Dominik Vogt <vogt at linux dot vnet.ibm.com> ---
> I'm not really happy with Dominik's patch because 1) it doesn't work when
> configuring with --enable-sjlj-exceptions;

Why is that important?

> 2) the current code almost always works, even on S/390, but the patch
> forces us to do a lookup in the FDE table every time we call recover.

The current code works unreliably as s390 uses memcopy to copy call arguments
to the stack.  The control flow introduced by the function call triggers basic
block reordering that may result in anything.

> I'd like to propose a different patch, that keeps the current code
> that almost always works, does a quick exit when there is no defer in
> the call stack, and does an unwind to check for a specific function
> in other cases.

I've not tested the patch yet, but see some potential problems:

* On systems that "use a leading underscore on symbol names", the test for
functions beginning with "__go_" or "_go_" would yield "true" from user
functions named "_go_..." (because the system adds one '_' and the patch strips
it).

* Wouldn't the new patch re-introduce the bug that

  func foo(n int) {
    if (n == 0) { recover(); } else { foo(0); }
  }
  func main() {
    defer foo(1)
    panic("...")
  }

  would recover although it should not?

* The code is even more expensive than the approach I had chosen because now it
needs to fetch a two level backtrace instead of just one level (and probably
each level is more expensive than the one _Unwind_FindEnclosingFunc()).

----


Digging deeper at the issue that causes Lynn's problems on Power surfaces more
problems with the current implementation of __go_can_recover() and the thunk,
also with the patch I've posted.  Here's a summary of all the problems I am
aware of (some new, some known, skipping the bugs introduced by the patch):

Problems with the current implementation only:
----------------------------------------------

1) Calculation of the label address in the thunk does not work if the basic
block reordering is done (that's the issue why this bug report was created)

2) The current checks for "return address + 16" may point into a different
function, allowing recover() in weird situations.

Problems with the current implementation and the proposed patch:
----------------------------------------------------------------

3) Quote from http://golang.org/ref/spec:

"The return value of recover is nil if any of the following conditions holds:
 ...
 *recover was not called directly by a deferred function."

According to the spec, the following code should recover the panic but does
not:

  func main() { defer foo(); panic("..."); }
  func foo() { defer bar(); }
  func bar() { recover(); }

Note that this is also also "broken" in Golang (well, at least in the old
version that comes with Ubuntu).  This may be an effect of imprecise wording of
the spec.

4) __go_can_recover assumes that any call through libffi is allowed to recover.
 Quote from the sources:

  "/* If the function calling recover was created by reflect.MakeFunc,
      then RETADDR will be somewhere in libffi.  Our caller is
      permitted to recover if it was called from libffi.  */"

This violates the specification.  An example that recovers the panic in a
nested function but should not:

--snip--
package main
import "reflect"

func main() {
  // generate foo() and bar()
  fn := reflect.ValueOf(interface{}(&foo)).Elem()
  val := reflect.MakeFunc(fn.Type(), recover_reflect1)
  fn.Set(val)
  fn = reflect.ValueOf(interface{}(&bar)).Elem()
  val = reflect.MakeFunc(fn.Type(), recover_reflect2)
  fn.Set(val)

  defer foo()
  panic("...")
}

var foo func()
func recover_reflect1(args []reflect.Value) (results []reflect.Value) {
  bar()
  return results
}

var bar func()
func recover_reflect2(args []reflect.Value) (results []reflect.Value) {
  recover()
  return results
}
--snip--

Actually, I think this may also happen if bar is not a function created through
reflection but any foreign language function

 * from a stripped object file
   (no name in the backtrace is guessed as "called by libffi"),
 * if the name begins with "[_]ffi_"

(But perhaps it's impossible to construct such an example.)

Reply via email to