On 09/11/2019 03:05, v...@apple.com wrote:
On Nov 8, 2019, at 1:17 AM, Pavel Labath <pa...@labath.sk> wrote:
On 08/11/2019 01:33, via lldb-commits wrote:
Hey JF, Pavel,
We're still seeing crashes due to SIGPIPE on some lldb bots. This workaround in
the lldb driver is insufficient, because liblldb.dylib may install its own a
fresh set of llvm signal handlers (the `NumRegisteredSignals` global is not
shared by all images loaded in a process). Here is a trace that illustrates the
issue: https://gist.github.com/vedantk/2d0cc1df9bea9f0fa74ee101d240b82c.
I think we should fix this by changing llvm's default behavior: let's have it
ignore SIGPIPE. Driver programs (like clang, dwarfdump, etc.) can then opt-in
to exiting when SIGPIPE is received. Wdyt?
Ah, yes.. I should've realized that wasn't enough.
I agree (and I've alluded to this in the past) that changing the llvm behavior
is the best option, though ideally, I'd take this a step further, and have llvm
avoid installing *any* signal handlers by default.
This kind of race is not just limited to SIGPIPE. Other signals suffer from the
same issues too, it's just that the manifestation of that is more subtle.
lldb and liblldb are still racing in who sets the SIGSEGV (etc.) handlers last.
Since the main effect of those handlers is to produce a backtrace and exit, you
won't notice this most of the time. But another effect of these handlers is to
run the RemoveFileOnSignal actions, and so the set of files removed might
differ since the two lists of files to delete are independent too.
(BTW, the fact that this is two copies of llvm racing with each other here is
unfortunate but irrelevant for this discussion -- the same kind of problem
would occur if we had llvm and some other entity both trying to handle the same
signals.)
One wrinkle is that the two copies of llvm will track different sets of files
to remove on a signal. In practice, this doesn't seem to be an issue. But the
current situation is that if the lldb process gets a SIGINT, any
files-to-be-removed registered by liblldb get removed, but any
files-to-be-removed registered by the lldb driver stick around.
I think the right behavior for the llvm *library* should be to not install
*any* signal handlers by default. Instead it should expose some API like
`runAbnormalExitActions()`, which it's *users* can invoke from a signal
handler, if they choose so.
Then, the client programs can opt into installing the signal handlers and
calling this function. This can be done as a part of InitLLVM, and it can be
the default behavior of InitLLVM. The lldb driver can then pass some flag to
InitLLVM to disable this handling, or it can just continue to override it after
the fact, like it does now.
One advantage of this approach is that existing llvm tools without custom
signal handling wouldn't have to change.
To make this work, the dylib would need to install an interrupt handler that 1)
runs llvm::runAbnormalExitActions() to clean up files, then 2) locates & runs
the lldb driver's handler (to do `GetDebugger().DispatchInputInterrupt()`).
And we couldn't use a static initializer to do it, because *then*
liblldb.dylib's handler would get installed before the lldb driver's. I think
this approach necessitates a special entry point into the dylib that just
installs handlers.
That's *almost* what I was thinking. The main difference is that I
didn't want liblldb to be installing the signal handlers (as it is also
a library, and I don't think *any* library should be doing that by default).
Instead I'd expose an SB api to run the cleanup actions
(SBHostOS::RunAbnormalExitActions() ?), and leave the signal
installation to the lldb driver, whose handler can then call the SB
function.
However, all of this is only needed if we _really_ care that the cleanup
actions inside liblldb get actually run. I'm not sure if that's the case
right now.
Some alternatives include:
- Add a static initializer to liblldb.dylib that copies the workaround in the
driver. This should work fine, but it's a duct tape on top of a bandaid.
- Have the dynamic linker coalesce all copies of `NumRegisteredSignals` in a
process. This would incur an app launch time hit on iOS (where libllvm is hot
code), and it doesn't seem very portable.
A third option might be to link liblldb and the lldb driver to libllvm
dynamically. That would solve this problem by not having two copies of llvm
around, but it would also come with some runtime and binary size costs...
Yeah, that might be too big a hammer.
I just talked to Jonas about it offline and both of us think we should just
make llvm's current SIGPIPE handling opt-in, as (imo) it's a lot simpler than
the alternatives we've mentioned so far. @Pavel it sounded like you were ok
with this in principle -- anyone have any objections?
I think it's "acceptable", but I'm not really thrilled with it, as it
still seems fairly bandaid-y. I'm pretty sure we'll run into other
signal handler race problems down the line, the first candidate being
SIGINT. It is nonetheless a step in the (imo) right direction, as you'll
need to set up some mechanism for opting this one signal, which can
later hopefully be extended to other signals.
That said, how much do we actually care about running these abnormal
exit actions? Could we just make a change that makes the installation of
_all_ signal handers optional, and leave the business of hooking up the
two cleanup lists to someone who actually needs that?
pl
_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits