On 09/11/2019 03:05, v...@apple.com wrote:


On Nov 8, 2019, at 1:17 AM, Pavel Labath <pa...@labath.sk> wrote:

On 08/11/2019 01:33, via lldb-commits wrote:
Hey JF, Pavel,
We're still seeing crashes due to SIGPIPE on some lldb bots. This workaround in 
the lldb driver is insufficient, because liblldb.dylib may install its own a 
fresh set of llvm signal handlers (the `NumRegisteredSignals` global is not 
shared by all images loaded in a process). Here is a trace that illustrates the 
issue: https://gist.github.com/vedantk/2d0cc1df9bea9f0fa74ee101d240b82c.
I think we should fix this by changing llvm's default behavior: let's have it 
ignore SIGPIPE. Driver programs (like clang, dwarfdump, etc.) can then opt-in 
to exiting when SIGPIPE is received. Wdyt?

Ah, yes.. I should've realized that wasn't enough.

I agree (and I've alluded to this in the past) that changing the llvm behavior 
is the best option, though ideally, I'd take this a step further, and have llvm 
avoid installing *any* signal handlers by default.

This kind of race is not just limited to SIGPIPE. Other signals suffer from the 
same issues too, it's just that the manifestation of that is more subtle.

lldb and liblldb are still racing in who sets the SIGSEGV (etc.) handlers last. 
Since the main effect of those handlers is to produce a backtrace and exit, you 
won't notice this most of the time. But another effect of these handlers is to 
run the RemoveFileOnSignal actions, and so the set of files removed might 
differ since the two lists of files to delete are independent too.

(BTW, the fact that this is two copies of llvm racing with each other here is 
unfortunate but irrelevant for this discussion -- the same kind of problem 
would occur if we had llvm and some other entity both trying to handle the same 
signals.)

One wrinkle is that the two copies of llvm will track different sets of files 
to remove on a signal. In practice, this doesn't seem to be an issue. But the 
current situation is that if the lldb process gets a SIGINT, any 
files-to-be-removed registered by liblldb get removed, but any 
files-to-be-removed registered by the lldb driver stick around.


I think the right behavior for the llvm *library* should be to not install 
*any* signal handlers by default. Instead it should expose some API like 
`runAbnormalExitActions()`, which it's *users* can invoke from a signal 
handler, if they choose so.

Then, the client programs can opt into installing the signal handlers and 
calling this function. This can be done as a part of InitLLVM, and it can be 
the default behavior of InitLLVM. The lldb driver can then pass some flag to 
InitLLVM to disable this handling, or it can just continue to override it after 
the fact, like it does now.

One advantage of this approach is that existing llvm tools without custom 
signal handling wouldn't have to change.

To make this work, the dylib would need to install an interrupt handler that 1) 
runs llvm::runAbnormalExitActions() to clean up files, then 2) locates & runs 
the lldb driver's handler (to do `GetDebugger().DispatchInputInterrupt()`).

And we couldn't use a static initializer to do it, because *then* 
liblldb.dylib's handler would get installed before the lldb driver's. I think 
this approach necessitates a special entry point into the dylib that just 
installs handlers.

That's *almost* what I was thinking. The main difference is that I didn't want liblldb to be installing the signal handlers (as it is also a library, and I don't think *any* library should be doing that by default). Instead I'd expose an SB api to run the cleanup actions (SBHostOS::RunAbnormalExitActions() ?), and leave the signal installation to the lldb driver, whose handler can then call the SB function. However, all of this is only needed if we _really_ care that the cleanup actions inside liblldb get actually run. I'm not sure if that's the case right now.



Some alternatives include:
- Add a static initializer to liblldb.dylib that copies the workaround in the 
driver. This should work fine, but it's a duct tape on top of a bandaid.
- Have the dynamic linker coalesce all copies of `NumRegisteredSignals` in a 
process. This would incur an app launch time hit on iOS (where libllvm is hot 
code), and it doesn't seem very portable.

A third option might be to link liblldb and the lldb driver to libllvm 
dynamically. That would solve this problem by not having two copies of llvm 
around, but it would also come with some runtime and binary size costs...

Yeah, that might be too big a hammer.

I just talked to Jonas about it offline and both of us think we should just 
make llvm's current SIGPIPE handling opt-in, as (imo) it's a lot simpler than 
the alternatives we've mentioned so far. @Pavel it sounded like you were ok 
with this in principle -- anyone have any objections?

I think it's "acceptable", but I'm not really thrilled with it, as it still seems fairly bandaid-y. I'm pretty sure we'll run into other signal handler race problems down the line, the first candidate being SIGINT. It is nonetheless a step in the (imo) right direction, as you'll need to set up some mechanism for opting this one signal, which can later hopefully be extended to other signals.

That said, how much do we actually care about running these abnormal exit actions? Could we just make a change that makes the installation of _all_ signal handers optional, and leave the business of hooking up the two cleanup lists to someone who actually needs that?

pl
_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to