sammccall added a comment.

Thanks for doing this, I think this is incredibly useful for debugging.
But also subtle, so please forgive a bunch of comments!

I don't think we're strictly in defined-behavior territory in much of what 
these signal handlers are doing, but neither is clang's existing crash handlers 
and they seem to work well (main difference is the thread-local access)

Also discussed this a bunch with @kadircet offline.



================
Comment at: clang-tools-extra/clangd/JSONTransport.cpp:114
       if (readRawMessage(JSON)) {
+        ThreadSignalHandler ScopedHandler([JSON]() {
+          auto &OS = llvm::errs();
----------------
this is capturing by value, I don't think it needs to


================
Comment at: clang-tools-extra/clangd/JSONTransport.cpp:117
+          OS << "Signalled while processing message:\n";
+          OS << JSON << "\r\n";
+        });
----------------
why \r?


================
Comment at: clang-tools-extra/clangd/TUScheduler.cpp:596
+  /// Called by thread local signal handler.
+  void printRequestContextOnSignal() const;
 
----------------
naming: "context" is a bit vague (and overloaded in clangd), and we often use 
"dump" for things in a debugging context.

Maybe `dumpCurrentRequest()`?


================
Comment at: clang-tools-extra/clangd/TUScheduler.cpp:1380
+void ASTWorker::printRequestContextOnSignal() const {
+  auto &OS = llvm::errs();
+  OS << "Signalled during AST action:\n";
----------------
i'd probably pass this in as a param, just to move the "we're printing to 
stderr" and "we're running in a signal handler" decisions closer together.


================
Comment at: clang-tools-extra/clangd/TUScheduler.cpp:1381
+  auto &OS = llvm::errs();
+  OS << "Signalled during AST action:\n";
+  auto &Command = FileInputs.CompileCommand;
----------------
the name of the action is available as CurrentRequest->Action, which I think is 
safe to either pass into this function or read directly here.


================
Comment at: clang-tools-extra/clangd/TUScheduler.cpp:1383
+  auto &Command = FileInputs.CompileCommand;
+  OS << "Filename: " << Command.Filename << "\n";
+  OS << "Directory: " << Command.Directory << "\n";
----------------
nit: please indent the lines after the first to set this apart from other crash 
output a bit


================
Comment at: clang-tools-extra/clangd/TUScheduler.cpp:1392
+  OS << "Contents:\n";
+  OS << FileInputs.Contents << "\n";
+}
----------------
hmm, this is potentially really useful (could gather full reproducers) but also 
is going to dominate the output and make it annoying to see the stack 
traces/other details. I'm worried this may make this feature net-negative for a 
majority of users, who won't find the right parts of the output to report...

How critical do you think this is for debugging crashes in practice? (I suspect 
fairly important!)
Since we're taking liberties with async-signal-safety anyway, it might be 
possible to get this sent to a tempfile, which might be more useful. In any 
case, we might want full file dumps to be off by default outside of 
environments where anyone is likely to actually try to get the reproducers.

I'd probably suggest leaving this out of the initial patch entirely so we can 
focus on getting the rest of the mechanism right.


================
Comment at: clang-tools-extra/clangd/TUScheduler.cpp:1392
+  OS << "Contents:\n";
+  OS << FileInputs.Contents << "\n";
+}
----------------
sammccall wrote:
> hmm, this is potentially really useful (could gather full reproducers) but 
> also is going to dominate the output and make it annoying to see the stack 
> traces/other details. I'm worried this may make this feature net-negative for 
> a majority of users, who won't find the right parts of the output to report...
> 
> How critical do you think this is for debugging crashes in practice? (I 
> suspect fairly important!)
> Since we're taking liberties with async-signal-safety anyway, it might be 
> possible to get this sent to a tempfile, which might be more useful. In any 
> case, we might want full file dumps to be off by default outside of 
> environments where anyone is likely to actually try to get the reproducers.
> 
> I'd probably suggest leaving this out of the initial patch entirely so we can 
> focus on getting the rest of the mechanism right.
another thought along the lines of reproducers (but nothing for this patch)...

If we're working on a file and have a preamble, then the headers that make up 
the preamble are relevant. In practice it's hard to reproduce bug reports we 
get because we need all their headers. If we're using a preamble then just 
dumping all the filenames it includes would make it possible for some tool to 
take the crash report and zip up a reproducer.

(This probably needs the ability to run all of the handlers for the current 
thread, not just the top of the stack, since we acquire the relevant preamble 
later)


================
Comment at: clang-tools-extra/clangd/support/ThreadSignalHandler.cpp:13
+
+using namespace clang::clangd;
+
----------------
style nit: we tend do wrap the code in `namespace clang { namespace clangd {` 
instead, at least within clangd


================
Comment at: clang-tools-extra/clangd/support/ThreadSignalHandler.cpp:15
+
+static llvm::sys::ThreadLocal<ThreadSignalHandlerCallback> CurrentCallback;
+
----------------
we assume `thread_local` in clangd (see support/Context.ccp)

It's possible that llvm::sys::ThreadLocal happens to play nicer with signal 
handlers in practice (none of these are completely safe!), but unless we've 
seen some concrete evidence I'd rather just have `static thread_local 
ThreadSignalHandlerCallback*` here


================
Comment at: clang-tools-extra/clangd/support/ThreadSignalHandler.cpp:15
+
+static llvm::sys::ThreadLocal<ThreadSignalHandlerCallback> CurrentCallback;
+
----------------
sammccall wrote:
> we assume `thread_local` in clangd (see support/Context.ccp)
> 
> It's possible that llvm::sys::ThreadLocal happens to play nicer with signal 
> handlers in practice (none of these are completely safe!), but unless we've 
> seen some concrete evidence I'd rather just have `static thread_local 
> ThreadSignalHandlerCallback*` here
we can quite easily support a stack of handlers, such that we dump *all* living 
handlers on the crashing thread.

just give each ThreadSignalHandler a `ThreadSignalHandler* Next = nullptr` 
member, and both the constructor and destructor `swap(Next, CurrentCallback)`. 
Then CurrentCallback points to the head of a linked list of handlers.

Not sure if you want to have that in this patch, but it's pretty simple so I 
wouldn't object.


================
Comment at: clang-tools-extra/clangd/support/ThreadSignalHandler.cpp:17
+
+static void registerSignalHandlerIfNeeded() {
+  static llvm::once_flag RegisterOnceFlag;
----------------
A library may create a ThreadSignalHandler to dump context which may in turn 
install a global signal handler, which can be a surprising side-effect.
In particular, we do have users that embed ClangdServer in different 
environments (not via ClangdMain) and don't use the LLVM crash-handlers.

I think the cleanest thing would be to have ThreadSignalHandler.cpp expose a 
function `RunThreadCrashHandlers()` or so, which is basically PrintInputs, and 
have ClangdMain.cpp call llvm::sys::AddSignalHandler near where it calls 
`PrintStackTraceOnErrorSignal` today


================
Comment at: clang-tools-extra/clangd/support/ThreadSignalHandler.cpp:22
+      ThreadSignalHandlerCallback *Callback = CurrentCallback.get();
+      // TODO: ignore non thread local signals like SIGTERM
+      if (Callback) {
----------------
I think we should do this already, probably just starting with an allowlist of 
`SIGBUS, SIGFPE, SIGILL, SIGSEGV` (these are signals that are delivered to the 
thread that triggered them per the linux docs).

I'm also wondering how portable this assumption/mechanism is, e.g. whether we 
should enable it for certain platforms only. I guess it's probably true for 
unixy things, and apparently mostly true for windows too[1]? I'm a little 
concerned *something* may go wrong on windows, though.

[1] 
https://stackoverflow.com/questions/15879342/which-thread-are-signal-handlers-e-g-signalsigint-crtl-c-called-on


================
Comment at: clang-tools-extra/clangd/support/ThreadSignalHandler.cpp:36
+  registerSignalHandlerIfNeeded();
+  CurrentCallback.set(&Callback);
+}
----------------
we could consider `atomic_signal_fence` around these mutations, or making 
CurrentCallback atomic.

It doesn't get us out of all possible write-reordering problems (we could see 
inconsistent states of the data referred to by the handler) but may avoid some 
catastrophic failures


================
Comment at: clang-tools-extra/clangd/support/ThreadSignalHandler.h:7
+//
+//===----------------------------------------------------------------------===//
+
----------------
Either here in the file comment or on the ThreadSignalHandler class we should 
definitely have an overview of what this is for/how it's used.


================
Comment at: clang-tools-extra/clangd/support/ThreadSignalHandler.h:16
+
+class ThreadSignalHandler {
+public:
----------------
I understand framing this as general signal handler machinery, but I think it 
might be easier to reason about if we give it a name more specific to its 
purpose. Like maybe `ThreadCrashReporter`?

The main reason is that in the general case signal handling calls to mind a 
bunch of concerns like async safety that we're able to be a bit cavalier about 
in the specific case where the signal is (probably) handled synchronously, 
we're crashing anyway, etc. We also don't really want to bother exposing info 
like *which* signal is being handled, but a "signal handler" interface looks 
strange without it.


================
Comment at: clang-tools-extra/clangd/support/ThreadSignalHandler.h:21
+
+  ThreadSignalHandler(ThreadSignalHandler &&);
+  ThreadSignalHandler(const ThreadSignalHandler &) = delete;
----------------
I'm not quite clear on why we need to support move-construction (and the 
move-constructor's interaction with static state!)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109506/new/

https://reviews.llvm.org/D109506

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to