Hi Pavel, Eugene, here is a testcase that demonstrates the problem. The code that sets up sigalrm in main() is the same as the program I have been debugging except that I’ve reduced the interval in this testcase. In the original, as per the comment, the interval is .8 seconds.
When I run this under macOS lldb it runs to the end and exits normally without stopping. If I run it under Linux it stops 5 or 6 times with "stop reason = trace” I compile with clang test.c -ldl on both platforms. cheers, --steve #include <dlfcn.h> #include <signal.h> #include <stdio.h> #include <sys/time.h> static int alarms=0; void statistics_sigalarm_handler (int x) { alarms++; } void loadso(char *path) { void *handlep; handlep = dlopen(path, RTLD_LAZY | RTLD_LOCAL); if (handlep == NULL) { printf("dlopen failed %s\n", dlerror()); } } int main() { signal(SIGALRM, statistics_sigalarm_handler); struct itimerval itimerval; struct timeval timeval; timeval.tv_sec = 0; timeval.tv_usec = 10000; // 8 tenths of a second itimerval.it_interval = timeval; itimerval.it_value = timeval; int r = setitimer(ITIMER_REAL, &itimerval, 0); if (r) perror("setitimer"); for (int i=0; i<=100; i++) { #ifdef __APPLE__ loadso(“some_macos_library.dylib"); loadso(“some_other_macos_library.dylib"); #else loadso(“some_linux_library.so"); loadso(“some_other_linux_library.so"); #endif } signal(SIGALRM, SIG_DFL); printf("Got %d signals\n", alarms); } > On Apr 18, 2018, at 6:14 AM, Pavel Labath <lab...@google.com> wrote: > > +Eugene, who wrote the signal-skipping code > > Cool, I'm glad we got to the bottom of this. > > At this point, I *think* we should be able to reproduce the issue > ourselves, but a self-contained reproducer would definitely make it easier > for us and would be appreciated, particularly as there are still a few > things I don't understand here (e.g., I think this should only matter only > after an instruction-step, but your last log seems to show it happens after > a "continue"). > > In any case, I do have a couple of follow-up questions: > - How are you sending the SIGALRM and with what frequency? > - Are you using SIGALRM on macOS as well (implying we handle it correctly > there)? > > Also, you may know this already, but I feel obligated to point out that > setting "--pass false" will change the behavior of your program. > > cheers, > pavel > On Tue, 17 Apr 2018 at 20:59, Steve Ravet <steve.ra...@apple.com> wrote: > >> Ding! That’s it. This program does use SIGALRM. I wouldn’t have > thought it would be enabled at this point, but apparently it is because I > got lots of sigalarm halts during the .so loading. > >> Further, if I run in lldb with > >> process handle -n false -p false -s false SIGALRM > >> then the debugger seems to run fine without stopping during dlopen(). > The SIGALRM itself isn’t important to the operation of the program. > >> With this knowledge I can probably create a simple testcase. Would this > be considered an lldb bug? If so should I submit a testcase in some way? > >> thanks! >> --steve > >> The lover of nature is he whose inward and outward senses are still > truly adjusted to each other; who has retained the spirit of infancy even > into the era of manhood. > > > >> On Apr 17, 2018, at 1:12 PM, Pavel Labath <lab...@google.com> wrote: > >> It's a bit of a wild guess, but is it possible that you (or one of the >> libraries you use) are doing anything with signals (SIGALRM or such?). I >> think I remember looking at the code handling the server-side ignored >> signal handling and thinking that it could go wrong if you get a signal >> while doing a instruction-step. I am not sure it fully applies here as the >> last command that lldb client did was a "continue", but i think it has to >> have something to do with signals, as you end up stopped in a signal >> handler. > >> Could you try the following sequence of commands? >> (lldb) process launch --stop-at-entry-point >> (lldb) process handle --notify true --stop true #Stop on all signals >> (lldb) continue > >> and let us know if you see any extra stops due to signals. If that doesn't >> find anything then I think we'll have start pulling logs from the >> lldb-server side, as there doesn't seem to be anything wrong with the >> client. The easiest way to achieve that is to do a >> export LLDB_SERVER_LOG_CHANNELS="posix all:gdb-remote packets" >> export LLDB_DEBUGSERVER_LOG_FILE=/tmp/server.log >> before launching lldb. >> On Tue, 17 Apr 2018 at 18:28, Steve Ravet <steve.ra...@apple.com> wrote: > >> Pavel asked for a dump of gdb-remote commands. I got that and ran it > >> through the gdbremote decoder, and trimmed to include what looks like the >> last successful continue after breakpoint and then the halt on dlopen. >> Both cases stop on signal 5. > >> After the stop message the debugger issues two binary reads and then > >> apparently makes the decision that it should stop rather than continue. >> The stopping case is missing the equivalent of "Element 1: Single stepping >> past breakpoint site 2 at 0x2aaaaaab9eb0” which is in the continuing case. >> I’ve attached the file here: > > >> thanks, >> --steve > > > >> From the ashes of disaster grow the roses of success. > > > > >> On Apr 17, 2018, at 11:27 AM, Jim Ingham <jing...@apple.com> wrote: > > >> It is interesting that the stop reason on the thread that stopped is > >> "trace". That's what you would expect returning from the single-step to >> step over the breakpoint. But it looks like we got a signal while >> single-stepping, but the stop reason was misreported by somebody. > >> Jim > > > >> On Apr 17, 2018, at 6:00 AM, Pavel Labath via lldb-dev < > >> lldb-dev@lists.llvm.org> wrote: > >> [+lldb-dev] > > >> Hello Steve, > > >> thanks for the report. > > >> The fact that you see the rendezvous breakpoint being hit many times is > >> not > >> surprising. We get those every time the library is loaded (we need that to >> load relevant debug info and set potential breakpoints). However, they >> should generally not be surfaced to the user (unless you have the >> stop-on-sharedlibrary-events setting set, which you don't). > > >> The part that is suspicious to me is that __restore_rt shows up on the top >> of the backtrace. This is a trampoline used to return from signal > >> handlers, > >> and it would seem to indicate that you got some sort of a signal while >> loading the libraries. I don't know why this would happen, but it could be >> that this is confusing lldb's auto-resume logic. > > >> The interesting part to see here is what lldb thinks are the stop reasons >> for individual threads in the process (is the process multi-threaded?) for >> the last couple of stops. The "lldb step" and "gdb-remote packets" log >> categories are the most interesting to observe here. If you are able to >> send me the log traces, I can help you interpret them. > > >> regards, >> pavel > > > > > > >> On Tue, 17 Apr 2018 at 02:27, Steve Ravet via llvm-dev < >> llvm-...@lists.llvm.org> wrote: > > >> Hello lldb developers, I am running into a problem with lldb on Linux. I > > >> am currently running llvm 6.0.0. > > >> I have an executable that dynamically loads a large number of shared > > >> libraries at runtime. These are explicitly loaded via dlopen (they are >> specified in a configuration file), and after loading a few (typically a >> dozen or so, but the number varies) lldb will halt during dlopen. If I >> continue, it will load a few more then halt again, which makes debugging >> from startup impractical since there are so many libraries to be loaded >> (more than a hundred of them). > > >> When I build and debug this same C++ on macOS, the debugger works fine. > > >> I have verified that target.process.stop-on-sharedlibrary-events is false. >> I turned on dyld logging and I see lots of log messages about >> RendezvousBreakpoint being hit, but I don’t see anything that sheds light >> on why some libraries load without stopping but others don’t. > > >> I have tried to recreate this in a trivial program that calls dlopen in a > > >> loop, but haven’t been able to reproduce. > > >> Can your offer any suggestions for further debugging this? More > > >> supporting evidence follows. > > >> Here is the message when the debugger stops: > > > >> Process 120004 stopped >> * thread #1, name = ‘xxxxxxxx', stop reason = trace >> frame #0: 0x00002aaaacfca6a0 libc.so.6`__restore_rt >> libc.so.6`__restore_rt: >> -> 0x2aaaacfca6a0 <+0>: movq $0xf, %rax >> 0x2aaaacfca6a7 <+7>: syscall >> 0x2aaaacfca6a9 <+9>: nopl (%rax) > > > >> libc.so.6`__libc_sigaction: >> 0x2aaaacfca6b0 <+0>: subq $0xd0, %rsp > > > >> I do not have the stop on shared library events setting enabled: > > > >> (lldb) settings show target.process.stop-on-sharedlibrary-events >> target.process.stop-on-sharedlibrary-events (boolean) = false > > > > > >> The backtrace goes back to dlopen: > > > >> (lldb) bt >> * thread #1, name = ‘xxxxx', stop reason = trace >> * frame #0: 0x00002aaaacfca6a0 libc.so.6`__restore_rt >> frame #1: 0x00002aaaaaab9eb0 ld-linux-x86-64.so.2 >> frame #2: 0x00002aaaaaabdc53 ld-linux-x86-64.so.2`dl_open_worker + 499 >> frame #3: 0x00002aaaaaab9286 ld-linux-x86-64.so.2`_dl_catch_error + > > >> 102 > > >> frame #4: 0x00002aaaaaabd63a ld-linux-x86-64.so.2`_dl_open + 186 >> frame #5: 0x00002aaaac39df66 libdl.so.2`dlopen_doit + 102 >> frame #6: 0x00002aaaaaab9286 ld-linux-x86-64.so.2`_dl_catch_error + > > >> 102 > > >> frame #7: 0x00002aaaac39e29c libdl.so.2`_dlerror_run + 124 >> frame #8: 0x00002aaaac39dee1 libdl.so.2`__dlopen_check + 49 > > > >> the dyld debug log has a lot of this: >> 209 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit pid > > >> 153501 stop_when_images_change=false > > >> 210 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit > > >> called for pid 153501 > > >> 211 intern-state DYLDRendezvous::Resolve address size: 8, padding 4 >> 212 intern-state DYLDRendezvous::Resolve cursor = 0x2aaaaaccc160 >> 213 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit pid > > >> 153501 stop_when_images_change=false > > >> 214 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit > > >> called for pid 153501 > > >> 215 intern-state DYLDRendezvous::Resolve address size: 8, padding 4 >> 216 intern-state DYLDRendezvous::Resolve cursor = 0x2aaaaaccc160 > > > > > >> thanks, >> --steve > > > > >> In the woods too, a man casts off his years, as the snake his slough, > > >> and at what period soever of life, is always a child. > > > > >> _______________________________________________ >> LLVM Developers mailing list >> llvm-...@lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >> _______________________________________________ >> lldb-dev mailing list >> lldb-dev@lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev _______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev