Re: [lldb-dev] [llvm-dev] lldb stops on every call to dlopen

2018-04-18 Thread Steve Ravet via lldb-dev
Hi Pavel, Eugene, here is a testcase that demonstrates the problem.  The code 
that sets up sigalrm in main() is the same as the program I have been debugging 
except that I’ve reduced the interval in this testcase.  In the original, as 
per the comment, the interval is .8 seconds.

When I run this under macOS lldb it runs to the end and exits normally without 
stopping.  If I run it under Linux it stops 5 or 6 times with "stop reason = 
trace”

I compile with

clang test.c -ldl

on both platforms.

cheers,
--steve

#include 
#include 
#include 
#include 

static int alarms=0;

void statistics_sigalarm_handler (int x)
{
  alarms++;
}

void loadso(char *path)
{
  void *handlep;
  handlep = dlopen(path, RTLD_LAZY | RTLD_LOCAL);
  if (handlep == NULL) {
printf("dlopen failed %s\n", dlerror());
  }
}

int main()
{
  signal(SIGALRM, statistics_sigalarm_handler);

  struct itimerval itimerval;
  struct timeval timeval;
  timeval.tv_sec = 0;
  timeval.tv_usec = 1;  // 8 tenths of a second

  itimerval.it_interval = timeval;
  itimerval.it_value = timeval;
  int r = setitimer(ITIMER_REAL, , 0);
  if (r) perror("setitimer");

  for (int i=0; i<=100; i++) {
#ifdef __APPLE__
loadso(“some_macos_library.dylib");
loadso(“some_other_macos_library.dylib");
#else
loadso(“some_linux_library.so");
loadso(“some_other_linux_library.so");
#endif
  }
  signal(SIGALRM, SIG_DFL);
  printf("Got %d signals\n", alarms);
}




> On Apr 18, 2018, at 6:14 AM, Pavel Labath  wrote:
> 
> +Eugene, who wrote the signal-skipping code
> 
> Cool, I'm glad we got to the bottom of this.
> 
> At this point, I *think* we should be able to reproduce the issue
> ourselves, but a self-contained reproducer would definitely make it easier
> for us and would be appreciated, particularly as there are still a few
> things I don't understand here (e.g., I think this should only matter only
> after an instruction-step, but your last log seems to show it happens after
> a "continue").
> 
> In any case, I do have a couple of follow-up questions:
> - How are you sending the SIGALRM and with what frequency?
> - Are you using SIGALRM on macOS as well (implying we handle it correctly
> there)?
> 
> Also, you may know this already, but I feel obligated to point out that
> setting "--pass false" will change the behavior of your program.
> 
> cheers,
> pavel
> On Tue, 17 Apr 2018 at 20:59, Steve Ravet  wrote:
> 
>> Ding!  That’s it.  This program does use SIGALRM.  I wouldn’t have
> thought it would be enabled at this point, but apparently it is because I
> got lots of sigalarm halts during the .so loading.
> 
>> Further, if I run in lldb with
> 
>> process handle -n false -p false -s false SIGALRM
> 
>> then the debugger seems to run fine without stopping during dlopen().
> The SIGALRM itself isn’t important to the operation of the program.
> 
>> With this knowledge I can probably create a simple testcase.  Would this
> be considered an lldb bug?  If so should I submit a testcase in some way?
> 
>> thanks!
>> --steve
> 
>>  The lover of nature is he whose inward and outward senses are still
> truly adjusted to each other; who has retained the spirit of infancy even
> into the era of manhood.
> 
> 
> 
>> On Apr 17, 2018, at 1:12 PM, Pavel Labath  wrote:
> 
>> It's a bit of a wild guess, but is it possible that you (or one of the
>> libraries you use) are doing anything with signals (SIGALRM or such?). I
>> think I remember looking at the code handling the server-side ignored
>> signal handling and thinking that it could go wrong if you get a signal
>> while doing a instruction-step. I am not sure it fully applies here as the
>> last command that lldb client did was a "continue", but i think it has to
>> have something to do with signals, as you end up stopped in a signal
>> handler.
> 
>> Could you try the following sequence of commands?
>> (lldb) process launch --stop-at-entry-point
>> (lldb) process handle --notify true --stop true #Stop on all signals
>> (lldb) continue
> 
>> and let us know if you see any extra stops due to signals. If that doesn't
>> find anything then I think we'll have start pulling logs from the
>> lldb-server side, as there doesn't seem to be anything wrong with the
>> client. The easiest way to achieve that is to do a
>> export LLDB_SERVER_LOG_CHANNELS="posix all:gdb-remote packets"
>> export LLDB_DEBUGSERVER_LOG_FILE=/tmp/server.log
>> before launching lldb.
>> On Tue, 17 Apr 2018 at 18:28, Steve Ravet  wrote:
> 
>> Pavel asked for a dump of gdb-remote commands.  I got that and ran it
> 
>> through the gdbremote decoder, and trimmed to include what looks like the
>> last successful continue after breakpoint and then the halt on dlopen.
>> Both cases stop on signal 5.
> 
>> After the stop message the debugger issues two binary reads and then
> 
>> apparently makes the decision that it should stop rather than continue.
>> The 

Re: [lldb-dev] [llvm-dev] lldb stops on every call to dlopen

2018-04-18 Thread Pavel Labath via lldb-dev
+Eugene, who wrote the signal-skipping code

Cool, I'm glad we got to the bottom of this.

At this point, I *think* we should be able to reproduce the issue
ourselves, but a self-contained reproducer would definitely make it easier
for us and would be appreciated, particularly as there are still a few
things I don't understand here (e.g., I think this should only matter only
after an instruction-step, but your last log seems to show it happens after
a "continue").

In any case, I do have a couple of follow-up questions:
- How are you sending the SIGALRM and with what frequency?
- Are you using SIGALRM on macOS as well (implying we handle it correctly
there)?

Also, you may know this already, but I feel obligated to point out that
setting "--pass false" will change the behavior of your program.

cheers,
pavel
On Tue, 17 Apr 2018 at 20:59, Steve Ravet  wrote:

> Ding!  That’s it.  This program does use SIGALRM.  I wouldn’t have
thought it would be enabled at this point, but apparently it is because I
got lots of sigalarm halts during the .so loading.

> Further, if I run in lldb with

> process handle -n false -p false -s false SIGALRM

> then the debugger seems to run fine without stopping during dlopen().
The SIGALRM itself isn’t important to the operation of the program.

> With this knowledge I can probably create a simple testcase.  Would this
be considered an lldb bug?  If so should I submit a testcase in some way?

> thanks!
> --steve

>  The lover of nature is he whose inward and outward senses are still
truly adjusted to each other; who has retained the spirit of infancy even
into the era of manhood.



> On Apr 17, 2018, at 1:12 PM, Pavel Labath  wrote:

> It's a bit of a wild guess, but is it possible that you (or one of the
> libraries you use) are doing anything with signals (SIGALRM or such?). I
> think I remember looking at the code handling the server-side ignored
> signal handling and thinking that it could go wrong if you get a signal
> while doing a instruction-step. I am not sure it fully applies here as the
> last command that lldb client did was a "continue", but i think it has to
> have something to do with signals, as you end up stopped in a signal
> handler.

> Could you try the following sequence of commands?
> (lldb) process launch --stop-at-entry-point
> (lldb) process handle --notify true --stop true #Stop on all signals
> (lldb) continue

> and let us know if you see any extra stops due to signals. If that doesn't
> find anything then I think we'll have start pulling logs from the
> lldb-server side, as there doesn't seem to be anything wrong with the
> client. The easiest way to achieve that is to do a
> export LLDB_SERVER_LOG_CHANNELS="posix all:gdb-remote packets"
> export LLDB_DEBUGSERVER_LOG_FILE=/tmp/server.log
> before launching lldb.
> On Tue, 17 Apr 2018 at 18:28, Steve Ravet  wrote:

> Pavel asked for a dump of gdb-remote commands.  I got that and ran it

> through the gdbremote decoder, and trimmed to include what looks like the
> last successful continue after breakpoint and then the halt on dlopen.
> Both cases stop on signal 5.

> After the stop message the debugger issues two binary reads and then

> apparently makes the decision that it should stop rather than continue.
> The stopping case is missing the equivalent of "Element 1: Single stepping
> past breakpoint site 2 at 0x2aab9eb0” which is in the continuing case.
> I’ve attached the file here:


> thanks,
> --steve



>  From the ashes of disaster grow the roses of success.




> On Apr 17, 2018, at 11:27 AM, Jim Ingham  wrote:


> It is interesting that the stop reason on the thread that stopped is

> "trace".  That's what you would expect returning from the single-step to
> step over the breakpoint.  But it looks like we got a signal while
> single-stepping, but the stop reason was misreported by somebody.

> Jim



> On Apr 17, 2018, at 6:00 AM, Pavel Labath via lldb-dev <

> lldb-dev@lists.llvm.org> wrote:

> [+lldb-dev]


> Hello Steve,


> thanks for the report.


> The fact that you see the rendezvous breakpoint being hit many times is

> not

> surprising. We get those every time the library is loaded (we need that to
> load relevant debug info and set potential breakpoints). However, they
> should generally not be surfaced to the user (unless you have the
> stop-on-sharedlibrary-events setting set, which you don't).


> The part that is suspicious to me is that __restore_rt shows up on the top
> of the backtrace. This is a trampoline used to return from signal

> handlers,

> and it would seem to indicate that you got some sort of a signal while
> loading the libraries. I don't know why this would happen, but it could be
> that this is confusing lldb's auto-resume logic.


> The interesting part to see here is what lldb thinks are the stop reasons
> for individual threads in the process (is the process 

Re: [lldb-dev] [llvm-dev] lldb stops on every call to dlopen

2018-04-17 Thread Steve Ravet via lldb-dev
Ding!  That’s it.  This program does use SIGALRM.  I wouldn’t have thought it 
would be enabled at this point, but apparently it is because I got lots of 
sigalarm halts during the .so loading.

Further, if I run in lldb with 

process handle -n false -p false -s false SIGALRM

then the debugger seems to run fine without stopping during dlopen().  The 
SIGALRM itself isn’t important to the operation of the program.

With this knowledge I can probably create a simple testcase.  Would this be 
considered an lldb bug?  If so should I submit a testcase in some way?

thanks!
--steve

 The lover of nature is he whose inward and outward senses are still truly 
adjusted to each other; who has retained the spirit of infancy even into the 
era of manhood.



> On Apr 17, 2018, at 1:12 PM, Pavel Labath  wrote:
> 
> It's a bit of a wild guess, but is it possible that you (or one of the
> libraries you use) are doing anything with signals (SIGALRM or such?). I
> think I remember looking at the code handling the server-side ignored
> signal handling and thinking that it could go wrong if you get a signal
> while doing a instruction-step. I am not sure it fully applies here as the
> last command that lldb client did was a "continue", but i think it has to
> have something to do with signals, as you end up stopped in a signal
> handler.
> 
> Could you try the following sequence of commands?
> (lldb) process launch --stop-at-entry-point
> (lldb) process handle --notify true --stop true #Stop on all signals
> (lldb) continue
> 
> and let us know if you see any extra stops due to signals. If that doesn't
> find anything then I think we'll have start pulling logs from the
> lldb-server side, as there doesn't seem to be anything wrong with the
> client. The easiest way to achieve that is to do a
> export LLDB_SERVER_LOG_CHANNELS="posix all:gdb-remote packets"
> export LLDB_DEBUGSERVER_LOG_FILE=/tmp/server.log
> before launching lldb.
> On Tue, 17 Apr 2018 at 18:28, Steve Ravet  wrote:
> 
>> Pavel asked for a dump of gdb-remote commands.  I got that and ran it
> through the gdbremote decoder, and trimmed to include what looks like the
> last successful continue after breakpoint and then the halt on dlopen.
> Both cases stop on signal 5.
> 
>> After the stop message the debugger issues two binary reads and then
> apparently makes the decision that it should stop rather than continue.
> The stopping case is missing the equivalent of "Element 1: Single stepping
> past breakpoint site 2 at 0x2aab9eb0” which is in the continuing case.
> I’ve attached the file here:
> 
> 
>> thanks,
>> --steve
> 
> 
>>  From the ashes of disaster grow the roses of success.
> 
> 
> 
>> On Apr 17, 2018, at 11:27 AM, Jim Ingham  wrote:
> 
>> It is interesting that the stop reason on the thread that stopped is
> "trace".  That's what you would expect returning from the single-step to
> step over the breakpoint.  But it looks like we got a signal while
> single-stepping, but the stop reason was misreported by somebody.
> 
>> Jim
> 
> 
>> On Apr 17, 2018, at 6:00 AM, Pavel Labath via lldb-dev <
> lldb-dev@lists.llvm.org> wrote:
> 
>> [+lldb-dev]
> 
>> Hello Steve,
> 
>> thanks for the report.
> 
>> The fact that you see the rendezvous breakpoint being hit many times is
> not
>> surprising. We get those every time the library is loaded (we need that to
>> load relevant debug info and set potential breakpoints). However, they
>> should generally not be surfaced to the user (unless you have the
>> stop-on-sharedlibrary-events setting set, which you don't).
> 
>> The part that is suspicious to me is that __restore_rt shows up on the top
>> of the backtrace. This is a trampoline used to return from signal
> handlers,
>> and it would seem to indicate that you got some sort of a signal while
>> loading the libraries. I don't know why this would happen, but it could be
>> that this is confusing lldb's auto-resume logic.
> 
>> The interesting part to see here is what lldb thinks are the stop reasons
>> for individual threads in the process (is the process multi-threaded?) for
>> the last couple of stops. The "lldb step" and "gdb-remote packets" log
>> categories are the most interesting to observe here. If you are able to
>> send me the log traces, I can help you interpret them.
> 
>> regards,
>> pavel
> 
> 
> 
> 
> 
>> On Tue, 17 Apr 2018 at 02:27, Steve Ravet via llvm-dev <
>> llvm-...@lists.llvm.org> wrote:
> 
>> Hello lldb developers, I am running into a problem with lldb on Linux.  I
> 
>> am currently running llvm 6.0.0.
> 
>> I have an executable that dynamically loads a large number of shared
> 
>> libraries at runtime.  These are explicitly loaded via dlopen (they are
>> specified in a configuration file), and after loading a few (typically a
>> dozen or so, but the number varies) lldb will halt during dlopen.  If I
>> continue, it will load a few more then halt again, which makes 

Re: [lldb-dev] [llvm-dev] lldb stops on every call to dlopen

2018-04-17 Thread Pavel Labath via lldb-dev
It's a bit of a wild guess, but is it possible that you (or one of the
libraries you use) are doing anything with signals (SIGALRM or such?). I
think I remember looking at the code handling the server-side ignored
signal handling and thinking that it could go wrong if you get a signal
while doing a instruction-step. I am not sure it fully applies here as the
last command that lldb client did was a "continue", but i think it has to
have something to do with signals, as you end up stopped in a signal
handler.

Could you try the following sequence of commands?
(lldb) process launch --stop-at-entry-point
(lldb) process handle --notify true --stop true #Stop on all signals
(lldb) continue

and let us know if you see any extra stops due to signals. If that doesn't
find anything then I think we'll have start pulling logs from the
lldb-server side, as there doesn't seem to be anything wrong with the
client. The easiest way to achieve that is to do a
export LLDB_SERVER_LOG_CHANNELS="posix all:gdb-remote packets"
export LLDB_DEBUGSERVER_LOG_FILE=/tmp/server.log
before launching lldb.
On Tue, 17 Apr 2018 at 18:28, Steve Ravet  wrote:

> Pavel asked for a dump of gdb-remote commands.  I got that and ran it
through the gdbremote decoder, and trimmed to include what looks like the
last successful continue after breakpoint and then the halt on dlopen.
Both cases stop on signal 5.

> After the stop message the debugger issues two binary reads and then
apparently makes the decision that it should stop rather than continue.
The stopping case is missing the equivalent of "Element 1: Single stepping
past breakpoint site 2 at 0x2aab9eb0” which is in the continuing case.
I’ve attached the file here:


> thanks,
> --steve


>  From the ashes of disaster grow the roses of success.



> On Apr 17, 2018, at 11:27 AM, Jim Ingham  wrote:

> It is interesting that the stop reason on the thread that stopped is
"trace".  That's what you would expect returning from the single-step to
step over the breakpoint.  But it looks like we got a signal while
single-stepping, but the stop reason was misreported by somebody.

> Jim


> On Apr 17, 2018, at 6:00 AM, Pavel Labath via lldb-dev <
lldb-dev@lists.llvm.org> wrote:

> [+lldb-dev]

> Hello Steve,

> thanks for the report.

> The fact that you see the rendezvous breakpoint being hit many times is
not
> surprising. We get those every time the library is loaded (we need that to
> load relevant debug info and set potential breakpoints). However, they
> should generally not be surfaced to the user (unless you have the
> stop-on-sharedlibrary-events setting set, which you don't).

> The part that is suspicious to me is that __restore_rt shows up on the top
> of the backtrace. This is a trampoline used to return from signal
handlers,
> and it would seem to indicate that you got some sort of a signal while
> loading the libraries. I don't know why this would happen, but it could be
> that this is confusing lldb's auto-resume logic.

> The interesting part to see here is what lldb thinks are the stop reasons
> for individual threads in the process (is the process multi-threaded?) for
> the last couple of stops. The "lldb step" and "gdb-remote packets" log
> categories are the most interesting to observe here. If you are able to
> send me the log traces, I can help you interpret them.

> regards,
> pavel





> On Tue, 17 Apr 2018 at 02:27, Steve Ravet via llvm-dev <
> llvm-...@lists.llvm.org> wrote:

> Hello lldb developers, I am running into a problem with lldb on Linux.  I

> am currently running llvm 6.0.0.

> I have an executable that dynamically loads a large number of shared

> libraries at runtime.  These are explicitly loaded via dlopen (they are
> specified in a configuration file), and after loading a few (typically a
> dozen or so, but the number varies) lldb will halt during dlopen.  If I
> continue, it will load a few more then halt again, which makes debugging
> from startup impractical since there are so many libraries to be loaded
> (more than a hundred of them).

> When I build and debug this same C++ on macOS, the debugger works fine.

> I have verified that target.process.stop-on-sharedlibrary-events is false.
> I turned on dyld logging and I see lots of log messages about
> RendezvousBreakpoint being hit, but I don’t see anything that sheds light
> on why some libraries load without stopping but others don’t.

> I have tried to recreate this in a trivial program that calls dlopen in a

> loop, but haven’t been able to reproduce.

> Can your offer any suggestions for further debugging this?  More

> supporting evidence follows.

> Here is the message when the debugger stops:


> Process 120004 stopped
> * thread #1, name = ‘', stop reason = trace
> frame #0: 0x2cfca6a0 libc.so.6`__restore_rt
> libc.so.6`__restore_rt:
> ->  0x2cfca6a0 <+0>: movq   $0xf, %rax
> 0x2cfca6a7 <+7>: syscall
> 

Re: [lldb-dev] [llvm-dev] lldb stops on every call to dlopen

2018-04-17 Thread Steve Ravet via lldb-dev
Pavel asked for a dump of gdb-remote commands.  I got that and ran it through the gdbremote decoder, and trimmed to include what looks like the last successful continue after breakpoint and then the halt on dlopen.  Both cases stop on signal 5.After the stop message the debugger issues two binary reads and then apparently makes the decision that it should stop rather than continue.  The stopping case is missing the equivalent of "Element 1: Single stepping past breakpoint site 2 at 0x2aab9eb0” which is in the continuing case.  I’ve attached the file here:

out
Description: Binary data
thanks,--steve
 From the ashes of disaster grow the roses of success.

On Apr 17, 2018, at 11:27 AM, Jim Ingham  wrote:It is interesting that the stop reason on the thread that stopped is "trace".  That's what you would expect returning from the single-step to step over the breakpoint.  But it looks like we got a signal while single-stepping, but the stop reason was misreported by somebody.JimOn Apr 17, 2018, at 6:00 AM, Pavel Labath via lldb-dev  wrote:[+lldb-dev]Hello Steve,thanks for the report.The fact that you see the rendezvous breakpoint being hit many times is notsurprising. We get those every time the library is loaded (we need that toload relevant debug info and set potential breakpoints). However, theyshould generally not be surfaced to the user (unless you have thestop-on-sharedlibrary-events setting set, which you don't).The part that is suspicious to me is that __restore_rt shows up on the topof the backtrace. This is a trampoline used to return from signal handlers,and it would seem to indicate that you got some sort of a signal whileloading the libraries. I don't know why this would happen, but it could bethat this is confusing lldb's auto-resume logic.The interesting part to see here is what lldb thinks are the stop reasonsfor individual threads in the process (is the process multi-threaded?) forthe last couple of stops. The "lldb step" and "gdb-remote packets" logcategories are the most interesting to observe here. If you are able tosend me the log traces, I can help you interpret them.regards,pavelOn Tue, 17 Apr 2018 at 02:27, Steve Ravet via llvm-dev  wrote:Hello lldb developers, I am running into a problem with lldb on Linux.  Iam currently running llvm 6.0.0.I have an executable that dynamically loads a large number of sharedlibraries at runtime.  These are explicitly loaded via dlopen (they arespecified in a configuration file), and after loading a few (typically adozen or so, but the number varies) lldb will halt during dlopen.  If Icontinue, it will load a few more then halt again, which makes debuggingfrom startup impractical since there are so many libraries to be loaded(more than a hundred of them).When I build and debug this same C++ on macOS, the debugger works fine.I have verified that target.process.stop-on-sharedlibrary-events is false.I turned on dyld logging and I see lots of log messages aboutRendezvousBreakpoint being hit, but I don’t see anything that sheds lighton why some libraries load without stopping but others don’t.I have tried to recreate this in a trivial program that calls dlopen in aloop, but haven’t been able to reproduce.Can your offer any suggestions for further debugging this?  Moresupporting evidence follows.Here is the message when the debugger stops:Process 120004 stopped* thread #1, name = ‘', stop reason = trace    frame #0: 0x2cfca6a0 libc.so.6`__restore_rtlibc.so.6`__restore_rt:->  0x2cfca6a0 <+0>: movq   $0xf, %rax    0x2cfca6a7 <+7>: syscall    0x2cfca6a9 <+9>: nopl   (%rax)libc.so.6`__libc_sigaction:    0x2cfca6b0 <+0>: subq   $0xd0, %rspI do not have the stop on shared library events setting enabled:(lldb) settings show target.process.stop-on-sharedlibrary-eventstarget.process.stop-on-sharedlibrary-events (boolean) = falseThe backtrace goes back to dlopen:(lldb) bt* thread #1, name = ‘x', stop reason = trace  * frame #0: 0x2cfca6a0 libc.so.6`__restore_rt    frame #1: 0x2aab9eb0 ld-linux-x86-64.so.2    frame #2: 0x2aabdc53 ld-linux-x86-64.so.2`dl_open_worker + 499    frame #3: 0x2aab9286 ld-linux-x86-64.so.2`_dl_catch_error +102    frame #4: 0x2aabd63a ld-linux-x86-64.so.2`_dl_open + 186    frame #5: 0x2c39df66 libdl.so.2`dlopen_doit + 102    frame #6: 0x2aab9286 ld-linux-x86-64.so.2`_dl_catch_error +102    frame #7: 0x2c39e29c libdl.so.2`_dlerror_run + 124    frame #8: 0x2c39dee1 libdl.so.2`__dlopen_check + 49the dyld debug log has a lot of this:209 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit pid153501 stop_when_images_change=false210 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHitcalled for pid 153501211 intern-state DYLDRendezvous::Resolve address size: 8, padding 4212 intern-state DYLDRendezvous::Resolve cursor = 0x2accc160213 

Re: [lldb-dev] [llvm-dev] lldb stops on every call to dlopen

2018-04-17 Thread Jim Ingham via lldb-dev
It is interesting that the stop reason on the thread that stopped is "trace".  
That's what you would expect returning from the single-step to step over the 
breakpoint.  But it looks like we got a signal while single-stepping, but the 
stop reason was misreported by somebody.

Jim


> On Apr 17, 2018, at 6:00 AM, Pavel Labath via lldb-dev 
>  wrote:
> 
> [+lldb-dev]
> 
> Hello Steve,
> 
> thanks for the report.
> 
> The fact that you see the rendezvous breakpoint being hit many times is not
> surprising. We get those every time the library is loaded (we need that to
> load relevant debug info and set potential breakpoints). However, they
> should generally not be surfaced to the user (unless you have the
> stop-on-sharedlibrary-events setting set, which you don't).
> 
> The part that is suspicious to me is that __restore_rt shows up on the top
> of the backtrace. This is a trampoline used to return from signal handlers,
> and it would seem to indicate that you got some sort of a signal while
> loading the libraries. I don't know why this would happen, but it could be
> that this is confusing lldb's auto-resume logic.
> 
> The interesting part to see here is what lldb thinks are the stop reasons
> for individual threads in the process (is the process multi-threaded?) for
> the last couple of stops. The "lldb step" and "gdb-remote packets" log
> categories are the most interesting to observe here. If you are able to
> send me the log traces, I can help you interpret them.
> 
> regards,
> pavel
> 
> 
> 
> 
> 
> On Tue, 17 Apr 2018 at 02:27, Steve Ravet via llvm-dev <
> llvm-...@lists.llvm.org> wrote:
> 
>> Hello lldb developers, I am running into a problem with lldb on Linux.  I
> am currently running llvm 6.0.0.
> 
>> I have an executable that dynamically loads a large number of shared
> libraries at runtime.  These are explicitly loaded via dlopen (they are
> specified in a configuration file), and after loading a few (typically a
> dozen or so, but the number varies) lldb will halt during dlopen.  If I
> continue, it will load a few more then halt again, which makes debugging
> from startup impractical since there are so many libraries to be loaded
> (more than a hundred of them).
> 
>> When I build and debug this same C++ on macOS, the debugger works fine.
> I have verified that target.process.stop-on-sharedlibrary-events is false.
> I turned on dyld logging and I see lots of log messages about
> RendezvousBreakpoint being hit, but I don’t see anything that sheds light
> on why some libraries load without stopping but others don’t.
> 
>> I have tried to recreate this in a trivial program that calls dlopen in a
> loop, but haven’t been able to reproduce.
> 
>> Can your offer any suggestions for further debugging this?  More
> supporting evidence follows.
> 
>> Here is the message when the debugger stops:
> 
>> Process 120004 stopped
>> * thread #1, name = ‘', stop reason = trace
>> frame #0: 0x2cfca6a0 libc.so.6`__restore_rt
>> libc.so.6`__restore_rt:
>> ->  0x2cfca6a0 <+0>: movq   $0xf, %rax
>> 0x2cfca6a7 <+7>: syscall
>> 0x2cfca6a9 <+9>: nopl   (%rax)
> 
>> libc.so.6`__libc_sigaction:
>> 0x2cfca6b0 <+0>: subq   $0xd0, %rsp
> 
>> I do not have the stop on shared library events setting enabled:
> 
>> (lldb) settings show target.process.stop-on-sharedlibrary-events
>> target.process.stop-on-sharedlibrary-events (boolean) = false
> 
> 
> 
>> The backtrace goes back to dlopen:
> 
>> (lldb) bt
>> * thread #1, name = ‘x', stop reason = trace
>>   * frame #0: 0x2cfca6a0 libc.so.6`__restore_rt
>> frame #1: 0x2aab9eb0 ld-linux-x86-64.so.2
>> frame #2: 0x2aabdc53 ld-linux-x86-64.so.2`dl_open_worker + 499
>> frame #3: 0x2aab9286 ld-linux-x86-64.so.2`_dl_catch_error +
> 102
>> frame #4: 0x2aabd63a ld-linux-x86-64.so.2`_dl_open + 186
>> frame #5: 0x2c39df66 libdl.so.2`dlopen_doit + 102
>> frame #6: 0x2aab9286 ld-linux-x86-64.so.2`_dl_catch_error +
> 102
>> frame #7: 0x2c39e29c libdl.so.2`_dlerror_run + 124
>> frame #8: 0x2c39dee1 libdl.so.2`__dlopen_check + 49
> 
>> the dyld debug log has a lot of this:
>> 209 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit pid
> 153501 stop_when_images_change=false
>> 210 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit
> called for pid 153501
>> 211 intern-state DYLDRendezvous::Resolve address size: 8, padding 4
>> 212 intern-state DYLDRendezvous::Resolve cursor = 0x2accc160
>> 213 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit pid
> 153501 stop_when_images_change=false
>> 214 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit
> called for pid 153501
>> 215 intern-state DYLDRendezvous::Resolve address size: 8, padding 4
>> 216 intern-state DYLDRendezvous::Resolve cursor = 0x2accc160
> 
> 
> 
>> thanks,
>> --steve
> 

Re: [lldb-dev] [llvm-dev] lldb stops on every call to dlopen

2018-04-17 Thread Pavel Labath via lldb-dev
[+lldb-dev]

Hello Steve,

thanks for the report.

The fact that you see the rendezvous breakpoint being hit many times is not
surprising. We get those every time the library is loaded (we need that to
load relevant debug info and set potential breakpoints). However, they
should generally not be surfaced to the user (unless you have the
stop-on-sharedlibrary-events setting set, which you don't).

The part that is suspicious to me is that __restore_rt shows up on the top
of the backtrace. This is a trampoline used to return from signal handlers,
and it would seem to indicate that you got some sort of a signal while
loading the libraries. I don't know why this would happen, but it could be
that this is confusing lldb's auto-resume logic.

The interesting part to see here is what lldb thinks are the stop reasons
for individual threads in the process (is the process multi-threaded?) for
the last couple of stops. The "lldb step" and "gdb-remote packets" log
categories are the most interesting to observe here. If you are able to
send me the log traces, I can help you interpret them.

regards,
pavel





On Tue, 17 Apr 2018 at 02:27, Steve Ravet via llvm-dev <
llvm-...@lists.llvm.org> wrote:

> Hello lldb developers, I am running into a problem with lldb on Linux.  I
am currently running llvm 6.0.0.

> I have an executable that dynamically loads a large number of shared
libraries at runtime.  These are explicitly loaded via dlopen (they are
specified in a configuration file), and after loading a few (typically a
dozen or so, but the number varies) lldb will halt during dlopen.  If I
continue, it will load a few more then halt again, which makes debugging
from startup impractical since there are so many libraries to be loaded
(more than a hundred of them).

> When I build and debug this same C++ on macOS, the debugger works fine.
I have verified that target.process.stop-on-sharedlibrary-events is false.
I turned on dyld logging and I see lots of log messages about
RendezvousBreakpoint being hit, but I don’t see anything that sheds light
on why some libraries load without stopping but others don’t.

> I have tried to recreate this in a trivial program that calls dlopen in a
loop, but haven’t been able to reproduce.

> Can your offer any suggestions for further debugging this?  More
supporting evidence follows.

> Here is the message when the debugger stops:

> Process 120004 stopped
> * thread #1, name = ‘', stop reason = trace
>  frame #0: 0x2cfca6a0 libc.so.6`__restore_rt
> libc.so.6`__restore_rt:
> ->  0x2cfca6a0 <+0>: movq   $0xf, %rax
>  0x2cfca6a7 <+7>: syscall
>  0x2cfca6a9 <+9>: nopl   (%rax)

> libc.so.6`__libc_sigaction:
>  0x2cfca6b0 <+0>: subq   $0xd0, %rsp

> I do not have the stop on shared library events setting enabled:

> (lldb) settings show target.process.stop-on-sharedlibrary-events
> target.process.stop-on-sharedlibrary-events (boolean) = false



> The backtrace goes back to dlopen:

> (lldb) bt
> * thread #1, name = ‘x', stop reason = trace
>* frame #0: 0x2cfca6a0 libc.so.6`__restore_rt
>  frame #1: 0x2aab9eb0 ld-linux-x86-64.so.2
>  frame #2: 0x2aabdc53 ld-linux-x86-64.so.2`dl_open_worker + 499
>  frame #3: 0x2aab9286 ld-linux-x86-64.so.2`_dl_catch_error +
102
>  frame #4: 0x2aabd63a ld-linux-x86-64.so.2`_dl_open + 186
>  frame #5: 0x2c39df66 libdl.so.2`dlopen_doit + 102
>  frame #6: 0x2aab9286 ld-linux-x86-64.so.2`_dl_catch_error +
102
>  frame #7: 0x2c39e29c libdl.so.2`_dlerror_run + 124
>  frame #8: 0x2c39dee1 libdl.so.2`__dlopen_check + 49

> the dyld debug log has a lot of this:
> 209 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit pid
153501 stop_when_images_change=false
> 210 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit
called for pid 153501
> 211 intern-state DYLDRendezvous::Resolve address size: 8, padding 4
> 212 intern-state DYLDRendezvous::Resolve cursor = 0x2accc160
> 213 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit pid
153501 stop_when_images_change=false
> 214 intern-state DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit
called for pid 153501
> 215 intern-state DYLDRendezvous::Resolve address size: 8, padding 4
> 216 intern-state DYLDRendezvous::Resolve cursor = 0x2accc160



> thanks,
> --steve


>  In the woods too, a man casts off his years, as the snake his slough,
and at what period soever of life, is always a child.



> ___
> LLVM Developers mailing list
> llvm-...@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
___
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev