Re: svn commit: r330338 - head/sys/amd64/amd64

2018-03-12 Thread Andriy Gapon
On 09/03/2018 21:22, John Baldwin wrote:
> On Saturday, March 03, 2018 03:10:37 PM Andriy Gapon wrote:
>> Author: avg
>> Date: Sat Mar  3 15:10:37 2018
>> New Revision: 330338
>> URL: https://svnweb.freebsd.org/changeset/base/330338
>>
>> Log:
>>   db_nextframe/amd64: catch up with r328083 to recognize fast_syscall_common
>>   
>>   Since that change the system call stack traces look like this:
>> ...
>> sys___sysctl() at sys___sysctl+0x5f/frame 0xfe0028e13ac0
>> amd64_syscall() at amd64_syscall+0x79b/frame 0xfe0028e13bf0
>> fast_syscall_common() at fast_syscall_common+0x101/frame 
>> 0xfe0028e13bf0
>>   So, db_nextframe() stopped recognizing the system call frame.
>>   This commit should fix that.
>>   
>>   Reviewed by:   kib
>>   MFC after: 4 days
>>
>> Modified:
>>   head/sys/amd64/amd64/db_trace.c
>>
>> Modified: head/sys/amd64/amd64/db_trace.c
>> ==
>> --- head/sys/amd64/amd64/db_trace.c  Sat Mar  3 13:20:44 2018
>> (r330337)
>> +++ head/sys/amd64/amd64/db_trace.c  Sat Mar  3 15:10:37 2018
>> (r330338)
>> @@ -212,7 +212,9 @@ db_nextframe(struct amd64_frame **fp, db_addr_t *ip, s
>>  strcmp(name, "Xcpususpend") == 0 ||
>>  strcmp(name, "Xrendezvous") == 0)
>>  frame_type = INTERRUPT;
>> -else if (strcmp(name, "Xfast_syscall") == 0)
>> +else if (strcmp(name, "Xfast_syscall") == 0 ||
>> +strcmp(name, "Xfast_syscall_pti") == 0 ||
>> +strcmp(name, "fast_syscall_common") == 0)
>>  frame_type = SYSCALL;
> 
> I think you actually just want to replace Xfast_syscall with
> fast_syscall_common.  Neither Xfast_syscall nor Xfast_syscall_pti call any
> functions before jumping to the common label, so when unwinding from a system
> call you should always get the common label.  (That is, I think we should
> remove Xfast_syscall and Xfast_syscall_pti here.  Any stack trace that
> happens to find those symbols during unwinding won't have a valid SYSCALL
> frame to unwind.)
> 

I kept / added those to, sort of, decouple db_nextframe from the current
implementation details.  I hope that the extra code does not create too much
overhead.

-- 
Andriy Gapon
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r330338 - head/sys/amd64/amd64

2018-03-09 Thread Bruce Evans

On Fri, 9 Mar 2018, John Baldwin wrote:


On Saturday, March 10, 2018 07:41:30 AM Bruce Evans wrote:

On Fri, 9 Mar 2018, John Baldwin wrote:


On Saturday, March 03, 2018 03:10:37 PM Andriy Gapon wrote:

Author: avg
Date: Sat Mar  3 15:10:37 2018
New Revision: 330338
URL: https://svnweb.freebsd.org/changeset/base/330338

Log:
  db_nextframe/amd64: catch up with r328083 to recognize fast_syscall_common

  Since that change the system call stack traces look like this:
...
sys___sysctl() at sys___sysctl+0x5f/frame 0xfe0028e13ac0
amd64_syscall() at amd64_syscall+0x79b/frame 0xfe0028e13bf0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfe0028e13bf0
  So, db_nextframe() stopped recognizing the system call frame.
  This commit should fix that.

  Reviewed by:  kib
  MFC after:4 days

Modified:
  head/sys/amd64/amd64/db_trace.c

Modified: head/sys/amd64/amd64/db_trace.c
==
--- head/sys/amd64/amd64/db_trace.c Sat Mar  3 13:20:44 2018
(r330337)
+++ head/sys/amd64/amd64/db_trace.c Sat Mar  3 15:10:37 2018
(r330338)
@@ -212,7 +212,9 @@ db_nextframe(struct amd64_frame **fp, db_addr_t *ip, s
strcmp(name, "Xcpususpend") == 0 ||
strcmp(name, "Xrendezvous") == 0)
frame_type = INTERRUPT;
-   else if (strcmp(name, "Xfast_syscall") == 0)
+   else if (strcmp(name, "Xfast_syscall") == 0 ||
+   strcmp(name, "Xfast_syscall_pti") == 0 ||
+   strcmp(name, "fast_syscall_common") == 0)
frame_type = SYSCALL;


I think you actually just want to replace Xfast_syscall with
fast_syscall_common.  Neither Xfast_syscall nor Xfast_syscall_pti call any
functions before jumping to the common label, so when unwinding from a system
call you should always get the common label.  (That is, I think we should
remove Xfast_syscall and Xfast_syscall_pti here.  Any stack trace that
happens to find those symbols during unwinding won't have a valid SYSCALL
frame to unwind.)


No, it needs these symbols to decode the frame after reaching a point where
the frame is actually set up.


My point is that during the instructions from Xfast_syscall to 
fast_syscall_common
there isn't a valid frame.


I know.


Xfast_syscall has two instructions and hasn't yet
decremented %rsp to create room for the trapframe for example.  If you wanted to
handle the special case of stepping through those functions you'd have to create
a new type of frame that used register values from the saved frame for the
debug trap.  You can't assume that there's a 'struct trapframe' at 'rbp + 16'.


In my version (only written for i386 so far), the strcmp()s are not even
reached for syscalls since t is known and discovered that there is no
trapframe at [er]bp + 16 (since that is not in the kernel stack).  For
nested exceptions, [er]bp + 16 is on the kernel stack and the trap printing
for the reentry is bogus before the frame is set up, but ebp is now a
usually-good frame for the interrupted context and the backtrace continues
(as in -current) back to the top frame with no further problems except for
a wrong name at the top level caused by cross-jumping.


Also, in uncommitted fixes I add some decoding of the non-frame between
the entry point and when the frame is set up.  Then the frame register
is still the user's for the syscall case, so should not even be accessed.
The i386 output looks like this:

current:
XX12: Breakpoint at   Xint0x80_syscall:   pushl   $0x2
XX12: db> t
XX12: Tracing pid 1 tid 11 td 0xc6fad000
XX12: Xint0x80_syscall(33,282,bfbfee0c,3b,0,...) at Xint0x80_syscall/frame 
0xd1d05bd8
XX12: kernload(2,bfbfeec4,bfbfeed0,804812b,80a3d64,...) at 0x805188f/frame 
0xbfbfee7c

fixed:
XX12F: Breakpoint at   Xint0x80_syscall:   pushl   $0x2
XX12F: db> t
XX12F: Tracing pid 1 tid 11 td 0xd4c23000
XX12F: Xint0x80_syscall(2,bfbfeec4,bfbfeed0,804812b,80a3d64,...) at 
Xint0x80_syscall/frame 0xbfbfee7c
XX12F: --- unknown trap, ebp = 0xbfbfee7c, sp = 0xd3399bdc, ks = 0xd3398000, 
kse = 0xd339a000 ---

where most of the values on the last line are for debugging (ebp is the user
stack pointer which will become inaccessible, sp is the local variable sp
and [ks, kse - 1] is the thread's kernel stack range (sp must be in this
else the backtrace stops).


Yes, both of these symbols would only be found instructions for this type of
special frame.  Using the 'SYSCALL' frame type for the Xfast_syscall* symbols
would always be wrong.  Until such time as we have the new frame type we should
just ignore them.


It's to have to have a new frame type.  Setting up the frame takes 20-50
or more instructions and there is a different frame type at every
instruction, possibly multiplied by different setups for different entries
(similarly for doreti except it clearly has only one exit).  (i386 is still

Re: svn commit: r330338 - head/sys/amd64/amd64

2018-03-09 Thread John Baldwin
On Saturday, March 10, 2018 07:41:30 AM Bruce Evans wrote:
> On Fri, 9 Mar 2018, John Baldwin wrote:
> 
> > On Saturday, March 03, 2018 03:10:37 PM Andriy Gapon wrote:
> >> Author: avg
> >> Date: Sat Mar  3 15:10:37 2018
> >> New Revision: 330338
> >> URL: https://svnweb.freebsd.org/changeset/base/330338
> >>
> >> Log:
> >>   db_nextframe/amd64: catch up with r328083 to recognize 
> >> fast_syscall_common
> >>
> >>   Since that change the system call stack traces look like this:
> >> ...
> >> sys___sysctl() at sys___sysctl+0x5f/frame 0xfe0028e13ac0
> >> amd64_syscall() at amd64_syscall+0x79b/frame 0xfe0028e13bf0
> >> fast_syscall_common() at fast_syscall_common+0x101/frame 
> >> 0xfe0028e13bf0
> >>   So, db_nextframe() stopped recognizing the system call frame.
> >>   This commit should fix that.
> >>
> >>   Reviewed by: kib
> >>   MFC after:   4 days
> >>
> >> Modified:
> >>   head/sys/amd64/amd64/db_trace.c
> >>
> >> Modified: head/sys/amd64/amd64/db_trace.c
> >> ==
> >> --- head/sys/amd64/amd64/db_trace.cSat Mar  3 13:20:44 2018
> >> (r330337)
> >> +++ head/sys/amd64/amd64/db_trace.cSat Mar  3 15:10:37 2018
> >> (r330338)
> >> @@ -212,7 +212,9 @@ db_nextframe(struct amd64_frame **fp, db_addr_t *ip, s
> >>strcmp(name, "Xcpususpend") == 0 ||
> >>strcmp(name, "Xrendezvous") == 0)
> >>frame_type = INTERRUPT;
> >> -  else if (strcmp(name, "Xfast_syscall") == 0)
> >> +  else if (strcmp(name, "Xfast_syscall") == 0 ||
> >> +  strcmp(name, "Xfast_syscall_pti") == 0 ||
> >> +  strcmp(name, "fast_syscall_common") == 0)
> >>frame_type = SYSCALL;
> >
> > I think you actually just want to replace Xfast_syscall with
> > fast_syscall_common.  Neither Xfast_syscall nor Xfast_syscall_pti call any
> > functions before jumping to the common label, so when unwinding from a 
> > system
> > call you should always get the common label.  (That is, I think we should
> > remove Xfast_syscall and Xfast_syscall_pti here.  Any stack trace that
> > happens to find those symbols during unwinding won't have a valid SYSCALL
> > frame to unwind.)
> 
> No, it needs these symbols to decode the frame after reaching a point where
> the frame is actually set up.

My point is that during the instructions from Xfast_syscall to 
fast_syscall_common
there isn't a valid frame.  Xfast_syscall has two instructions and hasn't yet
decremented %rsp to create room for the trapframe for example.  If you wanted to
handle the special case of stepping through those functions you'd have to create
a new type of frame that used register values from the saved frame for the
debug trap.  You can't assume that there's a 'struct trapframe' at 'rbp + 16'.

> Also, in uncommitted fixes I add some decoding of the non-frame between
> the entry point and when the frame is set up.  Then the frame register
> is still the user's for the syscall case, so should not even be accessed.
> The i386 output looks like this:
> 
> current:
> XX12: Breakpoint at   Xint0x80_syscall:   pushl   $0x2
> XX12: db> t
> XX12: Tracing pid 1 tid 11 td 0xc6fad000
> XX12: Xint0x80_syscall(33,282,bfbfee0c,3b,0,...) at Xint0x80_syscall/frame 
> 0xd1d05bd8
> XX12: kernload(2,bfbfeec4,bfbfeed0,804812b,80a3d64,...) at 0x805188f/frame 
> 0xbfbfee7c
> 
> fixed:
> XX12F: Breakpoint at   Xint0x80_syscall:   pushl   $0x2
> XX12F: db> t
> XX12F: Tracing pid 1 tid 11 td 0xd4c23000
> XX12F: Xint0x80_syscall(2,bfbfeec4,bfbfeed0,804812b,80a3d64,...) at 
> Xint0x80_syscall/frame 0xbfbfee7c
> XX12F: --- unknown trap, ebp = 0xbfbfee7c, sp = 0xd3399bdc, ks = 0xd3398000, 
> kse = 0xd339a000 ---
> 
> where most of the values on the last line are for debugging (ebp is the user
> stack pointer which will become inaccessible, sp is the local variable sp
> and [ks, kse - 1] is the thread's kernel stack range (sp must be in this
> else the backtrace stops).

Yes, both of these symbols would only be found instructions for this type of
special frame.  Using the 'SYSCALL' frame type for the Xfast_syscall* symbols
would always be wrong.  Until such time as we have the new frame type we should
just ignore them.

> Jumps and labels with names inside functions complicate things.  I think
> fast_syscall_common needs to be in the list too, and the many alltraps
> labels should have been there.  This will be more useful with my fix.
> The label calltrap has always been in the list.  This works right since
> the frame has been set up then -- IIRC it is the first place where the
> frame has been set up, and label it more for gdb than for ddb, and decode
> the frame for ddb (presumably gdb decodes the frame too).

gdb does depend on the names, and I was looking at this commit again to see if
I needed to update gdb.  I thought I didn't, but now I see that gdb was 

Re: svn commit: r330338 - head/sys/amd64/amd64

2018-03-09 Thread Bruce Evans

On Sat, 10 Mar 2018, Bruce Evans wrote:


On Fri, 9 Mar 2018, John Baldwin wrote:


I think you actually just want to replace Xfast_syscall with
fast_syscall_common.  Neither Xfast_syscall nor Xfast_syscall_pti call any
functions before jumping to the common label, so when unwinding from a 
system

call you should always get the common label.  (That is, I think we should
remove Xfast_syscall and Xfast_syscall_pti here.  Any stack trace that
happens to find those symbols during unwinding won't have a valid SYSCALL
frame to unwind.)


No, it needs these symbols to decode the frame after reaching a point where
the frame is actually set up.

Also, in uncommitted fixes I add some decoding of the non-frame between
the entry point and when the frame is set up.  Then the frame register
...



Jumps and labels with names inside functions complicate things.  I think
fast_syscall_common needs to be in the list too, and the many alltraps
labels should have been there.  This will be more useful with my fix.
The label calltrap has always been in the list.  This works right since
the frame has been set up then -- IIRC it is the first place where the
frame has been set up, and label it more for gdb than for ddb, and decode
the frame for ddb (presumably gdb decodes the frame too).


Actually, internal labels like alltraps just break finding the entry
point.  Cross-jumping to such labels also breaks finding the entry
point.  The calltrap label is a work around for this problem -- it is
too hard to trace back to the entry point (e.g., Xpage) or even to
alltraps, so the single fake entry point calltrap is used.  Syscalls
and interrupts are more important, or at least easier to handle, so
the problem was avoided for them by not using cross jumps or too many
internal labels (though I don't like unnamed labels).  This is quite
broken for amd64 now.  Cross-jumps also complicate mcounting.  (The
non-traps are easier to handle since they are not multiplexed through
trap().  The multiplexing can be thought of as making all traps enter
at trap() with a trap code giving the trap number, so no label for the
entry point is needed.  The trap code is an arg, so it is printed in
the stack trace on i386.)

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r330338 - head/sys/amd64/amd64

2018-03-09 Thread Bruce Evans

On Fri, 9 Mar 2018, John Baldwin wrote:


On Saturday, March 03, 2018 03:10:37 PM Andriy Gapon wrote:

Author: avg
Date: Sat Mar  3 15:10:37 2018
New Revision: 330338
URL: https://svnweb.freebsd.org/changeset/base/330338

Log:
  db_nextframe/amd64: catch up with r328083 to recognize fast_syscall_common

  Since that change the system call stack traces look like this:
...
sys___sysctl() at sys___sysctl+0x5f/frame 0xfe0028e13ac0
amd64_syscall() at amd64_syscall+0x79b/frame 0xfe0028e13bf0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfe0028e13bf0
  So, db_nextframe() stopped recognizing the system call frame.
  This commit should fix that.

  Reviewed by:  kib
  MFC after:4 days

Modified:
  head/sys/amd64/amd64/db_trace.c

Modified: head/sys/amd64/amd64/db_trace.c
==
--- head/sys/amd64/amd64/db_trace.c Sat Mar  3 13:20:44 2018
(r330337)
+++ head/sys/amd64/amd64/db_trace.c Sat Mar  3 15:10:37 2018
(r330338)
@@ -212,7 +212,9 @@ db_nextframe(struct amd64_frame **fp, db_addr_t *ip, s
strcmp(name, "Xcpususpend") == 0 ||
strcmp(name, "Xrendezvous") == 0)
frame_type = INTERRUPT;
-   else if (strcmp(name, "Xfast_syscall") == 0)
+   else if (strcmp(name, "Xfast_syscall") == 0 ||
+   strcmp(name, "Xfast_syscall_pti") == 0 ||
+   strcmp(name, "fast_syscall_common") == 0)
frame_type = SYSCALL;


I think you actually just want to replace Xfast_syscall with
fast_syscall_common.  Neither Xfast_syscall nor Xfast_syscall_pti call any
functions before jumping to the common label, so when unwinding from a system
call you should always get the common label.  (That is, I think we should
remove Xfast_syscall and Xfast_syscall_pti here.  Any stack trace that
happens to find those symbols during unwinding won't have a valid SYSCALL
frame to unwind.)


No, it needs these symbols to decode the frame after reaching a point where
the frame is actually set up.

Also, in uncommitted fixes I add some decoding of the non-frame between
the entry point and when the frame is set up.  Then the frame register
is still the user's for the syscall case, so should not even be accessed.
The i386 output looks like this:

current:
XX12: Breakpoint at   Xint0x80_syscall:   pushl   $0x2
XX12: db> t
XX12: Tracing pid 1 tid 11 td 0xc6fad000
XX12: Xint0x80_syscall(33,282,bfbfee0c,3b,0,...) at Xint0x80_syscall/frame 
0xd1d05bd8
XX12: kernload(2,bfbfeec4,bfbfeed0,804812b,80a3d64,...) at 0x805188f/frame 
0xbfbfee7c

fixed:
XX12F: Breakpoint at   Xint0x80_syscall:   pushl   $0x2
XX12F: db> t
XX12F: Tracing pid 1 tid 11 td 0xd4c23000
XX12F: Xint0x80_syscall(2,bfbfeec4,bfbfeed0,804812b,80a3d64,...) at 
Xint0x80_syscall/frame 0xbfbfee7c
XX12F: --- unknown trap, ebp = 0xbfbfee7c, sp = 0xd3399bdc, ks = 0xd3398000, 
kse = 0xd339a000 ---

where most of the values on the last line are for debugging (ebp is the user
stack pointer which will become inaccessible, sp is the local variable sp
and [ks, kse - 1] is the thread's kernel stack range (sp must be in this
else the backtrace stops).

Jumps and labels with names inside functions complicate things.  I think
fast_syscall_common needs to be in the list too, and the many alltraps
labels should have been there.  This will be more useful with my fix.
The label calltrap has always been in the list.  This works right since
the frame has been set up then -- IIRC it is the first place where the
frame has been set up, and label it more for gdb than for ddb, and decode
the frame for ddb (presumably gdb decodes the frame too).

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r330338 - head/sys/amd64/amd64

2018-03-09 Thread John Baldwin
On Saturday, March 03, 2018 03:10:37 PM Andriy Gapon wrote:
> Author: avg
> Date: Sat Mar  3 15:10:37 2018
> New Revision: 330338
> URL: https://svnweb.freebsd.org/changeset/base/330338
> 
> Log:
>   db_nextframe/amd64: catch up with r328083 to recognize fast_syscall_common
>   
>   Since that change the system call stack traces look like this:
> ...
> sys___sysctl() at sys___sysctl+0x5f/frame 0xfe0028e13ac0
> amd64_syscall() at amd64_syscall+0x79b/frame 0xfe0028e13bf0
> fast_syscall_common() at fast_syscall_common+0x101/frame 
> 0xfe0028e13bf0
>   So, db_nextframe() stopped recognizing the system call frame.
>   This commit should fix that.
>   
>   Reviewed by:kib
>   MFC after:  4 days
> 
> Modified:
>   head/sys/amd64/amd64/db_trace.c
> 
> Modified: head/sys/amd64/amd64/db_trace.c
> ==
> --- head/sys/amd64/amd64/db_trace.c   Sat Mar  3 13:20:44 2018
> (r330337)
> +++ head/sys/amd64/amd64/db_trace.c   Sat Mar  3 15:10:37 2018
> (r330338)
> @@ -212,7 +212,9 @@ db_nextframe(struct amd64_frame **fp, db_addr_t *ip, s
>   strcmp(name, "Xcpususpend") == 0 ||
>   strcmp(name, "Xrendezvous") == 0)
>   frame_type = INTERRUPT;
> - else if (strcmp(name, "Xfast_syscall") == 0)
> + else if (strcmp(name, "Xfast_syscall") == 0 ||
> + strcmp(name, "Xfast_syscall_pti") == 0 ||
> + strcmp(name, "fast_syscall_common") == 0)
>   frame_type = SYSCALL;

I think you actually just want to replace Xfast_syscall with
fast_syscall_common.  Neither Xfast_syscall nor Xfast_syscall_pti call any
functions before jumping to the common label, so when unwinding from a system
call you should always get the common label.  (That is, I think we should
remove Xfast_syscall and Xfast_syscall_pti here.  Any stack trace that
happens to find those symbols during unwinding won't have a valid SYSCALL
frame to unwind.)

-- 
John Baldwin
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"