[I trace code associated with bl <00001322.plt_call.getenv>
in the two contexts and extend the range over which things
appear to match: up to some point after the branch
b <__glink_PLTresolve> .]

On 2018-Nov-6, at 19:12, Mark Millard <marklmi26-f...@yahoo.com> wrote:

> [I've present a little information about the longer-existing
> failure's odd backtrace for /libexec/ld-elf.so.1 /bin/ls
> --but on powerpc64 FreeBSD instead of 32-bit powerpc FreeBSD.]
> 
> On 2018-Nov-2, at 11:50, Konstantin Belousov <kostikbel at gmail.com> wrote:
> 
>> On Fri, Nov 02, 2018 at 10:38:08AM -0700, Mark Millard wrote:
>>> On 2018-Nov-2, at 8:52 AM, Konstantin Belousov <kostikbel at gmail.com> 
>>> wrote:
>>> 
>>>> . . .
>>> 
>>> That seems better. But it crashes during /bin/ls execution
>>> ( 0x0180???? addresses ), apparently in a library routine
>>> ( 0x41?????? addresses ):
>>> 
>>> Program received signal SIGSEGV, Segmentation fault.
>>> 0x411220b4 in ?? ()
>>> (gdb) bt
>>> #0  0x411220b4 in ?? ()
>>> #1  0x4112200c in ?? ()
>>> #2  0x01803c84 in ?? ()
>>> #3  0x018023b4 in ?? ()
>>> #4  0x010121a0 in .rtld_start () at 
>>> /usr/src/libexec/rtld-elf/powerpc/rtld_start.S:112
>>> 
>>> Using a normal gdb run of /bin/ls suggests:
>>> 
>>> #2  0x01803c84 in ?? () should be in main and seems to be: bl 0x1818914 
>>> <getopt_long@plt>
>>> #3  0x018023b4 in ?? () should be in _start
>>> 
>>> Looking in the test context:
>>> 
>>>  0x1803c80: bl      0x1818914
>>>  0x1803c84: cmpwi   cr7,r3,-1
>>> 
>>> and:
>>> 
>>>  0x1818914: li      r11,59
>>>  0x1818918: b       0x18186f4
>>> 
>>> and:
>>> 
>>>  0x18186f4: rlwinm  r11,r11,2,0,29
>>>  0x18186f8: addis   r11,r11,386
>>>  0x18186fc: lwz     r11,-30316(r11)
>>>  0x1818700: mtctr   r11
>>>  0x1818704: bctr
>>> 
>>> Breaking at the bctr and using info reg:
>>> 
>>> r11            0x4125ffa0   1093009312
>>> 
>>> It looks like there is some amount of
>>> activity before the traceback addresses
>>> show up.
>>> 
>>> I've not found a good way to fill in the "in ??()"
>>> (or analogous) information. The addresses 0x411220??
>>> do not match up with a normal run of /bin/ls from
>>> gdb: the addresses can not be accessed.
>>> 
>>> 
>>> 
>>> It does appear that the code is in /lib/libc.so.7 in the
>>> test context:
>>> 
>>> Breakpoint 2, reloc_non_plt (obj=0x41041600, obj_rtld=0x41104b57, flags=4, 
>>> lockstate=0x0) at /usr/src/libexec/rtld-elf/powerpc/reloc.c:338
>>> . . .
>>> 
>> There seems to be an issue with the direct execution mode on ppc.
>> Even otherwise working ld-elf.so.1 segfaults if I try to use it as
>> standalone binary.
>> 
>> But if I specify patched ld-elf.so.1 as the interpreter for some program,
>> using 'cc -Wl,-I,<path>/ld-elf.so.1' it works.  So I see there two bugs,
>> one is regression due to textsize calculation, which should be fixed by
>> my patch.  Another is the direct exec problem.
> 
> I've got a little more information about the odd backtrace
> from the /libexec/ld-elf.so.1 /bin/ls failure that the
> prior patch allowed getting to, although for a powerpc64
> example context.
> 
> The information is only identifying where the code was
> in /bin/ls and /lib/libc.so.1 in the backtrace. For
> libc.so.1 I found the same code sequences in a gdb of
> /bin/ls directly, matching one first, using the addresses
> vs. in the /libexec/ld-elf.so.1 /bin/ls process to
> find offsets for going back and forth, and then used
> that two find the 2nd backtrace addresses material.
> 
> Overall it suggests to me that (in somewhat 
> symbolic terms):
> 
> bl     <00001322.plt_call.getenv>
> 
> eventually lead to executing the wrong code.
> 
> 
> The supporting detail is as follows.
> 
> The /libexec/ld-elf.so.1 part of the backtrace was
> easy to find where the code was:
> 
> (gdb) run /bin/ls
> Starting program: /libexec/ld-elf.so.1 /bin/ls
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x000000080118d81c in ?? ()
> (gdb) bt
> #0  0x000000080118d81c in ?? ()
> #1  0x000000080118d920 in ?? ()
> #2  0x0000000010002558 in ?? ()
> #3  0x00000000100037b0 in ?? ()
> #4  0x0000000001018450 in ._rtld_start () at 
> /usr/src/libexec/rtld-elf/powerpc64/rtld_start.S:104
> Backtrace stopped: frame did not save the PC
> 
> (gdb) 
> 101           ld      %r7,128(%r1)    /* exit proc */
> 102           ld      %r8,136(%r1)    /* ps_strings */
> 103   
> 104           blrl    /* _start(argc, argv, envp, obj, cleanup, ps_strings) */
> 105   
> 106           li      %r0,1           /* _exit() */
> 107           sc
> 
> 
> The /bin/ls part of the backtrace was easy to find
> were the code was:
> 
> (gdb) symbol-file /bin/ls
> Load new symbol table from "/bin/ls"? (y or n) y
> Reading symbols from /bin/ls...Reading symbols from 
> /usr/lib/debug//bin/ls.debug...done.
> done.
> (gdb) bt
> #0  0x000000080118d81c in ?? ()
> #1  0x000000080118d920 in ?? ()
> #2  0x0000000010002558 in main (argc=<optimized out>, argv=0x80134bdb0) at 
> /usr/src/bin/ls/ls.c:268
> #3  0x00000000100037b0 in _start (argc=<optimized out>, 
> argv=0x3fffffffffffdb70, env=0x3fffffffffffdb88, obj=<optimized out>, 
> cleanup=<optimized out>, ps_strings=<optimized out>)
>    at /usr/src/lib/csu/powerpc64/crt1.c:96
> #4  0x0000000001018450 in ?? ()
> #5  0x0000000000000000 in ?? ()
> 
> (gdb) fr 3 
> #3  0x00000000100037b0 in _start (argc=<optimized out>, 
> argv=0x3fffffffffffdb70, env=0x3fffffffffffdb88, obj=<optimized out>, 
> cleanup=<optimized out>, ps_strings=<optimized out>)
>    at /usr/src/lib/csu/powerpc64/crt1.c:96
> 96            exit(main(argc, argv, env));
> (gdb) down
> #2  0x0000000010002558 in main (argc=<optimized out>, argv=0x80134bdb0) at 
> /usr/src/bin/ls/ls.c:268
> 268           while ((ch = getopt_long(argc, argv,
> 
> 
> 
> For the messy lib.libc.so.1 part of the backtrace both
> addresses are in getopt_internal. I show extractions from
> the the gdb /bin/ls output because it has helpful symbolic
> information displayed. But that means that the addresses
> are offset from those in the bt for the failure process.
> 
> For #1  0x000000080118d920 in ?? () I end up with:
> 
> (gdb) x/32i 0x81019b6c0+0xad0-0x880
>   0x81019b910 <getopt_internal+592>:  stw     r9,0(r18)
>   0x81019b914 <getopt_internal+596>:  addis   r3,r2,-5
>   0x81019b918 <getopt_internal+600>:  addi    r3,r3,30120
>   0x81019b91c <getopt_internal+604>:  bl      0x81018dfe0 
> <00001322.plt_call.getenv>
>   0x81019b920 <getopt_internal+608>:  ld      r2,40(r1)
> 
> (The machine code around it all matches around
> 0x000000080118d920 in the failure context.)
> 
> The getenv call in the source is the 2nd line of:
> 
>        if (posixly_correct == -1 || optreset)
>                posixly_correct = (getenv("POSIXLY_CORRECT") != NULL);
> 
> For #0  0x000000080118d81c in ?? () I end up with:
> 
> (gdb) x/32i 0x81019b6c0+0xad0-0x880-0x110
>   0x81019b800 <getopt_internal+320>:  bne     cr7,0x81019b868 
> <getopt_internal+424>
>   0x81019b804 <getopt_internal+324>:  lwa     r5,0(r29)
>   0x81019b808 <getopt_internal+328>:  stw     r17,0(r18)
>   0x81019b80c <getopt_internal+332>:  cmpw    cr7,r5,r19
>   0x81019b810 <getopt_internal+336>:  bge     cr7,0x81019ba60 
> <getopt_internal+928>
>   0x81019b814 <getopt_internal+340>:  rldicr  r9,r5,3,60
>   0x81019b818 <getopt_internal+344>:  ldx     r10,r20,r9
>   0x81019b81c <getopt_internal+348>:  lbz     r9,0(r10)
> 
> with the failure being that r10 is zero in that last
> line above. Again the surrounding code matches.
> 
> The source code line is reported to be:
> 
>                if (*(place = nargv[optind]) != '-' ||
> 
> I got the line number information from breakpoints 3 and 4
> below (from the gdb /bin/ls process):
> 
> (gdb) info br
> Num     Type           Disp Enb Address            What
> 1       breakpoint     keep y   0x0000000010002360 in main at 
> /usr/src/bin/ls/ls.c:231
>       breakpoint already hit 1 time
> 3       breakpoint     keep y   0x000000081019b81c in getopt_internal at 
> /usr/src/lib/libc/stdlib/getopt_long.c:411
> 4       breakpoint     keep y   0x000000081019b91c in getopt_internal at 
> /usr/src/lib/libc/stdlib/getopt_long.c:379
> 
> Line 379 has the getenv call, matching the machine code showing
> the call.
> 
> (I set the breakpoints just as a way of using "info br" to list
> the information later.)
> 
> Overall this seems to suggest that:
> 
> bl     <00001322.plt_call.getenv>
> 
> lead to something odd happening and got to the wrong
> code.
> 
> That is all the additional information that I have
> at this point. I hope it is of some use.
> 

I'll note that the normal cases execution does the
getenv call but does not execute the lbz r9,0(r10)
related code.

I'll also note that for the libc.so.1 code
the /libexec/ld-elf.so.1 /bin/ls code
addresses are bigger than the /bin/ls
addresses by:

0x81019b920 - 0x80118d920 = 0xF00E000

I use this to go back and forth, checking for matching
code as I go.

Presenting the normal /bin/ls use in gdb first for
up to b <__glink_PLTresolve> :

I'd already shown:

  0x81019b91c <getopt_internal+604>:    bl      0x81018dfe0 
<00001322.plt_call.getenv>

Looking:

   0x81018dfe0 <00001322.plt_call.getenv>:      std     r2,40(r1)
   0x81018dfe4 <00001322.plt_call.getenv+4>:    ld      r12,480(r2)
   0x81018dfe8 <00001322.plt_call.getenv+8>:    mtctr   r12
   0x81018dfec <00001322.plt_call.getenv+12>:   ld      r11,496(r2)
   0x81018dff0 <00001322.plt_call.getenv+16>:   ld      r2,488(r2)
   0x81018dff4 <00001322.plt_call.getenv+20>:   cmpldi  r2,0
   0x81018dff8 <00001322.plt_call.getenv+24>:   bnectr+ 
   0x81018dffc <00001322.plt_call.getenv+28>:   b       0x81030f3dc <getenv@plt>

As for getenv@pl :

   0x81030f3dc <getenv@plt>:    li      r0,925
   0x81030f3e0 <getenv@plt+4>:  b       0x81030d6c8 <__glink_PLTresolve>


Note that 0x81018dfe0 - 0xF00E000 = 0x80117ffe0 .

Back in the /libexec/ld-elf.so.1 /bin/ls context:

(gdb) bt
#0  0x000000080118d81c in ?? ()
#1  0x000000080118d920 in ?? () [Just after the bl <00001322.plt_call.getenv> .]
#2  0x0000000010002558 in ?? ()
#3  0x00000000100037b0 in ?? ()
#4  0x0000000001018450 in ?? ()
#5  0x0000000000000000 in ?? ()

(gdb) x/i 0x000000080118d920-0x4
   0x80118d91c: bl      0x80117ffe0

So matching what was calculated earlier.

(gdb) x/32i 0x81018dfe0-0xf00e000 
   0x80117ffe0: std     r2,40(r1)
   0x80117ffe4: ld      r12,480(r2)
   0x80117ffe8: mtctr   r12
   0x80117ffec: ld      r11,496(r2)
   0x80117fff0: ld      r2,488(r2)
   0x80117fff4: cmpldi  r2,0
   0x80117fff8: bnectr+ 
   0x80117fffc: b       0x8013013dc

(gdb) x/2i 0x8013013dc
   0x8013013dc: li      r0,925
   0x8013013e0: b       0x8012ff6c8

0x81030d6c8 - 0x8012ff6c8 = 0xF00E000

Still matching tp to this point.

So the two contexts seem to match up to
some point after: b <__glink_PLTresolve> .

I've not looked beyond this.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

_______________________________________________
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to