That's such a good idea I had it too. :) Gabe
Quoting Geoffrey Blake <[email protected]>: > Gabe, > > Why not do a tracediff, but do so by just comparing the user-code execution > stream (these are single threaded right?) and by not looking at the timing > information. The user program should follow the same execution path and > values I would think. That way you can at least track down where the pointer > demangling values diverge and cause the segfault. > > Geoff > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On Behalf > Of Gabe Black > Sent: Tuesday, January 27, 2009 5:24 AM > To: M5 Developer List > Subject: Re: [m5-dev] debugging user level code in FS > > I looked at this some more, and it appears that the thing reading the > pointers is getting the values from the same physical page as some heap > data structures. It looks like a big chunk of contiguous linked list > nodes, so when a random piece of data was accessed there was a good > chance it was a pointer. As sort of a test I turned down the number of > TLB entries to 1 and the segfault seems to have gone away. I'd like to > be able to do some sort of tracediff to see what changed, but > unfortunately the timing will be different and interrupts will come at > different times. Does anyone have a suggestion for how to handle this? > > Gabe > > Gabe Black wrote: > > That was what I had thought at one point but it doesn't randomize the > > address, it makes it totally invalid. The end result is actually in a > > class of illegal addresses as defined by the architecture. I found a > > comment having to do with it where they said they "encrypted" (their > > quote marks) the pointers since they can't write protect them. > > > > I spend about 12 hours digging around with traces and some C and I've > > figured out the following. > > 1. The pointer is run through PTR_DEMANGLE in the function _IO_link_in > > as part of establishing a "cleanup" region. > > 2. The pointer is part of a structure of pthread function pointers which > > should be initialized by a function called by a function called by... > > _init, when glibc is compiled in the right way. > > 3. The path of the code which gets to _IO_link_in goes through a > > function _IO_file_init or something like that, but I wasn't able to > > identify the code it was called from. The PCs look like they're from > > where lib-linux.so is mapped, but I can find no similar body of code in > > that, libc.so or /sbin/init. It's saving a context onto the stack and > > then branching to a function pointer so I don't think it's junk code > though. > > 4. Figuring out a raw stream of assembly from a similarly versioned C > > file buried in macros with no symbols and ambiguous (dynamic) addressing > > is like shoving pencils in your eyes. > > > > At this point I'm basically giving up trying to figure out what's going > > on like this since it's so painful to try to piece together and I think > > I've gone about as far as I can with it. I need to either recompile > > those binaries with symbols which may change the result, or set up a > > golden model to compare against like legion for SPARC. A golden model is > > probably a very useful long term investment, so I think that's what I'm > > favoring. > > > > Gabe > > > > Clint Smullen wrote: > > > >> Is it part of ASR (address-space randomization)? If so, then it > >> couldn't be used in static linking, which would explain the discrepancy. > >> > >> On Jan 25, 2009, at 4:21 AM, Gabe Black wrote: > >> > >> > >> > >>> I figured out what's going on. This is a glibc security measure where > >>> they scramble pointers people might try to maliciously tamper with. If > >>> you change the pointer, you (supposedly) can't easily tell what you're > >>> changing it too. The problem here is that the pointer it's > >>> unscrambling > >>> is just a normal pointer so it's really scrambling it. A likely reason > >>> I've never seen this before is that this mechanism isn't used for > >>> static > >>> binaries for some reason. > >>> > >>> Gabe > >>> > >>> Gabe Black wrote: > >>> > >>> > >>>> I figured part of this out (objdump is my new best friend), but I > >>>> ran into another mystery. The problem I'm trying to diagnose is > >>>> that the > >>>> init process gets a segfault at some point as it comes up because it > >>>> jumps to a totally bizarre and meaningless pc. The dynamic linker has > >>>> finished settup up the executable, and /lib/libc.so.6 is doing > >>>> something > >>>> for a while. Eventually, it loads a value from memory, rotates it > >>>> right > >>>> by 17 bits, xors it against something it got out of TLS, and then > >>>> jumps > >>>> to the (totally wrong) address. I have no idea how this is supposed > >>>> to > >>>> do anything but break horribly, so I was wondering if anyone else has > >>>> ever seen something like this before and knows what the heck libc is > >>>> trying to do. I've verified that the this piece of code really does > >>>> exist in libc (below), but unfortunately there's no debug > >>>> information so > >>>> I can't easily trace it to the originating C. The fact that the REX > >>>> prefix (0x48) exists in there an unusually large number of times > >>>> makes > >>>> me confident this is actually code. > >>>> > >>>> Gabe > >>>> > >>>> 6a383: 48 8b 05 e6 3a 2d 00 mov > >>>> 0x2d3ae6(%rip),%rax # 33de70 <argp_program_version_hook+0x1a0> > >>>> 6a38a: 48 c1 c8 11 ror $0x11,%rax > >>>> 6a38e: 64 48 33 04 25 30 00 xor %fs:0x30,%rax > >>>> 6a395: 00 00 > >>>> 6a397: ff d0 callq *%rax > >>>> > >>>> Gabe Black wrote: > >>>> > >>>> > >>>> > >>>>> Recently I've had the good fortune of discovering what a pain > >>>>> it is > >>>>> to try to debug user level code in FS with no symbols. I'm going to > >>>>> think about ways to make the process easier, and I was wondering if > >>>>> there are any tricks other people know of that might help. Also, > >>>>> does > >>>>> anyone know of a good way to use gdb on the dynamic linker binary? I > >>>>> know /sbin/init is dynamically linked and I know where the linker > >>>>> gets > >>>>> stuck in memory, but I haven't been able to get gdb to tell me > >>>>> anything > >>>>> useful about it (/lib64/ld-linux-x86-64.so.2). For instance, I'd > >>>>> like to > >>>>> disassemble its entry point, but gdb insists it can't read it for > >>>>> some > >>>>> reason. The entry point according to the elf headers is 0xb10, and > >>>>> from > >>>>> tracing m5 it appears that that's being relocated to > >>>>> 0x2b8ff0abfb10. I'm > >>>>> sure I can get the bytes I need with, for instance, hexedit, but > >>>>> that's > >>>>> a little too masochistic I think. > >>>>> > >>>>> Gabe > >>>>> _______________________________________________ > >>>>> m5-dev mailing list > >>>>> [email protected] > >>>>> http://m5sim.org/mailman/listinfo/m5-dev > >>>>> > >>>>> > >>>>> > >>>>> > >>>> _______________________________________________ > >>>> m5-dev mailing list > >>>> [email protected] > >>>> http://m5sim.org/mailman/listinfo/m5-dev > >>>> > >>>> > >>>> > >>> _______________________________________________ > >>> m5-dev mailing list > >>> [email protected] > >>> http://m5sim.org/mailman/listinfo/m5-dev > >>> > >>> > >> _______________________________________________ > >> m5-dev mailing list > >> [email protected] > >> http://m5sim.org/mailman/listinfo/m5-dev > >> > >> > > > > _______________________________________________ > > m5-dev mailing list > > [email protected] > > http://m5sim.org/mailman/listinfo/m5-dev > > > _______________________________________________ > m5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/m5-dev > > > > No virus found in this incoming message. > Checked by AVG - http://www.avg.com > Version: 8.0.176 / Virus Database: 270.10.14/1917 - Release Date: 1/26/2009 > 6:37 PM > > _______________________________________________ > m5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/m5-dev > _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
