In addition to what Jon Haslam already said, "instrumentation" in DTrace is many things. Not all DTrace providers enable probes by putting jump or trap instructions into application/kernel code at the probepoints. The syscall provider is one that doesn't. Neither application nor kernel code is "instrumented" when you enable a syscall probe - instead, as Jon showed, the kernel's system call dispatch table is modified, with a bounce to the dtrace syscall provider in the slots that you're probing.
syscalls in Solaris are in a function call table (sysent). An application, making a system call, ends up executing a trap instruction (ta 0x8 on SPARC, int/sysenter/syscall/lcall on x86 depending on CPU type) with one CPU register containing the system call number (the list of which you find in <sys/syscalls.h>). The trap handler simply checks this for validity (within the range that's defined) and then gets the function pointer to call by indexing that table. This is how syscalls get from userland into the kernel - they cause a trap, which is a privilege switching event. Why are you not seeing those trap instructions in your app's code ? Because they're in libc only. The app is not allowed to care how exactly a system call is done - it calls the libc function, via the procedure linkage table that ld.so fills in when loading/linking the app. Try the following to see this stuff: 1. Compile + Link your test program 2. load it into mdb but do not run it yet. 3. disassemble main(), find the PLT:... entries 4. put a breakpoint at main (main::bp does it in mdb) 5. run the program 6. when it hits the breakpoint, disassemble it again You'll find that the PLT:... entries have been replaced by the actual libc function entry points. That's the linker's work. If you disassemble those libc funcs, you'll then find the actual 'syscall' instruction (on amd64 it's indeed 'syscall'). Bye, FrankH. On Thu, 22 Feb 2007, Peter Boros wrote: > Hi! > > I want to see how the syscall instrumentation work in assembly level, so > similar to this: > >> ufs_write::dis -n 3 > ufs_write: save %sp, -0x110, %sp > ufs_write+4: stx %i4, [%sp + 0x8bf] > ufs_write+8: mov %i0, %i5 > ufs_write+0xc: ldx [%i0 + 0x10], %i4 > >> ufs_write::dis -n 3 > ufs_write: ba,a +0x19814c <0x14c95dc> > ufs_write+4: stx %i4, [%sp + 0x8bf] > ufs_write+8: mov %i0, %i5 > ufs_write+0xc: ldx [%i0 + 0x10], %i4 > >> ufs_write+0x19814c::dis > 0x14c95b4: sethi %hi(0x1331000), %g1 > 0x14c95b8: call +0x79ebc0e8 <dtrace_probe> > 0x14c95bc: or %g1, 0xc8, %o7 > 0x14c95c0: sethi %hi(0x4000), %o0 > 0x14c95c4: or %o0, 0x98, %o0 > 0x14c95c8: mov 0x300, %o1 > 0x14c95cc: call +0x79ebc0d4 <dtrace_probe> > 0x14c95d0: mov %i0, %o2 > 0x14c95d4: ret > 0x14c95d8: restore > --- > 0x14c95dc: save %sp, -0x110, %sp > 0x14c95e0: sethi %hi(0x4000), %o0 > 0x14c95e4: or %o0, 0x99, %o0 > 0x14c95e8: mov %i0, %o1 > 0x14c95ec: mov %i1, %o2 > 0x14c95f0: mov %i2, %o3 > 0x14c95f4: mov %i3, %o4 > 0x14c95f8: mov %i4, %o5 > 0x14c95fc: sethi %hi(0x1331400), %g1 > 0x14c9600: call +0x79ebc0a0 <dtrace_probe> > 0x14c9604: or %g1, 0x8c, %o7 > > So, to examine this, I wrote a program, which makes a system call: > #include <unistd.h> > int main(int argc, char *argv[]) { > write(0,"helloworld\n",11); > return 0; > } > > So, I start to examing it with mdb: > mdb ./syscall >> main:b >> :r > mdb: stop at main > mdb: target stopped at: > main: save %sp, -0x68, %sp >> .::dis > main: save %sp, -0x68, %sp > main+4: st %i0, [%fp + 0x44] > main+8: st %i1, [%fp + 0x48] > main+0xc: sethi %hi(0x10c00), %o1 > main+0x10: or %o1, 0x90, %o1 > main+0x14: clr %o0 > main+0x18: call +0x100ac <PLT:write> > main+0x1c: mov 0xb, %o2 > main+0x20: clr [%fp - 0x4] > main+0x24: clr %i0 > main+0x28: ret > main+0x2c: restore > main+0x30: clr %i0 > main+0x34: ret > main+0x38: restore > > Okay, the syscall is there, dtrace instuments it, if I turn on the > syscall::write:entry probe. > > When I try to examing write itself I get the same results in > instrumented and non-instrumented case (I followed the brances, it is > the same after that too): >> main+0x100ac::dis > PLT:exit: sethi %hi(0xf000), %g1 > PLT:exit: ba,a -0x40 <PLT:> > PLT:exit: nop > PLT:_exit: sethi %hi(0x12000), %g1 > PLT:_exit: ba,a -0x4c <PLT:> > PLT:_exit: nop > PLT:write: sethi %hi(0x15000), %g1 > PLT:write: ba,a -0x58 <PLT:> > PLT:write: nop > PLT:_get_exit_frame_monitor: sethi %hi(0x18000), %g1 > PLT:_get_exit_frame_monitor: ba,a -0x64 <PLT:> > > I tried to ::step the program through the instrumentation, but when the > probe is on, it conseqently crashes at one instruction (with this, at > some point I should run into dtrace_probe). > > How can I see the effect of system call instrumentation at assembly > level? Maybe it would be easier if I could compile a static binary. I am > using nevada build 56 on sparc. > > Peter > > _______________________________________________ > mdb-discuss mailing list > mdb-discuss at opensolaris.org >