On Jun 25, 2013, at 10:19 AM, Jeff Rizzo <r...@tastylime.net> wrote:
> On 6/25/13 10:06 AM, Christos Zoulas wrote: >> On Jun 25, 9:32am, m...@3am-software.com (Matt Thomas) wrote: >> -- Subject: Re: DTrace syscall provider - please test/comment >> >> | >> | On Jun 25, 2013, at 5:25 AM, chris...@zoulas.com (Christos Zoulas) wrote: >> | >> | > On Jun 24, 6:12pm, m...@3am-software.com (Matt Thomas) wrote: >> | > -- Subject: Re: DTrace syscall provider - please test/comment >> | > >> | > | >> | > | On Jun 24, 2013, at 6:01 PM, Christos Zoulas <chris...@astron.com> >> wrote: >> | > | >> | > | > Can't this be done as an addition/enhancement to the trace_enter()/ >> | > | > trace_exit() facility instead of having to enter each syscall entry? >> | > | >> | > | that only gets called if p->p_trace_enabled is set. So now you need >> | > | a hook to set that on every lwp switch if the provider is tracing. >> | > >> | > Right, and it (dtrace) can set a different (or the same flag) to enable >> | > it. >> | >> | >> | How does it set the same flag since that's per-proc and will need to >> changed >> | on context switch. >> | >> | A different flag is more overhead per syscall. >> >> I am trying to balance that against adding of two more conditionals per >> syscall per architecture and touching dozens of source files adding the >> same code in each one. Perhaps the syscall_plain/syscall_fancy idea >> was not that bad after all :-( Perhaps a different bit on the same flag. >> If any of them is set, you call trace enter, and you clear/move the >> bit on context switch. >> >> christos > > I am by no means an expert on this part of the kernel, but during the course > of this project, I noticed that FreeBSD seems to have made a > kern/subr_syscall.c which has an MI place where they put their entry/exit. > It seemed a lot cleaner than what we have; I can't speak to performance. Can > someone with more design clue than I comment on this setup? I added an inline to <sys/syscall.h>. int sy_invoke(const struct sysent *, struct lwp *, const void *, register_t *, int code); which does the trace_enter/trace_exit dance, that can be modified to do the dtrace dance as well.