Re: [networking-discuss] Re: [dtrace-discuss] Re: DTrace Network Provider

Roch Tue, 26 Sep 2006 08:48:16 -0700


I think that streams/packet tracing makes the scope of the
Network provider overwhelming (it would to me at least).


So, I would like to see us make progress on the individual
modules tracing and defer packet tracing as a followup.


-r


[EMAIL PROTECTED] writes:
 > Brendan Gregg - Sun Microsystems wrote:
 > 
 > >G'Day Sangeeta,
 > >
 > >On Thu, Sep 21, 2006 at 11:43:55AM -0700, Sangeeta Misra wrote:
 > >[...]
 > >
 > >>>But don't assume that customers care most about seeing code path,
 > >>>or have read the 50,000+ lines that make up tcp.c and ip.c.
 > >>>
 > >>At this point,Sun engineers who work in Solaris kernel are customers who 
 > >>would want to use DTrace Network Provider to track the above code paths 
 > >>( to do analyse performance bottlenecks and latencies)  There will be 
 > >>Solaris kernel developers OpenSolaris community in future who would need 
 > >>the same.
 > >>
 > >>Anyway, I am done trying to justify whether the feature I requested is 
 > >>important or not, and how may customers want it. You seem to have some 
 > >>quite strong opinions about what Soalris networking engineers need.
 > >>
 > >
 > >Ok, so you still don't understand what is happening.
 > >
 > >When I'm talking about customers, I'm talking about Solaris users - the
 > >system administrators, developers and staff who use Solaris servers and
 > >would like easy observability tools.
 > >
 > >The world isn't made up of kernel network engineers or customers who care
 > >about kernel code path latencies. Instead there are Solaris users who
 > >run prstat or top from time to time, and would use a similar network summary
 > >tool if we provided one. I can't see the average customer running top in
 > >one window and a kernel code path latency tool in another.
 > >
 > >
 > >I've already said that I want to measure code path latencies too, and that
 > >I think it should be part of a provider. However, since kernel network
 > >engineers understand the kernel network code, then they can get their own
 > >statistics from fbt and some sdt, and write their own private provider
 > >for their own pet needs. Are they also "customers" of DTrace? Sure, but
 > >customers who have the skills to satisfy their own private needs.
 > >
 > 
 > I think you're underestimating the complexities involved (or maybe
 > I'm underestimating what dtrace can do?)
 > 
 > What I'd like to see dtrace be able to do is follow a packet through
 > the kernel, not a thread of execution.  fbt is for following execution.
 > sdt is up to how it is used, but most often, it is just random hooks
 > in the code.
 > 
 > The difference between using fbt and being able to follow a packet
 > is when a packet is put on a queue, the fbt path is terminated,
 > even though the packet still has places to go and I/O ports to see.
 > 
 > If the networking dtrace provider doesn't deliver the capability
 > to trace a packet through the kernel then it is not delivering a
 > key feature that is needed by many people.
 > 
 > Why don't we just hack up our own dtrace provider?  Because it
 > doesn't help in solving problems with installed systems in the
 > field.  In addition, it would take a lot of messing about to do
 > this with fbt or sdt and that doesn't help anyone.
 > 
 > To say that kernel engineers understand the networking code and
 > therefore do not need help in generating stats is not a very
 > ingeneous statement to make and ignores the fact that there are
 > varying levels of capability throughout the organisation, not
 > all of whom are experts.
 > 
 > In addition to the examples you've presented on the web page,
 > I think it would be of benefit if the following questions could
 > also be answered:
 > 
 > - how many PPS is an application responsible for being sent out?
 > - how many PPS is an application responsible for on the receive side?
 > The above two questions respeated for bytes per second, not PPS...or
 > take all of your "questions answered" and repeat for PID.
 > 
 > ...and I think this is the biggest problem with the dtrace networking
 > provider as scoped so far - nothing in it relates to a process.  I'm
 > aware that this can be challenging for networking (especially on the
 > receive side) but there are worthwhile questions here to answer from
 > the generic customer perspective.
 > 
 > I'd dispute that "Are hackers/crackers port scanning my server? (TCP
 > flag matching by IP address)" is actually a question worth asking.
 > If it is directly connected to the Internet, the answer is "yes"
 > (but maybe not right now.)  If there are any other boxes (routers,
 > firewalls) along the way in from the Internet, then it's not a
 > question you should be using dtrace to find an answer for.  If
 > you're trying to come up with a way to justify that particular
 > dtrace probe, I'd recommend looking for a better example question.
 > 
 > Some other probes that might be useful:
 > - RTT calculations from TCP timestamp measurements
 > - TCP window size changes
 > - when a packet is dropped because it is "bad" (checksum, etc)
 > - look through "netstat -s" output as there is quite a large
 >   number of stats there that are worthy of being identified
 >   with a probe.
 > 
 > Darren
 > 
 > p.s the HREF for dtrace-discuss at the bottom of
 > http://www.opensolaris.org/os/community/dtrace/NetworkProvider/
 > is wrong/broken.
 > 
 > _______________________________________________
 > dtrace-discuss mailing list
 > [EMAIL PROTECTED]

_______________________________________________
networking-discuss mailing list
[email protected]

Re: [networking-discuss] Re: [dtrace-discuss] Re: DTrace Network Provider

Reply via email to