I've been trying to identify why we're seeing frequent stalls during packet
transmission in our GPFS cluster in the bnx2 driver (as well as other
NICs/drivers), but I am at the limit of my current knowledge.  I used perf
netdev events (as described in http://lwn.net/Articles/397654/) to measure
the tx times, and see spikes such as the following:

   dev    len      Qdisc               netdevice             free
    em2    98 807740.878085sec        0.002msec             0.061msec
    em2    98 807740.878119sec        0.002msec             0.029msec
    em2    98 807741.140600sec        0.005msec             0.092msec
    em2 65226 807742.763833sec        0.007msec             0.436msec
    em2    66 807727.081712sec        0.001msec         16246.072msec
    em2    66 807740.882741sec        0.001msec          3457.625msec


Based on the source for netdev-times.py, the "free" column is the
difference between trace_net_dev_xmit() and trace_kfree_skb()
in net/core/dev.c, but I'm not sure how to dig any deeper.  Are there any
common causes for this behavior?  What's the best way to further break down
the time difference between the xmit and kfree trace points?
_______________________________________________
Kernelnewbies mailing list
[email protected]
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Reply via email to