On Fri, Feb 2, 2024 at 12:21 PM Jakub Kicinski <k...@kernel.org> wrote:
>
> On Wed, 31 Jan 2024 10:54:33 +0800 Jason Xing wrote:
> > > [danielj@sw-mtx-051 upstream]$ ethtool -S ens2f1np1 | grep 'stop\|wake'
> > >      tx_queue_stopped: 0
> > >      tx_queue_wake: 0
> > >      tx0_stopped: 0
> > >      tx0_wake: 0
> > >      ....
> >
> > Yes, that's it! What I know is that only mlx drivers have those two
> > counters, but they are very useful when debugging some issues or
> > tracking some historical changes if we want to.
>
> Can you say more? I'm curious what's your use case.

I'm not working at Nvidia, so my point of view may differ from theirs.
>From what I can tell is that those two counters help me narrow down
the range if I have to diagnose/debug some issues.
1) I sometimes notice that if some irq is held too long (say, one
simple case: output of printk printed to the console), those two
counters can reflect the issue.
2) Similarly in virtio net, recently I traced such counters the
current kernel does not have and it turned out that one of the output
queues in the backend behaves badly.
...

Stop/wake queue counters may not show directly the root cause of the
issue, but help us 'guess' to some extent.

Thanks,
Jason

Reply via email to