Hello Jakub,

On Thu, Jul 10, 2025 at 06:45:35PM -0700, Jakub Kicinski wrote:
> > +MAX_WRITES: int = 40
> 
> FWIW the test takes 25sec on our debug-heavy VMs right now.
> I think we can crank the writes quite a bit.. ?

Sure. I will increase it to 40. On my VMs I get more than 30 hits every
single time:

        2025-07-11 08:30:08,904 - DEBUG - BPFtrace output: {'hits': 30}
        2025-07-11 08:30:08,904 - DEBUG - MAPS coming from bpftrace = {'hits': 
30}

> > +def ethtool_read_rx_tx_queue(interface_name: str) -> tuple[int, int]:
> > +    """
> > +    Read the number of RX and TX queues using ethtool. This will be used
> > +    to restore it after the test
> > +    """
> > +    rx_queue = 0
> > +    tx_queue = 0
> > +
> > +    try:
> > +        ethtool_result = ethtool(f"-g {interface_name}").stdout
> 
> json=True please and you'll get a dict, on CLI you can try:
> 
> ethtool --json -g eth0

Sure. I was parsing manually because some options do not have --json
format. 

        ethtool --json -l eth0
        ethtool: bad command line argument(s)

I haven't checked upstream, but, if this feature is upstream, is it worth
implementing it?

> > +        ethtool(f"-G {interface_name} rx {rx_val} tx {tx_val}")
> 
> This is setting _ring size_ not queue count.
> I suppose we want both, this and queue count to 1 (with ethtool -l / -L)
> The ring size of 1 is unlikely to work on real devices.
> I'd try setting it to 128 and 256 and if neither sticks just carry on
> with whatever was there.

Thanks. I creating a function to do it. Something as:

        def configure_network(ifname: str) -> None:
        """Configure ring size and queue numbers"""

        # Set defined queues to 1 to force congestion
        prev_queues = ethtool_get_queues_cnt(ifname)
        logging.debug("RX/TX/combined queues: %s", prev_queues)
        # Only set the queues to 1 if they exists in the device. I.e, they are 
> 0
        ethtool_set_queues_cnt(ifname, tuple(1 if x > 0 else x for x in 
prev_queues))
        defer(ethtool_set_queues_cnt, ifname, prev_queues)

        # Try to set the ring size to some low value.
        # Do not fail if the hardware do not accepted desired values
        prev_ring_size = ethtool_get_ringsize(ifname)
        for size in [(1, 1), (128, 128), (256, 256)]:
                if ethtool_set_ringsize(ifname, size):
                # hardware accepted the desired ringsize
                logging.debug("Set RX/TX ringsize to: %s from %s", size, 
prev_ring_size)
                break
        defer(ethtool_set_ringsize, ifname, prev_ring_size)

> > +        "remote_ip": (
> > +            cfg.remote_addr_v["4"] if cfg.addr_ipver == "4" else 
> > cfg.remote_addr_v["6"]
> > +        ),
> 
> this is already done for you
> cfg.addr is either v4 or v6 depending on what was provided in the env

Ack!

> > +# toggle the interface up and down, to cause some congestion
> 
> Let's not do this, you're missing disruptive annotation and for many
> drivers NAPI is stopped before queues
> https://github.com/linux-netdev/nipa/wiki/Guidance-for-test-authors#ksft_disruptive

Sure. This is not needed to reproduce the issue.
I just put it in there in order to create more entropy. Anyway, removing
it.

> > +def main() -> None:
> > +    """Main function to run the test"""
> > +    netcons_load_module()
> > +    test_check_dependencies()
> > +    with NetDrvEpEnv(__file__, nsim_test=True) as cfg:
> 
> I think nsim_test=True will make the test run _only_ on netdevsim.
> But there's nothing netdevsim specific here right?
> You can remove the argument and let's have this run against real
> drivers, too?

Sure. that is our goal, by the end of the day. Being able to run it on
real hardware.


Thanks of the review. I will send an updated version soon, and we can
continue the discusion over there.
--breno

Reply via email to