On Tue, Apr 29, 2025 at 11:44:55AM -0700, Jacob Keller wrote: > Yes. That's the trouble with the current approach. The VF interface has > to work well when the VF driver is running different operating systems > or versions, and if we change the behavior with a new opcode or similar > that would be difficult. > > The reset logic is likely a haphazard mess of different "solutions" to > various issues we've had. It grew more or less organically out of i40evf > code from years ago. > > Agreed. Obviously, our own testing never caught this. :(
Yes you need to actually run with promisc on, not just using tcpdump once in a while. So someone using the interface connected to a virtual bridge that would want promisc to allow all traffic to be received that then hits a tx hang would see it, but probably that is about the only time you would have hit it. tx hangs don't seem to be nearly as common as they were back in the igbe and ixgbe days fortunately. In my particular case it was enabling promisc mode, then changing the mtu that resulted in very often loosing promisc mode. > We might be able to get away with improving the PF to stop losing as > much data, but I worry that could lead to a similar sort of race > condition as this but in reverse, where VF thinks that it was cleared. I > guess the VF would send a new config and that would either be a no-op or > just restore config. > > That makes me think this fix to the VF is required regardless of what or > how we modify the PF. It seems better to make the VF driver handle it since you don't know what kernel version the host is running and hence what it is going to do when you do reset (unless you up the API version of course, which seems excessive just for this, and you would still have to handle the case when the host is older). Of course it seems that if the driver wasn't caching the current settings for promisc and multicast and simply sent the config everytime any config changed, it would be working, but it would also be wasteful. I don't remember when the cache was introduced, but I think it was done as part of not sending a message for promisc and a separate one for multicast since it sometimes resulted in the wrong setting in the end. But the caching thing has not been around for the entire life of the iavf/i40evf driver so it may in fact have worked in the past and was accidentally broken as part of fixing the other issue. -- Len Sorensen
