On Wed, Mar 15, 2023 at 09:33:09AM +0100, Simon Horman wrote: > On Tue, Mar 14, 2023 at 06:49:25PM +0100, Ilya Maximets wrote: > > On 3/14/23 12:15, Simon Horman wrote: > > > On Mon, Mar 13, 2023 at 09:34:30PM +0100, Ilya Maximets wrote: > > >> On 3/13/23 11:27, Roi Dayan wrote: > > >>> From: Oz Shlomo <[email protected]> > > >>> > > >>> Currently jumping over a output-to-port action is translated to tc > > >>> mirror action and stolen control action. > > >>> However, the tc control action is not propagated to the hw offload > > >>> action, > > >>> thus the hardware action will mirror the packet and continue to the next > > >>> action. > > >> > > >> This sounds like a bug in the hardware offloading in the kernel/driver. > > >> Shouldn't this be fixed in the kernel? > > > > > > Hi Ilya, > > > > > > I think what we are seeing is a mismatch between how > > > OVS translates flows to TC, and how TC actually works (since forever). > > > > > > It seems to me that fixing the translation, as this patch does, > > > is the right approach. But I could be missing something (often the case). > > > > I'm not a TC expert, but the description sounded like the driver > > or some other code in the kernel is ignoring the TC_ACT_STOLEN. > > If that's correct, that must be a bug in the kernel.
That's correct, but it's more like a tc feature that hasn't been implemented in the HWOL side of it and that shouldn't have been used here. Or a gap, or so, but still. It's not a small piece of code. Tc is super flexible and that flexibility is not there in the driver sometimes. This is only becoming a problem now because with check_pkt_larger() ovs will generate action lists that don't execute entirely. > > > > We could still change the way OVS translates flows for TC in order > > to avoid the kernel bug as long as we're not introducing anything > > nasty in the flow translation logic. But it still remains a kernel > > bug that needs to be fixed regardless. So, I was just making sure > > that the issue is worked on from the kernel side. And if it can > > be fixed in the kernel in a reasonable time frame, then maybe we > > don't need to change OVS after all. This jumping code is fairly > > complex already. > > Thanks Ilya, > > I understand your point. And I agree that the kernel treatment of > TC_ACT_STOLEN needs to be understood/investigated. Nvidia folks please feel free to chime in here, but it seems to me that it that it would be nice to avoid having to implement this in the driver. Perhaps some safety checks, to reject offloading stuff that it knows that it will be broken, and that's it. Well, as Ilya said, as long as it's not nasty from ovs side, of course. As I was saying above, I think the motivation for this patch is check_pkt_larger, on which it could have: actions:check_pkt_larger(value,action1,action2) and we want to stop the execution after action1. Tc allows more than one way of doing it, and from tc perspective, the new way this patch is proposing is fine. But then, it could also be: actions:check_pkt_larger(value,action1,action2),action3 and then we don't want to stop the execution after action1, but continue on action3. I don't know the details on the translation here and optimizations that are done, but there will be a moment that 2 jumps are needed (j1: action1->action3, j2: match->action2), and then the control action that is getting ignored here now, won't even be there (well, it will be, but it will be a PIPE one). Hopefully this long text (sorry) makes sense? Marcelo _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
