Sun Jian <[email protected]> writes: > Generic XDP devmap multi redirect uses skb_clone() for the > intermediate destinations and sends the last destination with the > original skb. This can leave multiple destinations sharing the same > packet data. > > This becomes visible when a devmap egress program mutates packet data. > One destination can observe changes made for another destination. The > last-destination path has the same problem: the last destination runs on > the original skb, so its egress program can modify packet data still > shared with earlier cloned skbs. > > Native XDP broadcast redirect does not have this issue because > xdpf_clone() copies the frame data for each destination. Generic XDP > should provide the same per-destination isolation before running a > devmap egress program. > > Fix this by making cloned skbs private in dev_map_generic_redirect() > before running the devmap egress program. Use skb_copy() instead of > skb_unshare() so that allocation failure does not consume the skb and > the existing caller error paths keep their ownership semantics. > > Add a selftest that covers the last-destination case where earlier > destinations do not have a devmap egress program, while the final > destination does. > > Tested with: > ./test_progs -t xdp_veth_egress > ./test_progs -t xdp_veth > ./test_progs -t xdp > > Fixes: e624d4ed4aa8 ("xdp: Extend xdp_redirect_map with broadcast support") > Suggested-by: Jiayuan Chen <[email protected]> > Signed-off-by: Sun Jian <[email protected]>
With a few nits (see below): Reviewed-by: Toke Høiland-Jørgensen <[email protected]> > --- > v1: > https://lore.kernel.org/bpf/CABFUUZFimdrZdq=NWi+N-0sJZWvMwY=f4iF6-3TVMS8=m07...@mail.gmail.com/ > > Changes in v2: > - Move the private-copy step into dev_map_generic_redirect() so the > last-destination path is covered as well. > - Use skb_copy() instead of skb_unshare() to keep caller ownership > unchanged on allocation failure. > - Add a generic XDP last-destination selftest case. > > kernel/bpf/devmap.c | 10 ++ > .../selftests/bpf/prog_tests/test_xdp_veth.c | 151 +++++++++++++++++- > 2 files changed, 158 insertions(+), 3 deletions(-) > > diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c > index cc0a43ebab6b..59f267685bc6 100644 > --- a/kernel/bpf/devmap.c > +++ b/kernel/bpf/devmap.c > @@ -700,12 +700,22 @@ int dev_map_enqueue_multi(struct xdp_frame *xdpf, > struct net_device *dev_rx, > int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff > *skb, > const struct bpf_prog *xdp_prog) > { > + struct sk_buff *nskb; nit: this definition could go inside the if statement block below, to make it obvious that nskb is not used outside that branch. > int err; > > err = xdp_ok_fwd_dev(dst->dev, skb->len); > if (unlikely(err)) > return err; > > + if (dst->xdp_prog && skb_cloned(skb)) { > + nskb = skb_copy(skb, GFP_ATOMIC); > + if (!nskb) > + return -ENOMEM; > + > + consume_skb(skb); > + skb = nskb; > + } > + > /* Redirect has already succeeded semantically at this point, so we just > * return 0 even if packet is dropped. Helper below takes care of > * freeing skb. > diff --git a/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c > b/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c > index 3e98a1665936..1f0b9ade12fe 100644 > --- a/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c > +++ b/tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c It's customary to split changes to the selftests into their own commits. -Toke

