Avinash Duduskar <[email protected]> writes: > BPF_FIB_LOOKUP_VLAN resolves a VLAN egress. The reverse is also > useful: an XDP program receiving a VLAN-tagged frame on a physical > device wants the lookup to behave as if the packet had arrived on the > corresponding VLAN subinterface, so iif-based policy routing and VRF > table selection use the right ingress. > > Add BPF_FIB_LOOKUP_VLAN_INPUT. When set, params->h_vlan_proto and > params->h_vlan_TCI are read as an input VLAN tag and the matching VLAN > device of params->ifindex is resolved with __vlan_find_dev_deep_rcu(). > The device must be up and in the same network namespace as > params->ifindex (a VLAN device can be moved to another netns while > registered on its parent; receive would deliver into that other > namespace, which a lookup here cannot represent). If params->ifindex > is itself a VLAN device, its inner (QinQ) subinterface is matched. > For a bond or team, a tag on a port matches no device and returns > NOT_FWDED; pass the master's ifindex. > The lookup then runs with the resolved device as the ingress; > params->ifindex itself is not modified on the input side. When the > resolved device is enslaved to a VRF, both the full lookup (via the > l3mdev rule) and BPF_FIB_LOOKUP_DIRECT (via l3mdev_fib_table_rcu()) > select the VRF's table from the resolved ingress. That follows from > feeding the resolved device to the flow as the ingress > (fl4.flowi4_iif = dev->ifindex), which is what makes l3mdev resolve > the VRF master from the subinterface rather than from > params->ifindex. > > The two failure classes get different treatment on purpose. A > h_vlan_proto other than 802.1Q/802.1ad is API misuse and returns > -EINVAL, since it would otherwise reach the WARN in vlan_proto_idx() > with a program-controlled value. An unmatched VID, a device that is > down, or one in another namespace is a data outcome and returns > BPF_FIB_LKUP_RET_NOT_FWDED, matching the DIRECT path when > fib_get_table() finds no table and mirroring real ingress, where the > receive path drops such frames. A VID of 0 (a priority tag) is looked > up literally and normally fails the same way; receive instead > processes such frames untagged, so callers should not set the flag for > priority tags. Proceeding on the physical device for any of these > would be fail-open for the policy-routing cases above. > > The h_vlan fields share a union with tbid, so the flag cannot be > combined with BPF_FIB_LOOKUP_TBID. It describes ingress, so it also > cannot be combined with BPF_FIB_LOOKUP_OUTPUT. Both combinations > return -EINVAL; restricting now keeps a later relaxation backward > compatible. Combining with BPF_FIB_LOOKUP_VLAN is allowed: the tag is > consumed on the ingress side and the egress tag is written on > success. > > Under !CONFIG_VLAN_8021Q the __vlan_find_dev_deep_rcu() stub returns > NULL, so every lookup with the flag returns NOT_FWDED, which is > correct since no VLAN device can exist. > > Suggested-by: Toke Høiland-Jørgensen <[email protected]> > Signed-off-by: Avinash Duduskar <[email protected]>
Reviewed-by: Toke Høiland-Jørgensen <[email protected]>

