> >> +static int get_xps_queue(struct net_device *dev, struct sk_buff *skb) > >> { > >> #ifdef CONFIG_XPS > >> struct xps_dev_maps *dev_maps; > >> - struct xps_map *map; > >> + struct sock *sk = skb->sk; > >> int queue_index = -1; > >> > >> if (!static_key_false(&xps_needed)) > >> return -1; > >> > >> rcu_read_lock(); > >> - dev_maps = rcu_dereference(dev->xps_cpus_map); > >> + if (!static_key_false(&xps_rxqs_needed)) > >> + goto get_cpus_map; > >> + > >> + dev_maps = rcu_dereference(dev->xps_rxqs_map); > >> if (dev_maps) { > >> - unsigned int tci = skb->sender_cpu - 1; > >> + int tci = sk_rx_queue_get(sk); > > > > What if the rx device differs from the tx device? > > > I think I have 3 options here: > 1. Cache the ifindex in sock_common which will introduce a new > additional field in sock_common. > 2. Use dev_get_by_napi_id to get the device id. This could be expensive, > if the rxqs_map is set, this will be done on every packet and involves > walking through the hashlist for napi_id lookup.
The tx queue mapping is cached in the sk for connected sockets, but indeed this would be expensive for many workloads. > 3. Remove validating device id, similar to how it is in skb_tx_hash > where rx_queue recorded is used and if not, fall through to flow hash > calculation. > What do you think is suitable here? Alternatively, just accept the misprediction in this rare case. But do make the caveat explicit in the documentation.