> -----Original Message----- > From: Intel-wired-lan <[email protected]> On Behalf Of > Krishna Kumar > Sent: 20 May 2025 22:37 > To: [email protected] > Cc: [email protected]; [email protected]; Nguyen, Anthony L > <[email protected]>; Kitszel, Przemyslaw > <[email protected]>; [email protected]; > [email protected]; [email protected]; [email protected]; > [email protected]; Samudrala, Sridhar <[email protected]>; Zaki, > Ahmed <[email protected]>; Kumar, Krishna <[email protected]> > Subject: [Intel-wired-lan] [PATCH v2 net] net: ice: Perform accurate aRFS > flow match > > This patch fixes an issue seen in a large-scale deployment under heavy > incoming pkts where the aRFS flow wrongly matches a flow and reprograms the > NIC with wrong settings. That mis-steering causes RX-path latency spikes and > noisy neighbor effects when many connections collide on the same hash (some > of our production servers have 20-30K connections). > > set_rps_cpu() calls ndo_rx_flow_steer() with flow_id that is calculated by > hashing the skb sized by the per rx-queue table size. This results in > multiple connections (even across different rx-queues) getting the same hash > value. > The driver steer function modifies the wrong flow to use this > rx-queue, e.g.: Flow#1 is first added: > Flow#1: <ip1, port1, ip2, port2>, Hash 'h', q#10 > > Later when a new flow needs to be added: > Flow#2: <ip3, port3, ip4, port4>, Hash 'h', q#20 > > The driver finds the hash 'h' from Flow#1 and updates it to use q#20. This > results in both flows getting un-optimized - packets for Flow#1 goes to q#20, > and then reprogrammed back to q#10 later and so on; and Flow #2 programming > is never done as Flow#1 is matched first for all misses. Many flows may > wrongly share the same hash and reprogram rules of the original flow each > with their own q#. > > Tested on two 144-core servers with 16K netperf sessions for 180s. Netperf > clients are pinned to cores 0-71 sequentially (so that wrong packets on q#s 72-143 can be measured). IRQs are set 1:1 for queues -> CPUs, enable XPS, enable aRFS (global value is 144 * rps_flow_cnt). > > Test notes about results from ice_rx_flow_steer(): > --------------------------------------------------- > 1. "Skip:" counter increments here: > if (fltr_info->q_index == rxq_idx || > arfs_entry->fltr_state != ICE_ARFS_ACTIVE) > goto out; > 2. "Add:" counter increments here: > ret = arfs_entry->fltr_info.fltr_id; > INIT_HLIST_NODE(&arfs_entry->list_entry); > 3. "Update:" counter increments here: > /* update the queue to forward to on an already existing flow */ > > Runtime comparison: original code vs with the patch for different > rps_flow_cnt values. > > +-------------------------------+--------------+--------------+ > | rps_flow_cnt | 512 | 2048 | > +-------------------------------+--------------+--------------+ > | Ratio of Pkts on Good:Bad q's | 214 vs 822K | 1.1M vs 980K | > | Avoid wrong aRFS programming | 0 vs 310K | 0 vs 30K | > | CPU User | 216 vs 183 | 216 vs 206 | > | CPU System | 1441 vs 1171 | 1447 vs 1320 | > | CPU Softirq | 1245 vs 920 | 1238 vs 961 | > | CPU Total | 29 vs 22.7 | 29 vs 24.9 | > | aRFS Update | 533K vs 59 | 521K vs 32 | > | aRFS Skip | 82M vs 77M | 7.2M vs 4.5M | > +-------------------------------+--------------+--------------+ > > A separate TCP_STREAM and TCP_RR with 1,4,8,16,64,128,256,512 connections > showed no performance degradation. > > Some points on the patch/aRFS behavior: > 1. Enabling full tuple matching ensures flows are always correctly matched, > even with smaller hash sizes. > 2. 5-6% drop in CPU utilization as the packets arrive at the correct CPUs > and fewer calls to driver for programming on misses. > 3. Larger hash tables reduces mis-steering due to more unique flow hashes, > but still has clashes. However, with larger per-device rps_flow_cnt, old > flows take more time to expire and new aRFS flows cannot be added if h/w > limits are reached (rps_may_expire_flow() succeeds when 10*rps_flow_cnt > pkts have been processed by this cpu that are not part of the flow). > > Changes since v1: > - Added "Fixes:" tag and documented return values. > - Added @ for function parameters. > - Updated subject line to denote target tree (net) > > Fixes: 28bf26724fdb0 ("ice: Implement aRFS") > Signed-off-by: Krishna Kumar <[email protected]> > --- > drivers/net/ethernet/intel/ice/ice_arfs.c | 49 +++++++++++++++++++++++ > 1 file changed, 49 insertions(+) >
Tested-by: Rinitha S <[email protected]> (A Contingent worker at Intel)
