From: Antonio Fischetti <[email protected]> When OVS is configured as a firewall, with thousands of active concurrent connections, the EMC gets quicly saturated and may come under heavy thrashing for the reason that original and recirculated packets keep overwrite existing active EMC entries due to its limited size(8k).
This thrashing causes the EMC to be less efficient than the dcpls in terms of lookups and insertions. This patch allows to use the EMC efficiently by allowing only the 'original' packets to hit EMC. All recirculated packets are sent to classifier directly. An empirical threshold (EMC_FULL_THRESHOLD - of 50%) for EMC occupancy is set to trigger this logic. By doing so when EMC utilization exceeds EMC_FULL_THRESHOLD. - EMC Insertions are allowed just for original packets. EMC insertion and look up is skipped for recirculated packets. - Recirculated packets are sent to classifier. This patch depends on the previous one in this series. It's based on patch "dpif-netdev: add EMC entry count and %full figure to pmd-stats-show" at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327570.html Signed-off-by: Antonio Fischetti <[email protected]> Signed-off-by: Bhanuprakash Bodireddy <[email protected]> Co-authored-by: Bhanuprakash Bodireddy <[email protected]> --- In our Connection Tracker testbench set up with table=0, priority=1 actions=drop table=0, priority=10,arp actions=NORMAL table=0, priority=100,ct_state=-trk,ip actions=ct(table=1) table=1, ct_state=+new+trk,ip,in_port=1 actions=ct(commit),output:2 table=1, ct_state=+est+trk,ip,in_port=1 actions=output:2 table=1, ct_state=+new+trk,ip,in_port=2 actions=drop table=1, ct_state=+est+trk,ip,in_port=2 actions=output:1 we saw the following performance improvement. Measured packet Rx rate (regardless of packet loss). Bidirectional test with 64B UDP packets. Each row is a test with a different number of traffic streams. The traffic generator is set so that each stream establishes one UDP connection. Mpps columns reports the Rx rates on the 2 sides. Traffic | Orig | Orig | +changes | +changes Streams | [Mpps] | [EMC entries] | [Mpps] | [EMC entries] ---------+------------+---------------+------------+--------------- 10 | 3.4, 3.4 | 20 | 3.4, 3.4 | 20 100 | 2.6, 2.7 | 200 | 2.6, 2.7 | 201 1,000 | 2.4, 2.4 | 2009 | 2.4, 2.4 | 1994 2,000 | 2.2, 2.2 | 3903 | 2.2, 2.2 | 3900 3,000 | 2.1, 2.1 | 5473 | 2.2, 2.2 | 4798 4,000 | 2.0, 2.0 | 6478 | 2.2, 2.2 | 5663 10,000 | 1.8, 1.9 | 8070 | 2.0, 2.0 | 7347 100,000 | 1.7, 1.7 | 8192 | 1.8, 1.8 | 8192 lib/dpif-netdev.c | 46 ++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 40 insertions(+), 6 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index fd2ed52..64a3cd4 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -4538,6 +4538,8 @@ dp_netdev_queue_batches(struct dp_packet *pkt, packet_batch_per_flow_update(batch, pkt, mf); } +#define EMC_FULL_THRESHOLD 0x0000F000 + /* Try to process all ('cnt') the 'packets' using only the exact match cache * 'pmd->flow_cache'. If a flow is not found for a packet 'packets[i]', the * miniflow is copied into 'keys' and the packet pointer is moved at the @@ -4582,6 +4584,19 @@ emc_processing(struct dp_netdev_pmd_thread *pmd, pkt_metadata_prefetch_init(&packets[i+1]->md); } + /* + * EMC lookup is skipped when one or both of the following + * two cases occurs: + * + * - EMC is disabled. This is detected from cur_min. + * + * - The EMC occupancy exceeds EMC_FULL_THRESHOLD and the + * packet to be classified is being recirculated. When this + * happens also EMC insertions are skipped for recirculated + * packets. So that EMC is used just to store entries which + * are hit from the 'original' packets. This way the EMC + * thrashing is mitigated with a benefit on performance. + */ if (!md_is_valid) { pkt_metadata_init(&packet->md, port_no); miniflow_extract(packet, &key->mf); @@ -4603,11 +4618,18 @@ emc_processing(struct dp_netdev_pmd_thread *pmd, } else { /* Recirculated packets. */ miniflow_extract(packet, &key->mf); - if (OVS_LIKELY(cur_min)) { - key->hash = dpif_netdev_packet_get_rss_hash(packet, &key->mf); - flow = emc_lookup(flow_cache, key); - } else { + if (flow_cache->n_entries & EMC_FULL_THRESHOLD) { + /* EMC occupancy is over the threshold. We skip EMC + * lookup for recirculated packets. */ flow = NULL; + } else { + if (OVS_LIKELY(cur_min)) { + key->hash = dpif_netdev_packet_get_rss_hash(packet, + &key->mf); + flow = emc_lookup(flow_cache, key); + } else { + flow = NULL; + } } } key->len = 0; /* Not computed yet. */ @@ -4695,7 +4717,13 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd, add_actions->size); } ovs_mutex_unlock(&pmd->flow_mutex); - emc_probabilistic_insert(pmd, key, netdev_flow); + /* When EMC occupancy goes over a threshold we avoid inserting new + * entries for recirculated packets. */ + if (!packet->md.recirc_id) { + emc_probabilistic_insert(pmd, key, netdev_flow); + } else if (!(pmd->flow_cache.n_entries & EMC_FULL_THRESHOLD)) { + emc_probabilistic_insert(pmd, key, netdev_flow); + } } } @@ -4788,7 +4816,13 @@ fast_path_processing(struct dp_netdev_pmd_thread *pmd, flow = dp_netdev_flow_cast(rules[i]); - emc_probabilistic_insert(pmd, &keys[i], flow); + /* When EMC occupancy goes over a threshold we avoid inserting new + * entries for recirculated packets. */ + if (!packet->md.recirc_id) { + emc_probabilistic_insert(pmd, &keys[i], flow); + } else if (!(pmd->flow_cache.n_entries & EMC_FULL_THRESHOLD)) { + emc_probabilistic_insert(pmd, &keys[i], flow); + } dp_netdev_queue_batches(packet, flow, &keys[i].mf, batches, n_batches); } -- 2.4.11 _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
