Paolo Valerio <pvale...@redhat.com> writes: > From: Gaetan Rivet <gr...@u256.net> > > This patch aims to replace the expiration lists as, due to the way > they are used, besides being a source of contention, they have a known > issue when used with non-default policies for different zones that > could lead to retaining expired connections potentially for a long > time. > > This patch replaces them with an array of rculist used to distribute > all the newly created connections in order to, during the sweeping > phase, scan them without locking, and evict the expired connections > only locking during the actual removal. This allows to reduce the > contention introduced by the pushback performed at every packet > update, also solving the issue related to zones and timeout policies. > > Signed-off-by: Gaetan Rivet <gr...@u256.net> > Co-authored-by: Paolo Valerio <pvale...@redhat.com> > Signed-off-by: Paolo Valerio <pvale...@redhat.com> > --- > v6: > - minor function renaming > - removed conn->lock in conn_clean() as this was unneeded. > - minor commit message rephrase > ---
I have a small incremental diff that changes minor things: - changes/remove some comments - next_wake changed from 30 to 20 - removes MAX() as next_wake is always used I'm planning to send it by the end of Monday so that if any feedback comes can be folded into a single re-spin. diff --git a/lib/conntrack.c b/lib/conntrack.c index 819c356c1..66a44da2e 100644 --- a/lib/conntrack.c +++ b/lib/conntrack.c @@ -1552,11 +1552,7 @@ set_label(struct dp_packet *pkt, struct conn *conn, } -/* Delete the expired connections from 'ctb', up to 'limit'. Returns the - * earliest expiration time among the remaining connections in 'ctb'. Returns - * LLONG_MAX if 'ctb' is empty. The return value might be smaller than 'now', - * if 'limit' is reached */ -static long long +static size_t ct_sweep(struct conntrack *ct, struct rculist *list, long long now) OVS_NO_THREAD_SAFETY_ANALYSIS { @@ -1574,16 +1570,15 @@ ct_sweep(struct conntrack *ct, struct rculist *list, long long now) return count; } -/* Cleans up old connection entries from 'ct'. Returns the time when the - * next expiration might happen. The return value might be smaller than - * 'now', meaning that an internal limit has been reached, and some expired - * connections have not been deleted. */ +/* Cleans up old connection entries from 'ct'. Returns the time + * when the next wake will happen. The return value might be zero, + * meaning that an internal limit has been reached. */ static long long conntrack_clean(struct conntrack *ct, long long now) { - long long next_wakeup = now + 30 * 1000; - unsigned int n_conn_limit, i, count = 0; - size_t clean_end; + long long next_wakeup = now + 20 * 1000; + unsigned int n_conn_limit, i; + size_t clean_end, count = 0; atomic_read_relaxed(&ct->n_conn_limit, &n_conn_limit); clean_end = n_conn_limit / 64; @@ -1599,7 +1594,7 @@ conntrack_clean(struct conntrack *ct, long long now) ct->next_sweep = (i < EXP_LISTS) ? i : 0; - VLOG_DBG("conntrack cleanup %"PRIu32" entries in %lld msec", count, + VLOG_DBG("conntrack cleanup %"PRIuSIZE" entries in %lld msec", count, time_msec() - now); return next_wakeup; @@ -1608,24 +1603,8 @@ conntrack_clean(struct conntrack *ct, long long now) /* Cleanup: * * We must call conntrack_clean() periodically. conntrack_clean() return - * value gives an hint on when the next cleanup must be done (either because - * there is an actual connection that expires, or because a new connection - * might be created with the minimum timeout). - * - * The logic below has two goals: - * - * - We want to reduce the number of wakeups and batch connection cleanup - * when the load is not very high. CT_CLEAN_INTERVAL ensures that if we - * are coping with the current cleanup tasks, then we wait at least - * 5 seconds to do further cleanup. - * - * - We don't want to keep the map locked too long, as we might prevent - * traffic from flowing. CT_CLEAN_MIN_INTERVAL ensures that if cleanup is - * behind, there is at least some 200ms blocks of time when the map will be - * left alone, so the datapath can operate unhindered. - */ -#define CT_CLEAN_INTERVAL 5000 /* 5 seconds */ -#define CT_CLEAN_MIN_INTERVAL 200 /* 0.2 seconds */ + * value gives an hint on when the next cleanup must be done. */ +#define CT_CLEAN_MIN_INTERVAL_MS 200 static void * clean_thread_main(void *f_) @@ -1639,9 +1618,9 @@ clean_thread_main(void *f_) next_wake = conntrack_clean(ct, now); if (next_wake < now) { - poll_timer_wait_until(now + CT_CLEAN_MIN_INTERVAL); + poll_timer_wait_until(now + CT_CLEAN_MIN_INTERVAL_MS); } else { - poll_timer_wait_until(MAX(next_wake, now + CT_CLEAN_INTERVAL)); + poll_timer_wait_until(next_wake); } latch_wait(&ct->clean_thread_exit); poll_block(); _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev