On 25 Jul 2022, at 18:20, John Baldwin wrote:
On 7/22/22 4:19 PM, Kristof Provost wrote:
The branch main has been updated by kp:
URL:
https://cgit.FreeBSD.org/src/commit/?id=151abc80cde778bc18b91c334d07fbd52bbb38fb
commit 151abc80cde778bc18b91c334d07fbd52bbb38fb
Author: Kristof Provost <k...@freebsd.org>
AuthorDate: 2022-07-22 17:17:04 +0000
Commit: Kristof Provost <k...@freebsd.org>
CommitDate: 2022-07-22 17:18:41 +0000
if_vlan: avoid hash table thrashing when adding and removing
entries
vlan_remhash() uses incorrect value for b.
When using the default value for VLAN_DEF_HWIDTH (4), the
VLAN hash-list table
expands from 16 chains to 32 chains as the 129th entry is added.
trunk->hwidth
becomes 5. Say a few more entries are added and there are now
135 entries.
trunk-hwidth will still be 5. If an entry is removed,
vlan_remhash() will
calculate a value of 32 for b. refcnt will be decremented to
134. The if
comparison at line 473 will return true and vlan_growhash() will
be called. The
VLAN hash-list table will be compressed from 32 chains wide to
16 chains wide.
hwidth will become 4. This is an error, and it can be seen when
a new VLAN is
added. The table will again be expanded. If an entry is then
removed, again
the table is contracted.
If the number of VLANS stays in the range of 128-512, each
time an insert
follows a remove, the table will expand. Each time a remove
follows an
insert, the table will be contracted.
The fix is simple. The line 473 should test that the number
of entries has
decreased such that the table should be contracted using what
would be the new
value of hwidth. line 467 should be:
b = 1 << (trunk->hwidth - 1);
PR: 265382
Reviewed by: kp
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
This does mean that there are some odd edge cases (e.g. if you remove
the 513th entry
only to turn around and re-add it) that can still thrash even with
this fixed. Not
sure if those are worth worrying about though? If they were, perhaps
you could
imagine some sort of "dampener" where you have to remove some number N
of entries
to actually rehash to a smaller table. But that is probably rather
complex to
implement and probably not worth it?
Good point, although I think I agree it’s probably not worth worrying
too much about this.
There’s a tangentially related issue I also want to look at one of
these days, namely that the current `trunk->refcnt > (b * b) / 2` check
results in far too many entries in each hash row. For example, until 128
vlan interfaces we stick with width = 4, so end up with an average of 8
entries in each row.
That means that for every inbound packet we scan through a linked list
of 8 entries to find the correct vlan interface. We should tweak the
check to reduce this, as all it’ll cost is a tiny bit of memory. (Or
we should just default VLAN_ARRAY on, paying a bit more memory for no
linked-list scanning).
Kristof