On 25 Jul 2022, at 18:20, John Baldwin wrote:
On 7/22/22 4:19 PM, Kristof Provost wrote:
The branch main has been updated by kp:

URL: https://cgit.FreeBSD.org/src/commit/?id=151abc80cde778bc18b91c334d07fbd52bbb38fb

commit 151abc80cde778bc18b91c334d07fbd52bbb38fb
Author:     Kristof Provost <k...@freebsd.org>
AuthorDate: 2022-07-22 17:17:04 +0000
Commit:     Kristof Provost <k...@freebsd.org>
CommitDate: 2022-07-22 17:18:41 +0000

if_vlan: avoid hash table thrashing when adding and removing entries
         vlan_remhash() uses incorrect value for b.
When using the default value for VLAN_DEF_HWIDTH (4), the VLAN hash-list table expands from 16 chains to 32 chains as the 129th entry is added. trunk->hwidth becomes 5. Say a few more entries are added and there are now 135 entries. trunk-hwidth will still be 5. If an entry is removed, vlan_remhash() will calculate a value of 32 for b. refcnt will be decremented to 134. The if comparison at line 473 will return true and vlan_growhash() will be called. The VLAN hash-list table will be compressed from 32 chains wide to 16 chains wide. hwidth will become 4. This is an error, and it can be seen when a new VLAN is added. The table will again be expanded. If an entry is then removed, again
     the table is contracted.
If the number of VLANS stays in the range of 128-512, each time an insert follows a remove, the table will expand. Each time a remove follows an
     insert, the table will be contracted.
The fix is simple. The line 473 should test that the number of entries has decreased such that the table should be contracted using what would be the new
     value of hwidth. line 467 should be:
                 b = 1 << (trunk->hwidth - 1);
         PR:             265382
     Reviewed by:    kp
     MFC after:      2 weeks
     Sponsored by:   NetApp, Inc.

This does mean that there are some odd edge cases (e.g. if you remove the 513th entry only to turn around and re-add it) that can still thrash even with this fixed. Not sure if those are worth worrying about though? If they were, perhaps you could imagine some sort of "dampener" where you have to remove some number N of entries to actually rehash to a smaller table. But that is probably rather complex to
implement and probably not worth it?

Good point, although I think I agree it’s probably not worth worrying too much about this.

There’s a tangentially related issue I also want to look at one of these days, namely that the current `trunk->refcnt > (b * b) / 2` check results in far too many entries in each hash row. For example, until 128 vlan interfaces we stick with width = 4, so end up with an average of 8 entries in each row. That means that for every inbound packet we scan through a linked list of 8 entries to find the correct vlan interface. We should tweak the check to reduce this, as all it’ll cost is a tiny bit of memory. (Or we should just default VLAN_ARRAY on, paying a bit more memory for no linked-list scanning).

Kristof

Reply via email to