CVSROOT: /cvs
Module name: src
Changes by: [email protected] 2025/10/29 04:32:38
Modified files:
usr.sbin/bgpd : Makefile
Added files:
usr.sbin/bgpd : chash.c chash.h
Log message:
Introduce chash or CH - a cache friendly hash table
The CH hash table is split into an extendible hashing part and fixed
size sub tables. The sub tables are split into CH_H2_SIZE groups holding
7 elements each. The sub table size is therefore 512 * 9 = 3584 entries.
The sub tables are open addressed hash tables similar to swiss tables
but the groups are only using 7 fields and not 16. Also the meta data
is part of each group. On 64bit archs a group uses 64 bytes.
The 64bit hash is split in three parts:
o H1: variable size hash used by the extendible hashing table
o H2: fixed size hash to select a group in a sub table.
o H3: 1 byte hash to select element in a group.
Unlike other unordered sets this hash table does not allow a key to
be inserted more than once.
CH is built similar to the RBT implementation in sys/tree.h (see RBT_INIT(9)).
It uses internal common code instead of crazy big macros.
In many cases bgpd uses RB trees for data that does not need to be ordered.
The overhead to keep the tree ordered is substantial, this is why it makes
sense to support an unordered hash table implementation.
CH_PROTOTYPE() and CH_GENERATE() are used to generate the hash table
implementation for a specific type.
CH_INSERT(), CH_REMOVE() and CH_FIND() are used to add, remove and look up
elements. Additionally there is CH_LOCATE() which is a look up function that
does not need a needle struct to be set up for the lookup.
It is also possible to traverse the hash table with CH_FOREACH, CH_FIRST,
CH_NEXT but remember this is an unordered set so there is no order.
CH_INSERT() will automatically grow the hash table and can therefor fail
in out of memory conditions. This is different to RB trees where each
element contains a RBT_ENTRY member to hold the tree meta data. There is
no RBT_ENTRY equivalent in CH tables.
CH_REMOVE() will automatically compact the table if it is possible.
This code can be further improved and optimised but it is working well
and work should happen in tree instead.
OK tb@