On 2007-Jun-29 18:20:32 +1000, Peter Jeremy <[EMAIL PROTECTED]> wrote:
>I might try to analyse the behaviour of NAT_HASH_FN() with my traffic
>mix and see how skewed the output is.

I've captured some NAT records via 'ipnat -o N' and simulated the hashing
in nat_insert().  In total I have:

5720388 NAT entries
 709174 unique src/dst/NAT entries
 499023 unique hv1 values ignoring second modulo (max 7 clashes)
 630031 unique hv2 values ignoring second modulo (max 7 clashes)
    133 unique 'internal' addresses
    103 unique NAT'd addresses
    162 unique external addresses

I've experimented with a variety of hash moduli and 2047 is
particularly bad - there's a roughly 3.5:1 ratio in the number of
entries in each hash bucket.  Using 16383 (the default 'LARGE_NAT'
value) is even worse - there's a 9:1 ratio in bucket entries.  This
is probably because both moduli have very small factors.

Looking at prime numbers around 2047, 2039 and 2063 appear to give the
flattest distributions (a ratio of ~1.3:1 in bucket sizes for both) -
though this is likely to be data dependent.

Is there anything particularly special about your choice of 2^N-1
values for the moduli or can it be safely changed to a suitable prime
number?

-- 
Peter Jeremy

Attachment: pgpUXCRCCiyEL.pgp
Description: PGP signature

Reply via email to