We have encountered an issue resulting from commit 2724680bceee ("neigh: Keep
neighbour cache entries if number of them is small enough."), which allows
stale entries to remain in the neigh table indefinitely if the total number of
entries is less than gc_thresh1.
This issue arises if:
- a stale entry has existed for a long time, so it has a sufficiently old
neigh->confirmed value
- the neighbour itself has sinced change MAC address
- we then try to ping the neighbour
When we ping the neighbour, the entry moves into NUD_DELAY as expected. But
then, within neigh_timer_handler(), an incorrect jiffie comparison causes
time_before_eq(now, neigh->confirmed + NEIGH_VAR(neigh->parms,
DELAY_PROBE_TIME)) to return true and the entry is erroneously moved to
NUD_REACHABLE. The entry becomes stuck in this state, even though it is not
actually reachable as the neighbour has since changed MAC address.
The necessary age of neigh->confirmed for this to occur depends on the
platform. It occurs after approximitely 100 days on a 32-bit platform with
250HZ.
We have resolved this by setting gc_thresh1 = 0, which effectively undoes
commit 2724680bceee.
I would like to know if anyone else has observed this or has an alternative
solution.
Kind regards,
Ash