On Tue, Aug 23, 2022 at 10:15:22AM +0200, Stefan Sperling wrote:
> I found one of my amd64 systems running -current, built on 12th of
> August, has crashed as follows.
I there any chance that the kernel sources are between these commits?
August 12th does not fit exactly, do you remember when you did the
checkout? Or is it a snapshot kernel?
----------------------------
revision 1.246
date: 2022/08/09 21:10:03; author: kn; state: Exp; lines: +10 -10;
commitid: 7dnmtpMeiy7k6IOQ;
Backout "Call getuptime() just once per function"
This caused stuck ndp cache entries as found by naddy, sorry.
----------------------------
revision 1.244
date: 2022/08/08 15:56:35; author: kn; state: Exp; lines: +10 -10;
commitid: ILY0HdurUXzwu2qJ;
Call getuptime() just once per function
IPv6 pendant to bluhm's sys/netinet/if_ether.c r1.249:
Instead of calling getuptime() all the time in ARP code, do it only
once per function. This gives a more consistent time value.
OK claudio@ miod@ mvs@
OK bluhm
----------------------------
> I am not sure if this is still relevant; please excuse the noise if
> this has already been found and fixed.
I am not aware of a fix in this area. nd6 is not MP safe, so we
have a big kernel lock around it. I have asked kn@ to look at nd6
locking.
The interaction between rtable SRP locking and MP access to routing
table leaves like nd6 is less than optinal. I expect bugs there.
> kernel: protection fault trap, code=0
> Stopped at rt_ifa_del+0x39: movb 0x1be(%rax),%bl
> ddb{2}> bt
> rt_ifa_del(ffff8000004e9400,800100,deaf0009deafbead,0) at rt_ifa_del+0x39
> in6_unlink_ifa(ffff8000004e9400,ffff8000000da2a8) at in6_unlink_ifa+0xae
> in6_purgeaddr(ffff8000004e9400) at in6_purgeaddr+0x127
> nd6_expire(0) at nd6_expire+0x96
> taskq_thread(ffff80000002c080) at taskq_thread+0x100
> end trace frame: 0x0, count: -5