Hi,
* Alexander Bluhm wrote:
> On Sun, Jun 20, 2021 at 07:24:14PM +0200, Matthias Schmidt wrote:
> > kernel: double fault trap, code=0
> > Stopped at m_copydata+0x17: pushq %r14
> > m_copydata(fffffd807cfbb100,14,14,ffff800022e5d1d4) at m_copydata+0x17
> > pf_pull_hdr(fffffd807cfbb100,14,ffff800022e5d1d4,14,0,ffff800022e5d22e) at
> > pf_pull_hdr+0xa9
> > pf_setup_pdsec(ffff800022e5d130,2,2,ffff8000006bd600,fffffd807cfbb100,ffff800022e5d22e)
> > at pf_setup_pdesc+0x213
> > pf_test(2,2,ffff80000018800,ffff800022e5d320) qt pf_test+0x172
> > ip_output(fffffd807cfbb100,0,fffffd8259008d80,800,0,fffffd8259008d10) ad
> > ip_out0ut+0x7b6
> > tcp_output(ffff8000013ab000) at tcp_output+0x1a10
> > tcp_output(ffff8000013ab000) at tcp_nutput+0x1a10
> > tcp_output(ffff8000013ab000) at tcp_output+0x1a10
> > tcp_output(ffff8000013ab000) at tcp_output+0x1a10
> > tcp_output(fDff8000013ab000) at tcp_output+0x1a10
> > tcp_output(ffff8000013ab000) at tcp_output+0x1a10
> > [...]
>
> Debugging with tobhe@ revealed that this endless recursion is
> triggerd by using enc0 interface to configure the local IP addresss.
> Workaround is easy, follow the FAQ and use lo1.
>
> But the kernel should not crash anyway.
>
> Something like this may happen:
> - PMTU discovery does not work properly at a certain time
> - after 10 seconds TCP marks the route MTU as bad
> - IP output clears DF flag and is sending fragments
> - interface MTU for enc0 is 0, fragmentation fails
> - the EMSGSIZE error triggers PMTU TCP resend
> - loop to IP output
>
> This diff is resending the packet only if the MTU flag appears at
> the route and was not there before. At least this should prevent
> the recusion.
>
> Please test TCP in IPsec and also TCP in strange MTU environments.
I have the patch running since yesterday in a roadwarrior setup and
created a lot of artificial network load. So far, all is stable. Note
that I use lo1 and not deliberately enc0.
Cheers
Matthias