Am Dienstag, 11. September 2018, 12:33:34 schrieb Steffen Klassert: > On Mon, Sep 10, 2018 at 10:18:47AM +0200, Kristian Evensen wrote: > > Hi, > > > > Thanks everyone for all the effort in debugging this issue. > > > > On Mon, Sep 10, 2018 at 8:39 AM Steffen Klassert > > > > <steffen.klass...@secunet.com> wrote: > > > The easy fix that could be backported to stable would be > > > to check skb->dst for NULL and drop the packet in that case. > > > > Thought I should just chime in and say that we deployed this > > work-around when we started observing the error back in June. Since > > then we have not seen any crashes. Also, we have instrumented some of > > our kernels to count the number of times the error is hit (overall + > > consecutive). Compared to the overall number of packets, the error > > happens very rarely. With our workloads, we on average see the error > > once every couple of days. > > Thanks for letting us know! > > I plan to fix this in the ipsec tree with: > > Subject: [PATCH RFC] xfrm: Fix NULL pointer dereference when skb_dst_force > clears the dst_entry. > > Since commit 222d7dbd258d ("net: prevent dst uses after free") > skb_dst_force() might clear the dst_entry attached to the skb. > The xfrm code don't expect this to happen, so we crash with > a NULL pointer dereference in this case. Fix it by checking > skb_dst(skb) for NULL after skb_dst_force() and drop the packet > in cast the dst_entry was cleared. > > Fixes: 222d7dbd258d ("net: prevent dst uses after free") > Reported-by: Tobias Hommel <netdev-l...@genoetigt.de> > Reported-by: Kristian Evensen <kristian.even...@gmail.com> > Reported-by: Wolfgang Walter <li...@stwm.de> > Signed-off-by: Steffen Klassert <steffen.klass...@secunet.com> > --- > net/xfrm/xfrm_output.c | 4 ++++ > net/xfrm/xfrm_policy.c | 4 ++++ > 2 files changed, 8 insertions(+) > > diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c > index 89b178a78dc7..36d15a38ce5e 100644 > --- a/net/xfrm/xfrm_output.c > +++ b/net/xfrm/xfrm_output.c > @@ -101,6 +101,10 @@ static int xfrm_output_one(struct sk_buff *skb, int > err) spin_unlock_bh(&x->lock); > > skb_dst_force(skb); > + if (!skb_dst(skb)) { > + XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR); > + goto error_nolock; > + } > > if (xfrm_offload(skb)) { > x->type_offload->encap(x, skb); > diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c > index 7c5e8978aeaa..626e0f4d1749 100644 > --- a/net/xfrm/xfrm_policy.c > +++ b/net/xfrm/xfrm_policy.c > @@ -2548,6 +2548,10 @@ int __xfrm_route_forward(struct sk_buff *skb, > unsigned short family) } > > skb_dst_force(skb); > + if (!skb_dst(skb)) { > + XFRM_INC_STATS(net, LINUX_MIB_XFRMFWDHDRERROR); > + return 0; > + } > > dst = xfrm_lookup(net, skb_dst(skb), &fl, NULL, XFRM_LOOKUP_QUEUE); > if (IS_ERR(dst)) {
This patch fixes the problem here. XfrmFwdHdrError gets around 80 at the very beginning and remains so. Probably this happens when some route are changed/set then. Regards and thanks, -- Wolfgang Walter Studentenwerk München Anstalt des öffentlichen Rechts