Am Dienstag, 11. September 2018, 12:33:34 schrieb Steffen Klassert:
> On Mon, Sep 10, 2018 at 10:18:47AM +0200, Kristian Evensen wrote:
> > Hi,
> > 
> > Thanks everyone for all the effort in debugging this issue.
> > 
> > On Mon, Sep 10, 2018 at 8:39 AM Steffen Klassert
> > 
> > <steffen.klass...@secunet.com> wrote:
> > > The easy fix that could be backported to stable would be
> > > to check skb->dst for NULL and drop the packet in that case.
> > 
> > Thought I should just chime in and say that we deployed this
> > work-around when we started observing the error back in June. Since
> > then we have not seen any crashes. Also, we have instrumented some of
> > our kernels to count the number of times the error is hit (overall +
> > consecutive). Compared to the overall number of packets, the error
> > happens very rarely. With our workloads, we on average see the error
> > once every couple of days.
> 
> Thanks for letting us know!
> 
> I plan to fix this in the ipsec tree with:
> 
> Subject: [PATCH RFC] xfrm: Fix NULL pointer dereference when skb_dst_force
> clears the dst_entry.
> 
> Since commit 222d7dbd258d ("net: prevent dst uses after free")
> skb_dst_force() might clear the dst_entry attached to the skb.
> The xfrm code don't expect this to happen, so we crash with
> a NULL pointer dereference in this case. Fix it by checking
> skb_dst(skb) for NULL after skb_dst_force() and drop the packet
> in cast the dst_entry was cleared.
> 
> Fixes: 222d7dbd258d ("net: prevent dst uses after free")
> Reported-by: Tobias Hommel <netdev-l...@genoetigt.de>
> Reported-by: Kristian Evensen <kristian.even...@gmail.com>
> Reported-by: Wolfgang Walter <li...@stwm.de>
> Signed-off-by: Steffen Klassert <steffen.klass...@secunet.com>
> ---
>  net/xfrm/xfrm_output.c | 4 ++++
>  net/xfrm/xfrm_policy.c | 4 ++++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
> index 89b178a78dc7..36d15a38ce5e 100644
> --- a/net/xfrm/xfrm_output.c
> +++ b/net/xfrm/xfrm_output.c
> @@ -101,6 +101,10 @@ static int xfrm_output_one(struct sk_buff *skb, int
> err) spin_unlock_bh(&x->lock);
> 
>               skb_dst_force(skb);
> +             if (!skb_dst(skb)) {
> +                     XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
> +                     goto error_nolock;
> +             }
> 
>               if (xfrm_offload(skb)) {
>                       x->type_offload->encap(x, skb);
> diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
> index 7c5e8978aeaa..626e0f4d1749 100644
> --- a/net/xfrm/xfrm_policy.c
> +++ b/net/xfrm/xfrm_policy.c
> @@ -2548,6 +2548,10 @@ int __xfrm_route_forward(struct sk_buff *skb,
> unsigned short family) }
> 
>       skb_dst_force(skb);
> +     if (!skb_dst(skb)) {
> +             XFRM_INC_STATS(net, LINUX_MIB_XFRMFWDHDRERROR);
> +             return 0;
> +     }
> 
>       dst = xfrm_lookup(net, skb_dst(skb), &fl, NULL, XFRM_LOOKUP_QUEUE);
>       if (IS_ERR(dst)) {

This patch fixes the problem here.

XfrmFwdHdrError gets around 80 at the very beginning and remains so. Probably 
this happens when some route are changed/set then. 

Regards and thanks,
-- 
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts

Reply via email to