Hello Steffen,

in one of your emails to Thomas you wrote:
> xfrm_lookup+0x2a is at the very beginning of xfrm_lookup(), here we
> find:
> 
> u16 family = dst_orig->ops->family;
> 
> ops has an offset of 32 bytes (20 hex) in dst_orig, so looks like
> dst_orig is NULL.
> 
> In the forwarding case, we get dst_orig from the skb and dst_orig
> can't be NULL here unless the skb itself is already fishy.

Is this really true?

If xfrm_lookup is called from 

__xfrm_route_forward():

int __xfrm_route_forward(struct sk_buff *skb, unsigned short family)
{
        struct net *net = dev_net(skb->dev);
        struct flowi fl;
        struct dst_entry *dst;
        int res = 1;

        if (xfrm_decode_session(skb, &fl, family) < 0) {
                XFRM_INC_STATS(net, LINUX_MIB_XFRMFWDHDRERROR);
                return 0;
        }

        skb_dst_force(skb);

        dst = xfrm_lookup(net, skb_dst(skb), &fl, NULL, XFRM_LOOKUP_QUEUE);
        if (IS_ERR(dst)) {
                res = 0;
                dst = NULL;
        }
        skb_dst_set(skb, dst);
        return res;
}

couldn't it be possible that skb_dst_force(skb) actually sets dst to NULL if 
it cannot safely lock it? If it is absolutely sure that skb_dst_force() never 
can set dst to NULL I wonder why it is called at all?


Here is  skb_dst_force()

static inline void skb_dst_force(struct sk_buff *skb)
{
        if (skb_dst_is_noref(skb)) {
                struct dst_entry *dst = skb_dst(skb);

                WARN_ON(!rcu_read_lock_held());
                if (!dst_hold_safe(dst))
                        dst = NULL;

                skb->_skb_refdst = (unsigned long)dst;
        }
}

and dst_hold_safe() is

static inline bool dst_hold_safe(struct dst_entry *dst)
{
        return atomic_inc_not_zero(&dst->__refcnt);
}



Am Freitag, 7. September 2018, 22:22:39 schrieb Wolfgang Walter:
> Am Freitag, 31. August 2018, 08:50:24 schrieb Steffen Klassert:
> > On Thu, Aug 30, 2018 at 08:53:50PM +0200, Wolfgang Walter wrote:
> > > Hello,
> > > 
> > > kernels > 4.12 do not work on one of our main routers. They crash as
> > > soon
> > > as ipsec-tunnels are configured and ipsec-traffic actually flows.
> > 
> > Can you please send the backtrace of this crash?
> 
> I bootet the b838d5e1c5b6e57b10ec8af2268824041e3ea911 several times but I
> could not record the complete trace. I think I have to log to the serial
> console but I can't do that before next week.
> 
> 
> What I could record ist:
> 
> There is a always
>       <IRQ> ... </IRQ>
> the callrace.
> 
> This is the part I could see:
> 
> 
> irq_exit+0x71/0x80
> do_IRQ+0x4d/0xd0
> common_interrup+07a/0x7a
> </IRQ>
> RIP: 010:cpuidle_enter_state+0x11d/0x200
> RSP: 0018:ffffc9000321bee0 EFLAGS: 00000282 ORIG_RAX: ffffffffffffffc4
> RAX: ffff88085efde450 RBX: 0000000000000004 RCX: 00000003c9e63c13
> RDX: 00000003c9e63c13 RSI: ffb03103fe35ac43 RDI: 0000000000000000
> RBP: ffffe8ffff7cf600 R08: 000000000000000c R09: 0000000000000004
> R10: 0000000000000400 R11: 00000003c99e56fc R12: 00000003c9e63c13
> R13: 00000003c9da9567 R14: 0000000000000004 R15: ffffffff822763e0
> do_idle+0xd3/0x160
> cpu_startup_entry+0x14/0x20
> secondary_startup_64+0xa5/0xb0
> Code: 00 0f b7 83 c0 00 00 00 80 7c 02 08 01 0f 86 d3 02 00 00 41
> 8b 8c 24 3c 10 00 00 48 8b 6b 58 85 c9 0f 84 2f 01 00 00 48 83 e5 fe <f6> 45
> 60
> 02 0f 84 4e 01 00 00 f6 43 38 01 74 0d 80 00 bd ab 00 00
> RIP: ip_forward+0xd4/0x470 RSP: ffff88085efc3cb0
> CR2: 0000000000000060
> ----[ end trace 7205b53c25b7b35a ]---
> Kernel panic - not syncing: Fatal exception in interrupt
> Kernel Offset: disabled
> Rebooting in 60 seconds..
> 
> 
> I got an email from Tobias Hommel and I think it is the same problem.
> 
> It is very clear that it is the difference from
> 
>       ipv4: call dst_hold_safe() properly
> 
> to
> 
>       ipv4: mark DST_NOGC and remove the operation of dst_free()
> 
> which triggers this bug.
> 
> Regards,

Regards
-- 
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts

Reply via email to