On 2018/12/04 11:52, Nicolas Belouin wrote:
> On 03/12 07:59, Eric Dumazet wrote:
> >
> >
> > On 12/03/2018 07:20 AM, Nicolas Belouin wrote:
> > > Hi,
> > > I ran into a panic while adding an interface to a bridge with a vxlan
> > > interface already attached to it, as it seems related mtu=9000.
> > >
> > > I get the following panic info :
> > >
> > > [ 2482.419893] br100: port 2(vif1.1) entered blocking state
> > > [ 2482.425427] br100: port 2(vif1.1) entered forwarding state
> > > [ 2482.431797] skbuff: skb_over_panic: text:816e4f78 len:40
> > > put:40 head:880146449000 data:880146458fd0 tail:0xfff8 end:0xec0
> > > dev:vif1.1
> > > [ 2482.442891] [ cut here ]
> > > [ 2482.448254] kernel BUG at
> > > /srv/jenkins/workspace/workspace/hosting-xen-dom0-kernel/build/src/linux-4.9/net/core/skbuff.c:105!
> > > [ 2482.459009] invalid opcode: [#1] PREEMPT SMP
> > > [ 2482.464371] Modules linked in:
> > > [ 2482.469682] CPU: 19 PID: 1317 Comm: kworker/19:1 Not tainted
> > > 4.9.135-dom0-e9d15b2-x86_64-iaas #2
> > > [ 2482.480362] Hardware name: Dell Inc. PowerEdge C8220/09N44V, BIOS
> > > 2.7.1 03/04/2015
> > > [ 2482.491008] Workqueue: ipv6_addrconf addrconf_dad_work
> > > [ 2482.496380] task: 88017eef1a00 task.stack: c90001fcc000
> > > [ 2482.501785] RIP: e030:[] []
> > > skb_panic+0x5f/0x70
> > > [ 2482.512450] RSP: e02b:c90001fcfba8 EFLAGS: 00010296
> > > [ 2482.517817] RAX: 0088 RBX: 880117fb0800 RCX:
> > >
> > > [ 2482.528447] RDX: 0088 RSI: 880184cd03c8 RDI:
> > > 880184cd03c8
> > > [ 2482.539085] RBP: c90001fcfc00 R08: 06a8 R09:
> > > 81ea7359
> > > [ 2482.549717] R10: 880180406f80 R11: 06a8 R12:
> > > 880147258cc0
> > > [ 2482.560350] R13: c90001fcfc20 R14: 81d13440 R15:
> > >
> > > [ 2482.570993] FS: () GS:880184cc()
> > > knlGS:
> > > [ 2482.581646] CS: e033 DS: ES: CR0: 80050033
> > > [ 2482.587039] CR2: 7f5b17f032b0 CR3: 01c08000 CR4:
> > > 00042660
> > > [ 2482.597675] Stack:
> > > [ 2482.602958] 880146458fd0 fff8 0ec0
> > > 88017f3f
> > > [ 2482.613619] 815efa62 816e4f78 880117fb0800
> > > c90001fcfc20
> > > [ 2482.624288] 880147258cc0 88017f3f 880146502000
> > > c90001fcfc68
> > > [ 2482.634955] Call Trace:
> > > [ 2482.640254] [] ? skb_put+0x42/0x50
> > > [ 2482.645633] [] ? ip6_mc_hdr.constprop.36+0x58/0xd0
> > > [ 2482.651045] [] ? mld_newpack+0x12a/0x1e0
> > > [ 2482.656421] [] ? add_grhead.isra.28+0x87/0xa0
> > > [ 2482.661825] [] ? add_grec+0x446/0x4c0
> > > [ 2482.667198] [] ? __local_bh_enable_ip+0x1b/0xb0
> > > [ 2482.672609] [] ?
> > > mld_send_initial_cr.part.29+0x58/0xa0
> > > [ 2482.678022] [] ? ipv6_mc_dad_complete+0x26/0x60
> > > [ 2482.683441] [] ? addrconf_dad_completed+0x29f/0x2c0
> > > [ 2482.688850] [] ? ipv6_dev_mc_inc+0x194/0x2c0
> > > [ 2482.694249] [] ? addrconf_dad_work+0xfe/0x3d0
> > > [ 2482.699650] [] ? _raw_spin_unlock_irq+0xd/0x20
> > > [ 2482.705052] [] ? process_one_work+0x142/0x3e0
> > > [ 2482.710453] [] ? worker_thread+0x62/0x480
> > > [ 2482.715848] [] ? process_one_work+0x3e0/0x3e0
> > > [ 2482.721256] [] ? kthread+0xe2/0x100
> > > [ 2482.726621] [] ? __switch_to+0x261/0x6b0
> > > [ 2482.732006] [] ? kthread_park+0x60/0x60
> > > [ 2482.737379] [] ? ret_from_fork+0x57/0x70
> > > [ 2482.742761] Code: 00 00 48 89 44 24 10 8b 87 b0 00 00 00 48 89 44 24
> > > 08 48 8b 87 c0 00 00 00 48 c7 c7 50 8e a2 81 48 89 04 24 31 c0 e8 b5 07
> > > b6 ff <0f> 0b 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
> > > [ 2482.759199] RIP [] skb_panic+0x5f/0x70
> > > [ 2482.764672] RSP
> > > [ 2482.771186] ---[ end trace 6d0fe52ed049d841 ]---
> > > [ 2482.776641] Kernel panic - not syncing: Fatal exception in interrupt
> > > [ 2482.861621] Kernel Offset: disabled
> > >
> > > I circumvented the bug by applying this patch:
> > > diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> > > index 21f6deb2aec9..2762c3dcc883 100644
> > > --- a/net/ipv6/mcast.c
> > > +++ b/net/ipv6/mcast.c
> > > @@ -1605,8 +1605,6 @@ static struct sk_buff *mld_newpack(struct inet6_dev
> > > *idev, unsigned int mtu)
> > > IPV6_TLV_PADN, 0 };
> > >
> > > /* we assume size > sizeof(ra) here */
> > > - /* limit our allocations to order-0 page */
> > > - size = min_t(int, size, SKB_MAX_ORDER(0, 0));
> > > skb = sock_alloc_send_skb(sk, size, 1, );
> > >
> > > if (!skb)
> > >
> > > The lines are introduced by commit
> > > 72e09ad107e78d69ff4d3b97a69f0aad2b77280f
> > > stating that "order-2 GRP_ATOMIC allocations are very unreliable"
> > > I then wonder if this statement is still relevant, or if such a patch
> > > would be acceptable to you.
> >
> >