Re: BUG: unable to handle kernel paging request in fib6_node_lookup_1

2018-09-05 Thread Song Liu



> On Sep 5, 2018, at 10:32 AM, David Ahern  wrote:
> 
> On 9/5/18 12:11 AM, Song Liu wrote:
>> We are debugging an issue with fib6_node_lookup_1(). 
>> 
>> We use a 4.16 based kernel, and we have back ported most upstream
>> patches in ip6_fib.{c.h}. The only major differences I can spot are
>> 
> 
> Did you backport all patches in each set that included a change to those
> files, or just the patches to ip6_fib.* and any dependencies?

I believe we always try back port all patches in each set. But we have back
ported hundreds of patches to our 4.16 tree, so it is also likely we missed
some useful patches. 

Thanks,
Song



Re: BUG: unable to handle kernel paging request in fib6_node_lookup_1

2018-09-05 Thread Song Liu



> On Sep 5, 2018, at 10:09 AM, Wei Wang  wrote:
> 
> On Tue, Sep 4, 2018 at 11:11 PM Song Liu  wrote:
>> 
>> We are debugging an issue with fib6_node_lookup_1().
>> 
>> We use a 4.16 based kernel, and we have back ported most upstream
>> patches in ip6_fib.{c.h}. The only major differences I can spot are
>> 
>> 8b7f2731bd68d83940714ce92381d1a72596407c
>> c350637229fccbffee2475400fcd689d5738
>> 
>> I guess the issue is not related to these two fixes.
>> 
>> After staring at the call trace and disassembly code (attached below)
>> I guess this is a use-after-free issue in (or right after) the lookup
>> loop:
>> 
>>for (;;) {
>>struct fib6_node *next;
>> 
>>dir = addr_bit_set(args->addr, fn->fn_bit);
>> 
>>next = dir ? rcu_dereference(fn->right) :
>> rcu_dereference(fn->left);
>> 
>>if (next) {
>>fn = next;
>>continue;
>>}
>>break;
>>}
>> 
>> I guess this probably also happens to latest upstream. I haven't
>> tested this with upstream kernel (or net tree) yet, because we
>> can only trigger this about once a week on 100 servers.
>> 
>> Does this look familiar? Any comments and/or suggestions are highly
>> appreciated.
>> 
> By glancing at the commit logs, I don't think any changes were made
> regarding the core logic of fib6_node handling recently.
> (There were a couple of fixes regarding fib6_info but I don't think it
> is the cause here... But it is still good to check if you have commit
> 9b0a8da8c4c6, e873e4b9cc7e, e70a3aad44cc in your build.)

Looks like we don't have e70a3aad44cc. I think it fixes a memory leak 
(instead of a use-after-free)? Let me add it and run some tests anyway. 
Thanks a lot for this information. 

> 
> I also went through the call path and did not find anything obviously wrong...
> I think it's the best for you to reproduce it and we can debug further.
> One question is, do you have "CONFIG_IPV6_SUBTREE" enabled and specify
> src IP in the routing table?

We do have CONFIG_IPV6_SUBTREE enabled. But we usually do not specify
src IP in the routing table. 

Let me try to reproduce it. 

Thanks again,
Song




Re: BUG: unable to handle kernel paging request in fib6_node_lookup_1

2018-09-05 Thread David Ahern
On 9/5/18 12:11 AM, Song Liu wrote:
> We are debugging an issue with fib6_node_lookup_1(). 
> 
> We use a 4.16 based kernel, and we have back ported most upstream
> patches in ip6_fib.{c.h}. The only major differences I can spot are
> 

Did you backport all patches in each set that included a change to those
files, or just the patches to ip6_fib.* and any dependencies?


Re: BUG: unable to handle kernel paging request in fib6_node_lookup_1

2018-09-05 Thread Wei Wang
On Tue, Sep 4, 2018 at 11:11 PM Song Liu  wrote:
>
> We are debugging an issue with fib6_node_lookup_1().
>
> We use a 4.16 based kernel, and we have back ported most upstream
> patches in ip6_fib.{c.h}. The only major differences I can spot are
>
> 8b7f2731bd68d83940714ce92381d1a72596407c
> c350637229fccbffee2475400fcd689d5738
>
> I guess the issue is not related to these two fixes.
>
> After staring at the call trace and disassembly code (attached below)
> I guess this is a use-after-free issue in (or right after) the lookup
> loop:
>
> for (;;) {
> struct fib6_node *next;
>
> dir = addr_bit_set(args->addr, fn->fn_bit);
>
> next = dir ? rcu_dereference(fn->right) :
>  rcu_dereference(fn->left);
>
> if (next) {
> fn = next;
> continue;
> }
> break;
> }
>
> I guess this probably also happens to latest upstream. I haven't
> tested this with upstream kernel (or net tree) yet, because we
> can only trigger this about once a week on 100 servers.
>
> Does this look familiar? Any comments and/or suggestions are highly
> appreciated.
>
By glancing at the commit logs, I don't think any changes were made
regarding the core logic of fib6_node handling recently.
(There were a couple of fixes regarding fib6_info but I don't think it
is the cause here... But it is still good to check if you have commit
9b0a8da8c4c6, e873e4b9cc7e, e70a3aad44cc in your build.)

I also went through the call path and did not find anything obviously wrong...
I think it's the best for you to reproduce it and we can debug further.
One question is, do you have "CONFIG_IPV6_SUBTREE" enabled and specify
src IP in the routing table?

Thanks.
Wei

> Thanks,
> Song
>
>
> Bug stack trace:
>
> [354764.457916] BUG: unable to handle kernel
> [354764.466125] paging request
> [354764.471720]  at f60fc318
> [354764.478360] IP: fib6_node_lookup_1+0x29/0x130
> [354764.487249] PGD 80010f725067
> [354764.494062] P4D 80010f725067
> [354764.500878] PUD 0
> [354764.505087] Oops:  [#1] SMP PTI
> [354764.512245] Modules linked in:
> [354764.518536]  udp_diag
> [354764.523266]  act_gact
> [354764.527997]  cls_bpf
> [354764.532557]  tcp_diag
> [354764.537291]  inet_diag
> [354764.542200]  nfsv3
> [354764.546409]  nfs
> [354764.550273]  fscache
> [354764.554834]  ip6table_raw
> [354764.560260]  ip6table_filter
> [354764.566208]  xt_DSCP
> [354764.570765]  iptable_raw
> [354764.576020]  iptable_filter
> [354764.581790]  ip6table_mangle
> [354764.587738]  iptable_mangle
> [354764.593505]  sb_edac
> [354764.598058]  x86_pkg_temp_thermal
> [354764.604872]  intel_powerclamp
> [354764.610992]  coretemp
> [354764.615723]  kvm_intel
> [354764.620628]  kvm
> [354764.624494]  irqbypass
> [354764.629399]  iTCO_wdt
> [354764.634132]  iTCO_vendor_support
> [354764.640772]  i2c_i801
> [354764.645507]  lpc_ich
> [354764.650064]  efivars
> [354764.654619]  mfd_core
> [354764.659353]  ipmi_si
> [354764.663911]  ipmi_devintf
> [354764.669341]  ipmi_msghandler
> [354764.675281]  acpi_cpufreq
> [354764.680711]  button
> [354764.685096]  sch_fq_codel
> [354764.690520]  nfsd
> [354764.694557]  nfs_acl
> [354764.699118]  lockd
> [354764.703330]  auth_rpcgss
> [354764.708588]  oid_registry
> [354764.714006]  grace
> [354764.718213]  sunrpc
> [354764.722590]  fuse
> [354764.726626]  loop
> [354764.730661]  efivarfs
> [354764.735395]  autofs4
> [354764.739957] CPU: 5 PID: 3460038 Comm: java Not tainted 
> 4.16.0-14_fbk2_1455_g6bcb99c57db6 #14
> [354764.756996] Hardware name: Wiwynn Leopard-Orv2/Leopard-DDR BW, BIOS LBM03 
>   06/02/2016
> [354764.773001] RIP: 0010:fib6_node_lookup_1+0x29/0x130
> [354764.782929] RSP: 0018:c9003f0bb730 EFLAGS: 00010206
> [354764.793557] RAX: 883fc131a000 RBX: f60fc300 RCX: 
> ffe4
> [354764.807999] RDX: 0010 RSI: 0001 RDI: 
> c9003f0bb8f0
> [354764.822436] RBP: c9003f0bb750 R08: 0002 R09: 
> 0004
> [354764.836877] R10: c9003f0bb7a8 R11: 883ff7795780 R12: 
> 82305080
> [354764.851317] R13: 0002 R14:  R15: 
> 
> [354764.865765] FS:  7f8defcfc700() GS:881fff94() 
> knlGS:
> [354764.882119] CS:  0010 DS:  ES:  CR0: 80050033
> [354764.893800] CR2: f60fc318 CR3: 000f68cae006 CR4: 
> 003606e0
> [354764.908235] DR0:  DR1:  DR2: 
> 
> [354764.922671] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [354764.937109] Call Trace:
> [354764.942195]  fib6_node_lookup+0x67/0x90
> [354764.950042]  ? fib6_table_lookup+0x43/0x2f0
> [354764.958587]  fib6_table_lookup+0x43/0x2f0
> [354764.966794]  ip6_pol_route+0x43/0x360
> [354764.974294]  ? ip6_pol_route_input+0x20/0x20
> [354764.983016]  

BUG: unable to handle kernel paging request in fib6_node_lookup_1

2018-09-05 Thread Song Liu
We are debugging an issue with fib6_node_lookup_1(). 

We use a 4.16 based kernel, and we have back ported most upstream
patches in ip6_fib.{c.h}. The only major differences I can spot are

8b7f2731bd68d83940714ce92381d1a72596407c
c350637229fccbffee2475400fcd689d5738

I guess the issue is not related to these two fixes. 

After staring at the call trace and disassembly code (attached below) 
I guess this is a use-after-free issue in (or right after) the lookup 
loop:

for (;;) {
struct fib6_node *next;

dir = addr_bit_set(args->addr, fn->fn_bit);

next = dir ? rcu_dereference(fn->right) :
 rcu_dereference(fn->left);

if (next) {
fn = next;
continue;
}
break;
}

I guess this probably also happens to latest upstream. I haven't 
tested this with upstream kernel (or net tree) yet, because we 
can only trigger this about once a week on 100 servers. 

Does this look familiar? Any comments and/or suggestions are highly
appreciated. 

Thanks, 
Song


Bug stack trace:

[354764.457916] BUG: unable to handle kernel
[354764.466125] paging request
[354764.471720]  at f60fc318
[354764.478360] IP: fib6_node_lookup_1+0x29/0x130
[354764.487249] PGD 80010f725067
[354764.494062] P4D 80010f725067
[354764.500878] PUD 0
[354764.505087] Oops:  [#1] SMP PTI
[354764.512245] Modules linked in:
[354764.518536]  udp_diag
[354764.523266]  act_gact
[354764.527997]  cls_bpf
[354764.532557]  tcp_diag
[354764.537291]  inet_diag
[354764.542200]  nfsv3
[354764.546409]  nfs
[354764.550273]  fscache
[354764.554834]  ip6table_raw
[354764.560260]  ip6table_filter
[354764.566208]  xt_DSCP
[354764.570765]  iptable_raw
[354764.576020]  iptable_filter
[354764.581790]  ip6table_mangle
[354764.587738]  iptable_mangle
[354764.593505]  sb_edac
[354764.598058]  x86_pkg_temp_thermal
[354764.604872]  intel_powerclamp
[354764.610992]  coretemp
[354764.615723]  kvm_intel
[354764.620628]  kvm
[354764.624494]  irqbypass
[354764.629399]  iTCO_wdt
[354764.634132]  iTCO_vendor_support
[354764.640772]  i2c_i801
[354764.645507]  lpc_ich
[354764.650064]  efivars
[354764.654619]  mfd_core
[354764.659353]  ipmi_si
[354764.663911]  ipmi_devintf
[354764.669341]  ipmi_msghandler
[354764.675281]  acpi_cpufreq
[354764.680711]  button
[354764.685096]  sch_fq_codel
[354764.690520]  nfsd
[354764.694557]  nfs_acl
[354764.699118]  lockd
[354764.703330]  auth_rpcgss
[354764.708588]  oid_registry
[354764.714006]  grace
[354764.718213]  sunrpc
[354764.722590]  fuse
[354764.726626]  loop
[354764.730661]  efivarfs
[354764.735395]  autofs4
[354764.739957] CPU: 5 PID: 3460038 Comm: java Not tainted 
4.16.0-14_fbk2_1455_g6bcb99c57db6 #14
[354764.756996] Hardware name: Wiwynn Leopard-Orv2/Leopard-DDR BW, BIOS LBM03   
06/02/2016
[354764.773001] RIP: 0010:fib6_node_lookup_1+0x29/0x130
[354764.782929] RSP: 0018:c9003f0bb730 EFLAGS: 00010206
[354764.793557] RAX: 883fc131a000 RBX: f60fc300 RCX: 
ffe4
[354764.807999] RDX: 0010 RSI: 0001 RDI: 
c9003f0bb8f0
[354764.822436] RBP: c9003f0bb750 R08: 0002 R09: 
0004
[354764.836877] R10: c9003f0bb7a8 R11: 883ff7795780 R12: 
82305080
[354764.851317] R13: 0002 R14:  R15: 

[354764.865765] FS:  7f8defcfc700() GS:881fff94() 
knlGS:
[354764.882119] CS:  0010 DS:  ES:  CR0: 80050033
[354764.893800] CR2: f60fc318 CR3: 000f68cae006 CR4: 
003606e0
[354764.908235] DR0:  DR1:  DR2: 

[354764.922671] DR3:  DR6: fffe0ff0 DR7: 
0400
[354764.937109] Call Trace:
[354764.942195]  fib6_node_lookup+0x67/0x90
[354764.950042]  ? fib6_table_lookup+0x43/0x2f0
[354764.958587]  fib6_table_lookup+0x43/0x2f0
[354764.966794]  ip6_pol_route+0x43/0x360
[354764.974294]  ? ip6_pol_route_input+0x20/0x20
[354764.983016]  fib6_rule_lookup+0x85/0x140
[354764.991050]  ? ip6t_do_table+0x331/0x6b0
[354764.999089]  ? ip6_route_output_flags+0xa3/0xc0
[354765.008342]  ip6_route_me_harder+0xab/0x280
[354765.016889]  ip6table_mangle_hook+0xd4/0x110 [ip6table_mangle]
[354765.028754]  ? nf_hook_slow+0x43/0xc0
[354765.036269]  nf_hook_slow+0x43/0xc0
[354765.043445]  nf_hook+0x6e/0xc0
[354765.049731]  ? ac6_proc_exit+0x20/0x20
[354765.057412]  ip6_xmit+0x28a/0x500
[354765.064225]  ? ac6_proc_exit+0x20/0x20
[354765.071902]  ? inet6_csk_route_socket+0x10f/0x1c0
[354765.081495]  ? update_group_capacity+0x23/0x1e0
[354765.090749]  inet6_csk_xmit+0x82/0xd0
[354765.098277]  tcp_transmit_skb+0x51d/0x9d0
[354765.106495]  tcp_write_xmit+0x1bd/0xf40
[354765.114359]  ? _copy_from_iter_full+0x9c/0x240
[354765.123444]  tcp_sendmsg_locked+0x2c2/0xdd0
[354765.131991]  tcp_sendmsg+0x27/0x40
[354765.138991]