Re: 4.20-rc6: WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect

2018-12-21 Thread Willem de Bruijn
On Fri, Dec 21, 2018 at 1:45 AM Christian Borntraeger
 wrote:
>
>
>
> On 20.12.2018 18:23, Willem de Bruijn wrote:
> > On Thu, Dec 20, 2018 at 11:17 AM Ido Schimmel  wrote:
> >>
> >> On Thu, Dec 20, 2018 at 03:09:22PM +0100, Christian Borntraeger wrote:
> >>> On 20.12.2018 10:12, Ido Schimmel wrote:
>  +Willem
> 
>  On Thu, Dec 20, 2018 at 08:45:40AM +0100, Christian Borntraeger wrote:
> > Folks,
> >
> > I got this warning today. I cant tell when and why this happened, so I 
> > do not know yet how to reproduce.
> > Maybe someone has a quick idea.
> >
> > [85109.572032] WARNING: CPU: 30 PID: 197360 at 
> > net/core/flow_dissector.c:764 __skb_flow_dissect+0x1f0/0x1318
> 
>  I managed to trigger this warning as well the other day, but from a
>  different call path:
> >>>
> >>> FWIW, it also seems to happen on 4.20-rc1. 4.19.0 seems fine. bisect seem 
> >>> to have failed so
> >>> my reproducer is not reliable.
> >>
> >> Yes, it is caused by commit d0e13a1488ad ("flow_dissector: lookup netns
> >> by skb->sk if skb->dev is NULL")
> >>
> >> $ git tag --contains d0e13a1488ad
> >> v4.20-rc1
> >> v4.20-rc2
> >> v4.20-rc3
> >> v4.20-rc4
> >> v4.20-rc5
> >> v4.20-rc6
> >
> > That tap_get_user_xdp path is also new for 4.20-rc1:
> >
> > commit 0efac27791ee068075d80f07c55a229b1335ce12
> > tap: accept an array of XDP buffs through sendmsg()
> >
> > $ git describe --contains 0efac27791ee
> > v4.20-rc1~14^2~382^2~1
> >
> > In v4.19 and before all packets went through tap_get_user.
>
> Hmmm, so maybe my bisect wasnt broken at all? It pointed to
>
> commit 105bc1306e9b29c2aa2783b9524f7aec9b5a5b1f
> Merge: 3475372ff60e4 d0e13a1488ad3
> Author: David S. Miller 
> AuthorDate: Tue Sep 25 20:29:38 2018 -0700
> Commit: David S. Miller 
> CommitDate: Tue Sep 25 20:29:38 2018 -0700
>
> Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Yes, that's the right commit. The flow dissector change went in
through bpf-next.


Re: 4.20-rc6: WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect

2018-12-20 Thread Christian Borntraeger



On 20.12.2018 18:23, Willem de Bruijn wrote:
> On Thu, Dec 20, 2018 at 11:17 AM Ido Schimmel  wrote:
>>
>> On Thu, Dec 20, 2018 at 03:09:22PM +0100, Christian Borntraeger wrote:
>>> On 20.12.2018 10:12, Ido Schimmel wrote:
 +Willem

 On Thu, Dec 20, 2018 at 08:45:40AM +0100, Christian Borntraeger wrote:
> Folks,
>
> I got this warning today. I cant tell when and why this happened, so I do 
> not know yet how to reproduce.
> Maybe someone has a quick idea.
>
> [85109.572032] WARNING: CPU: 30 PID: 197360 at 
> net/core/flow_dissector.c:764 __skb_flow_dissect+0x1f0/0x1318

 I managed to trigger this warning as well the other day, but from a
 different call path:
>>>
>>> FWIW, it also seems to happen on 4.20-rc1. 4.19.0 seems fine. bisect seem 
>>> to have failed so
>>> my reproducer is not reliable.
>>
>> Yes, it is caused by commit d0e13a1488ad ("flow_dissector: lookup netns
>> by skb->sk if skb->dev is NULL")
>>
>> $ git tag --contains d0e13a1488ad
>> v4.20-rc1
>> v4.20-rc2
>> v4.20-rc3
>> v4.20-rc4
>> v4.20-rc5
>> v4.20-rc6
> 
> That tap_get_user_xdp path is also new for 4.20-rc1:
> 
> commit 0efac27791ee068075d80f07c55a229b1335ce12
> tap: accept an array of XDP buffs through sendmsg()
> 
> $ git describe --contains 0efac27791ee
> v4.20-rc1~14^2~382^2~1
> 
> In v4.19 and before all packets went through tap_get_user.

Hmmm, so maybe my bisect wasnt broken at all? It pointed to

commit 105bc1306e9b29c2aa2783b9524f7aec9b5a5b1f
Merge: 3475372ff60e4 d0e13a1488ad3
Author: David S. Miller 
AuthorDate: Tue Sep 25 20:29:38 2018 -0700
Commit: David S. Miller 
CommitDate: Tue Sep 25 20:29:38 2018 -0700

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next



Re: 4.20-rc6: WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect

2018-12-20 Thread Willem de Bruijn
On Thu, Dec 20, 2018 at 11:17 AM Ido Schimmel  wrote:
>
> On Thu, Dec 20, 2018 at 03:09:22PM +0100, Christian Borntraeger wrote:
> > On 20.12.2018 10:12, Ido Schimmel wrote:
> > > +Willem
> > >
> > > On Thu, Dec 20, 2018 at 08:45:40AM +0100, Christian Borntraeger wrote:
> > >> Folks,
> > >>
> > >> I got this warning today. I cant tell when and why this happened, so I 
> > >> do not know yet how to reproduce.
> > >> Maybe someone has a quick idea.
> > >>
> > >> [85109.572032] WARNING: CPU: 30 PID: 197360 at 
> > >> net/core/flow_dissector.c:764 __skb_flow_dissect+0x1f0/0x1318
> > >
> > > I managed to trigger this warning as well the other day, but from a
> > > different call path:
> >
> > FWIW, it also seems to happen on 4.20-rc1. 4.19.0 seems fine. bisect seem 
> > to have failed so
> > my reproducer is not reliable.
>
> Yes, it is caused by commit d0e13a1488ad ("flow_dissector: lookup netns
> by skb->sk if skb->dev is NULL")
>
> $ git tag --contains d0e13a1488ad
> v4.20-rc1
> v4.20-rc2
> v4.20-rc3
> v4.20-rc4
> v4.20-rc5
> v4.20-rc6

That tap_get_user_xdp path is also new for 4.20-rc1:

commit 0efac27791ee068075d80f07c55a229b1335ce12
tap: accept an array of XDP buffs through sendmsg()

$ git describe --contains 0efac27791ee
v4.20-rc1~14^2~382^2~1

In v4.19 and before all packets went through tap_get_user.


Re: 4.20-rc6: WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect

2018-12-20 Thread Willem de Bruijn
On Thu, Dec 20, 2018 at 9:34 AM Willem de Bruijn
 wrote:
>
> On Thu, Dec 20, 2018 at 9:16 AM Ido Schimmel  wrote:
> >
> > On Thu, Dec 20, 2018 at 09:04:25AM -0500, Willem de Bruijn wrote:
> > > On Thu, Dec 20, 2018 at 6:15 AM Ido Schimmel  wrote:
> > > >
> > > > +Willem
> > > >
> > > > On Thu, Dec 20, 2018 at 08:45:40AM +0100, Christian Borntraeger wrote:
> > > > > Folks,
> > > > >
> > > > > I got this warning today. I cant tell when and why this happened, so 
> > > > > I do not know yet how to reproduce.
> > > > > Maybe someone has a quick idea.
> > > > >
> > > > > [85109.572032] WARNING: CPU: 30 PID: 197360 at 
> > > > > net/core/flow_dissector.c:764 __skb_flow_dissect+0x1f0/0x1318
> > > >
> > > > I managed to trigger this warning as well the other day, but from a
> > > > different call path:
> > > >
> > > > [280155.348610]  fib_multipath_hash+0x28c/0x2d0
> > > > [280155.348613]  ? fib_multipath_hash+0x28c/0x2d0
> > > > [280155.348619]  fib_select_path+0x241/0x32f
> > > > [280155.348622]  ? __fib_lookup+0x6a/0xb0
> > > > [280155.348626]  ip_route_output_key_hash_rcu+0x650/0xa30
> > > > [280155.348631]  ? __alloc_skb+0x9b/0x1d0
> > > > [280155.348634]  inet_rtm_getroute+0x3f7/0xb80
> > >
> > > inet_rtm_getroute builds a new packet with inet_rtm_getroute_build_skb
> > > here without dev or sk.
> >
> > Ack
> >
> > >
> > > > Problem is the synthesized skb for output route resolution does not have
> > > > skb->dev or skb->sk set. When a multipath route is hit and
> > > > net.ipv4.fib_multipath_hash_policy is set the flow dissector is called
> > > > with this skb and the warning is triggered.
> > > >
> > > > I plan to fix it by setting skb->dev to net->loopback_dev.
> > >
> > > The device can be chosen based on iif in inet_rtm_getroute? A first
> > > thought, I don't know this code very well.
> >
> > Yes, but iif is for input routes. I'm talking about output routes.
> >
> > > Let me know if you want me to take a stab at that patch. IPv6 probably
> > > will need the same.
> >
> > Yes, I'll try it now and post later today if everything is OK. IPv6 is
> > using flow info and not an skb, so no problem there. I also checked
> > other getroute implementations and none of them call into the flow
> > dissector with an skb, so I think we're fine.
> >
> > >
> > > > I assume we
> > > > want to keep this warning to prevent call paths which will otherwise
> > > > silently fallback to standard flow dissector instead of the BPF one.
> > >
> > > Indeed, the warning is there to sniff out paths that do not follow
> > > what I thought was an invariant. If there are too many exceptions, I
> > > may have to revisit that assumption. But for now, let's see if we can
> > > address these edge cases.
> >
> > Ack
> >
> > >
> > > > I'm not familiar with tap code, so someone else will need to patch this
> > > > case, but it looks like:
> > > >
> > > > tap_sendmsg()
> > > > tap_get_user()
> > > > skb_probe_transport_header()
> > > > skb_flow_dissect_flow_keys_basic()
> > > > __skb_flow_dissect()
> > > >
> > > > skb->dev is only set later in the code.
> > >
> > > tap_get_user uses sock_alloc_send_pskb (through tap_alloc_skb) to
> > > allocate the skb. So skb->sk should be set at the time of
> > > skb_probe_transport_header. I'm not sure how this path triggers the
> > > warning.
> >
> > Maybe it's:
> >
> > tap_sendmsg()
> > tap_get_user_xdp()
> > build_skb()
> > skb_probe_transport_header()
> > skb_flow_dissect_flow_keys_basic()
> > __skb_flow_dissect()
>
> Oh, indeed. I completely overlooked that path.
>
> I will call skb_set_owner_w there and will audit the other users of build_skb.

Uhm, no, that may not be the right solution if these packets may
be injected into the receive path. This also affects the tun device
through tun_xdp_one, which calls netif_receive_skb.

I'll need to take a closer look. Other approach is to move the
assignment skb->dev = tap->dev earlier.


Re: 4.20-rc6: WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect

2018-12-20 Thread Willem de Bruijn
On Thu, Dec 20, 2018 at 9:16 AM Ido Schimmel  wrote:
>
> On Thu, Dec 20, 2018 at 09:04:25AM -0500, Willem de Bruijn wrote:
> > On Thu, Dec 20, 2018 at 6:15 AM Ido Schimmel  wrote:
> > >
> > > +Willem
> > >
> > > On Thu, Dec 20, 2018 at 08:45:40AM +0100, Christian Borntraeger wrote:
> > > > Folks,
> > > >
> > > > I got this warning today. I cant tell when and why this happened, so I 
> > > > do not know yet how to reproduce.
> > > > Maybe someone has a quick idea.
> > > >
> > > > [85109.572032] WARNING: CPU: 30 PID: 197360 at 
> > > > net/core/flow_dissector.c:764 __skb_flow_dissect+0x1f0/0x1318
> > >
> > > I managed to trigger this warning as well the other day, but from a
> > > different call path:
> > >
> > > [280155.348610]  fib_multipath_hash+0x28c/0x2d0
> > > [280155.348613]  ? fib_multipath_hash+0x28c/0x2d0
> > > [280155.348619]  fib_select_path+0x241/0x32f
> > > [280155.348622]  ? __fib_lookup+0x6a/0xb0
> > > [280155.348626]  ip_route_output_key_hash_rcu+0x650/0xa30
> > > [280155.348631]  ? __alloc_skb+0x9b/0x1d0
> > > [280155.348634]  inet_rtm_getroute+0x3f7/0xb80
> >
> > inet_rtm_getroute builds a new packet with inet_rtm_getroute_build_skb
> > here without dev or sk.
>
> Ack
>
> >
> > > Problem is the synthesized skb for output route resolution does not have
> > > skb->dev or skb->sk set. When a multipath route is hit and
> > > net.ipv4.fib_multipath_hash_policy is set the flow dissector is called
> > > with this skb and the warning is triggered.
> > >
> > > I plan to fix it by setting skb->dev to net->loopback_dev.
> >
> > The device can be chosen based on iif in inet_rtm_getroute? A first
> > thought, I don't know this code very well.
>
> Yes, but iif is for input routes. I'm talking about output routes.
>
> > Let me know if you want me to take a stab at that patch. IPv6 probably
> > will need the same.
>
> Yes, I'll try it now and post later today if everything is OK. IPv6 is
> using flow info and not an skb, so no problem there. I also checked
> other getroute implementations and none of them call into the flow
> dissector with an skb, so I think we're fine.
>
> >
> > > I assume we
> > > want to keep this warning to prevent call paths which will otherwise
> > > silently fallback to standard flow dissector instead of the BPF one.
> >
> > Indeed, the warning is there to sniff out paths that do not follow
> > what I thought was an invariant. If there are too many exceptions, I
> > may have to revisit that assumption. But for now, let's see if we can
> > address these edge cases.
>
> Ack
>
> >
> > > I'm not familiar with tap code, so someone else will need to patch this
> > > case, but it looks like:
> > >
> > > tap_sendmsg()
> > > tap_get_user()
> > > skb_probe_transport_header()
> > > skb_flow_dissect_flow_keys_basic()
> > > __skb_flow_dissect()
> > >
> > > skb->dev is only set later in the code.
> >
> > tap_get_user uses sock_alloc_send_pskb (through tap_alloc_skb) to
> > allocate the skb. So skb->sk should be set at the time of
> > skb_probe_transport_header. I'm not sure how this path triggers the
> > warning.
>
> Maybe it's:
>
> tap_sendmsg()
> tap_get_user_xdp()
> build_skb()
> skb_probe_transport_header()
> skb_flow_dissect_flow_keys_basic()
> __skb_flow_dissect()

Oh, indeed. I completely overlooked that path.

I will call skb_set_owner_w there and will audit the other users of build_skb.


Re: 4.20-rc6: WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect

2018-12-20 Thread Ido Schimmel
On Thu, Dec 20, 2018 at 03:09:22PM +0100, Christian Borntraeger wrote:
> On 20.12.2018 10:12, Ido Schimmel wrote:
> > +Willem
> > 
> > On Thu, Dec 20, 2018 at 08:45:40AM +0100, Christian Borntraeger wrote:
> >> Folks,
> >>
> >> I got this warning today. I cant tell when and why this happened, so I do 
> >> not know yet how to reproduce.
> >> Maybe someone has a quick idea.
> >>
> >> [85109.572032] WARNING: CPU: 30 PID: 197360 at 
> >> net/core/flow_dissector.c:764 __skb_flow_dissect+0x1f0/0x1318
> > 
> > I managed to trigger this warning as well the other day, but from a
> > different call path:
> 
> FWIW, it also seems to happen on 4.20-rc1. 4.19.0 seems fine. bisect seem to 
> have failed so
> my reproducer is not reliable.

Yes, it is caused by commit d0e13a1488ad ("flow_dissector: lookup netns
by skb->sk if skb->dev is NULL")

$ git tag --contains d0e13a1488ad
v4.20-rc1
v4.20-rc2
v4.20-rc3
v4.20-rc4
v4.20-rc5
v4.20-rc6


Re: 4.20-rc6: WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect

2018-12-20 Thread Ido Schimmel
On Thu, Dec 20, 2018 at 09:04:25AM -0500, Willem de Bruijn wrote:
> On Thu, Dec 20, 2018 at 6:15 AM Ido Schimmel  wrote:
> >
> > +Willem
> >
> > On Thu, Dec 20, 2018 at 08:45:40AM +0100, Christian Borntraeger wrote:
> > > Folks,
> > >
> > > I got this warning today. I cant tell when and why this happened, so I do 
> > > not know yet how to reproduce.
> > > Maybe someone has a quick idea.
> > >
> > > [85109.572032] WARNING: CPU: 30 PID: 197360 at 
> > > net/core/flow_dissector.c:764 __skb_flow_dissect+0x1f0/0x1318
> >
> > I managed to trigger this warning as well the other day, but from a
> > different call path:
> >
> > [280155.348610]  fib_multipath_hash+0x28c/0x2d0
> > [280155.348613]  ? fib_multipath_hash+0x28c/0x2d0
> > [280155.348619]  fib_select_path+0x241/0x32f
> > [280155.348622]  ? __fib_lookup+0x6a/0xb0
> > [280155.348626]  ip_route_output_key_hash_rcu+0x650/0xa30
> > [280155.348631]  ? __alloc_skb+0x9b/0x1d0
> > [280155.348634]  inet_rtm_getroute+0x3f7/0xb80
> 
> inet_rtm_getroute builds a new packet with inet_rtm_getroute_build_skb
> here without dev or sk.

Ack

> 
> > Problem is the synthesized skb for output route resolution does not have
> > skb->dev or skb->sk set. When a multipath route is hit and
> > net.ipv4.fib_multipath_hash_policy is set the flow dissector is called
> > with this skb and the warning is triggered.
> >
> > I plan to fix it by setting skb->dev to net->loopback_dev.
> 
> The device can be chosen based on iif in inet_rtm_getroute? A first
> thought, I don't know this code very well.

Yes, but iif is for input routes. I'm talking about output routes.

> Let me know if you want me to take a stab at that patch. IPv6 probably
> will need the same.

Yes, I'll try it now and post later today if everything is OK. IPv6 is
using flow info and not an skb, so no problem there. I also checked
other getroute implementations and none of them call into the flow
dissector with an skb, so I think we're fine.

> 
> > I assume we
> > want to keep this warning to prevent call paths which will otherwise
> > silently fallback to standard flow dissector instead of the BPF one.
> 
> Indeed, the warning is there to sniff out paths that do not follow
> what I thought was an invariant. If there are too many exceptions, I
> may have to revisit that assumption. But for now, let's see if we can
> address these edge cases.

Ack

> 
> > I'm not familiar with tap code, so someone else will need to patch this
> > case, but it looks like:
> >
> > tap_sendmsg()
> > tap_get_user()
> > skb_probe_transport_header()
> > skb_flow_dissect_flow_keys_basic()
> > __skb_flow_dissect()
> >
> > skb->dev is only set later in the code.
> 
> tap_get_user uses sock_alloc_send_pskb (through tap_alloc_skb) to
> allocate the skb. So skb->sk should be set at the time of
> skb_probe_transport_header. I'm not sure how this path triggers the
> warning.

Maybe it's:

tap_sendmsg()
tap_get_user_xdp()
build_skb()
skb_probe_transport_header()
skb_flow_dissect_flow_keys_basic()
__skb_flow_dissect()



Re: 4.20-rc6: WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect

2018-12-20 Thread Christian Borntraeger
On 20.12.2018 10:12, Ido Schimmel wrote:
> +Willem
> 
> On Thu, Dec 20, 2018 at 08:45:40AM +0100, Christian Borntraeger wrote:
>> Folks,
>>
>> I got this warning today. I cant tell when and why this happened, so I do 
>> not know yet how to reproduce.
>> Maybe someone has a quick idea.
>>
>> [85109.572032] WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 
>> __skb_flow_dissect+0x1f0/0x1318
> 
> I managed to trigger this warning as well the other day, but from a
> different call path:

FWIW, it also seems to happen on 4.20-rc1. 4.19.0 seems fine. bisect seem to 
have failed so
my reproducer is not reliable.

> 
> [280155.348526] WARNING: CPU: 1 PID: 24819 at net/core/flow_dissector.c:764 
> __skb_flow_dissect+0x314/0x16b0
> [280155.348529] Modules linked in: dummy vrf intel_powerclamp coretemp 
> kvm_intel kvm irqbypass crct10dif_pclmul leds_mlxreg i2c_mux_reg i2c_mlxcpld 
> crc32_pclmul mlxreg_hotplug mlxreg_io i2c_mux ghash_clmulni_intel iTCO_wdt 
> gpio_ich iTCO_vendor_support mlx_platform aesni_intel aes_x86_64 crypto_simd 
> cryptd glue_helper intel_cstate mac_hid lpc_ich ip_tables x_tables autofs4 
> mlxsw_spectrum mlxfw vxlan ip6_udp_tunnel udp_tunnel ip6_tunnel tunnel6 
> objagg psample parman bridge stp llc mlxsw_pci igb ahci mlxsw_core dca 
> i2c_algo_bit libahci devlink i2c_ismt
> [280155.348570] CPU: 1 PID: 24819 Comm: ip Not tainted 4.20.0-rc6-nn181213 #1
> [280155.348572] Hardware name: Mellanox Technologies Ltd. MSN2100/SA001390, 
> BIOS 5.6.5 06/07/2016
> [280155.348576] RIP: 0010:__skb_flow_dissect+0x314/0x16b0
> [280155.348579] Code: 85 19 0e 00 00 45 0f b7 6c 24 04 41 0f b7 44 24 06 4d 
> 01 fd 48 85 db 4d 8d 14 07 74 0f 48 8b 43 18 48 85 c0 0f 85 e5 02 00 00 <0f> 
> 0b 41 f6 04 24 40 0f 85 a4 02 00 00 c7 85 30 ff ff ff 00 00 00
> [280155.348581] RSP: 0018:a0df41fdf650 EFLAGS: 00010246
> [280155.348584] RAX:  RBX: 8bcded232000 RCX: 
> 
> [280155.348586] RDX: a0df41fdf7e0 RSI: 98e415a0 RDI: 
> 8bcded232000
> [280155.348588] RBP: a0df41fdf760 R08:  R09: 
> 
> [280155.348590] R10: a0df41fdf7e8 R11: 8bcdf27a3000 R12: 
> 98e415a0
> [280155.348591] R13: a0df41fdf7e0 R14: 98dd2980 R15: 
> a0df41fdf7e0
> [280155.348594] FS:  7f46f6897680() GS:8bcdf7a8() 
> knlGS:
> [280155.348596] CS:  0010 DS:  ES:  CR0: 80050033
> [280155.348598] CR2: 55933e95f9a0 CR3: 00021e636000 CR4: 
> 001006e0
> [280155.348600] Call Trace:
> [280155.348610]  fib_multipath_hash+0x28c/0x2d0
> [280155.348613]  ? fib_multipath_hash+0x28c/0x2d0
> [280155.348619]  fib_select_path+0x241/0x32f
> [280155.348622]  ? __fib_lookup+0x6a/0xb0
> [280155.348626]  ip_route_output_key_hash_rcu+0x650/0xa30
> [280155.348631]  ? __alloc_skb+0x9b/0x1d0
> [280155.348634]  inet_rtm_getroute+0x3f7/0xb80
> [280155.348640]  ? __alloc_pages_nodemask+0x11c/0x2c0
> [280155.348646]  rtnetlink_rcv_msg+0x1d9/0x2f0
> [280155.348650]  ? rtnl_calcit.isra.24+0x120/0x120
> [280155.348654]  netlink_rcv_skb+0x54/0x130
> [280155.348657]  rtnetlink_rcv+0x15/0x20
> [280155.348661]  netlink_unicast+0x20a/0x2c0
> [280155.348664]  netlink_sendmsg+0x2d1/0x3d0
> [280155.348669]  sock_sendmsg+0x39/0x50
> [280155.348672]  ___sys_sendmsg+0x2a0/0x2f0
> [280155.348677]  ? filemap_map_pages+0x16b/0x360
> [280155.348682]  ? __handle_mm_fault+0x108e/0x13d0
> [280155.348686]  __sys_sendmsg+0x63/0xa0
> [280155.348688]  ? __sys_sendmsg+0x63/0xa0
> [280155.348692]  __x64_sys_sendmsg+0x1f/0x30
> [280155.348697]  do_syscall_64+0x5a/0x120
> [280155.348701]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [280155.348704] RIP: 0033:0x7f46f5b80d04
> [280155.348707] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 
> 00 00 00 00 48 8d 05 01 dc 2c 00 8b 00 85 c0 75 13 b8 2e 00 00 00 0f 05 <48> 
> 3d 00 f0 ff ff 77 54 f3 c3 66 90 41 54 55 41 89 d4 53 48 89 f5
> [280155.348709] RSP: 002b:7fff82d62778 EFLAGS: 0246 ORIG_RAX: 
> 002e
> [280155.348712] RAX: ffda RBX: 5c1900ae RCX: 
> 7f46f5b80d04
> [280155.348713] RDX:  RSI: 7fff82d627e0 RDI: 
> 0003
> [280155.348715] RBP: 7fff82d628d8 R08: 0001 R09: 
> 
> [280155.348717] R10: 7f46f5bfccc0 R11: 0246 R12: 
> 0001
> [280155.348718] R13: 55933eb90020 R14:  R15: 
> 7fff82d63030
> [280155.348722] ---[ end trace e14023d76a175374 ]---
> 
> Problem is the synthesized skb for output route resolution does not have
> skb->dev or skb->sk set. When a multipath route is hit and
> net.ipv4.fib_multipath_hash_policy is set the flow dissector is called
> with this skb and the warning is triggered.
> 
> I plan to fix it by setting skb->dev to net->loopback_dev. I assume we
> want to keep this warning to prevent call paths which will otherwise
> silently fallback to standard flow dissector instead 

Re: 4.20-rc6: WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect

2018-12-20 Thread Willem de Bruijn
On Thu, Dec 20, 2018 at 6:15 AM Ido Schimmel  wrote:
>
> +Willem
>
> On Thu, Dec 20, 2018 at 08:45:40AM +0100, Christian Borntraeger wrote:
> > Folks,
> >
> > I got this warning today. I cant tell when and why this happened, so I do 
> > not know yet how to reproduce.
> > Maybe someone has a quick idea.
> >
> > [85109.572032] WARNING: CPU: 30 PID: 197360 at 
> > net/core/flow_dissector.c:764 __skb_flow_dissect+0x1f0/0x1318
>
> I managed to trigger this warning as well the other day, but from a
> different call path:
>
> [280155.348610]  fib_multipath_hash+0x28c/0x2d0
> [280155.348613]  ? fib_multipath_hash+0x28c/0x2d0
> [280155.348619]  fib_select_path+0x241/0x32f
> [280155.348622]  ? __fib_lookup+0x6a/0xb0
> [280155.348626]  ip_route_output_key_hash_rcu+0x650/0xa30
> [280155.348631]  ? __alloc_skb+0x9b/0x1d0
> [280155.348634]  inet_rtm_getroute+0x3f7/0xb80

inet_rtm_getroute builds a new packet with inet_rtm_getroute_build_skb
here without dev or sk.

> Problem is the synthesized skb for output route resolution does not have
> skb->dev or skb->sk set. When a multipath route is hit and
> net.ipv4.fib_multipath_hash_policy is set the flow dissector is called
> with this skb and the warning is triggered.
>
> I plan to fix it by setting skb->dev to net->loopback_dev.

The device can be chosen based on iif in inet_rtm_getroute? A first
thought, I don't know this code very well.

Let me know if you want me to take a stab at that patch. IPv6 probably
will need the same.

> I assume we
> want to keep this warning to prevent call paths which will otherwise
> silently fallback to standard flow dissector instead of the BPF one.

Indeed, the warning is there to sniff out paths that do not follow
what I thought was an invariant. If there are too many exceptions, I
may have to revisit that assumption. But for now, let's see if we can
address these edge cases.

> I'm not familiar with tap code, so someone else will need to patch this
> case, but it looks like:
>
> tap_sendmsg()
> tap_get_user()
> skb_probe_transport_header()
> skb_flow_dissect_flow_keys_basic()
> __skb_flow_dissect()
>
> skb->dev is only set later in the code.

tap_get_user uses sock_alloc_send_pskb (through tap_alloc_skb) to
allocate the skb. So skb->sk should be set at the time of
skb_probe_transport_header. I'm not sure how this path triggers the
warning.


Re: 4.20-rc6: WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect

2018-12-20 Thread Ido Schimmel
+Willem

On Thu, Dec 20, 2018 at 08:45:40AM +0100, Christian Borntraeger wrote:
> Folks,
> 
> I got this warning today. I cant tell when and why this happened, so I do not 
> know yet how to reproduce.
> Maybe someone has a quick idea.
> 
> [85109.572032] WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 
> __skb_flow_dissect+0x1f0/0x1318

I managed to trigger this warning as well the other day, but from a
different call path:

[280155.348526] WARNING: CPU: 1 PID: 24819 at net/core/flow_dissector.c:764 
__skb_flow_dissect+0x314/0x16b0
[280155.348529] Modules linked in: dummy vrf intel_powerclamp coretemp 
kvm_intel kvm irqbypass crct10dif_pclmul leds_mlxreg i2c_mux_reg i2c_mlxcpld 
crc32_pclmul mlxreg_hotplug mlxreg_io i2c_mux ghash_clmulni_intel iTCO_wdt 
gpio_ich iTCO_vendor_support mlx_platform aesni_intel aes_x86_64 crypto_simd 
cryptd glue_helper intel_cstate mac_hid lpc_ich ip_tables x_tables autofs4 
mlxsw_spectrum mlxfw vxlan ip6_udp_tunnel udp_tunnel ip6_tunnel tunnel6 objagg 
psample parman bridge stp llc mlxsw_pci igb ahci mlxsw_core dca i2c_algo_bit 
libahci devlink i2c_ismt
[280155.348570] CPU: 1 PID: 24819 Comm: ip Not tainted 4.20.0-rc6-nn181213 #1
[280155.348572] Hardware name: Mellanox Technologies Ltd. MSN2100/SA001390, 
BIOS 5.6.5 06/07/2016
[280155.348576] RIP: 0010:__skb_flow_dissect+0x314/0x16b0
[280155.348579] Code: 85 19 0e 00 00 45 0f b7 6c 24 04 41 0f b7 44 24 06 4d 01 
fd 48 85 db 4d 8d 14 07 74 0f 48 8b 43 18 48 85 c0 0f 85 e5 02 00 00 <0f> 0b 41 
f6 04 24 40 0f 85 a4 02 00 00 c7 85 30 ff ff ff 00 00 00
[280155.348581] RSP: 0018:a0df41fdf650 EFLAGS: 00010246
[280155.348584] RAX:  RBX: 8bcded232000 RCX: 

[280155.348586] RDX: a0df41fdf7e0 RSI: 98e415a0 RDI: 
8bcded232000
[280155.348588] RBP: a0df41fdf760 R08:  R09: 

[280155.348590] R10: a0df41fdf7e8 R11: 8bcdf27a3000 R12: 
98e415a0
[280155.348591] R13: a0df41fdf7e0 R14: 98dd2980 R15: 
a0df41fdf7e0
[280155.348594] FS:  7f46f6897680() GS:8bcdf7a8() 
knlGS:
[280155.348596] CS:  0010 DS:  ES:  CR0: 80050033
[280155.348598] CR2: 55933e95f9a0 CR3: 00021e636000 CR4: 
001006e0
[280155.348600] Call Trace:
[280155.348610]  fib_multipath_hash+0x28c/0x2d0
[280155.348613]  ? fib_multipath_hash+0x28c/0x2d0
[280155.348619]  fib_select_path+0x241/0x32f
[280155.348622]  ? __fib_lookup+0x6a/0xb0
[280155.348626]  ip_route_output_key_hash_rcu+0x650/0xa30
[280155.348631]  ? __alloc_skb+0x9b/0x1d0
[280155.348634]  inet_rtm_getroute+0x3f7/0xb80
[280155.348640]  ? __alloc_pages_nodemask+0x11c/0x2c0
[280155.348646]  rtnetlink_rcv_msg+0x1d9/0x2f0
[280155.348650]  ? rtnl_calcit.isra.24+0x120/0x120
[280155.348654]  netlink_rcv_skb+0x54/0x130
[280155.348657]  rtnetlink_rcv+0x15/0x20
[280155.348661]  netlink_unicast+0x20a/0x2c0
[280155.348664]  netlink_sendmsg+0x2d1/0x3d0
[280155.348669]  sock_sendmsg+0x39/0x50
[280155.348672]  ___sys_sendmsg+0x2a0/0x2f0
[280155.348677]  ? filemap_map_pages+0x16b/0x360
[280155.348682]  ? __handle_mm_fault+0x108e/0x13d0
[280155.348686]  __sys_sendmsg+0x63/0xa0
[280155.348688]  ? __sys_sendmsg+0x63/0xa0
[280155.348692]  __x64_sys_sendmsg+0x1f/0x30
[280155.348697]  do_syscall_64+0x5a/0x120
[280155.348701]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[280155.348704] RIP: 0033:0x7f46f5b80d04
[280155.348707] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 
00 00 00 48 8d 05 01 dc 2c 00 8b 00 85 c0 75 13 b8 2e 00 00 00 0f 05 <48> 3d 00 
f0 ff ff 77 54 f3 c3 66 90 41 54 55 41 89 d4 53 48 89 f5
[280155.348709] RSP: 002b:7fff82d62778 EFLAGS: 0246 ORIG_RAX: 
002e
[280155.348712] RAX: ffda RBX: 5c1900ae RCX: 
7f46f5b80d04
[280155.348713] RDX:  RSI: 7fff82d627e0 RDI: 
0003
[280155.348715] RBP: 7fff82d628d8 R08: 0001 R09: 

[280155.348717] R10: 7f46f5bfccc0 R11: 0246 R12: 
0001
[280155.348718] R13: 55933eb90020 R14:  R15: 
7fff82d63030
[280155.348722] ---[ end trace e14023d76a175374 ]---

Problem is the synthesized skb for output route resolution does not have
skb->dev or skb->sk set. When a multipath route is hit and
net.ipv4.fib_multipath_hash_policy is set the flow dissector is called
with this skb and the warning is triggered.

I plan to fix it by setting skb->dev to net->loopback_dev. I assume we
want to keep this warning to prevent call paths which will otherwise
silently fallback to standard flow dissector instead of the BPF one.

> [85109.572036] Modules linked in: vhost_net vhost macvtap macvlan tap vfio_ap 
> vfio_mdev mdev vfio_iommu_type1 vfio kvm xt_CHECKSUM ipt_MASQUERADE tun 
> bridge stp llc xt_tcpudp ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT 
> nf_reject_ipv4 xt_conntrack ip6table_nat nf_nat_ipv6 ip6table_mangle 
> ip6table_raw 

4.20-rc6: WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 __skb_flow_dissect

2018-12-19 Thread Christian Borntraeger
Folks,

I got this warning today. I cant tell when and why this happened, so I do not 
know yet how to reproduce.
Maybe someone has a quick idea.

[85109.572032] WARNING: CPU: 30 PID: 197360 at net/core/flow_dissector.c:764 
__skb_flow_dissect+0x1f0/0x1318
[85109.572036] Modules linked in: vhost_net vhost macvtap macvlan tap vfio_ap 
vfio_mdev mdev vfio_iommu_type1 vfio kvm xt_CHECKSUM ipt_MASQUERADE tun bridge 
stp llc xt_tcpudp ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT 
nf_reject_ipv4 xt_conntrack ip6table_nat nf_nat_ipv6 ip6table_mangle 
ip6table_raw ip6table_security iptable_nat nf_nat_ipv4 nf_nat iptable_mangle 
iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set 
nfnetlink ip6table_filter ip6_tables iptable_filter dm_service_time rpcrdma 
sunrpc rdma_ucm rdma_cm configfs iw_cm ib_cm mlx4_ib ib_uverbs ib_core pkey 
ghash_s390 prng aes_s390 des_s390 des_generic sha512_s390 sha1_s390 zcrypt_cex4 
zcrypt rng_core eadm_sch sch_fq_codel ip_tables x_tables mlx4_en mlx4_core 
sha256_s390 sha_common dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua 
dm_mirror dm_region_hash dm_log dm_mod autofs4
[85109.572072] CPU: 30 PID: 197360 Comm: vhost-197330 Not tainted 
4.20.0-20181213.rc6.git0.407d079170c1.300.fc29.s390x #1
[85109.572074] Hardware name: IBM 2964 NC9 712 (LPAR)
[85109.572075] Krnl PSW : 0704c0018000 0092e320 
(__skb_flow_dissect+0x1f0/0x1318)
[85109.572078]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 
RI:0 EA:3
[85109.572080] Krnl GPRS: 002a  03e0385bfc84 
00d91e30
[85109.572081]03e0385bfc84   
00d91e30
[85109.572083]03e0385bfc84 000e 007e3eb88100 
007ff3561e00
[85109.572084]0806 00b4f288 03e0385bfbb8 
03e0385bfab0
[85109.572115] Krnl Code: 0092e312: e310b0180002ltg 
%r1,24(%r11)
  0092e318: a7740271brc 7,92e7fa
 #0092e31c: a7f40001brc 
15,92e31e
 >0092e320: 91407003tm  
3(%r7),64
  0092e324: a7740257brc 7,92e7d2
  0092e328: 5810f0b4l   
%r1,180(%r15)
  0092e32c: e54cf0c8mvhi
200(%r15),0
  0092e332: c01b0008nilf%r1,8
[85109.572129] Call Trace:
[85109.572130] ([<>]   (null))
[85109.572134]  [<03ff800c81e4>] tap_sendmsg+0x384/0x430 [tap] 
[85109.572137]  [<03ff801acdee>] vhost_tx_batch.isra.10+0x66/0xe0 
[vhost_net] 
[85109.572138]  [<03ff801ad61c>] handle_tx_copy+0x18c/0x568 [vhost_net] 
[85109.572140]  [<03ff801adab4>] handle_tx+0xbc/0x100 [vhost_net] 
[85109.572145]  [<03ff8019bbe8>] vhost_worker+0xc8/0x128 [vhost] 
[85109.572148]  [<001690b8>] kthread+0x140/0x160 
[85109.572152]  [<00a84266>] kernel_thread_starter+0x6/0x10 
[85109.572154]  [<00a84260>] kernel_thread_starter+0x0/0x10 
[85109.572155] Last Breaking-Event-Address:
[85109.572156]  [<0092e31c>] __skb_flow_dissect+0x1ec/0x1318
[85109.572158] ---[ end trace 97c040a6691bc000 ]---