Hello,

( As discussed, I am adding in cc. some more audience up to 
[email protected]<mailto:[email protected]> )

TL;DR / Summary:
* At LTS kernel v4.19.0, I was able to send certain VMs into Kernel Panic 
remotely, by simply spamming them with slight traffic from user land:
  - In primis, rpm for LTS kernel 4.19.0 came via centos/elrepo, as-is; so this 
was a stock kernel, likely affecting several folks. I am surprised it passed QA 
from so many teams.
  - The bug inducing oneliner:
     * `echo 1 2 4 8 16 32 64 128|xargs -n1 iperf3 -c <my_iperf_server> -P` ## 
`iperf3 -s` server was running on the crashing side
  - the effect was 100% reproducible with the above, falling apart at about 8 
or 16 parallel processes, see [2] below for the stacktraces
  - the configuration involved a setup of just a couple k8s pods (i.e. 
containers) over my openstack provider; fyi. I don’t own the hardware resource, 
it was my upstream.
  - I established that it needed at least 2 nodes (VMs) _and_ openvswitch was 
part of the prerequisites for the crash (i.e. I could not reach kernel panic 
with the iperf server outside of the container)
  - perhaps a crucial piece of information would have been the openstack’s 
supervisor kernel & qemu version; alas, I don’t have that nor has it been 
readily available to me.

I may have difficulty today to reproduce it, but I think the simplicity of the 
test makes it very attractive for LTP, so I am sending this to see what kind of 
feedback we might collect.

cheers and thanks for reading so far,
Fotis

p.s. @Richard, a bit of feedback follows inline, thanks for the follow up.

On 11 Mar 2021, at 12:08, Richard Palethorpe 
<[email protected]<mailto:[email protected]>> wrote:

Hello Fotis,

Fotis Georgatos <[email protected]<mailto:[email protected]>> 
writes:

Dear Jaime, Richard,

I am motivated by Richard’s recent FOSDEM’21 talk [1] to reach out to
both you.

That's great to know!


Back in 2019 I had to deal with an obscure and esoteric linux 4.19.0 kernel 
bug, described in [2], whereby any 2 k8s pods could crash each other with the 
right network pressure and conditions - that was reproducible.
Then, Linux kernel 4.19.1 made the bug go away, presumably/possibly
due to the bug fix in [3], but this is just my personal guess and I
have never got down to cornering it.

I guess this is not specific to k8s, but any OVS setup. A GPF caused by
packet processing could be serious. Especially if the bug is not
actually fixed, but the commit mentioned just made it more difficult to
reproduce.

My thoughts, as well. Even if it’s fixed, the generality of the test would make 
it very attractive to adopt it under LTP - it would cover several more bug 
types.


Have you tried recompiling 4.19 with KASAN and lockdep enabled then
reproducing the bug? It may fail earlier giving a more accurate picture
of what is happening.

No, I was discouraged since I didn’t have full system ownership up to metal, to 
corner all the factors.



Given that the testing command was very simplistic, being just an one liner, I 
wonder if there is more juice to extract of this, in benefit of LTP itself - 
and future kernels' stability:
* `iperf3 -s & echo 1 2 4 8 16 32 64 128 256|xargs -n1 iperf3 -c 
my_iperf_server -P` ## this would crash reproducibly `my_iperf_server`, beyond 
parallelism values ~=8 or ~=16

Would we like to try to make a test case out of it? Is it worthy for you or do 
you know if that is merely an instance of another bug report?
(i.e. my hunt there has been fruitless, possibly I am not looking for
the right keywords)

Possibly it could be reproduced in LTP by creating two processes in
different network namespaces, then linking them with OVS, then
recreating something similar to what iperf is doing. This is a pure
guess though.

Yes, it’s a reasonable guess; I’d add to it that the containers need to be on 
distinct nodes; bug was irreproducible within 1 node.


It is probably worth reporting to the OVS maintainers first, e.g:

$ scripts/get_maintainer.pl net/openvswitch/
Pravin B Shelar <[email protected]<mailto:[email protected]>> 
(maintainer:OPENVSWITCH)
"David S. Miller" <[email protected]<mailto:[email protected]>> 
(maintainer:NETWORKING [GENERAL])
Jakub Kicinski <[email protected]<mailto:[email protected]>> (maintainer:NETWORKING 
[GENERAL])
[email protected]<mailto:[email protected]> (open list:OPENVSWITCH)
[email protected]<mailto:[email protected]> (open list:OPENVSWITCH)
[email protected]<mailto:[email protected]> (open list)

Then once we know exactly what causes it we can create a minimal
reproducer in LTP or the OVS test suite if there is too much setup
involved for LTP. (I'm not sure what test suite OVS has, but IIRC there
is one).



All the best and thanks to either of you,
Fotis

P.S. Needless to say, I much believe this bug could be food for the teeth of 
`fzsync` itself [4]!

[1] https://fosdem.org/2021/schedule/event/reproducing_kernel_data_races/
[2] https://github.com/weaveworks/weave/issues/3684 ## reproducible kernel 
panic w. 4.19.0 & parallel iperf threads P>8, weave/2.5.*; disclaimer: `weave` 
is merely accelerating the effect.
[3] 
https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/
[4] https://gitlab.com/Palethorpe/fuzzy-sync

—
Eur Ing Fotis Georgatos
Senior Systems Engineer
__________________________
Swiss Data Science Center, EPFL SDSC-GE, INN 218 (Bâtiment INN) Station 14, 
CH-1015 Lausanne
Email: 
[email protected]<mailto:[email protected]><mailto:[email protected]>
 Tel: +41 21 69 34067


--
Thank you,
Richard.


—
Eur Ing Fotis Georgatos
Senior Systems Engineer
__________________________
Swiss Data Science Center, EPFL SDSC-GE, INN 218 (Bâtiment INN) Station 14, 
CH-1015 Lausanne
Email: [email protected]<mailto:[email protected]> Tel: +41 21 69 
34067


_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to