On 10/27/23 23:16, Stéphane Graber via dev wrote: > Hello, > > I'm currently working on re-enabling our daily OVN tests in Incus (the > LXD fork). > > Unfortunately I'm not having much luck getting our testsuite to go all > the way through as it's triggering kernel panics. > > Here is the stack trace I'm getting: > ```
<snip> > ``` > > That kernel build is effectively a clean 6.5.9 kernel. > The action immediately preceding the kernel panic is the instance > being forcefully stopped, making the last command to run prior to > panic be `ip link del vethXYZ`. Hi, Stéphane. Thanks for the report! This is interesting. It looks like for some reason the revalidator is generating a datapath flow with 60+ nested actions, which is unusual and should not really happen in a normal setup. > > I can reproduce this panic very consistently, though can't easily > isolate the particular configuration needed in order for this to > trigger. > For example, once the machine is rebooted after the panic, I can > start/stop those instances at will, without any kernel panic. Would be really helpful if you could somehow intercept the netlink message revalidator is sending before the kernel dies. I understand though that it might be challenging. Debug logs in dpif_operate() are printed after the operation, so we can't actually use them, unless you modify the sources and move log_flow_put_message() before dpif->dpif_class->operate() call. One thing that happens before the operation execution is USDT probe. If you have them enabled during the build, you should be able to capture the dpif_netlink_operate__:op_flow_put request arguments this way before it goes to the kernel. Some info about USDT probes: https://docs.openvswitch.org/en/latest/topics/usdt-probes/ On the other hand, kernel should likely have a nesting limit to avoid crashing on user requests. :) We have a MAX_ODP_NESTED limit for the user-provided datapath flows, which is equal to 32. So, it might be a sane value to use for the kernel action parsing as well. But we should still figure out why OVS generates such a flow in a first place as it doesn't sound right. Best regards, Ilya Maximets. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
