On Thu, Dec 07, 2017 at 09:26:14AM -0800, Kevin Lin wrote:
> Hi,
> 
> I work on Kelda (kelda.io <http://kelda.io/>) with Ethan Jackson. We run a 
> containerized, distributed version of OVN. The master branch of 
> openvswitch/ovs (commit 07754b23ee5027508d64804d445e617b017cc2d1) fails with 
> the following assertion in ovs-vswitchd:
> 
> ovs-vswitchd(handler2): ofproto/ofproto-dpif-xlate.c:3704: assertion 
> !truncate failed in compose_output_action__()
> 
> whenever we try to use the OVN network. A little background on our setup:
> We’re a container orchestrator that uses OVN for the container network.
> One machine in our cluster runs ovn-northd and ovsdb-server. The network is 
> mostly configured from here (creating the logical ports, creating ACLs etc).
> Another machine runs ovn-controller, ovs-vswitchd, and ovsdb-server. We 
> install some container-specific OpenFlow rules by connecting directly to 
> ovs-vswitchd, and of course ovs-vswitchd also receives rules from OVN.
> ovs-vswitchd does not crash immediately after the rules are installed. But it 
> crashes as soon as the network is used (e.g. a ping from one container to 
> another).
> 
> The commit before the commit that introduced the assertion works for us 
> (https://github.com/openvswitch/ovs/commit/48f704f4768d13f85252bac4f93c8d45d8ab3eea
>  
> <https://github.com/openvswitch/ovs/commit/48f704f4768d13f85252bac4f93c8d45d8ab3eea>).
> 
> I’ve attached the ovs-vswitchd logs. I’m not sure how helpful the output of 
> ovs-bugtool will be given our containerized setup, but I’ve also attached the 
> output of running that from within the ovs-vswitchd container from before and 
> after the crash. Note, because the ovs-vswitchd container crashed, the 
> “after” tarball was generated after restarting the container, so I’m not sure 
> if any of the commands it ran actually succeeded.
> 
> The crash is trivial for me to reproduce, so please let me know if there’s 
> anymore information I can give you.

Thank you for the report.

I don't see a good reason that this should be a condition that kills
ovs-vswitchd.  I think that it will be both easier to debug and less
inherently harmful if we change it to an error message.  I sent out a
patch that does that:
        https://patchwork.ozlabs.org/patch/845845/

It would be great if you could apply the patch and then try to track
down the activity that triggers the error.  "ofproto/trace" is the best
way to do that, if you can find the right packet or flow, because it
will give us all the details on how the problem gets triggered.

Thanks,

Ben.
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to