IMO, we pretty much have a broken branch-2.7 for a while now. I can't seem to get geneve tunnels to work across 2 machines in a flaky way. I am not sure whether this is the cause of the problem, but looks like others are seeing similar issues.
On 14 July 2017 at 06:34, Huanle Han <[email protected]> wrote: > Hi, Ben > I see some netdev which is not added to ovs constructs and destructs > frequently because of route_table_handle_msg(). and there may be some side > effects everytime reopen the netdev, such as netdev_change_seq_changed. > > Is it neccessy that open a netdev not in ovs? > How about not open netdev or return failure for it? > > Thanks > Huanle > > > On Fri, Jul 14, 2017 at 2:33 AM, Ben Pfaff <[email protected]> wrote: > > > On Wed, Jul 05, 2017 at 05:03:06PM +0200, Eelco Chaudron wrote: > > > On 23/06/17 10:37, Eelco Chaudron wrote: > > > >On 06/23/2017 02:12 AM, Ben Pfaff wrote: > > > >>On Thu, Jun 22, 2017 at 04:04:44PM -0300, Flavio Leitner wrote: > > > >>>On Thu, Jun 22, 2017 at 01:04:59AM +0800, Huanle Han wrote: > > > >>>>Hi,all > > > >>>> > > > >>>>I get this problem with latest(dbd8112) branch-2.7 code on my > Ubuntu. > > > >>>>root@ubuntu:/var/log/# ovs-vsctl show > > > >>>>adf2ea99-0c53-4180-914f-7dadaa71302b > > > >>>> Bridge test > > > >>>> Port test > > > >>>> Interface test > > > >>>> type: internal > > > >>>> Bridge "manage" > > > >>>> Port "manage" > > > >>>> Interface "manage" > > > >>>> type: internal > > > >>>> error: "could not open network device manage (File > > > >>>>exists)" > > > >>>> Port "veth0" > > > >>>> Interface "veth0" > > > >>>> Port "eth0" > > > >>>> Interface "eth0" > > > >>>> ovs_version: "2.7.0" > > > >>>> > > > >>>>How to reproduce: > > > >>>>1. add bridge "manage", up and add ip on it > > > >>>>2. restart ovs-vswitchd > > > >>>>3. "ovs-vsctl show" displays error message. > > > >>>> > > > >>>>The reason: > > > >>>>In following "netdev_open" call on ovs-vswitchd start, input "type" > > > >>>>is NULL > > > >>>>and "manage" is opened as a "system" netdev_class iface > incorrectly. > > > >>>> > > > >>>>#0 netdev_open (name=0x7fffffffe2bc "manage", type=0x0, > > > >>>>netdevp=0x7fffffffc3b0) at ../lib/netdev.c:396 > > > >>>>#1 0x000000000052c492 in get_src_addr (ip6_dst=0x7fffffffe2ac, > > > >>>>output_bridge=0x7fffffffe2bc "manage", psrc=0x8f3490) at > > > >>>>../lib/ovs-router.c:141 > > > >>>>#2 0x000000000052c85d in ovs_router_insert__ (priority=104 'h', > > > >>>>ip6_dst=0x7fffffffe29c, plen=104 'h', output_bridge=0x7fffffffe2bc > > > >>>>"manage", gw=0x7fffffffe2ac) at ../lib/ovs-router.c:202 > > > >>>>#3 0x000000000052c980 in ovs_router_insert (ip_dst=0x7fffffffe29c, > > > >>>>plen=104 'h', output_bridge=0x7fffffffe2bc "manage", > > > >>>>gw=0x7fffffffe2ac) at > > > >>>>../lib/ovs-router.c:228 > > > >>>>#4 0x000000000058f63a in route_table_handle_msg > > > >>>>(change=0x7fffffffe290) at > > > >>>>../lib/route-table.c:295 > > > >>>>#5 0x000000000058f1da in route_table_reset () at > > > >>>>../lib/route-table.c:174 > > > >>>>#6 0x000000000058f034 in route_table_init () at > > > >>>>../lib/route-table.c:110 > > > >>>>#7 0x0000000000495838 in dp_initialize () at ../lib/dpif.c:126 > > > >>>>#8 0x0000000000495b40 in dp_enumerate_types (types=0x7fffffffe3a0) > > at > > > >>>>../lib/dpif.c:244 > > > >>>>#9 0x000000000042eb1c in enumerate_types (types=0x7fffffffe3a0) at > > > >>>>../ofproto/ofproto-dpif.c:267 > > > >>>>#10 0x000000000041b81c in ofproto_enumerate_types > > > >>>>(types=0x7fffffffe3a0) at > > > >>>>../ofproto/ofproto.c:432 > > > >>>>#11 0x000000000040df1e in bridge_run__ () at > > ../vswitchd/bridge.c:3020 > > > >>>>#12 0x000000000040e196 in bridge_run () at > ../vswitchd/bridge.c:3082 > > > >>>>#13 0x00000000004138ef in main (argc=1, argv=0x7fffffffe578) at > > > >>>>../vswitchd/ovs-vswitchd.c:119 > > > >>>> > > > >>>>After then, ovs fails to netdev_open "manage" with type == > > "internal". > > > >>>>"File exists" error is reported. > > > >>>>I think commit d3b8f50(netdev: Fix netdev_open() to adhere to class > > > >>>>type if > > > >>>>given) introduces this problem. It need be improved. > > > >>>Your analysis is correct. One solution is to wait vswitchd to > > > >>>configure first and only then enable ovs-route. The problem is that > > > >>>more modules might use netdev_open() and it doesn't sound like a > good > > > >>>idea to have a control per module. > > > >>> > > > >>>Another option is to map the device's info to a class instead of > using > > > >>>"system", so we could use internal classes for vports, for instance. > > > >>>However, that doesn't guarantee it will match with what is > configured > > > >>>in the DB. > > > >>This sounds like the kind of problem that I expected the following > > > >>commit might cause. > > > >> > > > >> commit d3b8f5052292b3ba9084ffed097e90b87f2950f5 > > > >> Author: Eelco Chaudron <[email protected]> > > > >> Date: Thu Jun 1 14:38:09 2017 +0200 > > > >> > > > >> netdev: Fix netdev_open() to adhere to class type if given > > > >> > > > >>If we can't fix it somehow, we might need to revert. > > > > > > > >Reverting the above patch does not solve the real issue here. If we > > revert > > > >we > > > >donot get the error, but the wrong class is used, hence the wrong > > > >callbacks > > > >get called. > > > > > > > >The main issue is with netdev_open() being called with type = NULL > > before > > > >the interface is actually configured in the system. We could track > these > > > >"auto" generated interfaces, and once netdev_open() gets called with a > > > >valid type reconfigure (re-create) it. netdev_remove() could work here > > > >but not sure if its safe due to reference counting. > > > > > > > Some background info; I see a lot of netdev_open() with a NULL type, > but > > > they just grep some data and closed it again. I found that the tunnel > > code > > > is keeping a reference to the netdev (for the IP assigned) and hence it > > is > > > not going away before the "real" interface opens it. > > > > > > The change below is based on tracking the "classless" opened devices, > and > > > once they get opened with a "real" class, re-create them. I did some > > basic > > > testing and it seem to work fine, also went over related code and could > > not > > > find any corner case. > > > > > > So please take a peek, and if it all makes sense I can send out an > > official > > > patch. > > > > It's not exactly pleasing, but it's the best solution I've seen so far. > > > > I wonder whether there's a way to better figure out the class type in > > advance. > > > > Let's try the solution you're proposing. You want to send it out? > > > > Thanks, > > > > Ben. > > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
