On Wed, Jul 05, 2017 at 05:03:06PM +0200, Eelco Chaudron wrote: > On 23/06/17 10:37, Eelco Chaudron wrote: > >On 06/23/2017 02:12 AM, Ben Pfaff wrote: > >>On Thu, Jun 22, 2017 at 04:04:44PM -0300, Flavio Leitner wrote: > >>>On Thu, Jun 22, 2017 at 01:04:59AM +0800, Huanle Han wrote: > >>>>Hi,all > >>>> > >>>>I get this problem with latest(dbd8112) branch-2.7 code on my Ubuntu. > >>>>root@ubuntu:/var/log/# ovs-vsctl show > >>>>adf2ea99-0c53-4180-914f-7dadaa71302b > >>>> Bridge test > >>>> Port test > >>>> Interface test > >>>> type: internal > >>>> Bridge "manage" > >>>> Port "manage" > >>>> Interface "manage" > >>>> type: internal > >>>> error: "could not open network device manage (File > >>>>exists)" > >>>> Port "veth0" > >>>> Interface "veth0" > >>>> Port "eth0" > >>>> Interface "eth0" > >>>> ovs_version: "2.7.0" > >>>> > >>>>How to reproduce: > >>>>1. add bridge "manage", up and add ip on it > >>>>2. restart ovs-vswitchd > >>>>3. "ovs-vsctl show" displays error message. > >>>> > >>>>The reason: > >>>>In following "netdev_open" call on ovs-vswitchd start, input "type" > >>>>is NULL > >>>>and "manage" is opened as a "system" netdev_class iface incorrectly. > >>>> > >>>>#0 netdev_open (name=0x7fffffffe2bc "manage", type=0x0, > >>>>netdevp=0x7fffffffc3b0) at ../lib/netdev.c:396 > >>>>#1 0x000000000052c492 in get_src_addr (ip6_dst=0x7fffffffe2ac, > >>>>output_bridge=0x7fffffffe2bc "manage", psrc=0x8f3490) at > >>>>../lib/ovs-router.c:141 > >>>>#2 0x000000000052c85d in ovs_router_insert__ (priority=104 'h', > >>>>ip6_dst=0x7fffffffe29c, plen=104 'h', output_bridge=0x7fffffffe2bc > >>>>"manage", gw=0x7fffffffe2ac) at ../lib/ovs-router.c:202 > >>>>#3 0x000000000052c980 in ovs_router_insert (ip_dst=0x7fffffffe29c, > >>>>plen=104 'h', output_bridge=0x7fffffffe2bc "manage", > >>>>gw=0x7fffffffe2ac) at > >>>>../lib/ovs-router.c:228 > >>>>#4 0x000000000058f63a in route_table_handle_msg > >>>>(change=0x7fffffffe290) at > >>>>../lib/route-table.c:295 > >>>>#5 0x000000000058f1da in route_table_reset () at > >>>>../lib/route-table.c:174 > >>>>#6 0x000000000058f034 in route_table_init () at > >>>>../lib/route-table.c:110 > >>>>#7 0x0000000000495838 in dp_initialize () at ../lib/dpif.c:126 > >>>>#8 0x0000000000495b40 in dp_enumerate_types (types=0x7fffffffe3a0) at > >>>>../lib/dpif.c:244 > >>>>#9 0x000000000042eb1c in enumerate_types (types=0x7fffffffe3a0) at > >>>>../ofproto/ofproto-dpif.c:267 > >>>>#10 0x000000000041b81c in ofproto_enumerate_types > >>>>(types=0x7fffffffe3a0) at > >>>>../ofproto/ofproto.c:432 > >>>>#11 0x000000000040df1e in bridge_run__ () at ../vswitchd/bridge.c:3020 > >>>>#12 0x000000000040e196 in bridge_run () at ../vswitchd/bridge.c:3082 > >>>>#13 0x00000000004138ef in main (argc=1, argv=0x7fffffffe578) at > >>>>../vswitchd/ovs-vswitchd.c:119 > >>>> > >>>>After then, ovs fails to netdev_open "manage" with type == "internal". > >>>>"File exists" error is reported. > >>>>I think commit d3b8f50(netdev: Fix netdev_open() to adhere to class > >>>>type if > >>>>given) introduces this problem. It need be improved. > >>>Your analysis is correct. One solution is to wait vswitchd to > >>>configure first and only then enable ovs-route. The problem is that > >>>more modules might use netdev_open() and it doesn't sound like a good > >>>idea to have a control per module. > >>> > >>>Another option is to map the device's info to a class instead of using > >>>"system", so we could use internal classes for vports, for instance. > >>>However, that doesn't guarantee it will match with what is configured > >>>in the DB. > >>This sounds like the kind of problem that I expected the following > >>commit might cause. > >> > >> commit d3b8f5052292b3ba9084ffed097e90b87f2950f5 > >> Author: Eelco Chaudron <[email protected]> > >> Date: Thu Jun 1 14:38:09 2017 +0200 > >> > >> netdev: Fix netdev_open() to adhere to class type if given > >> > >>If we can't fix it somehow, we might need to revert. > > > >Reverting the above patch does not solve the real issue here. If we revert > >we > >donot get the error, but the wrong class is used, hence the wrong > >callbacks > >get called. > > > >The main issue is with netdev_open() being called with type = NULL before > >the interface is actually configured in the system. We could track these > >"auto" generated interfaces, and once netdev_open() gets called with a > >valid type reconfigure (re-create) it. netdev_remove() could work here > >but not sure if its safe due to reference counting. > > > Some background info; I see a lot of netdev_open() with a NULL type, but > they just grep some data and closed it again. I found that the tunnel code > is keeping a reference to the netdev (for the IP assigned) and hence it is > not going away before the "real" interface opens it. > > The change below is based on tracking the "classless" opened devices, and > once they get opened with a "real" class, re-create them. I did some basic > testing and it seem to work fine, also went over related code and could not > find any corner case. > > So please take a peek, and if it all makes sense I can send out an official > patch.
It's not exactly pleasing, but it's the best solution I've seen so far. I wonder whether there's a way to better figure out the class type in advance. Let's try the solution you're proposing. You want to send it out? Thanks, Ben. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
