We can avoid the deadlock via  removing the route_table_reset() from
the route_table_init()

the call trace is below
dp_initialize (ovsthread_once)
    route_table_init
        route_table_reset
             route_table_handle_msg
                 ovs_router_insert__

ovs_router_insert__
       get_src_addr
            get_netdev_type
                   dp_enumerate_types
                         dp_initialize (ovsthread_once)


After removing the route_table_reset() from the route_table_init(),
the ovs-router works well. Because we run the route_table_run()
periodically, and route_table_valid is inited as false (we will call
the route_table_reset in this case). So it’s also unnecessary to
immediately  reset route table in the route_table_init().

On Fri, Aug 4, 2017 at 2:00 AM, Ben Pfaff <b...@ovn.org> wrote:
> On Tue, Jul 18, 2017 at 08:44:15PM -0700, Tonghao Zhang wrote:
>> ovs-router module uses the netdev_open() to get routes.
>> But this module always calls the netdev_open() with type
>> which is NULL. This module may open the eth0, vethx,
>> vxlan_sys_4789, br0 if these network devices exist. And
>> these device will be opened as 'system' type.
>>
>> When debugging, somewhere netdev_ref it.  After reverting
>> "netdev: Fix netdev_open() to adhere to class type if given",
>> and when we call the 'ovs-appctl dpctl/show' or 'ovs-dpctl show',
>> the info is shown as below. the vxlan_sys_4789 is up
>> (eg. ifconfig vxlan_sys_4789 up).
>>
>> $ ovs-dpctl show
>> system@ovs-system:
>>       lookups: hit:4053 missed:118 lost:3
>>       flows: 0
>>       masks: hit:4154 total:1 hit/pkt:1.00
>>       port 0: ovs-system (internal)
>>       port 1: user-ovs-vm (internal)
>>       port 2: vxlan_sys_4789 (vxlan)
>>
>> But the info should be as below.
>> $ ovs-dpctl show
>> system@ovs-system:
>>       lookups: hit:4053 missed:118 lost:3
>>       flows: 0
>>       masks: hit:4154 total:1 hit/pkt:1.00
>>       port 0: ovs-system (internal)
>>       port 1: user-ovs-vm (internal)
>>       port 2: vxlan_sys_4789 (vxlan: packet_type=ptap)
>>
>> Because the netdev-class of 'system' type does not have the
>> 'get_config', and tunnel vports have 'get_config', then we can
>> get the config info(eg. packet_type=ptap) of tunnel vports.
>>
>> If we only revert the patch, there is a bug all the time. The
>> patch which Eelco support is fine to me. That patch avoid issue.
>> URL: https://mail.openvswitch.org/pipermail/ovs-dev/2017-July/335560.html
>> But without it, the patch I support also avoid the problem.
>>
>> However we should check the type in the ovs-router module, this
>> patch works well with the patch Eelco support.
>>
>> Signed-off-by: Tonghao Zhang <xiangxia.m....@gmail.com>
>
> Thanks for the bug fix.
>
> This patch seems fine but can you help me understand the change to
> route_table_init()?  It isn't obviously connected to the rest of the
> patch.
>
> Thanks,
>
> Ben.
>
>> diff --git a/lib/route-table.c b/lib/route-table.c
>> index 67fda31..fc6845f 100644
>> --- a/lib/route-table.c
>> +++ b/lib/route-table.c
>> @@ -112,7 +112,6 @@ route_table_init(void)
>>          nln_notifier_create(nln, RTNLGRP_IPV6_ROUTE,
>>                              (nln_notify_func *) route_table_change, NULL);
>>
>> -    route_table_reset();
>>      name_table_init();
>>
>>      ovs_mutex_unlock(&route_table_mutex);
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to