Hi all:
We have recently found a ofproto tables_version problem in vxlan network.
We create ovs network like this, and VM1 use the port tap11 as NIC. We used
default NORMAL flow on br-int, br-tun and br-physnet.
[计算机生成了可选文字: 专 br一physnet]
ovs-vsctl add-br br-int -- set bridge br-int datapath_type=dpdk
other_config:fwd_mode=openflow
ovs-vsctl add-br br-tun -- set bridge br-tun datapath_type=dpdk
other_config:fwd_mode=openflow
ovs-vsctl add-br br-physnet -- set bridge br-physnet datapath_type=dpdk
other_config:fwd_mode=openflow
ovs-vsctl add-port br-int patch-int-tun -- set interface patch-int-tun
type=patch options:peer=patch-tun-int
ovs-vsctl add-port br-tun patch-tun-int -- set interface patch-tun-int
type=patch options:peer=patch-int-tun
ovs-vsctl add-port br-physnet eth3 -- set interface eth3 type=dpdkphy
ovs-vsctl add-port br-physnet tunnel_bearing -- set interface tunnel_bearing
type=internal
ovs-vsctl add-port br-tun vxlan1 -- set interface vxlan1 type=vxlan
options:df_default="true" options:local_ip=80.0.0.1 options:in_key=flow
options:out_key=flow options:remote_ip=80.0.0.2
ovs-vsctl add-port br-int tap11 -- set interface tap11 type=virtio
ovs-vsctl set port br-tun tag=4095
ovs-vsctl set port br-physnet tag=4095
ifconfig tunnel_bearing 80.0.0.1/16
When ping in VM, in normal situation, ovs receives packet on tap11, and first
sends an arp request packet with dst_ip=80.0.0.2. If 80.0.0.2 returns arp
reply, the command "ovs-appctl tnl/arp/show" will dump the mac address of
80.0.0.2.
However, if we execute these 3 steps:
1. flush the arp table: ovs-appctl tnl/arp/flush;
2. refresh flows on br-physnet: ovs-ofctl del-flows br-physnet && ovs-ofctl
add-flow br-physnet "table=0,priority=0 actions=NORMAL"
3. ping in VM;
We lookup the tnl arp table but get nothing. We find that the arp request
packet is dropped and the reason is that the versions does not match when
lookup rules on br-physnet.
The arp request send call stack is as follow:
tnl_send_arp_request()->compose_table_xlate()->ofproto_dpif_execute_actions__()->……
before refresh br-physnet flows, version match:
Breakpoint 2, versions_visible_in_version (versions=0x55bb9eaf02e0, version=6)
at lib/versions.h:50
50 {
(gdb) bt
#0 versions_visible_in_version (versions=0x55bb9eaf02e0, version=6) at
lib/versions.h:50
#1 0x000055bb9c499e00 in cls_match_visible_in_version (rule=0x55bb9eaf02c0,
version=6) at lib/classifier-private.h:109
#2 0x000055bb9c49df5e in find_match (subtable=0x55bb9eaebf00, version=6,
flow=0x7f6b21627e60, hash=0) at lib/classifier.c:1658
#3 0x000055bb9c49e257 in find_match_wc (subtable=0x55bb9eaebf00, version=6,
flow=0x7f6b21627e60, trie_ctx=0x7f6b21626000, n_tries=2, wc=0x7f6b21626ae0) at
lib/classifier.c:1712
#4 0x000055bb9c49c6ee in classifier_lookup__ (cls=0x55bba0dae598, version=6,
flow=0x7f6b21627e60, wc=0x7f6b21626ae0, allow_conjunctive_matches=true) at
lib/classifier.c:972
#5 0x000055bb9c49ce2a in classifier_lookup (cls=0x55bba0dae598, version=6,
flow=0x7f6b21627e60, wc=0x7f6b21626ae0) at lib/classifier.c:1166
#6 0x000055bb9c456cf4 in rule_dpif_lookup_in_table (ofproto=0x55bba0da4fd0,
version=6, table_id=0 '\000', flow=0x7f6b21627e60, wc=0x7f6b21626ae0) at
ofproto/ofproto-dpif.c:4195
#7 0x000055bb9c45708e in rule_dpif_lookup_from_table (ofproto=0x55bba0da4fd0,
version=6, flow=0x7f6b21627e60, wc=0x7f6b21626ae0, stats=0x7f6b21627dc0,
table_id=0x7f6b2162702a "", in_port=2, may_packet_in=true,
honor_table_miss=true,
xcache=0x0) at ofproto/ofproto-dpif.c:4309
#8 0x000055bb9c479cb0 in xlate_table_action (ctx=0x7f6b21626d40, in_port=2,
table_id=0 '\000', may_packet_in=true, honor_table_miss=true) at
ofproto/ofproto-dpif-xlate.c:3913
#9 0x000055bb9c47ba60 in xlate_output_action (ctx=0x7f6b21626d40, port=65529,
max_len=0, may_packet_in=true) at ofproto/ofproto-dpif-xlate.c:4593
#10 0x000055bb9c47e2d6 in do_xlate_actions (ofpacts=0x7f6b216285a0,
ofpacts_len=12, ctx=0x7f6b21626d40) at ofproto/ofproto-dpif-xlate.c:5587
#11 0x000055bb9c480bc2 in xlate_actions (xin=0x7f6b21627e50,
xout=0x7f6b21627da0) at ofproto/ofproto-dpif-xlate.c:6481
#12 0x000055bb9c4568ad in ofproto_dpif_execute_actions__
(ofproto=0x55bba0da4fd0, version=6, flow=0x7f6b216285b0, rule=0x0,
ofpacts=0x7f6b216285a0, ofpacts_len=12, depth=1, resubmits=1,
packet=0x7f6b21628880) at ofproto/ofproto-dpif.c:4077
#13 0x000055bb9c477ce7 in compose_table_xlate (ctx=0x7f6b2162bdf0,
out_dev=0x55bba0d8a500, packet=0x7f6b21628880) at
ofproto/ofproto-dpif-xlate.c:3294
#14 0x000055bb9c477eb8 in tnl_send_arp_request (ctx=0x7f6b2162bdf0,
out_dev=0x55bba0d8a500, eth_src=..., ip_src=16777296, ip_dst=33554512) at
ofproto/ofproto-dpif-xlate.c:3326
#15 0x000055bb9c4781d0 in build_tunnel_send (ctx=0x7f6b2162bdf0,
xport=0x55bb9eae9110, flow=0x7f6b2162ce50, tunnel_odp_port=1) at
ofproto/ofproto-dpif-xlate.c:3387
#16 0x000055bb9c47961a in compose_output_action__ (ctx=0x7f6b2162bdf0,
ofp_port=2, xr=0x0, check_stp=true) at ofproto/ofproto-dpif-xlate.c:3755
#17 0x000055bb9c47992c in compose_output_action (ctx=0x7f6b2162bdf0,
ofp_port=2, xr=0x0) at ofproto/ofproto-dpif-xlate.c:3847
#18 0x000055bb9c4752b0 in output_normal (ctx=0x7f6b2162bdf0,
out_xbundle=0x55bba0e04840, xvlan=0x7f6b2162a230) at
ofproto/ofproto-dpif-xlate.c:2290
#19 0x000055bb9c4765c0 in xlate_normal_flood (ctx=0x7f6b2162bdf0,
in_xbundle=0x55bb9eaf53f0, xvlan=0x7f6b2162a230) at
ofproto/ofproto-dpif-xlate.c:2769
#20 0x000055bb9c47724e in xlate_normal (ctx=0x7f6b2162bdf0) at
ofproto/ofproto-dpif-xlate.c:2999
#21 0x000055bb9c47ba71 in xlate_output_action (ctx=0x7f6b2162bdf0, port=65530,
max_len=0, may_packet_in=true) at ofproto/ofproto-dpif-xlate.c:4597
#22 0x000055bb9c47e2d6 in do_xlate_actions (ofpacts=0x55bba0ea40f8,
ofpacts_len=16, ctx=0x7f6b2162bdf0) at ofproto/ofproto-dpif-xlate.c:5587
#23 0x000055bb9c479a14 in xlate_recursively (ctx=0x7f6b2162bdf0,
rule=0x55bb9eaebcc0, deepens=true) at ofproto/ofproto-dpif-xlate.c:3867
#24 0x000055bb9c479d8d in xlate_table_action (ctx=0x7f6b2162bdf0, in_port=1,
table_id=0 '\000', may_packet_in=true, honor_table_miss=true) at
ofproto/ofproto-dpif-xlate.c:3937
#25 0x000055bb9c478da3 in compose_output_action__ (ctx=0x7f6b2162bdf0,
ofp_port=1, xr=0x0, check_stp=true) at ofproto/ofproto-dpif-xlate.c:3595
#26 0x000055bb9c47992c in compose_output_action (ctx=0x7f6b2162bdf0,
ofp_port=1, xr=0x0) at ofproto/ofproto-dpif-xlate.c:3847
#27 0x000055bb9c4752b0 in output_normal (ctx=0x7f6b2162bdf0,
out_xbundle=0x55bba0e04740, xvlan=0x7f6b2162b700) at
ofproto/ofproto-dpif-xlate.c:2290
#28 0x000055bb9c4765c0 in xlate_normal_flood (ctx=0x7f6b2162bdf0,
in_xbundle=0x55bba0e1df00, xvlan=0x7f6b2162b700) at
ofproto/ofproto-dpif-xlate.c:2769
#29 0x000055bb9c47724e in xlate_normal (ctx=0x7f6b2162bdf0) at
ofproto/ofproto-dpif-xlate.c:2999
#30 0x000055bb9c47ba71 in xlate_output_action (ctx=0x7f6b2162bdf0, port=65530,
max_len=0, may_packet_in=true) at ofproto/ofproto-dpif-xlate.c:4597
#31 0x000055bb9c47e2d6 in do_xlate_actions (ofpacts=0x55bba0da4db8,
ofpacts_len=16, ctx=0x7f6b2162bdf0) at ofproto/ofproto-dpif-xlate.c:5587
#32 0x000055bb9c480bc2 in xlate_actions (xin=0x7f6b2162ce40,
xout=0x7f6b2164dc90) at ofproto/ofproto-dpif-xlate.c:6481
#33 0x000055bb9c469a75 in upcall_xlate (udpif=0x55bba0d81060,
upcall=0x7f6b2164dc30, odp_actions=0x7f6b2164dca8, wc=0x7f6b2164dce8) at
ofproto/ofproto-dpif-upcall.c:1312
#34 0x000055bb9c46a0cd in process_upcall (udpif=0x55bba0d81060,
upcall=0x7f6b2164dc30, odp_actions=0x7f6b2164dca8, wc=0x7f6b2164dce8) at
ofproto/ofproto-dpif-upcall.c:1449
#35 0x000055bb9c468dbd in recv_upcalls (handler=0x55bba0eaf268) at
ofproto/ofproto-dpif-upcall.c:974
#36 0x000055bb9c4689a8 in udpif_upcall_handler (arg=0x55bba0eaf268) at
ofproto/ofproto-dpif-upcall.c:894
#37 0x000055bb9c57e87b in ovsthread_wrapper (aux_=0x55bba0e1e350) at
lib/ovs-thread.c:708
#38 0x00007f6b6cf4edf5 in start_thread () from /usr/lib64/libpthread.so.0
#39 0x00007f6b6b0e548d in clone () from /usr/lib64/libc.so.6
(gdb) n
54 atomic_read_relaxed(&CONST_CAST(struct versions *,
(gdb) n
58 return versions->add_version <= version && version <
remove_version;
(gdb) p versions->add_version
$1 = 6
(gdb) p version
$2 = 6
(gdb) c
Continuing.
after refresh br-physnet flows: version does not match
Breakpoint 3, versions_visible_in_version (versions=0x55bb9eaee620, version=6)
at lib/versions.h:50
50 {
(gdb) bt
#0 versions_visible_in_version (versions=0x55bb9eaee620, version=6) at
lib/versions.h:54
#1 0x000055bb9c499e00 in cls_match_visible_in_version (rule=0x55bb9eaee600,
version=6) at lib/classifier-private.h:109
#2 0x000055bb9c49df5e in find_match (subtable=0x55bba0eae8b0, version=6,
flow=0x7f6b21627e60, hash=0) at lib/classifier.c:1658
#3 0x000055bb9c49e257 in find_match_wc (subtable=0x55bba0eae8b0, version=6,
flow=0x7f6b21627e60, trie_ctx=0x7f6b21626000, n_tries=2, wc=0x7f6b21626ae0) at
lib/classifier.c:1712
#4 0x000055bb9c49c6ee in classifier_lookup__ (cls=0x55bba0dae598, version=6,
flow=0x7f6b21627e60, wc=0x7f6b21626ae0, allow_conjunctive_matches=true) at
lib/classifier.c:972
#5 0x000055bb9c49ce2a in classifier_lookup (cls=0x55bba0dae598, version=6,
flow=0x7f6b21627e60, wc=0x7f6b21626ae0) at lib/classifier.c:1166
#6 0x000055bb9c456cf4 in rule_dpif_lookup_in_table (ofproto=0x55bba0da4fd0,
version=6, table_id=0 '\000', flow=0x7f6b21627e60, wc=0x7f6b21626ae0) at
ofproto/ofproto-dpif.c:4195
#7 0x000055bb9c45708e in rule_dpif_lookup_from_table (ofproto=0x55bba0da4fd0,
version=6, flow=0x7f6b21627e60, wc=0x7f6b21626ae0, stats=0x7f6b21627dc0,
table_id=0x7f6b2162702a "", in_port=2, may_packet_in=true,
honor_table_miss=true,
xcache=0x0) at ofproto/ofproto-dpif.c:4309
#8 0x000055bb9c479cb0 in xlate_table_action (ctx=0x7f6b21626d40, in_port=2,
table_id=0 '\000', may_packet_in=true, honor_table_miss=true) at
ofproto/ofproto-dpif-xlate.c:3913
#9 0x000055bb9c47ba60 in xlate_output_action (ctx=0x7f6b21626d40, port=65529,
max_len=0, may_packet_in=true) at ofproto/ofproto-dpif-xlate.c:4593
#10 0x000055bb9c47e2d6 in do_xlate_actions (ofpacts=0x7f6b216285a0,
ofpacts_len=12, ctx=0x7f6b21626d40) at ofproto/ofproto-dpif-xlate.c:5587
#11 0x000055bb9c480bc2 in xlate_actions (xin=0x7f6b21627e50,
xout=0x7f6b21627da0) at ofproto/ofproto-dpif-xlate.c:6481
#12 0x000055bb9c4568ad in ofproto_dpif_execute_actions__
(ofproto=0x55bba0da4fd0, version=6, flow=0x7f6b216285b0, rule=0x0,
ofpacts=0x7f6b216285a0, ofpacts_len=12, depth=1, resubmits=1,
packet=0x7f6b21628880) at ofproto/ofproto-dpif.c:4077
#13 0x000055bb9c477ce7 in compose_table_xlate (ctx=0x7f6b2162bdf0,
out_dev=0x55bb9eaeb040, packet=0x7f6b21628880) at
ofproto/ofproto-dpif-xlate.c:3294
#14 0x000055bb9c477eb8 in tnl_send_arp_request (ctx=0x7f6b2162bdf0,
out_dev=0x55bb9eaeb040, eth_src=..., ip_src=16777296, ip_dst=33554512) at
ofproto/ofproto-dpif-xlate.c:3326
#15 0x000055bb9c4781d0 in build_tunnel_send (ctx=0x7f6b2162bdf0,
xport=0x55bba0e56430, flow=0x7f6b2162ce50, tunnel_odp_port=1) at
ofproto/ofproto-dpif-xlate.c:3387
#16 0x000055bb9c47961a in compose_output_action__ (ctx=0x7f6b2162bdf0,
ofp_port=2, xr=0x0, check_stp=true) at ofproto/ofproto-dpif-xlate.c:3755
#17 0x000055bb9c47992c in compose_output_action (ctx=0x7f6b2162bdf0,
ofp_port=2, xr=0x0) at ofproto/ofproto-dpif-xlate.c:3847
#18 0x000055bb9c4752b0 in output_normal (ctx=0x7f6b2162bdf0,
out_xbundle=0x55bba0d89b00, xvlan=0x7f6b2162a230) at
ofproto/ofproto-dpif-xlate.c:2290
#19 0x000055bb9c4765c0 in xlate_normal_flood (ctx=0x7f6b2162bdf0,
in_xbundle=0x55bba0d8a500, xvlan=0x7f6b2162a230) at
ofproto/ofproto-dpif-xlate.c:2769
#20 0x000055bb9c47724e in xlate_normal (ctx=0x7f6b2162bdf0) at
ofproto/ofproto-dpif-xlate.c:2999
#21 0x000055bb9c47ba71 in xlate_output_action (ctx=0x7f6b2162bdf0, port=65530,
max_len=0, may_packet_in=true) at ofproto/ofproto-dpif-xlate.c:4597
#22 0x000055bb9c47e2d6 in do_xlate_actions (ofpacts=0x55bba0ea40f8,
ofpacts_len=16, ctx=0x7f6b2162bdf0) at ofproto/ofproto-dpif-xlate.c:5587
#23 0x000055bb9c479a14 in xlate_recursively (ctx=0x7f6b2162bdf0,
rule=0x55bb9eaebcc0, deepens=true) at ofproto/ofproto-dpif-xlate.c:3867
#24 0x000055bb9c479d8d in xlate_table_action (ctx=0x7f6b2162bdf0, in_port=1,
table_id=0 '\000', may_packet_in=true, honor_table_miss=true) at
ofproto/ofproto-dpif-xlate.c:3937
#25 0x000055bb9c478da3 in compose_output_action__ (ctx=0x7f6b2162bdf0,
ofp_port=1, xr=0x0, check_stp=true) at ofproto/ofproto-dpif-xlate.c:3595
#26 0x000055bb9c47992c in compose_output_action (ctx=0x7f6b2162bdf0,
ofp_port=1, xr=0x0) at ofproto/ofproto-dpif-xlate.c:3847
#27 0x000055bb9c4752b0 in output_normal (ctx=0x7f6b2162bdf0,
out_xbundle=0x55bba0d8f880, xvlan=0x7f6b2162b700) at
ofproto/ofproto-dpif-xlate.c:2290
#28 0x000055bb9c4765c0 in xlate_normal_flood (ctx=0x7f6b2162bdf0,
in_xbundle=0x55bba0eaf5d0, xvlan=0x7f6b2162b700) at
ofproto/ofproto-dpif-xlate.c:2769
#29 0x000055bb9c47724e in xlate_normal (ctx=0x7f6b2162bdf0) at
ofproto/ofproto-dpif-xlate.c:2999
#30 0x000055bb9c47ba71 in xlate_output_action (ctx=0x7f6b2162bdf0, port=65530,
max_len=0, may_packet_in=true) at ofproto/ofproto-dpif-xlate.c:4597
#31 0x000055bb9c47e2d6 in do_xlate_actions (ofpacts=0x55bba0da4db8,
ofpacts_len=16, ctx=0x7f6b2162bdf0) at ofproto/ofproto-dpif-xlate.c:5587
#32 0x000055bb9c480bc2 in xlate_actions (xin=0x7f6b2162ce40,
xout=0x7f6b2164dc90) at ofproto/ofproto-dpif-xlate.c:6481
#33 0x000055bb9c469a75 in upcall_xlate (udpif=0x55bba0d81060,
upcall=0x7f6b2164dc30, odp_actions=0x7f6b2164dca8, wc=0x7f6b2164dce8) at
ofproto/ofproto-dpif-upcall.c:1312
#34 0x000055bb9c46a0cd in process_upcall (udpif=0x55bba0d81060,
upcall=0x7f6b2164dc30, odp_actions=0x7f6b2164dca8, wc=0x7f6b2164dce8) at
ofproto/ofproto-dpif-upcall.c:1449
#35 0x000055bb9c468dbd in recv_upcalls (handler=0x55bba0eaf268) at
ofproto/ofproto-dpif-upcall.c:974
#36 0x000055bb9c4689a8 in udpif_upcall_handler (arg=0x55bba0eaf268) at
ofproto/ofproto-dpif-upcall.c:894
#37 0x000055bb9c57e87b in ovsthread_wrapper (aux_=0x55bba0e1e350) at
lib/ovs-thread.c:708
#38 0x00007f6b6cf4edf5 in start_thread () from /usr/lib64/libpthread.so.0
#39 0x00007f6b6b0e548d in clone () from /usr/lib64/libc.so.6
(gdb) n
54 atomic_read_relaxed(&CONST_CAST(struct versions *,
(gdb) n
58 return versions->add_version <= version && version <
remove_version;
(gdb) p versions->add_version
$3 = 8
(gdb) p version
$4 = 6
(gdb) c
Continuing.
In the beginning, the version of br-tun and br-physnet are all 6. Function
versions_visible_in_version() return true, then the arp request can be sent out
through br-physnet.
However, after I execute "ovs-ofctl del-flows br-physnet; ovs-ofctl add-flow
br-physnet "table=0,priority=0 actions=NORMAL"" to refresh flows on br-physnet,
the version of br-physnet will increament to 8, and the version of br-tun is
still 6. Then Function versions_visible_in_version() return false, then the arp
request was dropped by br-physnet.
Through analysis, finally we found that in function compose_table_xlate(), the
variable "out_dev" is tunnel_bearing, "xbridge" is br-physnet. Since ctx is the
context of br-tun, the ctx->xin->tables_version is version of br-tun, which
equals 6.
In ofproto_dpif_execute_actions__(), ovs will lookup flows in br-physnet and
check rule version to make sure that rules version less equal than the
parameter "version". In Function versions_visible_in_version() , 8 is not less
equal than 6, then return null, and drop the arp request packet.
When we replace parameter "ctx->xin->tables_version" with
"ofproto_dpif_get_tables_version(xbridge->ofproto)", the version of br-physnet
will be used to do the version check, the problem solved.
We want to confirm that is this a problem and is our solution reasonable ?
[计算机生成了可选文字: staticint compose_table_xlate(struct
xlate_ctx木ctx,conststructxpe吐‘out_dev structdP--packet冲 { StrU(t StrU〔t StrUCt
xbridge ofpact_ 〔xbridge=out_dev- packet) >xbridge; OU工pUtOUTpUI二 +10以flo树二
ofpact_init(&output.ofpact, f10“一eXtract(packet,&flo。)
flow.in_port.ofp_port=out output.port二OFPP_TABLE; output.maX--len=e二
OFPACT_OIJTPUT,sizeo下output)二 二 _dev一>ofp_port二
returnofproto--dpif_execute_actions_ ctx一>xin一tablesversio &flow, ,改OUtpU
ofpact,sizeof ctx一>depth,ctx一>resub爪its, output, packet); }]
许琛chenxu
Cloud BU 基础服务产品部
Cloud Infrastructure Service Product Dept., Cloud BU, HUAWEI
Mobile: 18867525645 Tel: 0755- 12345678
中国(China)-杭州(Hangzhou)-华为杭州研发中心Z6 [4-B15R-015S]
HUAWEI R&D Center, Jiangshu Str., Binjiang District, Hangzhou 310051, P.R.China
E-mail: [email protected]<mailto:[email protected]>
[邮件签名中文字红点] [cid:[email protected]] <https://www.huaweicloud.com/>
[494f24f47d32af5819074491a59e2458]
<https://mp.weixin.qq.com/s?__biz=MzI1Mzc1MzMyOQ==&mid=100007026&idx=1&sn=91c07c2a030ab5fee15f24d601da530c&chksm=69cefeec5eb977faae9c29647b7512ec7e0f9bb6301e863c892dde95106d2ec732ff43662666#rd>
[5edacfeb19fb3f9d99604eeaee3be363]
<https://weibo.com/p/1006061930559805/home?from=page_100606&mod=TAB&is_all=1#place>
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss