> -----Original Message-----
> From: Ben Pfaff [mailto:b...@ovn.org]
> Sent: Wednesday, June 3, 2020 8:35 AM
> To: Yanqin Wei <yanqin....@arm.com>
> Cc: Linhaifeng <haifeng....@huawei.com>; d...@openvswitch.org; nd
> <n...@arm.com>; Lilijun (Jerry) <jerry.lili...@huawei.com>; chenchanghu
> <chenchan...@huawei.com>; Lichunhe <lichu...@huawei.com>
> Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first
> 
> This is not how RCU works in OVS.  Every thread is by default considered
> active.  They rarely quiesce except implicitly inside poll_block().
> Please read the large comment at the top of ovs-rcu.h.
> 
> Is your patch based on actual bugs that you have found, or is it just some 
> kind
> of precaution?  If it is the latter, then it is not needed.
> 
Is an actual bug for old version bug it's also suitable for the other codes in 
ovs.

Here is the debug info:
linux-mNuKFc:/Images/linhf/830/Euler_compile_env # gdb -p `pidof ovs-vswitchd`
GNU gdb (GDB) Red Hat Enterprise Linux 8.2-3.h2
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "aarch64-Huawei-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 102706
[New LWP 109133]
[New LWP 109134]
[New LWP 109297]
[New LWP 109298]
[New LWP 109299]
[New LWP 109300]
[New LWP 109303]
[New LWP 109304]
[New LWP 109308]
[New LWP 109309]
[New LWP 109310]
[New LWP 109311]
[New LWP 109522]
[New LWP 109523]
[New LWP 109603]
[New LWP 109615]
[New LWP 109619]
[New LWP 109655]
[New LWP 109673]
[New LWP 109794]
[New LWP 109795]
[New LWP 113953]
[New LWP 114362]
[New LWP 114364]
[New LWP 114368]
[New LWP 114370]
[New LWP 114373]
[New LWP 114377]
[New LWP 115594]
[New LWP 115595]
[New LWP 115596]
[New LWP 115597]
[New LWP 115598]
[New LWP 115600]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x0000ffff879981ac in poll () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glib2-2.54.2-2.h1.aarch64 
glibc-2.28-9.h17.aarch64 keyutils-libs-1.5.8-3.aarch64 
krb5-libs-1.15.1-34.h2.aarch64 libcgroup-0.41-15.h3.aarch64 
libcom_err-1.44.3-1.h4.aarch64 libgcc-7.3.0-20190804.h18.aarch64 
libselinux-2.5-12.aarch64 numactl-libs-2.0.9-7.h1.aarch64 
openssl-libs-1.0.2k-16.h6.aarch64 pcre-8.32-17.h9.aarch64 
uvpkmc-1.0.1-807.aarch64 zlib-1.2.7-17.aarch64
(gdb) b dpcls_destroy_subtable
Breakpoint 1 at 0x508bcc: file lib/dpif-netdev.c, line 6919.
(gdb) b ovsrcu_call_postponed
Breakpoint 2 at 0x5b7d34: file lib/ovs-rcu.c, line 336.
(gdb) c
Continuing.
[Switching to Thread 0xffff83b97860 (LWP 109304)]

Thread 9 "urcu2" hit Breakpoint 2, ovsrcu_call_postponed () at lib/ovs-rcu.c:336
warning: Source file is more recent than executable.
336     {
(gdb) n
339         int wait_del = 0;
(gdb) 
340         while(wait_del);
(gdb) set wait_del = 1
(gdb) c
Continuing.
[Switching to Thread 0xffff51748860 (LWP 115598)]

Thread 34 "revalidator19" hit Breakpoint 1, dpcls_destroy_subtable 
(cls=0xffff1c00a420, subtable=0xffff3c009250) at lib/dpif-netdev.c:6919
6919        int wait_get = 0;
(gdb) n
6920        VLOG_DBG("Destroying subtable %p for in_port %d", subtable, 
cls->in_port);
(gdb) 
6921        pvector_remove(&cls->subtables, subtable);
(gdb) 
6922        cmap_remove(&cls->subtables_map, &subtable->cmap_node,
(gdb) set wait_get = 1
(gdb) n
6924        cmap_destroy(&subtable->rules);
(gdb) p subtable->rules
$1 = {impl = {p = 0xffff30008940}}
(gdb) s
cmap_destroy (cmap=0xffff3c009258) at lib/cmap.c:288
288         if (cmap) {
(gdb) n
289             struct cmap_impl *impl = cmap_get_impl(cmap);
(gdb) 
290             if (impl != &empty_cmap) {
(gdb) 
291                 ovsrcu_postpone(free_cacheline, impl);
(gdb) s
ovsrcu_postpone__ (function=0x6029b0 <free_cacheline>, aux=0xffff30008940) at 
lib/ovs-rcu.c:315
315         struct ovsrcu_perthread *perthread = ovsrcu_perthread_get();
(gdb) n
318         int size = ARRAY_SIZE(cbset->cbs);
(gdb) 
319         cbset = perthread->cbset;
(gdb) 
320         if (!cbset) {
(gdb) 
325         cb = &cbset->cbs[cbset->n_cbs++];
(gdb) 
326         cb->function = function;
(gdb) 
327         cb->aux = aux;
(gdb) 
329         if (cbset->n_cbs >= size) {
(gdb) set size = cbset->n_cbs
(gdb) n
330             ovsrcu_flush_cbset(perthread);
(gdb) s
ovsrcu_flush_cbset (perthread=0xffff30001210) at lib/ovs-rcu.c:397
397         ovsrcu_flush_cbset__(perthread, false);
(gdb) s
ovsrcu_flush_cbset__ (perthread=0xffff30001210, protected=false) at 
lib/ovs-rcu.c:380
380         struct ovsrcu_cbset *cbset = perthread->cbset;
(gdb) n
382         if (cbset) {
(gdb) 
383             guarded_list_push_back(&flushed_cbsets, &cbset->list_node, 
SIZE_MAX);
(gdb) 
384             perthread->cbset = NULL;
(gdb) 
386             if (protected) {
(gdb) 
389                 seq_change(flushed_cbsets_seq);
(gdb) 
392     }
(gdb) 
ovsrcu_flush_cbset (perthread=0xffff30001210) at lib/ovs-rcu.c:398
398     }
(gdb) 
ovsrcu_postpone__ (function=0x6029b0 <free_cacheline>, aux=0xffff30008940) at 
lib/ovs-rcu.c:332
332     }
(gdb) 
cmap_destroy (cmap=0xffff3c009258) at lib/cmap.c:294
294     }
(gdb) 
dpcls_destroy_subtable (cls=0xffff1c00a420, subtable=0xffff3c009250) at 
lib/dpif-netdev.c:6925
6925        while(wait_get);
(gdb) p wait_get
$2 = 1
(gdb) thir 9
Undefined command: "thir".  Try "help".
(gdb) thr 9
[Switching to thread 9 (Thread 0xffff83b97860 (LWP 109304))]
#0  ovsrcu_call_postponed () at lib/ovs-rcu.c:340
340         while(wait_del);
(gdb) set wait_del = 0
(gdb) n
342         guarded_list_pop_all(&flushed_cbsets, &cbsets);
(gdb) 
343         if (ovs_list_is_empty(&cbsets)) {
(gdb) pcbsets cbsets
$173 = {list_node = {prev = 0xfffefc041660, next = 0xffff83b96f98}, cbs = 
{{function = 0x6029b0 <free_cacheline>, aux = 0xffff3c0094c0}, {function = 
0x6029b0 <free_cacheline>, aux = 0xffff30008940}, {function = 0x508b9c 
<subtable_free>, aux = 0xffff3c009c10}, {function = 0xffff87947448 <free>, aux 
= 0xffff30001390}, {function = 0x4f9b74 <dp_netdev_flow_free>, aux = 
0xffff34009b80}, {function = 0x0, aux = 0x0}, {function = 0x0, aux = 0x0}, 
{function = 0x90, aux = 0x74}, {function = 0x0, aux = 0xabc7c8 
<ovsrcu_threads>}, {function = 0x0, aux = 0x0}, {function = 0xffffffff, aux = 
0x0}, {function = 0x0, aux = 0x0}, {function = 0x0, aux = 0x2263e3}, {function 
= 0x0, aux = 0x3275637275}, {function = 0x0, aux = 0x21}, {function = 
0xffff3000ea30, aux = 0xffff300013f0}}, n_cbs = 2}
(gdb) prcus ovsrcu_threads
$174 = {list_node = {prev = 0xabc7c8 <ovsrcu_threads>, next = 0xffff30001390}, 
mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, 
__kind = 2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
      __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 30 times>, 
__align = 0}, where = 0x840bc0 "<unlocked>"}, seqno = 3301106, cbset = 0x0, 
name = "pmd14\000\000\000\000\000\000\000\000\000\000"}
$175 = {list_node = {prev = 0xffff38003970, next = 0xffff30001210}, mutex = 
{lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 
2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
      __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 30 times>, 
__align = 0}, where = 0x840bc0 "<unlocked>"}, seqno = 3297445, cbset = 0x0, 
name = "urcu2\000\000\000\000\000\000\000\000\000\000"}
$176 = {list_node = {prev = 0xffff30001390, next = 0xabc7c8 <ovsrcu_threads>}, 
mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, 
__kind = 2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
      __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 30 times>, 
__align = 0}, where = 0x840bc0 "<unlocked>"}, seqno = 3300573, cbset = 0x0, 
name = "revalidator19\000\000"}
(gdb) n
347         ovsrcu_synchronize();
(gdb) s
ovsrcu_synchronize () at lib/ovs-rcu.c:225
225     {
(gdb) n
226         unsigned int warning_threshold = INIT_WARNING_THRESHOLD_MS;
(gdb) 
227         unsigned int block_report_frequent = BLOCK_REPORT_FREQUENT_MS;
(gdb) 
230         int wait_round = 0;
(gdb) 
232         if (single_threaded()) {
(gdb) 
236         target_seqno = seq_read(global_seqno);
(gdb) 
237         ovsrcu_quiesce_start();
(gdb) p target_seqno
$177 = 3301107
(gdb) thr 23    //切换到pmd一直next,直到执行ovsrcu_try_quiesce
(gdb) 
4975                    if (!ovsrcu_try_quiesce()) {
(gdb) 
4976                        emc_cache_slow_sweep(&pmd->flow_cache);
(gdb) 
4980                    if (OVS_UNLIKELY(pmd->pmd_xinfo.flow_table_on_dp != 
flow_table_on_dp)) {
(gdb) prcus ovsrcu_threads
$194 = {list_node = {prev = 0xabc7c8 <ovsrcu_threads>, next = 0xffff30001210}, 
mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, 
__kind = 2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
      __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 30 times>, 
__align = 0}, where = 0x840bc0 "<unlocked>"}, seqno = 3301109, cbset = 0x0, 
name = "pmd14\000\000\000\000\000\000\000\000\000\000"}
$195 = {list_node = {prev = 0xffff38003970, next = 0xabc7c8 <ovsrcu_threads>}, 
mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, 
__kind = 2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
      __size = '\000' <repeats 16 times>, "\002", '\000' <repeats 30 times>, 
__align = 0}, where = 0x840bc0 "<unlocked>"}, seqno = 3300573, cbset = 0x0, 
name = "revalidator19\000\000"}
(gdb) b util.c:261 if (p == 0xffff30008940)
Breakpoint 3 at 0x6029bc: file lib/util.c, line 261.
(gdb) b dpif-netdev.c:7129 if (subtable->rules->impl->p == 0xffff30008940)
Breakpoint 4 at 0x509444: file lib/dpif-netdev.c, line 7129.
(gdb) thr 34
[Switching to thread 34 (Thread 0xffff51748860 (LWP 115598))]
(gdb) set wait_get = 0
(gdb) c
Continuing.
[Switching to Thread 0xffff43ffe860 (LWP 113953)]

Thread 23 "pmd14" hit Breakpoint 4, dpcls_lookup (cls=0xffff1c00a420, 
keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) 
at lib/dpif-netdev.c:7129
7129            while(wait_free);
(gdb) p wait_free
$200 = 0
(gdb) set wait_free = 1
(gdb) c
Continuing.
[Switching to Thread 0xffff43ffe860 (LWP 113953)]

Thread 23 "pmd14" hit Breakpoint 4, dpcls_lookup (cls=0xffff1c00a420, 
keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) 
at lib/dpif-netdev.c:7129
7129            while(wait_free);
(gdb) c
Continuing.

Thread 23 "pmd14" hit Breakpoint 4, dpcls_lookup (cls=0xffff1c00a420, 
keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) 
at lib/dpif-netdev.c:7129
7129            while(wait_free);
(gdb) c
Continuing.

Thread 23 "pmd14" hit Breakpoint 4, dpcls_lookup (cls=0xffff1c00a420, 
keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) 
at lib/dpif-netdev.c:7129
7129            while(wait_free);
(gdb) c
Continuing.
[Switching to Thread 0xffff83b97860 (LWP 109304)]

Thread 9 "urcu2" hit Breakpoint 3, free_cacheline (p=0xffff30008940) at 
lib/util.c:261
261         free(p);
(gdb) c
(gdb) c
Continuing.
[Switching to Thread 0xffff43ffe860 (LWP 113953)]

Thread 23 "pmd14" hit Breakpoint 4, dpcls_lookup (cls=0xffff1c00a420, 
keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) 
at lib/dpif-netdev.c:7129
7129            while(wait_free);
(gdb) set wait_free = 0
(gdb) c
Continuing.

Thread 23 "pmd14" received signal SIGSEGV, Segmentation fault.
0x00000000004e1e50 in read_counter (bucket_=0xffff3003bfc0) at lib/cmap.c:333
333         atomic_read_explicit(&bucket->counter, &counter, 
memory_order_acquire);
(gdb) bt
#0  0x00000000004e1e50 in read_counter (bucket_=0xffff3003bfc0) at 
lib/cmap.c:333
#1  0x00000000004e1e9c in read_even_counter (bucket=0xffff3003bfc0) at 
lib/cmap.c:344
#2  0x00000000004e2270 in cmap_find_batch (cmap=0xffff3c009258, map=1, 
hashes=0xffff43ffcb18, nodes=0xffff43ffcb98) at lib/cmap.c:459
#3  0x0000000000509468 in dpcls_lookup (cls=0xffff1c00a420, 
keys=0xffff43ffd3c0, rules=0xffff43ffcce0, cnt=1, num_lookups_p=0xffff43ffcd84) 
at lib/dpif-netdev.c:7131
#4  0x00000000004fabdc in dp_netdev_lookup_flow (classifiers=0xffff5237b0c8, 
key=0xffff43ffd3c0, lookup_num_p=0xffff43ffcd84) at lib/dpif-netdev.c:2111
#5  0x000000000050748c in fast_path_processing (pmd=0xffff51f4b000, 
packets_=0xffff43ffddb8, keys=0xffff43ffd3c0, batches=0xffff43ffd270, 
n_batches=0xffff43ffd6c0, in_port=11, now=862863019) at lib/dpif-netdev.c:6027
#6  0x0000000000507bb0 in dp_netdev_input__ (pmd=0xffff51f4b000, 
packets=0xffff43ffddb8, md_is_valid=false, port_no=11) at lib/dpif-netdev.c:6138
#7  0x0000000000507fa0 in dp_netdev_input (pmd=0xffff51f4b000, 
packets=0xffff43ffddb8, port_no=11) at lib/dpif-netdev.c:6222
#8  0x000000000067d130 in dp_uevs_forward (evs_ctx=0xffff43ffdd60, port_no=11, 
md_is_valid=false) at lib/evs/evs-dpdk.c:2717
#9  0x00000000005035e4 in evs_netdev_process_rxq_port (enable_blc=0, 
port_args=0xffff43ffdd40, pmd_args=0xffff43ffdd88) at 
lib/evs/evs-dpdk-inline.h:162
#10 evs_forward_loop (lacp_only_mode=false, ts=0xffff43ffdd30, 
port_args=0xffff43ffdd40, pmd_args=0xffff43ffdd88, poll_list=0xffff38009030, 
poll_cnt=24) at lib/dpif-netdev.c:4788
#11 pmd_thread_main (f_=0xffff51f4b000) at lib/dpif-netdev.c:4905
#12 0x00000000005bb0fc in ovsthread_wrapper (aux_=0x282d2280) at 
lib/ovs-thread.c:715
#13 0x0000ffff87f5c8bc in start_thread () from /lib64/libpthread.so.0
#14 0x0000ffff879a1e7c in thread_start () from /lib64/libc.so.6
> On Tue, Jun 02, 2020 at 11:22:57PM +0000, Yanqin Wei wrote:
> > Hi Ben,
> >
> > If my understanding is correct, the writer could not be a rcu thread 
> > because it
> does not need report holding or not holding pointers.
> > So old memory will be freed after all rcu thread report quiesce.
> >
> > Best Regards,
> > Wei Yanqin
> >
> > > -----Original Message-----
> > > From: Ben Pfaff <b...@ovn.org>
> > > Sent: Wednesday, June 3, 2020 1:28 AM
> > > To: Linhaifeng <haifeng....@huawei.com>
> > > Cc: Yanqin Wei <yanqin....@arm.com>; d...@openvswitch.org; nd
> > > <n...@arm.com>; Lilijun (Jerry) <jerry.lili...@huawei.com>;
> > > chenchanghu <chenchan...@huawei.com>; Lichunhe
> <lichu...@huawei.com>
> > > Subject: Re: [ovs-dev] [PATCH v2] ovs rcu: update rcu pointer first
> > >
> > > On Tue, Jun 02, 2020 at 07:27:59AM +0000, Linhaifeng wrote:
> > > > We should update rcu pointer first then use ovsrcu_postpone to
> > > > free otherwise maybe cause use-after-free.
> > > > e.g.,reader indicates momentary quiescent and access old pointer
> > > > after writer postpone free old pointer and before setting new pointer.
> > > >
> > > > Signed-off-by: Linhaifeng <haifeng....@huawei.com>
> > >
> > > I don't see how that's possible, since the writer hasn't quiesced.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to