On April 1, 2019 2:56:13 PM GMT+02:00, Ilya Maximets <[email protected]> wrote: >Hi. > >Thanks for the report. > >It looks like an rcu misusing inside pvector implementation. >The following sequence will lead to the use-after-free: > > Thread#1 Thread#2 > >1. pvector_init(pvec); >2. pvector_insert(pvec, elem1); >3. pvector_publish(pvec); > > /* Remove from 'temp' impl. */ >4. pvector_remove(pvec, elem1); > /* Postpone the free() */ >5. ovsrcu_postpone(free, elem1); >6. cursor__ = >pvector_cursor_init(PVECTOR, 0, 0); > /* Get pointer from main impl. */ >7. elem1 = >pvector_cursor_next(&cursor__, INT_MIN, 0, 0); > >8. ovsrcu_quiesce(); > >9. cursor__ = >pvector_cursor_init(PVECTOR, 0, 0); > /* Get pointer from main impl. */ >10. elem1 = >pvector_cursor_next(&cursor__, INT_MIN, 0, 0); > > /* Change 'temp' with main impl. */ > /* Postpone the free() of old impl. */ >11. pvector_publish(pvec); >/* Postponed free() frees the elem1 because both threads quiesced once >since postpone(). */ >12. ovsrcu_quiesce(); > > /* Try to use obtained pointer. Use after free. */ >13. elem1->use(); > /* Postponed free() frees the old 'impl'. */ >14. ovsrcu_quiesce(); > > >Thread#2 assumes that it's safe to use pvector element within its grace >period, >i.e. between steps 8 and 14, however it was already freed on step 12. > >So, in current implementation it's not allowed for other threads to >quiesce while >one thread modifies the pvector. > >Main issue is that ovsrcu_get(struct pvector_impl *, &pvec->impl) >doesn't return >the most recent implementation. It continues to return old one until >changes published. >But it can't return the new one, because it's not sorted. So, I'm not >sure how to >fix this properly. In fact we need to postpone freeing elements after >freeing the >pvector->impl. This probably could be achieved by double postponing the >freeing of >the pvector elements. I'm not sure how to implement this correctly. >Another option >is to force callers to call ovsrcu_postpone(free, elem1) only after >pvector_publish(). > >Ben, what do you think? > > >Best regards, Ilya Maximets. > >> Hi all, >> >> We find a core dump file in our environment with ovs+dpdk, the >following is call stack: >> >> #0 0x00007f354d6eb237 in raise () from /lib64/libc.so.6 >> #1 0x00007f354d6ec928 in abort () from /lib64/libc.so.6 >> #2 0x00000000006a6d99 in PAT_abort () >> #3 0x00000000006a3edd in patchIllInsHandler () >> #4 <signal handler called> >> #5 0x00000000004c2275 in rehash (hash=1661256324, impl=0x0) at >lib/ccmap.c:135 >> #6 ccmap_find (ccmap=ccmap at entry=0x26b654b8, hash=1661256324) at >lib/ccmap.c:229 >> #7 0x00000000004bf73b in find_match_wc (wc=0x7f350ce67370, >n_tries=0, trie_ctx=0x7f350ce63e20, flow=0x7f350ce66dc0, >> version=18446744073709551614, subtable=<optimized out>) at >lib/classifier.c:1676 >> #8 classifier_lookup__ (cls=cls at entry=0xee2360 <cls>, >version=version at entry=18446744073709551614, >> flow=flow at entry=0x7f350ce66dc0, wc=wc at entry=0x7f350ce67370, >allow_conjunctive_matches=allow_conjunctive_matches at entry=true) >> at lib/classifier.c:972 >> #9 0x00000000004c193b in classifier_lookup (cls=cls at >entry=0xee2360 <cls>, version=version at entry=18446744073709551614, >> flow=flow at entry=0x7f350ce66dc0, wc=wc at entry=0x7f350ce67370) >at lib/classifier.c:1166 >> #10 0x0000000000575acb in tnl_port_map_lookup (flow=flow at >entry=0x7f350ce66dc0, wc=wc at entry=0x7f350ce67370) >> at lib/tnl_ports.c:287 >> #11 0x00000000004a7049 in compose_output_action__ (ctx=ctx at >entry=0x7f350ce65d60, ofp_port=1, xr=0x0, >> check_stp=check_stp at entry=true) at >ofproto/ofproto_dpif_xlate.c:3753 >> #12 0x00000000004a7967 in compose_output_action (xr=<optimized out>, >ofp_port=<optimized out>, ctx=0x7f350ce65d60) >> at ofproto/ofproto_dpif_xlate.c:3832 >> #13 output_normal (ctx=ctx at entry=0x7f350ce65d60, >out_xbundle=out_xbundle at entry=0x2cfe8120, xvlan=xvlan at >entry=0x7f350ce655e0) >> at ofproto/ofproto_dpif_xlate.c:2288 >> #14 0x00000000004a7df5 in xlate_normal_flood (ctx=ctx at >entry=0x7f350ce65d60, in_xbundle=in_xbundle at entry=0x9b547c0, >> xvlan=xvlan at entry=0x7f350ce655e0) at >ofproto/ofproto_dpif_xlate.c:2757 >> #15 0x00000000004aaf39 in xlate_normal (ctx=0x7f350ce65d60) at >ofproto/ofproto_dpif_xlate.c:2987 >> #16 xlate_output_action (ctx=ctx at entry=0x7f350ce65d60, >port=<optimized out>, max_len=<optimized out>, >> may_packet_in=may_packet_in at entry=true) at >ofproto/ofproto_dpif_xlate.c:4577 >> #17 0x00000000004a920e in do_xlate_actions (ofpacts=<optimized out>, >ofpacts_len=<optimized out>, ctx=ctx at entry=0x7f350ce65d60) >> at ofproto/ofproto_dpif_xlate.c:5565 >> #18 0x00000000004ac25f in xlate_actions (xin=xin at >entry=0x7f350ce66db0, xout=xout at entry=0x7f350ce675f0) >> at ofproto/ofproto_dpif_xlate.c:6456 >> #19 0x000000000049d4b7 in xlate_key (key=<optimized out>, >len=<optimized out>, push=push at entry=0x7f350ce670f0, >> ctx=ctx at entry=0x7f350ce675d0, udpif=<optimized out>) at >ofproto/ofproto_dpif_upcall.c:1974 >> #20 0x000000000049da48 in xlate_ukey (ukey=0x7f34ec032a20, >ukey=0x7f34ec032a20, ctx=0x7f350ce675d0, tcp_flags=<optimized out>, >> udpif=0x6b6c550) at ofproto/ofproto_dpif_upcall.c:1986 >> #21 revalidate_ukey__ (udpif=udpif at entry=0x6b6c550, ukey=ukey at >entry=0x7f34ec032a20, tcp_flags=<optimized out>, >> odp_actions=0x7f350ce679e0, recircs=recircs at >entry=0x7f350ce679d0, xcache=<optimized out>) >> at ofproto/ofproto_dpif_upcall.c:2032 >> #22 0x000000000049dc8d in revalidate_ukey (udpif=udpif at >entry=0x6b6c550, ukey=ukey at entry=0x7f34ec032a20, >> stats=stats at entry=0x7f350ce688b8, odp_actions=odp_actions at >entry=0x7f350ce679e0, reval_seq=reval_seq at entry=268144954, >> recircs=recircs at entry=0x7f350ce679d0) at >ofproto/ofproto_dpif_upcall.c:2128 >> #23 0x00000000004a0853 in revalidate (revalidator=0x75715d0) at >ofproto/ofproto_dpif_upcall.c:2488 >> #24 0x00000000004a0a4b in udpif_revalidator (arg=0x75715d0) at >ofproto/ofproto_dpif_upcall.c:958 >> #25 0x000000000054ca48 in ovsthread_wrapper (aux_=<optimized out>) at >lib/ovs_thread.c:682 >> #26 0x00007f354f303dd5 in start_thread () from /lib64/libpthread.so.0 >> #27 0x00007f354d7b359d in clone () from /lib64/libc.so.6 >> >> It is raised by the wrong use of rcu_postpone when deleting subtable >from classifier->subtables(pvector), the following is the triggering >steps: >> >> 1. Modify a hnic port's ip(by ifconfig hnic x.x.x.x/24) >> 2. Ovs route's reset will flush all entries in the tnl_ports's >classifier(cls) >> 3. Ovs main thread call destroy_subtable(): call >ovsrcu_postpone() to free the subtale,it occurs the >ovsrcu_flush_cbset() by accident. But cls->subtables->impl still >stores the deleted subtable's pointer,it will be moved out in >pvector_publish(in step 7) >> 4. Rcu thread start to ovsrcu_synchronize() before free the >subtable in flush_list.( support target_seqno is 2, main thread's >seq_no is smaller than 2,support it is 1.) >> 5. The pmd thread is still working. It adds the global_seqno to >3. >> 6. Revalidator thread wakes up form poll_block with seq_no 3. >Then, It acquires the subtable's pointer in cls->sutables which shoud >not be used. >> 7. Ovs main thread call pvector_publish and go to poll_block(), >moves out from ovsrcu_threads >> 8. ovsrcu_synchronize in step 4 find all thread's seq_no is >larger than 2,free the subtable >> 9. revalidator is still using the subtable's pointer(has been >freed),then coredump!!! >> >> Is there any useful patch to fix this bug? Can we call the >postpone(to free subtable) after pvector_publish? >> >> Looking forward to your reply >> >> Thanks.
Hey, I'm out at a conference this week, so I probably won't be able to take a look at this soon. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
