On 2/9/22 16:23, Dumitru Ceara wrote:
On 1/24/22 11:34, Adrian Moreno wrote:
When running builds with UBSan, some undefined behavior was detected in the 
iteration of common data data structures in OVS.
Coincidentally, a bug was reported [1] whose root cause whas another, this time 
undetected, undefined behavior in the iteration macros.

 From both cases, we conclude that the way we're currently iterating the data 
structures is prone to errors and UB. This series is an attempt to rewrite 
those macros in a UB-safe manner.

Hi Adrian,

Thanks for this!  The patchset needs a small rebase, nothing major
though.  That's also why 0-day robot failed to build.


The core problem is that #define OBJECT_CONTAINING(POINTER, OBJECT, MEMBER) macro is 
being used on invalid POINTER values. In some cases we use NULL to compute the 
end-of-loop condition. In others, we allow it to point to non-contained objects (e.g: a 
non-contained stack allocated "struct ovs_list" as in [1]).

In order to systematically solve this in all cases this series introduces a new set of 
macros that implement a multi-variable loop iteration. They declare a hidden iterator 
variable inside the loop, use to iterate and evaluate the loop condition and only compute 
its OBJECT_CONTAINING if it satisfies the loop condition. One consequence of this safety 
guard is that the pointer provided by the user is set to NULL after the loop (if not 
exited via "break;").


It seems that sparse is not too happy about these changes.

In the documentation we recommend sparse version 0.5.1 or later.
However, sparse complains about various things when run with any version
older than 0.6.2 (more specifically after [0]).  I think we should just
recommend a newer version of sparse and that's it.

[0] 
https://git.kernel.org/pub/scm/devel/sparse/sparse.git/commit/?id=ffb24e18c9b83e5878ee9ca4513deb5de235e15c

However, that's not the only issue with sparse.  It seems on specific
distributions (e.g., on my Fedora 34 test machine) sparse fails to use
the right headers.  I made it work with this change, although I'm not
sure this is the best way of doing things:

diff --git a/acinclude.m4 b/acinclude.m4
index 0c360fd1ef73..f704bf36cdfe 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -1424,7 +1424,7 @@ AC_DEFUN([OVS_ENABLE_SPARSE],
     : ${SPARSE=sparse}
     AC_SUBST([SPARSE])
     AC_CONFIG_COMMANDS_PRE(
-     [CC='$(if $(C:0=),env REAL_CC="'"$CC"'" CHECK="$(SPARSE) $(SPARSE_WERROR) -I 
$(top_srcdir)/include/sparse $(SPARSEFLAGS) $(SPARSE_EXTRA_INCLUDES) " cgcc $(CGCCFLAGS),'"$CC"')'])
+     [CC='$(if $(C:0=),env REAL_CC="'"$CC"'" CHECK="$(SPARSE) $(SPARSE_WERROR) -I 
$(top_srcdir)/include/sparse -I $(top_srcdir)/include $(SPARSEFLAGS) $(SPARSE_EXTRA_INCLUDES) " cgcc 
$(CGCCFLAGS),'"$CC"')'])
AC_ARG_ENABLE(
       [sparse],
---


Thanks for bisecting sparse and pinpoint the problematic patch.
I will add your patch and the recommendation to use sparse > 0.6.2 in the next version of the series.


Apart from normal iteration, many OVS data structures have a _SAFE version of 
the loop which require the user to declare an extra variable to hold the next 
value of the iterator. The _SAFE version of the multi-variable iterators have 
the extra benefit of not requiring such extra variable.
On relevant data structures, an initial patch rewrites the macros in a 
backwards compatible manner and a second patch modifies all the callers to 
remove the unneeded variable. The fist patch would be easy to backport and the 
second would make code cleaner for the master branch.

Although this sounds nice, I'm afraid it creates some issues:
a. OVN relies in some places on the fact that it can explicitly access
    the "next" of the iterator, e.g.:
    https://github.com/ovn-org/ovn/blob/main/lib/expr.c#L1958

    We can fix it but it's not too nice.

b. There's currently some inconsistency in the way safe container
    iterators can be used:
    - SSET_FOR_EACH_SAFE() still expects the user to pass a "next"
      variable.
    - the IDL safe iterator helpers in ovsdb-idlc still require the user
      to provide a "next" variable:
      https://github.com/openvswitch/ovs/blob/master/ovsdb/ovsdb-idlc.in#L254


Good catch! I'll take care of those two iterators as well and make it homogeneous in the next version.

c. The changes in this OVS patchset seem to allow easier backport but
    that won't really be the case for OVN.

I wonder if it makes sense to keep both versions:
- the old *_SAFE(.., next, ..)
- a new *_SAFE_V2 (or a better name) that drops the need for explicitly
   supplied "next".

What do you think?

How about *_SAFE_OLD(...) for the old one and *_SAFE(...) for the new one? :-D
I'm just thinking that it would be a way to promote the use of the new and cleaner iterator in new code.

On the other hand, moving most of the users to the new macro should be a good example while we keep backwards compatibility. But even if we do keep the old name of the old version of the macro, it should have a new implementation that makes sure the "next" variable is not left pointing to something invalid, so I guess the backwards compatibility is broken anyway. What do you think?


Testing notes:
In order to verify this series removes all the loop-related UB, I've tested it 
on top of Dumitru's series [2] (without patch 1/11, which can hide some still 
invalid use of OBJECT_CONTAINING).
I've also verified no extra errors are reported through clang-analyzer.

Limitations:
The proposed approach benefits code readability, therefore the name of the 
iterator variable is derived from the name of the object pointer given by the 
caller.
This means that in an unlikely but still possible case in which a caller wants 
to nest two loops with the same iterating pointer object, the inner loop 
iterator variable will hide the one declared in the outer loop. This limitation 
is easy to spot (the compiler will warn) and easy to work around (just 
declaring another object pointer variable). I found no such code in the ovs 
tree.

Credits:
The idea was discussed in [3] and proposed by Jakub Jelinek <[email protected]>.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=2014942
[2] https://patchwork.ozlabs.org/project/openvswitch/list/?series=277900
[3] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103964

Adrian Moreno (13):
   util: add multi-variable loop iterator macros
   util: add safe multi-variable iterators
   list: use multi-variable helpers for list loops
   list: ensure iterator is NULL after pop loop
   list: remove the next variable in safe loops
   hmap: use multi-variable helpers for hmap loops
   hmap: implement UB-safe hmap pop iterator
   hmap: remove the next variable in safe loops
   cmap: use multi-variable iterators
   hindex: use multi-variable iterators
   hindex: remove the next variable in safe loops
   rculist: use multi-variable helpers for loop macros
   vtep: use _SAFE iterator if freing the iterator

I did try this patchset out (after quickly translating the OVN callers
too) and it seems to work fine.  There was one place in OVN where we
relied on a specific value of the LIST iterator after a complete
iteration but I *think* that was the only case:

https://github.com/ovn-org/ovn/blob/main/controller/ofctrl.c#L894

Regards,
DUmitru


--
Adrián Moreno

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to