On 2/9/22 16:23, Dumitru Ceara wrote:
On 1/24/22 11:34, Adrian Moreno wrote:
When running builds with UBSan, some undefined behavior was detected in the
iteration of common data data structures in OVS.
Coincidentally, a bug was reported [1] whose root cause whas another, this time
undetected, undefined behavior in the iteration macros.
From both cases, we conclude that the way we're currently iterating the data
structures is prone to errors and UB. This series is an attempt to rewrite
those macros in a UB-safe manner.
Hi Adrian,
Thanks for this! The patchset needs a small rebase, nothing major
though. That's also why 0-day robot failed to build.
The core problem is that #define OBJECT_CONTAINING(POINTER, OBJECT, MEMBER) macro is
being used on invalid POINTER values. In some cases we use NULL to compute the
end-of-loop condition. In others, we allow it to point to non-contained objects (e.g: a
non-contained stack allocated "struct ovs_list" as in [1]).
In order to systematically solve this in all cases this series introduces a new set of
macros that implement a multi-variable loop iteration. They declare a hidden iterator
variable inside the loop, use to iterate and evaluate the loop condition and only compute
its OBJECT_CONTAINING if it satisfies the loop condition. One consequence of this safety
guard is that the pointer provided by the user is set to NULL after the loop (if not
exited via "break;").
It seems that sparse is not too happy about these changes.
In the documentation we recommend sparse version 0.5.1 or later.
However, sparse complains about various things when run with any version
older than 0.6.2 (more specifically after [0]). I think we should just
recommend a newer version of sparse and that's it.
[0]
https://git.kernel.org/pub/scm/devel/sparse/sparse.git/commit/?id=ffb24e18c9b83e5878ee9ca4513deb5de235e15c
However, that's not the only issue with sparse. It seems on specific
distributions (e.g., on my Fedora 34 test machine) sparse fails to use
the right headers. I made it work with this change, although I'm not
sure this is the best way of doing things:
diff --git a/acinclude.m4 b/acinclude.m4
index 0c360fd1ef73..f704bf36cdfe 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -1424,7 +1424,7 @@ AC_DEFUN([OVS_ENABLE_SPARSE],
: ${SPARSE=sparse}
AC_SUBST([SPARSE])
AC_CONFIG_COMMANDS_PRE(
- [CC='$(if $(C:0=),env REAL_CC="'"$CC"'" CHECK="$(SPARSE) $(SPARSE_WERROR) -I
$(top_srcdir)/include/sparse $(SPARSEFLAGS) $(SPARSE_EXTRA_INCLUDES) " cgcc $(CGCCFLAGS),'"$CC"')'])
+ [CC='$(if $(C:0=),env REAL_CC="'"$CC"'" CHECK="$(SPARSE) $(SPARSE_WERROR) -I
$(top_srcdir)/include/sparse -I $(top_srcdir)/include $(SPARSEFLAGS) $(SPARSE_EXTRA_INCLUDES) " cgcc
$(CGCCFLAGS),'"$CC"')'])
AC_ARG_ENABLE(
[sparse],
---
Thanks for bisecting sparse and pinpoint the problematic patch.
I will add your patch and the recommendation to use sparse > 0.6.2 in the next
version of the series.
Apart from normal iteration, many OVS data structures have a _SAFE version of
the loop which require the user to declare an extra variable to hold the next
value of the iterator. The _SAFE version of the multi-variable iterators have
the extra benefit of not requiring such extra variable.
On relevant data structures, an initial patch rewrites the macros in a
backwards compatible manner and a second patch modifies all the callers to
remove the unneeded variable. The fist patch would be easy to backport and the
second would make code cleaner for the master branch.
Although this sounds nice, I'm afraid it creates some issues:
a. OVN relies in some places on the fact that it can explicitly access
the "next" of the iterator, e.g.:
https://github.com/ovn-org/ovn/blob/main/lib/expr.c#L1958
We can fix it but it's not too nice.
b. There's currently some inconsistency in the way safe container
iterators can be used:
- SSET_FOR_EACH_SAFE() still expects the user to pass a "next"
variable.
- the IDL safe iterator helpers in ovsdb-idlc still require the user
to provide a "next" variable:
https://github.com/openvswitch/ovs/blob/master/ovsdb/ovsdb-idlc.in#L254
Good catch! I'll take care of those two iterators as well and make it
homogeneous in the next version.
c. The changes in this OVS patchset seem to allow easier backport but
that won't really be the case for OVN.
I wonder if it makes sense to keep both versions:
- the old *_SAFE(.., next, ..)
- a new *_SAFE_V2 (or a better name) that drops the need for explicitly
supplied "next".
What do you think?
How about *_SAFE_OLD(...) for the old one and *_SAFE(...) for the new one? :-D
I'm just thinking that it would be a way to promote the use of the new and
cleaner iterator in new code.
On the other hand, moving most of the users to the new macro should be a good
example while we keep backwards compatibility. But even if we do keep the old
name of the old version of the macro, it should have a new implementation that
makes sure the "next" variable is not left pointing to something invalid, so I
guess the backwards compatibility is broken anyway. What do you think?
Testing notes:
In order to verify this series removes all the loop-related UB, I've tested it
on top of Dumitru's series [2] (without patch 1/11, which can hide some still
invalid use of OBJECT_CONTAINING).
I've also verified no extra errors are reported through clang-analyzer.
Limitations:
The proposed approach benefits code readability, therefore the name of the
iterator variable is derived from the name of the object pointer given by the
caller.
This means that in an unlikely but still possible case in which a caller wants
to nest two loops with the same iterating pointer object, the inner loop
iterator variable will hide the one declared in the outer loop. This limitation
is easy to spot (the compiler will warn) and easy to work around (just
declaring another object pointer variable). I found no such code in the ovs
tree.
Credits:
The idea was discussed in [3] and proposed by Jakub Jelinek <[email protected]>.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=2014942
[2] https://patchwork.ozlabs.org/project/openvswitch/list/?series=277900
[3] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103964
Adrian Moreno (13):
util: add multi-variable loop iterator macros
util: add safe multi-variable iterators
list: use multi-variable helpers for list loops
list: ensure iterator is NULL after pop loop
list: remove the next variable in safe loops
hmap: use multi-variable helpers for hmap loops
hmap: implement UB-safe hmap pop iterator
hmap: remove the next variable in safe loops
cmap: use multi-variable iterators
hindex: use multi-variable iterators
hindex: remove the next variable in safe loops
rculist: use multi-variable helpers for loop macros
vtep: use _SAFE iterator if freing the iterator
I did try this patchset out (after quickly translating the OVN callers
too) and it seems to work fine. There was one place in OVN where we
relied on a specific value of the LIST iterator after a complete
iteration but I *think* that was the only case:
https://github.com/ovn-org/ovn/blob/main/controller/ofctrl.c#L894
Regards,
DUmitru
--
Adrián Moreno
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev