[llvm-bugs] [Bug 78829] Ranged-for loop with _LIBCPP_ABI_BOUNDED_ITERATORS does not optimize runtime checks

LLVM Bugs via llvm-bugs Fri, 19 Jan 2024 19:17:25 -0800

Issue	78829
Summary	Ranged-for loop with _LIBCPP_ABI_BOUNDED_ITERATORS does not optimize runtime checks
Labels	new issue
Assignees
Reporter	davidben

    When enabling bounded iterators, it seems Clang cannot optimize them out of a ranged-for loop. Here's a simple function that just sums over a span with a ranged-for loop, `sum1`, as well as a corresponding hand-written iterator loop, `sum2`.
https://godbolt.org/z/WqsPKdEoa


The left output, without `_LIBCPP_ABI_BOUNDED_ITERATORS` has no runtime checks, and Clang is even able to vectorize the loop. The right output, with `_LIBCPP_ABI_BOUNDED_ITERATORS` contains runtime checks with, in turn, impede the optimization.

**Cause**

I teased apart what `__bounded_iter` was doing. This is the assertion in question:
```
  // Return whether the given iterator is in the bounds of this __bounded_iter.
 _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR bool __in_bounds(_Iterator const& __iter) const {
    return __iter >= __begin_ && __iter < __end_;
 }
```
I've extracted that into a minimal `sum3` function, and that seems to reveal the problem. While Clang can prove that `iter >= begin`, it can't prove that `iter < end`. I suspect this is because iterator loops look like `iter != end`, not `iter < end`. In order to prove that `iter < end`, it must also know that `begin <= end`.

Indeed, if I add that assertion to the top, in `sum4`, Clang pays for a *different* undesirable runtime check outside the loop, but then figures it out! So fixing https://github.com/llvm/llvm-project/issues/78784 may have further benefits, if the analysis is extended further.

**Alternative fix**

Another way Clang could have figured it out is if we used `iter != end` instead of `iter < end` for the safety assertion. That's actually what Chromium's checked iterator does. However, libc++ cannot do this because it doesn't check `operator+`, etc., only `operator*`. It is possible for `__current_` to overflow well past `__end_` before we do the check.

I think libc++ should bounds-check `operator+` too. First, once `__current_` has advanced past `__end_`, we may have already hit undefined behavior because pointer arithmetic is locked to the allocation. So the check in `operator*` may be too late anyway. Second, #78771 describes a place where libc++'s `memmove` specialization breaks its own hardening checks! Since `memmove` specialization is pretty valuable, I think it's best to fix #78771 by bounds-checking `operator+` and making sure the specialization takes triggers it early enough.

Now, once `operator+` is bounds-checked, `__bounded_iter` can rely on `__begin_ <= __current_ <= __end_`. That means the safety check need only be `__current_ != __end_`, which should be easily optimized out.

(CC @danakj)

_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 78829] Ranged-for loop with _LIBCPP_ABI_BOUNDED_ITERATORS does not optimize runtime checks

Reply via email to