On Mon, May 5, 2025, 11:48 AM Cassio Neri <cassio.n...@gmail.com> wrote:
> Use a better algorithm to calculate n % 7, taking into account that the > divisor > is a Mersenne number. This is explained in this two-part lightning talk at > CppCon 2024 [1, 2]. > > Roughly speaking, the algorithm is used for n = sys_days but it has a > precondition that doesn't hold for all values of sys_days. However, it > does hold > for every value coming out of year_month_day::operator sys_days(). > > Nevertheless, there's an issue because the conversion is done in two steps: > year_month_day -> sys_days -> weekday and in the second step the > information > that the input sys_days was obtained from the first step (so the > precondition > holds) is lost. In the talk a single function performs the two steps and > uses > the optimization. Unless the standard adds a similar function (e.g. a > weekday > constructor) or this is added as an extension (not done here), this is not > an > option for libstdc++. > > The issue is adressed in the following way. At the end of the fist step, > the > code informs the compiler, through an [[assume]] attribute that the > precondition > holds. Then, at the beginning of the second step, the code checks, through > __builtin_constant_p, if the precondition holds, in which case, it uses the > better algorithm or, otherwise, it falls back to n % 7. > I suspect we want to disable this for -Os . And plus i am not 100% convinced it is best for all micro-architures. Especially on say aarch64. Can you do more benchmarking and peocide which exaxt core is being used? And mention the size difference too? Plus gcc knows how to do %7 via multiplication is that being used or is it due to generic x86 tuning it is using the div instruction? Thanks, Andrew > The idea is illustrated in [3] which compares code generation (with and > without > the optimization) and provides an exhaustive test of correctness. A > benchmark is > shown in [4]. > > References: > > [1] https://youtu.be/64mTEXnSnZs > [2] https://youtu.be/bnVkWEjRNeI > [3] https://godbolt.org/z/E14asPK8r > [4] https://quick-bench.com/q/vzxlrLaEccner3h-ahIrbB0BZvE > > libstdc++-v3/ChangeLog: > > * include/std/chrono: Add postcondition to > year_month_date::operator > sys_days and check this condition in weekday::_S_from_days. > --- > The approach might be somewhat controversial and might need to be > discussed. The problem is that passing information from [[assume]] to > __builtin_constant_p is > quite fragile in 3 different ways. > > 1) The distance in source code between [[assume]] and __builtin_constant_p > makes > hard to understand why they are there in the first place. A reader who > doesn't > know the full picture might feel tempted to remove either of these lines. > > 2) Logically, the post-condition expressed by [[assume]] only needs to > imply the > condition tested by __builtin_constant_p. However, even whey they are > obviously > equivalent for a human, this might not be the case for the compiler. Even > slight cosmetic changes might break the compiler's chain of thought and > thus, > the optimization. > > 3) Similarly to 2, changes in the compiler code that handles [[assume]] or > __builtin_constant_p might break the optimization. > > If any of the issues above happens, the result will still be right but a > performance regression will appear. Comments can mitigate the issues but a > performance test or a test for code generation would be more effective to > prevent performance regressions. Unfortunately, I don't know how either of > these > can be done. > > libstdc++-v3/include/std/chrono | 27 +++++++++++++++++++++++++-- > 1 file changed, 25 insertions(+), 2 deletions(-) > > diff --git a/libstdc++-v3/include/std/chrono > b/libstdc++-v3/include/std/chrono > index 8eb9fd9baac..e762dd46cc4 100644 > --- a/libstdc++-v3/include/std/chrono > +++ b/libstdc++-v3/include/std/chrono > @@ -984,7 +984,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > static constexpr weekday > _S_from_days(const days& __d) > { > - return weekday{__detail::__add_modulo<7>(4, __d.count())}; > + const auto __days = __d.count(); > + > + const auto __n = static_cast<unsigned>(__days) + 12'687'434u; > + const auto __use_fast_mod7 = __n < 178'956'973u; > + // This condition is true when __d = __dp.time_since_epoch() where > __dp > + // comes from year_month_day::operator sys_days(). > + if (__builtin_constant_p(__use_fast_mod7)) > + { > + const auto __wd = 613'566'757u * __n >> 29; > + [[assume(__wd != 7)]]; > + return weekday{__wd}; > + } > + > + return weekday{__detail::__add_modulo<7>(4, __days)}; > } > > public: > @@ -1652,7 +1665,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > > constexpr > operator sys_days() const noexcept > - { return sys_days{_M_days_since_epoch()}; } > + { > + const auto __days = _M_days_since_epoch(); > + > + // The assume attribute below allows weekday::_S_from_days to use a > + // faster algorithm for modulus 7. > + const auto __n = static_cast<unsigned>(__days.count()) + > 12'687'434u; > + const auto __use_fast_mod7 = __n < 178'956'973u; > + [[assume(__use_fast_mod7)]]; > + > + return sys_days{__days}; > + } > > explicit constexpr > operator local_days() const noexcept > -- > 2.49.0 > >