Re: [PATCHv5] libstdc++: Add generate_canonical impl (P0952, LWG2524) [PR119739]

Tomasz Kaminski Wed, 13 Aug 2025 00:39:19 -0700

On Wed, Aug 13, 2025 at 12:06 AM Nathan Myers <n...@cantrip.org> wrote:


> Changes in v5:
>  * Static-assert movable RNG object correctly.
>  * Add a more comprehensive test gencanon.cc
>
> Changes in v4:
>  * Static-assert that arg is floating-point, coercible from bigger
> unsigned.
>  * Static-assert that arg satisfies uniform_random_bit_generator, movable.
>  * #include uniform_int_dist.h for concept uniform_random_bit_generator
>  * Coerce floating consts from unsigned literals, matching other usage.
>  * Remove redundant type decorations static and -> .
>  * Use __builtin_floor for broader applicability than std::floor.
>  * Use __builtin_pow in place of immediate lambda.
>  * Rename, rearrange local objects for clarity.
>  * Add redundant "if constexpr (k > 1)" for clarity.
>  * #if-guard new impl against Clang, which lacks constexpr floor() etc.
>
> Changes in v3:
>  * Implement exactly as specified in the WP, for C++26 and up.
>
> Implement P0952R2 "A new specification for std::generate_canonical".
> It has us start over if the naive generation of a float in 0..1
> comes up exactly equal to 1, which occurs too rarely for the change
> to measurably affect performance.
>
> A test for the case already appears in 64351.cc. This change amounts
> to a different resolution for PR64351 and LWG 2524. A new test,
> gencanon.cc, is more thorough.
>
> libstdc++-v3/ChangeLog:
>         PR libstdc++/119739
>         * include/bits/random.tcc: Add generate_canonical impl for C++26.
>         * include/std/random: Include uniform_int_dist.h for concept.
>         *
> testsuite/26_numerics/random/uniform_real_distribution/operators/64351.cc:
>         Adapt test for both pre- and post- C++26.
>         *
> testsuite/26_numerics/random/uniform_real_distribution/operators/gencanon.cc:
>         Test for float and double from 32-bit RNG, including 1.0 do-over.
> ---
>  libstdc++-v3/include/bits/random.tcc          | 48 +++++++++++
>  libstdc++-v3/include/std/random               |  1 +
>  .../operators/64351.cc                        |  8 +-
>  .../operators/gencanon.cc                     | 85 +++++++++++++++++++
>  4 files changed, 140 insertions(+), 2 deletions(-)
>  create mode 100644
> libstdc++-v3/testsuite/26_numerics/random/uniform_real_distribution/operators/gencanon.cc
>
> diff --git a/libstdc++-v3/include/bits/random.tcc
> b/libstdc++-v3/include/bits/random.tcc
> index 53ccacb2e38..65f457a6659 100644
> --- a/libstdc++-v3/include/bits/random.tcc
> +++ b/libstdc++-v3/include/bits/random.tcc
> @@ -3349,6 +3349,9 @@ namespace __detail
>         }
>      }
>
> +#if __cplusplus < 202400 || defined(__clang__)
>
I do not think that not implementing this issue for clang is really a good
option.
The function is small enough that you could test it on the compiler
explorer,
and you may use that to check how we behave for constant folding.
I would instead replace the constexpr variable with `const` and see if with
optimization GCC does constant fold calls to floor, pow etc.
Of as alternative, define a helper macro like
_GLIBCXX_GENCANONICAL_CONSTEXPR,
(and undefine it at end of the file),\ that would be constexpr for gcc and
empty for clang,
 and then define the variables as:
_GLIBCXX_GENCANONICAL_CONSTEXPR
const _RealT __Rk = __builtin_pow(__R, __k)

This would allow use to make them constexpr if later version of clang
supports that.

This is why I have also suggested adjusting the computation of __k, so we
can use __builting_log to compute the value faster in most common case.

> +  // Clang lacks constexpr __builtin_floor and __builtin_pow.
> +
>    template<typename _RealType, size_t __bits,
>            typename _UniformRandomNumberGenerator>
>      _RealType
> @@ -3385,6 +3388,51 @@ namespace __detail
>         }
>        return __ret;
>      }
> +#else
> +  template<typename _RealT, size_t __digits, typename _URBG>
> +    _RealT
> +    generate_canonical(_URBG& __urng)
> +    {
> +      static_assert(requires {
> +       [](uniform_random_bit_generator auto) {}(
> +           static_cast<std::remove_cv_t<_URBG>&&>(__urng)); },
> +       "argument must meet uniform_random_bit_generator requirements");
> +      static_assert(
> +       std::is_floating_point_v<_RealT> && requires {
> _RealT(_URBG::max()); },
> +       "template argument must be floating point, coercible from
> unsigned");
> +
> +      // Names are chosen to match the description in the Standard.
> +      const size_t __r = std::numeric_limits<_RealT>::radix;
> +      const auto __sample = [](auto __high) constexpr
> +       { return _RealT(__high - _URBG::min()); };
> +      constexpr _RealT __R = _RealT(1) + __sample(_URBG::max());
> +      const size_t __d_max = std::numeric_limits<_RealT>::digits;
> +      const size_t __d = __digits < __d_max ? __digits : __d_max;
> +      constexpr _RealT __rd = __builtin_pow(_RealT(__r), __d);
> +      const unsigned __k = [](_RealT __R, _RealT __rd) consteval
> +       {
> +         unsigned __i = 1;
> +         for (auto __Ri = __R; __Ri < __rd; __Ri *= __R)
> +           ++__i;
> +         return __i;
> +       } (__R, __rd);
> +      constexpr _RealT __Rk = __builtin_pow(__R, __k);
> +      constexpr _RealT __x = __builtin_floor(__Rk / __rd);
> +      constexpr _RealT __xrd = __x * __rd;
> +
> +      _RealT __sum;
> +      do
> +       {
> +         _RealT __Ri = _RealT(1);
> +         __sum = __sample(__urng());
> +         if constexpr (__k > 1)
> +           for (int __i = 1; __i != __k; ++__i)
> +             __Ri *= __R, __sum += __sample(__urng()) * __Ri;
> +       } while (__builtin_expect(__sum >= __xrd, false));
> +      _RealT __ret = __builtin_floor(__sum / __x) / __rd;
> +      return __ret;
> +    }
> +#endif
>
>  _GLIBCXX_END_NAMESPACE_VERSION
>  } // namespace
> diff --git a/libstdc++-v3/include/std/random
> b/libstdc++-v3/include/std/random
> index 0e058a04bd9..86d4f0a8380 100644
> --- a/libstdc++-v3/include/std/random
> +++ b/libstdc++-v3/include/std/random
> @@ -49,6 +49,7 @@
>  #include <type_traits>
>  #include <bits/random.h>
>  #include <bits/opt_random.h>
> +#include <bits/uniform_int_dist.h>
>  #include <bits/random.tcc>
>
>  #endif // C++11
> diff --git
> a/libstdc++-v3/testsuite/26_numerics/random/uniform_real_distribution/operators/64351.cc
> b/libstdc++-v3/testsuite/26_numerics/random/uniform_real_distribution/operators/64351.cc
> index 3c0cb8bbd49..6212a32a364 100644
> ---
> a/libstdc++-v3/testsuite/26_numerics/random/uniform_real_distribution/operators/64351.cc
> +++
> b/libstdc++-v3/testsuite/26_numerics/random/uniform_real_distribution/operators/64351.cc
> @@ -45,16 +45,20 @@ test02()
>    rng.seed(sequence);
>    rng.discard(12 * 629143);
>    std::mt19937 rng2{rng};
> +  int mismatch = 0;
>    for (int i = 0; i < 20; ++i)
>    {
>      float n =
>        std::generate_canonical<float,
> std::numeric_limits<float>::digits>(rng);
>      VERIFY( n != 1.f );
>
> -    // PR libstdc++/80137
>      rng2.discard(1);
> -    VERIFY( rng == rng2 );
>    }
> +  // PR libstdc++/80137
> +#if __cplusplus >= 202400
> +  rng2.discard(1);  // account for a 1.0 generated and discarded.
> +#endif
> +  VERIFY( rng == rng2 );
>  }
>
>  int
> diff --git
> a/libstdc++-v3/testsuite/26_numerics/random/uniform_real_distribution/operators/gencanon.cc
> b/libstdc++-v3/testsuite/26_numerics/random/uniform_real_distribution/operators/gencanon.cc
> new file mode 100644
> index 00000000000..ba781315e0d
> --- /dev/null
> +++
> b/libstdc++-v3/testsuite/26_numerics/random/uniform_real_distribution/operators/gencanon.cc
> @@ -0,0 +1,85 @@
> +// { dg-do run { target { c++26 && { ! simulator } } } }
> +
> +#include <random>
> +#include <limits>
> +#include <type_traits>
> +#include <cmath>
> +#include <testsuite_hooks.h>
> +
> +// Verify P0952R9 implementation requiring a second round-trip
> +// if first yields exactly 1. In this test, the RNG delivering
> +// 32 bits per call is seeded such that this occurs once on the
> +// sixth iteration for float, and not at all for double.
> +// However, each double iteration requires two calls to the RNG.
> +
> +template <typename T, typename RNG>
> +void
> +test01(RNG& rng, int& deviation, int& max, int& rms, int& zero, int&
> skips)
> +{
> +  const auto size = 1'000'000, buckets = 100;
> +  std::array<number, buckets> histo{};
> +  int zero = 0, skips = 0;
> +  for (auto i = 0; i != size; ++i) {
> +    T sample = std::generate_canonical<T, 100>(rng);
> +    VERIFY(sample >= 0.0);
> +    VERIFY(sample < 1.0)
> +    if (sample == 0.0) {
> +      ++zero;
> +    }
> +    auto bucket = static_cast<int>(std::floor(sample * buckets));
> +    ++histo[bucket];
> +    rng2.discard(1);
> +    if (rng != rng2) {
> +      ++skips;
> +      rng2.discard(1);
> +      VERIFY(rng == rng2)
> +    }
> +  }
> +  int devsquare = 0;
> +  for (int i = 0; i < buckets; ++i) {
> +    const auto expected = size / buckets;
> +    auto count = histo[i];
> +    auto diff = count - expected;
> +    if (diff < 0) diff = -diff;
> +    deviation += diff;
> +    devsquare += diff * diff;
> +    if (diff > max) max = diff;
> +  }
> +  rms = std::sqrt(devsquare);
> +}
> +
> +int
> +main()
> +{
> +  std::mt19937 rng(8890);
> +  std::seed_seq sequence{0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
> +  rng.seed(sequence);
> +  rng.discard(12 * 629143);
> +
> +  {
> +    int deviation{}, max{}, rms{}, zero{}, skips{};
> +    auto rng2{rng};
> +    test01<float>(rng2, deviation, max, rms, zero, skips);
> +
> +    if (std::numeric_limits<float>::is_iec559) {
> +      VERIFY(deviation == 7032);
> +      VERIFY(max == 276);
> +      VERIFY(rms == 906);
> +      VERIFY(zero == 0);
> +    }
> +    VERIFY(skips == 1);
> +  }
> +  {
> +    auto rng3{rng}
> +    test01<double>(rng2, deviation, max, rms, zero, skips);
> +
> +    if (std::numeric_limits<double>::is_iec559) {
> +      VERIFY(deviation == 7650);
> +      VERIFY(max == 259);
> +      VERIFY(rms == 975);
> +      VERIFY(zero == 0);
> +    }
> +    VERIFY(skips == 1'000'000);
> +  }
> +  return 0;
> +}
> --
> 2.50.0
>
>

Re: [PATCHv5] libstdc++: Add generate_canonical impl (P0952, LWG2524) [PR119739]

Reply via email to