Erratum. The JHM output numbers previously listed are for throughput
(operations/time) and not average time. So all the conclusions were
incorrect and must be reversed. This is not the typical output report value
used in other RNG JMH benchmarks and I did not notice. The figures make
more sense when interpreted correctly.

As listed the generation of an open double in (0, 1) is fastest using
rejection. I will add performance figures from several machines on the
ticket: RNG-190 [1].

[1] https://issues.apache.org/jira/browse/RNG-190


On Fri, 13 Feb 2026 at 12:06, Alex Herbert <[email protected]> wrote:

>
> I added the proposed methods to the existing
> FloatingPointGenerationBenchmark in the RNG JMH testing project. The
> nextDouble methods already present match the 3 variants proposed without
> the addition of a single trailing 1-bit. I also added two methods that will
> reject the value of zero either using a while loop or recursively calling
> the method until a non-zero is generated. The advantage of recursion is
> that an infinite loop will not occur due to stack memory overflow if the
> source RNG is broken (always outputs 0).
>
> Here are the results using JDK 21 on a Mac M2 Pro.
>
> nextDoubleUsingBitsToDouble        thrpt   10  1036.995 ± 10.052  ops/us
> nextDoubleUsingMultiply52bits      thrpt   10  1041.926 ± 13.284  ops/us
> nextDoubleUsingMultiply53bits      thrpt   10  1060.602 ± 20.994  ops/us
>
> nextOpenDoubleUsingBitsToDouble    thrpt   10  1207.123 ± 17.814  ops/us
> nextOpenDoubleUsingMultiply52bits  thrpt   10   991.166 ± 17.743  ops/us
> nextOpenDoubleUsingMultiply53bits  thrpt   10   976.256 ± 13.821  ops/us
> nextOpenDoubleUsingRecursion       thrpt   10  1151.243 ± 10.216  ops/us
> nextOpenDoubleUsingRejection       thrpt   10  1285.009 ± 13.680  ops/us
>
> Using rejection or recursion is slower than the branchless versions with
> multiplication.
>
> The multiplication version is faster if there is a trailing 1-bit. This
> could possibly be related to floating-point multiplication with a
> guaranteed non-zero result. The 52 and 53 bit multiplication are close
> enough to be within error.
>
> The method with the conversion of long bits to double was strange. It does
> well for the standard case of the [0, 1) interval. IIRC that was not the
> case on older processors used for this benchmark in the past. It is slower
> for the open interval.
>
> The two methods are:
>
> Double.longBitsToDouble((source.nextLong() >>> 12) | (0x3ffL << 52)) - 1.0;
> Double.longBitsToDouble((source.nextLong() >>> 12) | 0x3ff0000000000001L)
> - 1.0
>
> If the former is changed to the following it is slower:
>
> Double.longBitsToDouble((source.nextLong() >>> 12) | 0x3ff0000000000000L)
> - 1.0
>
> nextDoubleUsingBitsToDouble  thrpt   10  1200.327 ± 24.800  ops/us
>
> Loading the long constant is slower than generating it using a shift.
>
> These results are for a single JVM and processor. However it does put a
> case forward for a branchless version of the ContinuousUniformSampler when
> the open interval is requested to be in (0, 1). I can raise a ticket for
> this in Jira to record the benchmark results and document the potential
> change.
>
> Alex
>
>
> On Thu, 12 Feb 2026 at 16:47, Alex Herbert <[email protected]>
> wrote:
>
>> Hi,
>>
>> The code in Commons RNG provides a general interface for generating
>> primitive values in UniformRandomProvider [1]. This closely matches the
>> JDK's own interface in RandomGenerator (Java 17+) [2]. Although it is
>> possible to add more methods to UniformRandomProvider this risks
>> cluttering the interface with specialist methods that may not be
>> commonly used. If it's not in the JDK's interface then typically we would
>> not support it.
>>
>> The interfaces are mostly the same. Differences are:
>>
>> UniformRandomProvider:
>> void nextBytes(byte[] bytes, int start, int len)
>>
>> RandomGenerator:
>> double nextExponential()
>> double nextGaussian()
>>
>> If you wish to sample from a exponential or Gaussian then we have
>> samplers in the sampling module. These include the same sampling method
>> used in the JDK which is based on McFarland's modification of a ziggurat
>> algorithm by Marsaglia.
>>
>> If you wish to sample from an open interval then we have
>> the ContinuousUniformSampler [3] that samples within [lo, hi) by default
>> but can be changed to an open interval of (lo, hi) with a constructor
>> argument. Since the range can use any double values this requires some
>> floating-point computations to map a generated [0, 1) to the interval [lo,
>> hi), or (lo, hi). Since rounding can occur you can see values at the bounds
>> even when the original double was non-zero. So a rejection algorithm is
>> used: that is if sample == lo or sample == hi then repeat. Rejection
>> frequency is small unless the range between lo and hi does not contain many
>> floating-point values. Thus this rejection is efficiently ignored due to
>> branch prediction.
>>
>> Note that the constructor for this sampler validates there are values
>> between lo and hi. Otherwise you can have an infinite loop. Thus it
>> supports generation of open intervals with bounds 2 ULP or more apart, or 3
>> if the bounds span zero to account for -0.0.
>>
>> If you specifically require a value in (0, 1) we could add a specialised
>> version to this sampler to use a faster computation. But the user must be
>> warned that multiplication of (0, 1) by a floating point range can result
>> in a semi-open interval result due to rounding. For example the smallest
>> dyadic rational in 0-1 is 2^-53. Use this to sample from the range (2, 4):
>>
>> jshell
>> |  Welcome to JShell -- Version 21.0.9
>> |  For an introduction type: /help intro
>>
>> jshell> 0x1.0p-53 * (4 - 2) + 2
>> $1 ==> 2.0
>>
>> This can be avoided using:
>>
>> UniformRandomProvider rng = ...
>> double x = ContinuousUniformSampler.of(rng, 2, 4, true).sample();
>>
>> If the ultimate requirement is float values in the range (0, 1) then a
>> faster algorithm is possible. But this is not always what the user wants
>> and we should document the possible pitfalls as described above.
>>
>> Alex
>>
>> [1]
>> https://commons.apache.org/proper/commons-rng/commons-rng-docs/apidocs/org/apache/commons/rng/UniformRandomProvider.html
>> [2]
>> https://docs.oracle.com/en/java/javase/25/docs/api/java.base/java/util/random/RandomGenerator.html
>> [3]
>> https://commons.apache.org/proper/commons-rng/commons-rng-docs/apidocs/org/apache/commons/rng/sampling/distribution/ContinuousUniformSampler.html
>>
>>
>> On Thu, 12 Feb 2026 at 13:33, Gilles Sadowski <[email protected]>
>> wrote:
>>
>>> Hello.
>>>
>>> Le jeu. 12 févr. 2026 à 13:54, Jherek Healy
>>> <[email protected]> a écrit :
>>> >
>>> > Dear Commons RNG Team,
>>> >
>>> > I am proposing to introduce a new method in IntProvider and
>>> UniformRandomProvider which computes a random double number in the open
>>> interval (0, 1).
>>> >
>>> > Right now, nextDouble() computes a random double in the semi-closed
>>> interval [0,1). This can be problematic when the random number is to be
>>> used in a inverse distribution function, to provide random numbers
>>> according to a specific distribution, as the inverse distribution function
>>> is only defined on the open interval.
>>>
>>> Is this the sole use-case?
>>> If so, wouldn't it be better (design-wise) to implement the functionality
>>> in the "o.a.c.rng.sampling.distribution" package?
>>>
>>> Regards,
>>> Gilles
>>>
>>> [1]
>>> https://commons.apache.org/proper/commons-rng/commons-rng-sampling/index.html
>>>
>>> > The idea is to match the implementation of
>>> https://www.math.sci.hiroshima-u.ac.jp/m-mat/MT/VERSIONS/C-LANG/mt19937-64.cgenrand64_real3
>>> >
>>> > double genrand64_real3(void)
>>> > {
>>> >     return ((genrand64_int64() >> 12) + 0.5) *
>>> (1.0/4503599627370496.0);
>>> > }
>>> >
>>> > There are two possible Java implementations:
>>> > ((nextLong() >>> 12) + 0.5) * 0x1.0p-52;
>>> > or equivalently (reusing the constant used for the semi-closed
>>> interval)
>>> > ((v >>> 11) | 1) * * 0x1.0p-53;
>>> >
>>> > Yet another alternvative (which produces different numbers (last
>>> digit)) is the union trick:
>>> > long bits = (random64 >>> 12) | 0x3FF0000000000001L;
>>> > return Double.longBitsToDouble(bits) - 1.0;
>>> >
>>> > I don't have a strong preference in either of the choices.
>>> >
>>> > Jherek
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>>
>>>

Reply via email to