[
https://issues.apache.org/jira/browse/RNG-146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17369080#comment-17369080
]
Alex Herbert commented on RNG-146:
----------------------------------
{quote}Won't issues still exist for a mean and standard deviation of very
different magnitudes?
{quote}
E.g. mean=1e300, std.dev=1
Yes. This will fail to detect that the sampler will not output a Gaussian
deviate in such cases. Such a case would be a user error.
We could detect that the range of +/- 3 std dev has some number of doubles
within it. But what threshold do you set? If the std.dev. is 2^53 times smaller
in magnitude than the mean then 68% of samples (i.e. <=1 SD) would not change
the value from the mean. So that is a conservative threshold. Realistically to
get a Gaussian output the SD should be close to 2^27 in magnitude. So you get
at least half of the mantissa of the double to represent the majority of
samples in one direction (towards infinity). The other direction (towards zero)
there are increasingly more numbers of doubles at each reduction by a power of
2 so this direction will not be affected as much, depending on where the mean
is in the current power of 2 interval.
The original bug is that you can output NaN, which is arguably worse than
samples that are not really Gaussian. But to get that you have to use an
infinite std.dev. and so would not have a Gaussian sampler anyway. It would be
a function outputting from one of -inf, NaN, +inf, with NaN a very rare case.
I think setting the limit of +/-10 std.dev to not touch infinity is a start.
Then either one of:
# Updating the javadoc to state that cases where the magnitude of the mean is
far greater than the std.dev. will result in a sampler that does not output a
Gaussian distribution due to lack of precision in a double.
# Checking the std.dev. is within a set scale of the magnitude of the mean.
I would opt for the user beware documentation. I've not seen this documented in
other libraries, let alone throwing an exception for not be able to compute a
valid sample. The end user should be able to discover why their simulation did
not work as they expected with such extreme parameterisation.
> GaussianSampler should not allow infinite standard deviation
> ------------------------------------------------------------
>
> Key: RNG-146
> URL: https://issues.apache.org/jira/browse/RNG-146
> Project: Commons RNG
> Issue Type: Bug
> Components: sampling
> Affects Versions: 1.3
> Reporter: Alex Herbert
> Priority: Trivial
>
> The GaussianSampler requires the standard deviation is strictly positive. It
> allows an infinite value. This will produce a NaN output if the
> NormalizedGaussianSampler returns 0:
> {code:java}
> @Test
> public void testInfiniteStdDev() {
> NormalizedGaussianSampler gauss = new NormalizedGaussianSampler() {
> @Override
> public double sample() {
> return 0;
> }
> };
> GaussianSampler s = new GaussianSampler(gauss, 0,
> Double.POSITIVE_INFINITY);
> Assert.assertEquals(Double.NaN, s.sample(), 0.0);
> }
> {code}
> A fix is to require the standard deviation is finite.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)