[jira] [Commented] (RNG-146) GaussianSampler should not allow infinite standard deviation

Alex Herbert (Jira) Thu, 24 Jun 2021 14:03:07 -0700


    [ 
https://issues.apache.org/jira/browse/RNG-146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17369080#comment-17369080
 ]


Alex Herbert commented on RNG-146:
----------------------------------

{quote}Won't issues still exist for a mean and standard deviation of very 
different magnitudes?
{quote}
E.g. mean=1e300, std.dev=1

Yes. This will fail to detect that the sampler will not output a Gaussian 
deviate in such cases. Such a case would be a user error.

We could detect that the range of +/- 3 std dev has some number of doubles 
within it. But what threshold do you set? If the std.dev. is 2^53 times smaller 
in magnitude than the mean then 68% of samples (i.e. <=1 SD) would not change 
the value from the mean. So that is a conservative threshold. Realistically to 
get a Gaussian output the SD should be close to 2^27 in magnitude. So you get 
at least half of the mantissa of the double to represent the majority of 
samples in one direction (towards infinity). The other direction (towards zero) 
there are increasingly more numbers of doubles at each reduction by a power of 
2 so this direction will not be affected as much, depending on where the mean 
is in the current power of 2 interval.

The original bug is that you can output NaN, which is arguably worse than 
samples that are not really Gaussian. But to get that you have to use an 
infinite std.dev. and so would not have a Gaussian sampler anyway. It would be 
a function outputting from one of -inf, NaN, +inf, with NaN a very rare case.

I think setting the limit of +/-10 std.dev to not touch infinity is a start. 
Then either one of:
 # Updating the javadoc to state that cases where the magnitude of the mean is 
far greater than the std.dev. will result in a sampler that does not output a 
Gaussian distribution due to lack of precision in a double.
 # Checking the std.dev. is within a set scale of the magnitude of the mean.

I would opt for the user beware documentation. I've not seen this documented in 
other libraries, let alone throwing an exception for not be able to compute a 
valid sample. The end user should be able to discover why their simulation did 
not work as they expected with such extreme parameterisation.

 

> GaussianSampler should not allow infinite standard deviation
> ------------------------------------------------------------
>
>                 Key: RNG-146
>                 URL: https://issues.apache.org/jira/browse/RNG-146
>             Project: Commons RNG
>          Issue Type: Bug
>          Components: sampling
>    Affects Versions: 1.3
>            Reporter: Alex Herbert
>            Priority: Trivial
>
> The GaussianSampler requires the standard deviation is strictly positive. It 
> allows an infinite value. This will produce a NaN output if the 
> NormalizedGaussianSampler returns 0:
> {code:java}
> @Test
> public void testInfiniteStdDev() {
>     NormalizedGaussianSampler gauss = new NormalizedGaussianSampler() {
>         @Override
>         public double sample() {
>             return 0;
>         }
>     };
>     GaussianSampler s = new GaussianSampler(gauss, 0, 
> Double.POSITIVE_INFINITY);
>     Assert.assertEquals(Double.NaN, s.sample(), 0.0);
> }
> {code}
> A fix is to require the standard deviation is finite.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (RNG-146) GaussianSampler should not allow infinite standard deviation

Reply via email to