[ 
https://issues.apache.org/jira/browse/SOLR-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422029#comment-13422029
 ] 

Greg Bowyer commented on SOLR-3673:
-----------------------------------

{quote}
This is where my total ignorance of these random generators and how they use 
comes in: it looked to me like these generators in your patch just took in a 
java.util.Random as input – is there a particular reason why this Mrs. Twister 
random needs to be used? what does that give us that java.util.Random doesn't?
{quote}

They can take anything that extends java.util.Random, the only issue that 
exists with the inbuilt one is that its chance of repeating itself is 
outstandingly low, it has some properties with the numbers it generates that 
make it generate that are statistically poor and its slightly slower.

I dont lay claim to being an expert on this stuff, I am going on what I have 
been told, the usage of MT is a side benefit of cheating on the distributions 
and using the ones that come out of the box in uncommons-math - since I had a 
better RNG available I used it 

{quote}
FWIW: 128bits isn't that much if you let the seed argument to the function be 
an arbitrary String - even if you ignore the high bits the user just needs to 
give you 16 chars (less if we include stuff like the index version)
{quote}

Yeah its not a lot and manageable, I was more thinking about avoiding it being 
too configurable

{quote}
This is kind of where my "use case" question comes into play as well ... if the 
goal is just to use these generators to get a "biased" shuffling of the docs 
(ie: maybe you use certain random distribution and then frange filter on it get 
a set of documents with a roughly predictable size) then it's not that bad if 
the seeds aren't very complex – throw in the SolrCore start time to get a few 
more bits, etc.... But if there is some sort of cryptography goal then 
obviously having a "good" random seed that is unpredictable is a lot more 
important.
{quote}

The first use case, also use cases involving bending things towards 
distributions to act as cheap models. 

This stuff is useless as it stands for crypto anyhow since these RNG's are 
fairly predictable.
                
> Random variate functions
> ------------------------
>
>                 Key: SOLR-3673
>                 URL: https://issues.apache.org/jira/browse/SOLR-3673
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 4.0, 5.0
>            Reporter: Greg Bowyer
>            Assignee: Greg Bowyer
>         Attachments: SOLR-3673.patch
>
>
> Hi all
> At my $DAYJOB I have been asked to build a few random variate functions that 
> return random numbers bound to a distribution.
> I think these can be added to solr.
> I have a hesitation in that the code as written uses / needs uncommons math 
> (because we want a far better RNG than java's and because I am lazy and did 
> not want to write distributions)
> uncommons math is apache license so we are good on that front
> anyone have any thoughts on this ?
> For reference the functions are:
> rgaussian(mean, stddev) -> Random value aligned to gaussian distribution
> rpoisson(mean) -> Random value aligned to poisson distribution
> rbinomial(n, prob) -> Random value aligned to binomial distribtion
> rcontinous(min ,max) -> random continuous value between min and max
> rdiscrete(min, max) -> Random discrete value between min and max
> rexponential(rate) -> Random value from the exponential distribution

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to