[ 
https://issues.apache.org/jira/browse/SOLR-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422023#comment-13422023
 ] 

Hoss Man commented on SOLR-3673:
--------------------------------

{quote}
bq. I think at a minimum we should probably add a "seed" argument to all of 
these functions 
...
That mostly makes sense, I am not sure what to do if an RNG is used that needs 
more seed data than the end user provides, at the moment I am using the 
Mersenne Twister which requires 128-bits of seed data, I am nervous about 
exposing the particulars of the underlying RNG, or its seeding.
{quote}

This is where my total ignorance of these random generators and how they use 
comes in: it looked to me like these generators in your patch just took in a 
java.util.Random as input -- is there a particular reason why this Mrs. Twister 
random needs to be used? what does that give us that java.util.Random doesn't?

FWIW: 128bits isn't that much if you let the seed argument to the function be 
an arbitrary String - even if you ignore the high bits the user just needs to 
give you 16 chars (less if we include stuff like the index version)

This is kind of where my "use case" question comes into play as well ... if the 
goal is just to use these generators to get a "biased" shuffling of the docs 
(ie: maybe you use certain random distribution and then frange filter on it get 
a set of documents with a roughly predictable size) then it's not that bad if 
the seeds aren't very complex -- throw in the SolrCore start time to get a few 
more bits, etc....  But if there is some sort of cryptography goal then 
obviously having a "good" random seed that is unpredictable is a lot more 
important.


                
> Random variate functions
> ------------------------
>
>                 Key: SOLR-3673
>                 URL: https://issues.apache.org/jira/browse/SOLR-3673
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 4.0, 5.0
>            Reporter: Greg Bowyer
>            Assignee: Greg Bowyer
>         Attachments: SOLR-3673.patch
>
>
> Hi all
> At my $DAYJOB I have been asked to build a few random variate functions that 
> return random numbers bound to a distribution.
> I think these can be added to solr.
> I have a hesitation in that the code as written uses / needs uncommons math 
> (because we want a far better RNG than java's and because I am lazy and did 
> not want to write distributions)
> uncommons math is apache license so we are good on that front
> anyone have any thoughts on this ?
> For reference the functions are:
> rgaussian(mean, stddev) -> Random value aligned to gaussian distribution
> rpoisson(mean) -> Random value aligned to poisson distribution
> rbinomial(n, prob) -> Random value aligned to binomial distribtion
> rcontinous(min ,max) -> random continuous value between min and max
> rdiscrete(min, max) -> Random discrete value between min and max
> rexponential(rate) -> Random value from the exponential distribution

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to