Re: Advice on managing random number generators across threads

Mars0i Fri, 06 Jun 2014 20:33:25 -0700

Wow.  Linus, thanks for the very detailed answer.

It sounds as your view is that there's no problem with the logic of my 
proposed solution, but it seems wasteful to create 100 RNGs, when I don't 
actually need to have so many of them.  Your solutions provide ways to get 
the same result without maintaining so many RNG objects in a simulation 
run.  Is that roughly correct?

On Friday, June 6, 2014 2:26:25 PM UTC-5, Linus Ericsson wrote:
>
> (I think your stepwise problem is better to be solved without agents, see 
> below.)
>
> I realized that the *rng* actually is conflated, it keeps both the state 
> for the random number generator AND the algorithm used for the random 
> number generation. This is usually not a problem, since the state-size and 
> usage is dependent on the particular random number generator, but it 
> becomes a problem when we need to separate and multithread things 
>

Interesting point.  OK.  But seem my comment below about sharing of code 
between RNG objects.

> One solution would be to either re-implement the algorithm (by Knuth) used 
> in the original Sun Java random implementation, or use the experimental 
> library Four [1] by Gary Fredericks that does exactly this.
>

Four looks very nice.  Its stateless RNGs return a new RNG, rather than an 
RNG state, btw:
(class (rs/java-random 1001))
;==>  four.stateless.JavaRandom
(map class (rs/next-long (rs/java-random 1001)))
;==> (java.lang.Long four.stateless.JavaRandom)
 I can see how this might be useful, though.

What you would like to do is to be able to feed the rng-algorithm with the 
> state (it is just a 64 bit long), and simply take the random-result 
> generated from it, and store the state in the agent together with the rest 
> of the simulation state.
>
> Dubious Speculations:
> This makes it necessary to know that the agents are called in the same 
> order (if the influence on each other). 
>

> Given the non-guarantee of ordering in the nature of Java Threads and how 
> agents rely on them I don't think you can expect your result to be 
> reproducible. 
>

Right.  I can guarantee the order in which the random numbers are used when 
I use map to cause each person to do its work, but I assume that I can't 
guarantee this order when I use pmap.

> Here I assume that the incoming calls to agents aren't guaranteed to 
> strictly ordered among different agents, even though the agents reside in 
> one of two thread-pools (the send- vs. send-off pools). If two agents sent 
> messages that arrived a third agent in different order, the rng-state would 
> differ between the runs). Please correct me if I'm wrong on this one. 
>

No, there's no issue about the order in which messages are received, I 
believe.  Communication is effectively simultaneous, because all sending 
happens in one step, and all receiving happens in another step.

(Here are the details, if you're interested:  Each Person random chooses 
other Persons to send a message to, and then randomly chooses a message to 
send.  The pmap over the Persons ends, and all of the messages are 
collected into a single hashmap.  Then a separate pmap goes through each 
Person, applying the messages in the hashmap that are supposed to be sent 
to that Person.  Thus, although Persons perform their work in some order, 
the sending and the receiving of the messages are separated, so that 
message sending and receipt occurs as if all messages were sent 
simultaneously.  In the next timestep, the messages received by a Person 
affect the probabilities that various messages will be chosen to send to 
another Person.)

> A possible clever way around this could be to send the rng state together 
> with the message arriving at the agent, to make the agent at least react to 
> the state in a similar way. 
>

Interesting idea.  I don't think I'll need this, though.

> If you have some kind of continous magnitudes in the simulation - like 
> pheromones - the simulation would maybe converge to similar results since 
> the interaction effects would be reproducible but this is indeed a very 
> far-fetched hypothesis.
>

The simulation converges, but the same parameters can allow convergence to 
different outcomes.  I usually have to run the simulation many times with 
the same parameters so that I can get a frequency distribution over 
outcomes.

> So only if your Person transform functions are commutative (ie gives the 
> same result whichever order they was called in every timestep) the 
> simulation could be reproducible by storing the local random generator seed 
> in each Person.
>

I see.  I think I understand now.  If I store the state and not the RNG 
code, but I am able to pass the state into the RNG code, that's all that I 
need to store in each Person.  I think the idea is that this would do the 
same work as in my proposal, but with less storage.

But wouldn't the RNG code be shared between RNG objects anyway?  For 
example, don't all java.util.Random objects share the same code, differing 
only in that they store different state?  

> If it's just speed you are after, do use either the java 1.7 
> ThreadLocalRandom (at least try out if this really gives any performance 
> boost).
>

You're right.  I should try it.  (There are some special steps necessary to 
install 1.7 on my version of OS X, but it can be done.)

>
> A naive ThreadLocalImplementation implementation of clojure.core/rand 
> would be:
>
> (defn rand
>   "Returns a random floating point number between 0 (inclusive) and
>   n (default 1) (exclusive)."
>   {:added "1.0"
>    :static true}
> *  ([] (.nextDouble (java.util.concurrent.ThreadLocalRandom/current)))*
>   ([n] (* n (rand))))
>

Thanks--this is very helpful given that much of my Java knowledge fallen 
away through lack of use.

> but as you can see in the source code, random integers are generated using 
> a call to (rand), multiplied with the integer and floored (when converting 
> to int), so there are ways to get an limited range int more efficiently 
> (check the source code for java.util.Random, which makes various tricks for 
> getting just enough random bits out.
>

OK, good.

> As always, measure performance (Hugo Duncans Criterium, 
> criterium.core/bench is a really good start). 
>

Yes, I always use Criterium.

> Numerical double operations are cheap, memory accesses - in this case to 
> resulting data structures or other agents and likely context switches - 
> could as well be the more expensive part.
>
> BTW: It looks like the problem you want to solve would benefit much from 
> being formulated as a two-step problem, one where you applied all the 
> incoming calls to each Person, and another where you sorted the messages to 
> each Person in some reproducible order, like
>

Thanks.  I think that what you have in mind is roughly what I've already 
implemented, described above.  (Your example was fun.)

> If you makes the message passing explicit instead of 
> Java-thread-call-uncertain-implicit (which IS possible and parallelizeable 
> in a stepwise simulation!) you can still parallelize it in many ways (for 
> example with reducers) and at the same time make simulations reproducible, 
> and you'll avoid a lot (N²) of potentially expensive context switching.
>

I may use reducers in the future.  Still learning about them. 

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Advice on managing random number generators across threads

Reply via email to