On Thu, Apr 24, 2014 at 9:28 AM, Lars Buitinck <larsm...@gmail.com> wrote: > 2014-04-23 21:00 GMT+02:00 Sturla Molden <sturla.mol...@gmail.com>: >> - If you provide each thread with its own PRNG, you must make sure the >> sequences don't overlap. Just using a different seed for each thread is not >> safe either. > > I'm not sure what you mean by that; my rand(3) manpage says "In order > to get reproducible behavior in a threaded application, this state > must be made explicit; this can be done using the reentrant function > rand_r()." But you're saying that's not enough?
It's enough to be reproducible if the threads don't interact with each other. If they can interact with each other, then timing differences will affect reproducibility since the order of interactions is indeterminate, and different sequences of PRNG draws can occur. But that's expected with interacting threads even without a PRNG involved. It's also not enough to be sure that the different streams are statistically independent of one another. For many non-statistical applications, that doesn't matter. For statistical applications, it might but it also might not. Drawing pseudorandom numbers for a Monte Carlo algorithm definitely requires independence. Using a PRNG to shuffle data to try to avoid worst-case performance in some algorithms typically doesn't require independence, even if the result of that algorithm is a component in a larger statistical application. It looks like liblinear just uses rand() to do Fisher-Yates shuffles in some of its coordinate descent solvers. I *suspect* that independence is not a strictly necessary property here, but reproducibility is. -- Robert Kern ------------------------------------------------------------------------------ Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general