On Thu, Apr 24, 2014 at 9:28 AM, Lars Buitinck <larsm...@gmail.com> wrote:
> 2014-04-23 21:00 GMT+02:00 Sturla Molden <sturla.mol...@gmail.com>:
>> - If you provide each thread with its own PRNG, you must make sure the
>> sequences don't overlap. Just using a different seed for each thread is not
>> safe either.
>
> I'm not sure what you mean by that; my rand(3) manpage says "In order
> to get reproducible behavior in a threaded application, this  state
> must be made explicit; this can be done using the reentrant function
> rand_r()." But you're saying that's not enough?

It's enough to be reproducible if the threads don't interact with each
other. If they can interact with each other, then timing differences
will affect reproducibility since the order of interactions is
indeterminate, and different sequences of PRNG draws can occur. But
that's expected with interacting threads even without a PRNG involved.

It's also not enough to be sure that the different streams are
statistically independent of one another. For many non-statistical
applications, that doesn't matter. For statistical applications, it
might but it also might not. Drawing pseudorandom numbers for a Monte
Carlo algorithm definitely requires independence. Using a PRNG to
shuffle data to try to avoid worst-case performance in some algorithms
typically doesn't require independence, even if the result of that
algorithm is a component in a larger statistical application.

It looks like liblinear just uses rand() to do Fisher-Yates shuffles
in some of its coordinate descent solvers. I *suspect* that
independence is not a strictly necessary property here, but
reproducibility is.

-- 
Robert Kern

------------------------------------------------------------------------------
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to