On Thu, Nov 6, 2008 at 16:58, Barry Wark <[EMAIL PROTECTED]> wrote: > On Thu, Nov 6, 2008 at 1:55 PM, Robert Kern <[EMAIL PROTECTED]> wrote: >> On Thu, Nov 6, 2008 at 15:12, Barry Wark <[EMAIL PROTECTED]> wrote: >>> On Thu, Nov 6, 2008 at 12:09 PM, Robert Kern <[EMAIL PROTECTED]> wrote: >>>> On Thu, Nov 6, 2008 at 14:05, Barry Wark <[EMAIL PROTECTED]> wrote: >>>>> I'm just about to embark on a long-term research project and was >>>>> planning to use numpy.random to generate stimuli for our experiments. >>>>> We plan to store only the parameters and RandomState seed for each >>>>> stimulus and I'm concerned about stability of the API in the long >>>>> term: will the parameters and random seed we store now work with >>>>> future versions of numpy.random? >>>> >>>> It should. But just in case, make sure you explicitly instantiate >>>> RandomState objects instead of using the functions in numpy.random. >>>> That way, should we need to fix some bug that might change the >>>> results, you can always pull out the current mtrand code and use it >>>> independently. >>> >>> That is our working plan, as well as to record the numpy.__version__ >>> which was used to generate the original stimulus. Thanks for the >>> confirmation. >>> >>> On a side note, this seems like a potentially big issue for many >>> scientific users. Perhaps making a policy of keeping incompatible >>> revisions to RandomState noted in its documentation (if they ever >>> come up) would be useful. Even better, a module function or class >>> method that returns an instance of RandomState as it was at a >>> particular numpy version: >>> >>> r = numpy.random.RandomState.from_version(my_numpy_version, seed=None) >>> >>> Hmm. Sounds like a bit of work. I'll give it a go, if you think this >>> is a valuable approach. >>> >>>> >>>>> I think I recall that there was a >>>>> change in the random seed format some time around numpy 1.0. >>>> >>>> I don't think I changed it after 1.0. Before 1.0, we explicitly warned >>>> people about API instability. >>> >>> I believe you. We've been developing this app since before numpy 1.0, >>> so I'm sure the issue cropped up from data generated pre-1.0. >> >> Okay. Actually, now that I think about it, there have been changes >> that would affect results using the nonuniform distributions. These >> should only have arisen from fixing bugs (i.e. the previous results >> were wrong, not just different). Do you have any thoughts on how you >> would want us to handle that case? > > In our usage (neural physiology), we've recorded the physiological > response to a given stimulus. So being able to recover the _exact_ > original stimulus that produced the recorded data is critical. This is > why I suggested an API which would let us get an instance of the > RandomState as it was at a particular revision (including bugs) so > that we could regenerate the exact original sequence. Obviously, we're > happy to have the bug fixes in, and continue to use the current > RandomState for new experiments.
How big are these stimuli? I'd just store them in an HDF file. Some of those bugs were 32-bit/64-bit differences. Just because your future self can get the same-versioned source code doesn't mean that the results you get will be identical. I think that the versioned API would be a lot of work for a false sense of security. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ Numpy-discussion mailing list [email protected] http://projects.scipy.org/mailman/listinfo/numpy-discussion
