On Sun, Jun 3, 2018 at 5:46 PM <josef.p...@gmail.com> wrote: > > > On Sun, Jun 3, 2018 at 8:21 PM, Robert Kern <robert.k...@gmail.com> wrote: > >> >> The list of ``StableRandom`` methods should be chosen to support unit >>> tests: >>> >>> * ``.randint()`` >>> * ``.uniform()`` >>> * ``.normal()`` >>> * ``.standard_normal()`` >>> * ``.choice()`` >>> * ``.shuffle()`` >>> * ``.permutation()`` >>> >> >> https://github.com/numpy/numpy/pull/11229#discussion_r192604311 >> @bashtage writes: >> > standard_gamma and standard_exponential are important enough to be >> included here IMO. >> >> "Importance" was not my criterion, only whether they are used in unit >> test suites. This list was just off the top of my head for methods that I >> think were actually used in test suites, so I'd be happy to be shown live >> tests that use other methods. I'd like to be a *little* conservative about >> what methods we stick in here, but we don't have to be *too* conservative, >> since we are explicitly never going to be modifying these. >> > > That's one area where I thought the selection is too narrow. > We should be able to get a stable stream from the uniform for some > distributions. > > However, according to the Wikipedia description Poisson doesn't look easy. > I just wrote a unit test for statsmodels using Poisson random numbers with > hard coded numbers for the regression tests. >
I'd really rather people do this than use StableRandom; this is best practice, as I see it, if your tests involve making precise comparisons to expected results. StableRandom is intended as a crutch so that the pain of moving existing unit tests away from the deprecated RandomState is less onerous. I'd really rather people write better unit tests! In particular, I do not want to add any of the integer-domain distributions (aside from shuffle/permutation/choice) as these are the ones that have the platform-dependency issues with respect to 32/64-bit `long` integers. They'd be unreliable for unit tests even if we kept them stable over time. > I'm not sure which other distributions are common enough and not easily > reproducible by transformation. E.g. negative binomial can be reproduces by > a gamma-poisson mixture. > > On the other hand normal can be easily recreated from standard_normal. > I was mostly motivated by making it a bit easier to mechanically replace uses of randn(), which is probably even more common than normal() and standard_normal() in unit tests. > Would it be difficult to keep this list large, given that it should be > frozen, low maintenance code ? > I admit that I had in mind non-statistical unit tests. That is, tests that didn't depend on the precise distribution of the inputs. -- Robert Kern
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion