On Sat, Aug 28, 2021 at 5:56 AM Stig Korsnes <stigkors...@gmail.com> wrote:

> Thank you again Robert.
> I am using NamedTuple for mye keys, which also are keys in a dictionary.
> Each key will be unique (tuple on distinct int and enum), so I am thinking
> maybe the risk of producing duplicate hash is not present, but could as
> always be wrong :)
>

Present, but possibly ignorably small. 128-bit spaces give enough breathing
room for me to be comfortable; 64-bit spaces like what hash() will use for
its results makes me just a little claustrophobic.

If the structure of the keys is pretty fixed, just these two integers
(counting the enum as an integer), then I might just use both in the
seeding material.

def get_key_seed(key:ComponentId, root_seed:int):
    return np.random.SeedSequence([key.the_int, int(key.the_enum),
root_seed])


> For positive ints i followed this tip
> https://stackoverflow.com/questions/18766535/positive-integer-from-python-hash-function
> , and did:
>
> def stronghash(key:ComponentId):
>     return ctypes.c_size_t(hash(key)).value
>

np.uint64(possibly_negative_integer) will also work for this purpose
(somewhat more reliably).

Since I will be using each process/random sample several times, and keeping
> all of them in memory at once is not feasible (dimensionality) i did the
> following:
>
>         self._rng = default_rng(cs)
>         self._state = dict(self._rng.bit_generator.state)  #
>
>     def scenarios(self) -> npt.NDArray[np.float64]:
>         self._rng.bit_generator.state = self._state
>        ....
>       return ....
>
> Would you consider this bad practice, or an ok solution?
>

It's what that property is there for. No need to copy; `.state` creates a
new dict each time.

In a quick test, I measured a process with 1 million Generator instances to
use ~1.5 GiB while 1 million state dicts ~1.0 GiB (including all of the
other overhead of Python and numpy; not a scientific test). Storing just
the BitGenerator is half-way in between. That's something, but not a huge
win. If that is really crossing the border from feasible to infeasible, you
may be about to run into your limits anyways for other reasons. So balance
that out with the complications of swapping state in and out of a single
instance.

I Norway we have a saying which directly translates :" He asked for the
> finger... and took the whole arm" .
>

Well, when I craft an overly-complicated system, I feel responsible to help
shepherd people along in using it well. :-)

-- 
Robert Kern
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Reply via email to