On Fri, Nov 17, 2023, at 11:07, Robert Kern wrote: > If the arrays you are drawing indices for are real in-memory arrays for > present-day 64-bit computers, this should be adequate. If it's a notional > array that is larger, then you'll need actual arbitrary-sized integer > sampling. The builtin `random.randrange()` will do arbitrary-sized integers > and is quite reasonable for this task. If you want it to use our > BitGenerators underneath for clean PRNG state management, this is quite > doable with a simple subclass of `random.Random`: > https://github.com/numpy/numpy/issues/24458#issuecomment-1685022258
Thanks, Robert. The use case is for randomly populating a hypothetical N-dimensional sparse array object. In practice, int64 will probably suffice. I can see how to generate arbitrarily large integers using the pattern above, but still unsure how to sample without replacement. Aaron mentioned that conflicts are extremely unlikely, so perhaps fine to assume they won't happen. Checking for conflicts is expensive. Attached is a script that implements this solution. Stéfan
import random import functools import itertools import operator import numpy as np def cumulative_prod(arr): return list(itertools.accumulate(arr, func=operator.mul)) def unravel_index(x, dims): dim_prod = cumulative_prod([1] + list(dims)[:0:-1])[::-1] return [list((x[j] // dim_prod[i]) % dims[i] for i in range(len(dims))) for j in range(len(x))] # From Robert Kern's comment at # https://github.com/numpy/numpy/issues/24458#issuecomment-1685022258 class PythonRandomInterface(random.Random): def __init__(self, rng): self._rng = rng def getrandbits(self, k): """getrandbits(k) -> x. Generates an int with k random bits.""" if k < 0: raise ValueError('number of bits must be non-negative') numbytes = (k + 7) // 8 # bits / 8 and rounded up x = int.from_bytes(self._rng.bytes(numbytes), 'big') return x >> (numbytes * 8 - k) # trim excess bits def indices(self, shape, size=1): D = functools.reduce(lambda x, y: x * y, dims) indices = [pri.randint(0, D) for i in range(size)] return unravel_index(indices, shape) rng = np.random.default_rng() pri = PythonRandomInterface(rng) dims = (500, 400, 30, 15, 20, 800, 900, 2000, 800) k = 5 print(pri.indices(dims, size=k))
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com