On Fri, Nov 17, 2023, at 11:07, Robert Kern wrote:
> If the arrays you are drawing indices for are real in-memory arrays for 
> present-day 64-bit computers, this should be adequate. If it's a notional 
> array that is larger, then you'll need actual arbitrary-sized integer 
> sampling. The builtin `random.randrange()` will do arbitrary-sized integers 
> and is quite reasonable for this task. If you want it to use our 
> BitGenerators underneath for clean PRNG state management, this is quite 
> doable with a simple subclass of `random.Random`: 
> https://github.com/numpy/numpy/issues/24458#issuecomment-1685022258

Thanks, Robert. The use case is for randomly populating a hypothetical 
N-dimensional sparse array object.
In practice, int64 will probably suffice. 

I can see how to generate arbitrarily large integers using the pattern above, 
but still unsure how to sample without replacement. Aaron mentioned that 
conflicts are extremely unlikely, so perhaps fine to assume they won't happen. 
Checking for conflicts is expensive.

Attached is a script that implements this solution.

Stéfan
import random
import functools
import itertools
import operator

import numpy as np


def cumulative_prod(arr):
    return list(itertools.accumulate(arr, func=operator.mul))


def unravel_index(x, dims):
    dim_prod = cumulative_prod([1] + list(dims)[:0:-1])[::-1]
    return [list((x[j] // dim_prod[i]) % dims[i] for i in range(len(dims))) for j in range(len(x))]


# From Robert Kern's comment at
# https://github.com/numpy/numpy/issues/24458#issuecomment-1685022258
class PythonRandomInterface(random.Random):
    def __init__(self, rng):
        self._rng = rng

    def getrandbits(self, k):
        """getrandbits(k) -> x.  Generates an int with k random bits."""
        if k < 0:
            raise ValueError('number of bits must be non-negative')
        numbytes = (k + 7) // 8                       # bits / 8 and rounded up
        x = int.from_bytes(self._rng.bytes(numbytes), 'big')
        return x >> (numbytes * 8 - k)                # trim excess bits

    def indices(self, shape, size=1):
        D = functools.reduce(lambda x, y: x * y, dims)
        indices = [pri.randint(0, D) for i in range(size)]
        return unravel_index(indices, shape)


rng = np.random.default_rng()
pri = PythonRandomInterface(rng)

dims = (500, 400, 30, 15, 20, 800, 900, 2000, 800)
k = 5

print(pri.indices(dims, size=k))
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to