[Numpy-discussion] How to sample unique vectors

Stefan van der Walt via NumPy-Discussion Fri, 17 Nov 2023 10:55:09 -0800

Hi all,

I am trying to sample k N-dimensional vectors from a uniform distribution 
without replacement.
It seems like this should be straightforward, but I can't seem to pin it down.


Specifically, I am trying to get random indices in an d0 x d1 x d2.. x dN-1 
array.

I thought about sneaking in a structured dtype into `rng.integers`, but of 
course that doesn't work.

If we had a string sampler, I could sample k unique words (consisting of 
digits), and convert them to indices.

I could over-sample and filter out the non-unique indices. Or iteratively draw 
blocks of samples until I've built up my k unique indices.

The most straightforward solution would be to flatten indices, and to sample 
from those. The integers get large quickly, though. The rng.integers docstring 
suggests that it can handle object arrays for very large integers:

> When using broadcasting with uint64 dtypes, the maximum value (2**64)
> cannot be represented as a standard integer type.
> The high array (or low if high is None) must have object dtype, e.g., 
> array([2**64]).

But, that doesn't work:

In [35]: rng.integers(np.array([2**64], dtype=object))
ValueError: high is out of bounds for int64

Is there an elegant way to handle this problem?

Best regards,
Stéfan
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] How to sample unique vectors

Reply via email to