Re: randomShuffle

Diggory Tue, 04 Jun 2013 01:35:49 -0700

On Monday, 3 June 2013 at 21:24:50 UTC, Joseph Rushton Wakelingwrote:

On 06/03/2013 08:28 PM, Diggory wrote:
I'd guess that the heavy use of floating point arithmetic tocalculate the stepsizes means that algorithm has a fairly large constantoverhead even though the
complexity is smaller.
Yes, I agree. There might be some optimizations that could bedone there.


Well I've come up with this new algorithm:


uint[] randomSample(uint N, uint M) {
        uint[] result = new uint[N];

        struct hashPair {
                uint key;
                uint index;
        }

        size_t tableSize = N*4;
        if (tableSize > M)
                tableSize = M;

        hashPair[] table = new hashPair[tableSize];

        for (uint i = 0; i < N; ++i) {
                uint v = (rndGen.front % (M-i))+i;
                uint newv = v;
                rndGen.popFront();

                uint vhash = v%tableSize;

                while (table[vhash].index) {
                        if (table[vhash].key == v) {
                                newv = table[vhash].index-1;
                                table[vhash].index = i+1;
                                goto done;
                        }

                        vhash = (vhash+1)%tableSize;
                }

                table[vhash].key = v;
                table[vhash].index = i+1;

done:
                result[i] = newv;
        }

        return result;
}

It's O(N) rather than O(N²). Conceptually it works by randomlyshuffling the first N items in an array of size M which takesO(N) time but requires an array of size O(M), so to avoid this ituses a simple hash table to store just the items which have beenswapped. Since only O(N) items are being shuffled there can onlybe O(N) swaps and so the hash table can be O(N) in size whilestill offering constant time look-up.

Re: randomShuffle

Reply via email to