Re: [UM-LINUX] need to find a CS professor

Rob Wed, 26 Apr 2006 20:21:44 -0700

> Thank you for the pseudocode, I see what you're saying now, but it still 
> doesn't make sense to me. ISTM the algorithm implies m < k < n, because if 
> m > k, this will never terminate, and if k > n then shuffling is 
> deterministic and more efficient in both space and time.
> 
> However, if k < n, then there exists at least one pair of unsigned 
> integers a < n and b < n such that a^x(mod k) = b^x(mod k), so this 
> algorith will never produce a set with both a and b in it - not truly 
> random.


This is what I was talking about with respect to hash "buckets" for
collisions.  Basically, if you replace the array in the set implementation
with an array of binary trees, then you can have exactly the case where
both a and bmare in the[1] set, even though hash(a) == hash(b).  This would
use space O(k+k*log m ) and a running time that is a pain in the ass
to bound reasonably (I was trying to use a negative binomial distribution,
with p=m/n, but that ends up being a very loose estimate).


> Plus, what about the overhead of picking a k and x? IIUC finding large 
> primes is non-trivial.

Picking large primes is easy, plus this is a one time cost that is made
at compile time (any x and k pair will work for all m and n's).

> There's one thing about this algorithm I really like though - O(m) space. 
> The best thing I've thought of so far that works in O(m) space looks like:

- Rob
.
[1] Using a '+' in big O notation means "which ever is bigger".. it's
probably obvious but I forget at what level of algorithms they start
describing that.

Re: [UM-LINUX] need to find a CS professor

Reply via email to