From: "Bill Fairchild" <[email protected]>
Sent: Tuesday, 24 August 2010 4:04 AM

Your random input entry is a 32-bit number.
 Two to the 32nd power is 4GB.  Let the 32-bit number represent
the number of a bit in a contiguous string of bits that is 4 billion bits long.
The number of bytes to contain this array is 537 MB,
which can easily be acquired above the bar or in a data space.
 Find the byte address and bit offset within that byte that
corresponds to the 32-bit input number.  Test the bit.
If off, you have a new entry, so turn the bit on and
process the new entry.  If on, you have a duplicate entry,
so ignore the entry.

That is unworkable.
It's also the reason that I asked how large the list is.

Until that is known, it's pointless suggesting
strategies for solving this kind of problem.

Now that we know that the list is "less than a few hundred entries",
it would seem that hashing would speed this up considerably
(compared with binary search with insertion).
Insertion takes n/2 moves on average.
Searching takes log n (base 2) steps.

Hashing takes one (or a few steps) to look up,
and insertion takes one (or a few) steps.

Make the table large enough (for, say, 10 x 200 entries)
with provision for expanding the table.

Reply via email to