(For very large k case only)

Keep as many elements in the hash_map as you can. Once the map capacity is
exhausted, sort all the elements in the hash_map and write them to disk and
build a bloom filter. Keep this bloom
filter<http://en.wikipedia.org/wiki/Bloom_filter>in memory, empty the
hash_map and now for every new number in the stream,
check in the bloom filter, if it says does not exist, that means this is a
new number, insert in the map. If it says the number does exist, then this
can be a false positive, then first check in the map, if not found in it,
it could be on disk. Use a binary search on the file (this will take
O(logn) disk seeks) and verify if its present in the file or not. If
present, ignore, else add in the map. Keep repeating the process of
flushing the map to disk whenever the map is full. To flush, you could
either use any external
sorting<http://en.wikipedia.org/wiki/External_sorting>method, or just
write out a new file each time. If separate files, you will
need to binary search in all of them in case of false positive.

Programmers should realize their critical importance and responsibility in
a world gone digital. They are in many ways similar to the priests and
monks of Europe's Dark Ages; they are the only ones with the training and
insight to read and interpret the "scripture" of this age.


On Sat, Jun 1, 2013 at 10:19 PM, MAC <[email protected]> wrote:

> Given an infinite stream of integers with only k distinct integers, find
> those distinct integers. How would you modify your solution if k is also
> very large and cannot use hash table.
>
>
>
>
>
> thanks
> --mac
>
> --
> You received this message because you are subscribed to the Google Groups
> "Algorithm Geeks" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].


Reply via email to