If you have 1G memory, you should be able to store 64-256M IPs without using hashtables, considering that you use a 16, 32, or 64 bit number to store the visit count.
If you have a new PC, you can possibly store the whole IPv4 address space in memory, so multiple scans isn't necessary. Anyway, I don't see how a heap help with the implementation, because inserting things into a heap is essentially half a heap sort. BTW, for the sorting: if you don't need a deterministic approach, consider doing a sampling (with a hash) to estimate a cutoff value. Then pull out IPs with visits above the cutoff value and do the sorting. However, you still need to do full counting to obtain the counts. This reduce the sorting part from O(n log n) (which is much more expensive than counting that takes O(n)) to O(n + k log k). And if k is small, then insertion sort can also be the algorithm of choice as the complexity is O(nk). (You need only to store the top k results with insertion sort.) -- You received this message because you are subscribed to the Google Groups "Google Code Jam" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/google-code/dc02901e-ea5a-47e6-85ae-a2bdd96d711f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
