if the ip# are really very different you can do in 2 steps:
- external sort you log file and
- rank each ip sequencially on the sorted file.


On Thu, Sep 4, 2008 at 2:12 AM, Huabin Zheng <[EMAIL PROTECTED]> wrote:

> Hi all,
>     I am encountered with a problem, it looks like this:
>
>     There is a log file which records all the IPs that visited a certain
> web site. The log file may be several G bytes, but the computer used
> to analyze it has limited memory, about 1G bytes. I am asked to figure out
> the Top K  IPs which visited the web site most most frequently.
> is hash table competent to solve it?
>
> Any other suggestions? Or are there classic algorithms existed to cope with
> it?
>
> thanks
>
> Regards,
> Huabin
>
> --
> Huabin Zheng
> Sensor Networks and Application Research Center, GUCAS
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"google-codejam" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/google-code?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to