On Sep 4, 2:12 pm, "Huabin Zheng" <[EMAIL PROTECTED]> wrote:
> Hi all,
> I am encountered with a problem, it looks like this:
>
> There is a log file which records all the IPs that visited a certain web
> site. The log file may be several G bytes, but the computer used to analyze
> it has limited memory, about 1G bytes. I am asked to figure out the Top K
> IPs which visited the web site most most frequently.
> is hash table competent to solve it?
>
> Any other suggestions? Or are there classic algorithms existed to cope with
> it?
>
> thanks
>
> Regards,
> Huabin
>
> --
> Huabin Zheng
> Sensor Networks and Application Research Center, GUCAS
ya its possible .first ur log file is split in size proportional to ur
mem size.ones its split. u need to count the no of times each ip
adress is visited.similar to external sort principle .after that u
retrieve the frequently visited ip s and store it in separate
logfile.this way i get the no of times each ip address visited from
each slot u have visietd from that u tk the most vistted and sort
again for finding frequently visited ip addresses.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"google-codejam" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/google-code?hl=en
-~----------~----~----~----~------~----~------~--~---