Apache log munging

2008-10-08 Thread Joe Python
I have a written a generator for an apache log which returns two types of information, hostname and the filename requested. The 'log' generator can be 'consumed' like this: for r in log: print r['host'], r['filename'] I want to find the top '100' hosts (sorted in descending order of total

Re: Apache log munging

2008-10-08 Thread Joe Riopel
On Wed, Oct 8, 2008 at 1:55 PM, Joe Python [EMAIL PROTECTED] wrote: I want to find the top '100' hosts (sorted in descending order of total requests) like follows: Is there a fast way to this without scanning the log file many times? As you encounter a new host add it to a dict (or another

Re: Apache log munging

2008-10-08 Thread Joe Python
I am currently using the following technic to get the info above: all = defaultdict(int) hosts = defaultdict(int) filename = defaultdict(int) for r in log: all[r['host'],r['file']] += 1 hosts[r['host']] += 1 filename[r['file']] = 1 for host in sorted(hosts,key=hosts.get,