Wow, thanks again =)
--
http://mail.python.org/mailman/listinfo/python-list
MRAB writes:
> Federico Moreira wrote:
>> Great, 2min 34 secs with the open method =)
>>
>> but why?
>>
>> ip, sep, rest = line.partition(' ')
>>match_counter[ip] += 1
>>
>> instead of
>>
>> match_counter[line.strip()[0]] += 1
>>
>> strip really takes more time than partition?
>>
>> I'm h
Yep i meant split sorry.
Thanks for the answer!
--
http://mail.python.org/mailman/listinfo/python-list
Federico Moreira wrote:
Great, 2min 34 secs with the open method =)
but why?
ip, sep, rest = line.partition(' ')
match_counter[ip] += 1
instead of
match_counter[line.strip()[0]] += 1
strip really takes more time than partition?
I'm having the same results with both of them right now.
Great, 2min 34 secs with the open method =)
but why?
ip, sep, rest = line.partition(' ')
match_counter[ip] += 1
instead of
match_counter[line.strip()[0]] += 1
strip really takes more time than partition?
I'm having the same results with both of them right now.
--
http://mail.python.org
> FedericoMoreirawrote:
> > Hi all,
>
> > Im parsing a 4.1GB apache log to have stats about how many times an ip
> > request something from the server.
>
> > The first design of the algorithm was
>
> > for line in fileinput.input(sys.argv[1:]):
> > ip = line.split()[0]
> > if match_counter.
Arnaud Delobelle writes:
> match_total = dict((key, val()) for key, val in match_counter.iteritems())
Sorry I meant
match_total = dict((key, val.next())
for key, val in match_counter.iteritems())
--
Arnaud
--
http://mail.python.org/mailman/listinfo/python-list
bearophileh...@lycos.com writes:
> This can be a little faster still:
>
> match_counter = defaultdict(int)
> for line in fileinput.input(sys.argv[1:]):
> ip = line.split(None, 1)[0]
> match_counter[ip] += 1
>
> Bye,
> bearophile
Or maybe (untested):
match_counter = defaultdict(int)
for l
2008/12/16
> Python 3.0 does not support has_key, it's time to get used to not using it
> :)
>
Good to know
line.split(None, 1)[0] really speeds up the proccess
Thanks again.
--
http://mail.python.org/mailman/listinfo/python-list
Quoth Lie Ryan :
> On Tue, 16 Dec 2008 12:07:14 -0300, Federico Moreira wrote:
>
> > Hi all,
> >
> > Im parsing a 4.1GB apache log to have stats about how many times an ip
> > request something from the server.
> >
> > The first design of the algorithm was
> >
> > for line in fileinput.input(sy
The defaultdict option looks faster than the standard dict (20 secs aprox).
Now i have:
#
import fileinput
import sys
from collections import defaultdict
match_counter = defaultdict(int)
for line in fileinput.input(sys.argv[1:]):
match_counter[line.split()[0]] +=
MRAB:
> from collections import defaultdict
> match_counter = defaultdict(int)
> for line in fileinput.input(sys.argv[1:]):
> ip = line.split()[0]
> match_counter[ip] += 1
This can be a little faster still:
match_counter = defaultdict(int)
for line in fileinput.input(sys.argv[1:]):
Lie Ryan wrote:
> On Tue, 16 Dec 2008 12:07:14 -0300, Federico Moreira wrote:
>
>
>> Hi all,
>>
>> Im parsing a 4.1GB apache log to have stats about how many times an ip
>> request something from the server.
>>
>> The first design of the algorithm was
>>
>> for line in fileinput.input(sys.argv[1
On Tue, 16 Dec 2008 12:07:14 -0300, Federico Moreira wrote:
> Hi all,
>
> Im parsing a 4.1GB apache log to have stats about how many times an ip
> request something from the server.
>
> The first design of the algorithm was
>
> for line in fileinput.input(sys.argv[1:]):
> ip = line.split()[
On Tue, 16 Dec 2008 12:07:14 -0300, Federico Moreira wrote:
> Hi all,
>
> Im parsing a 4.1GB apache log to have stats about how many times an ip
> request something from the server.
>
> The first design of the algorithm was
>
> for line in fileinput.input(sys.argv[1:]):
> ip = line.split()[
Federico Moreira wrote:
Hi all,
Im parsing a 4.1GB apache log to have stats about how many times an ip
request something from the server.
The first design of the algorithm was
for line in fileinput.input(sys.argv[1:]):
ip = line.split()[0]
if match_counter.has_key(ip):
match_
Hi all,
Im parsing a 4.1GB apache log to have stats about how many times an ip
request something from the server.
The first design of the algorithm was
for line in fileinput.input(sys.argv[1:]):
ip = line.split()[0]
if match_counter.has_key(ip):
match_counter[ip] += 1
else:
17 matches
Mail list logo