Re: Generator slower than iterator?

2008-12-20 Thread Federico Moreira
Wow, thanks again =) -- http://mail.python.org/mailman/listinfo/python-list

Re: Generator slower than iterator?

2008-12-19 Thread Arnaud Delobelle
MRAB writes: > Federico Moreira wrote: >> Great, 2min 34 secs with the open method =) >> >> but why? >> >> ip, sep, rest = line.partition(' ') >>match_counter[ip] += 1 >> >> instead of >> >> match_counter[line.strip()[0]] += 1 >> >> strip really takes more time than partition? >> >> I'm h

Re: Generator slower than iterator?

2008-12-19 Thread Federico Moreira
Yep i meant split sorry. Thanks for the answer! -- http://mail.python.org/mailman/listinfo/python-list

Re: Generator slower than iterator?

2008-12-19 Thread MRAB
Federico Moreira wrote: Great, 2min 34 secs with the open method =) but why? ip, sep, rest = line.partition(' ') match_counter[ip] += 1 instead of match_counter[line.strip()[0]] += 1 strip really takes more time than partition? I'm having the same results with both of them right now.

Re: Generator slower than iterator?

2008-12-19 Thread Federico Moreira
Great, 2min 34 secs with the open method =) but why? ip, sep, rest = line.partition(' ') match_counter[ip] += 1 instead of match_counter[line.strip()[0]] += 1 strip really takes more time than partition? I'm having the same results with both of them right now. -- http://mail.python.org

Re: Generator slower than iterator?

2008-12-19 Thread Raymond Hettinger
> FedericoMoreirawrote: > > Hi all, > > > Im parsing a 4.1GB apache log to have stats about how many times an ip > > request something from the server. > > > The first design of the algorithm was > > > for line in fileinput.input(sys.argv[1:]): > >     ip = line.split()[0] > >     if match_counter.

Re: Generator slower than iterator?

2008-12-16 Thread Arnaud Delobelle
Arnaud Delobelle writes: > match_total = dict((key, val()) for key, val in match_counter.iteritems()) Sorry I meant match_total = dict((key, val.next()) for key, val in match_counter.iteritems()) -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list

Re: Generator slower than iterator?

2008-12-16 Thread Arnaud Delobelle
bearophileh...@lycos.com writes: > This can be a little faster still: > > match_counter = defaultdict(int) > for line in fileinput.input(sys.argv[1:]): > ip = line.split(None, 1)[0] > match_counter[ip] += 1 > > Bye, > bearophile Or maybe (untested): match_counter = defaultdict(int) for l

Re: Generator slower than iterator?

2008-12-16 Thread Federico Moreira
2008/12/16 > Python 3.0 does not support has_key, it's time to get used to not using it > :) > Good to know line.split(None, 1)[0] really speeds up the proccess Thanks again. -- http://mail.python.org/mailman/listinfo/python-list

Re: Generator slower than iterator?

2008-12-16 Thread rdmurray
Quoth Lie Ryan : > On Tue, 16 Dec 2008 12:07:14 -0300, Federico Moreira wrote: > > > Hi all, > > > > Im parsing a 4.1GB apache log to have stats about how many times an ip > > request something from the server. > > > > The first design of the algorithm was > > > > for line in fileinput.input(sy

Re: Generator slower than iterator?

2008-12-16 Thread Federico Moreira
The defaultdict option looks faster than the standard dict (20 secs aprox). Now i have: # import fileinput import sys from collections import defaultdict match_counter = defaultdict(int) for line in fileinput.input(sys.argv[1:]): match_counter[line.split()[0]] +=

Re: Generator slower than iterator?

2008-12-16 Thread bearophileHUGS
MRAB: > from collections import defaultdict > match_counter = defaultdict(int) > for line in fileinput.input(sys.argv[1:]): > ip = line.split()[0] > match_counter[ip] += 1 This can be a little faster still: match_counter = defaultdict(int) for line in fileinput.input(sys.argv[1:]):

Re: Generator slower than iterator?

2008-12-16 Thread Gary Herron
Lie Ryan wrote: > On Tue, 16 Dec 2008 12:07:14 -0300, Federico Moreira wrote: > > >> Hi all, >> >> Im parsing a 4.1GB apache log to have stats about how many times an ip >> request something from the server. >> >> The first design of the algorithm was >> >> for line in fileinput.input(sys.argv[1

Re: Generator slower than iterator?

2008-12-16 Thread Lie Ryan
On Tue, 16 Dec 2008 12:07:14 -0300, Federico Moreira wrote: > Hi all, > > Im parsing a 4.1GB apache log to have stats about how many times an ip > request something from the server. > > The first design of the algorithm was > > for line in fileinput.input(sys.argv[1:]): > ip = line.split()[

Re: Generator slower than iterator?

2008-12-16 Thread Lie Ryan
On Tue, 16 Dec 2008 12:07:14 -0300, Federico Moreira wrote: > Hi all, > > Im parsing a 4.1GB apache log to have stats about how many times an ip > request something from the server. > > The first design of the algorithm was > > for line in fileinput.input(sys.argv[1:]): > ip = line.split()[

Re: Generator slower than iterator?

2008-12-16 Thread MRAB
Federico Moreira wrote: Hi all, Im parsing a 4.1GB apache log to have stats about how many times an ip request something from the server. The first design of the algorithm was for line in fileinput.input(sys.argv[1:]): ip = line.split()[0] if match_counter.has_key(ip): match_

Generator slower than iterator?

2008-12-16 Thread Federico Moreira
Hi all, Im parsing a 4.1GB apache log to have stats about how many times an ip request something from the server. The first design of the algorithm was for line in fileinput.input(sys.argv[1:]): ip = line.split()[0] if match_counter.has_key(ip): match_counter[ip] += 1 else: