Hi all, Im parsing a 4.1GB apache log to have stats about how many times an ip request something from the server.
The first design of the algorithm was for line in fileinput.input(sys.argv[1:]): ip = line.split()[0] if match_counter.has_key(ip): match_counter[ip] += 1 else: match_counter[ip] = 1 And it took 3min 58 seg to give me the stats Then i tried a generator solution like def generateit(): for line in fileinput.input(sys.argv[1:]): yield line.split()[0] for ip in generateit(): ...the same if sentence Instead of being faster it took 4 min 20 seg Should i leave fileinput behind? Am i using generators with the wrong aproach? Thanks in advance, Federico.
-- http://mail.python.org/mailman/listinfo/python-list