[EMAIL PROTECTED] wrote: >Xiao Jianfeng wrote: > > >> First, I must say thanks to all of you. And I'm really sorry that I >>didn't >> describe my problem clearly. >> >> There are many tokens in the file, every time I find a token, I have >>to get >> the data on the next line and do some operation with it. It should be easy >> for me to find just one token using the above method, but there are >>more than >> one. >> >> My method was: >> >> f_in = open('input_file', 'r') >> data_all = f_in.readlines() >> f_in.close() >> >> for i in range(len(data_all)): >> line = data[i] >> if token in line: >> # do something with data[i + 1] >> >> Since my method needs to read all the file into memeory, I think it >>may be not >> efficient when processing very big file. >> >> I really appreciate all suggestions! Thanks again. >> >> >> >something like this : > >for x in fh: > if not has_token(x): continue > else: process(fh.next()) > >you can also create an iterator by iter(fh), but I don't think that is >necessary > >using the "side effect" to your advantage. I was bite before for the >iterator's side effect but for your particular apps, it becomes an >advantage. > > Thanks all of you!
I have compared the two methods, (1). "for x in fh:" (2). read all the file into memory firstly. I have tested the two methods on two files, one is 80M and the second one is 815M. The first method gained a speedup of about 40% for the first file, and a speedup of about 25% for the second file. Sorry for my bad English, and I hope I haven't made people confused. Regards, xiaojf -- http://mail.python.org/mailman/listinfo/python-list