On 8/7/07, Sullivan WxPyQtKinter <[EMAIL PROTECTED]> wrote: > I have a huge log file which contains 3,453,299,000 lines with > different lengths. It is not possible to calculate the absolute > position of the beginning of the one billionth line. Are there > efficient way to seek to the beginning of that line in python? > > This program: > for i in range(1000000000): > f.readline() > is absolutely every slow.... > > Thank you so much for help.
There is no fast way to do this, unless the lines are of fixed length (in which case you can use f.seek() to move to the correct spot). The reason is that there is no way to find the position of the billionth line without scanning the whole file. You should split your logs into smaller files in the future. You might be able to do this a very tiny bit faster by using the split utility and have it split the log file into smaller chunks (split can split by line amounts), but since that still has to scan the file it will will be IO bound. -- Evan Klitzke <[EMAIL PROTECTED]> -- http://mail.python.org/mailman/listinfo/python-list