manstey wrote: > But will this work if I don't know parts in advance. Yes it will work as long as the highest part number in the whole file is not very high. The algorithm needs only store N records in memory, where N is the highest part number in the whole file.
> I only know parts > by reading through the file, which has 450,000 lines. Lines or records? I created a sequence of 10,000,000 numbers which is equal to your ten million records like this: def many_numbers(): for n in xrange(1000000): for part in xrange(10): yield part parts = many_numbers() and the code processed it consuming virtually no memory in 13 seconds. That is the advantage of iterators and generators, you can process long sequences without allocating a lot of memory. -- http://mail.python.org/mailman/listinfo/python-list