Chris Rebert wrote: >> The running result was that read a 500M file consume almost 2GB RAM, I >> cannot figure it out, somebody help! > > If you could store the floats themselves, rather than their string > representations, that would be more space-efficient. You could then > also use the `array` module, which is more space-efficient than lists > (http://docs.python.org/library/array.html ). Numpy would also be > worth investigating since multidimensional arrays are involved. > > The next obvious question would then be: do you /really/ need /all/ of > the data in memory at once?
This is what you (OP) should think about really hard before resorting to the optimizations mentioned above. Perhaps you can explain what you are doing with the data once you've loaded it into memory? > Also, just so you're aware: > http://docs.python.org/library/sys.html#sys.getsizeof To give you an idea how memory usage explodes: >>> line = "1.23 4.56 7.89 0.12\n" >>> len(line) # size in the file 20 >>> sys.getsizeof(line) 60 >>> formatted = ["%2.6E" % float(x) for x in line.split()] >>> sys.getsizeof(formatted) + sum(sys.getsizeof(s) for s in formatted) 312 -- http://mail.python.org/mailman/listinfo/python-list