Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa...@spamschutz.glglgl.de> writes:
> Am 19.10.2012 21:03 schrieb Pradipto Banerjee: [...] >> Still got MemoryError, but at least this time python tried to use the >> physical memory. What I noticed is that before it gave me the error >> it used up to 1.5GB (of the 2.23 GB originally showed as available) - >> so in general, python takes up more memory than the size of the file >> itself. > > Of course - the file is not the only thing to be held by the process. > > I see several approaches here: > > * Process the file part by part - as the others already suggested, > line-wise, but if you have e.g. a binary file format, other partings > may be suitable as well - e.g. fixed block size, or parts given by the > file format. > > * If you absolutely have to keep the whole file data in memory, split > it up in several strings. Why? Well, the free space in virtual memory > is not necessarily contiguous. So even if you have 1.5G free, you > might not be able to read 1.5G at once, but you might succeed in > reading 3*0.5G. * try mmap, if you're lucky it will give you access to your data. (Note that it is completely unreasonable to load several Gs of data in a 32-bit address space, especially if this is text. So my real advice would be: * read the file line per line and pack the contents of every line into a list of objects; once you have all your stuff, process it -- Alain. -- http://mail.python.org/mailman/listinfo/python-list