2009/10/10 Xbox Muncher <xboxmunc...@gmail.com>: > What does flush do technically? > "Flush the internal buffer, like stdio‘s fflush(). This may be a no-op on > some file-like objects." > > The reason I thought that closing the file after I've written about 500MB > file data to it, was smart -> was because I thought that python stores that > data in memory or keeps info about it somehow and only deletes this memory of > it when I close the file. > When I write to a file in 'wb' mode at 500 bytes at a time.. I see that the > file size changes as I continue to add more data, maybe not in exact 500 byte > sequences as my code logic but it becomes bigger as I make more iterations > still. > > Seeing this, I know that the data is definitely being written pretty > immediately to the file and not being held in memory for very long. Or is > it...? Does it still keep it in this "internal buffer" if I don't close the > file. If it does, then flush() is exactly what I need to free the internal > buffer, which is what I was trying to do when I closed the file anyways... > > However, from your replies I take it that python doesn't store this data in > an internal buffer and DOES immediately dispose of the data into the file > itself (of course it still exists in variables I put it in). So, closing the > file doesn't free up any more memory.
Python file I/O is buffered. That means that there is a memory buffer that is used to hold a small amount of the file as it is read or written. You original example writes 5 bytes at a time. With unbuffered I/O, this would write to the disk on every call to write(). (The OS also has some buffering, I'm ignoring that.) With buffered writes, there is a memory buffer allocated to hold the data. The write() call just puts data into the buffer; when it is full, the buffer is written to the disk. This is a flush. Calling flush() forces the buffer to be written. So, a few points about your questions: - calling flush() after each write() will cause a disk write. This is probably not what you want, it will slow down the output considerably. - calling flush() does not de-allocate the buffer, it just writes its contents. So calling flush() should not change the amount of memory used. - the buffer is pretty small, maybe 8K or 32K. You can specify the buffer size as an argument to open() but really you probably want the system default. Kent _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor