What does flush do technically? "Flush the internal buffer, like stdio‘s fflush(). This may be a no-op on some file-like objects."
The reason I thought that closing the file after I've written about 500MB file data to it, was smart -> was because I thought that python stores that data in memory or keeps info about it somehow and only deletes this memory of it when I close the file. When I write to a file in 'wb' mode at 500 bytes at a time.. I see that the file size changes as I continue to add more data, maybe not in exact 500 byte sequences as my code logic but it becomes bigger as I make more iterations still. Seeing this, I know that the data is definitely being written pretty immediately to the file and not being held in memory for very long. Or is it...? Does it still keep it in this "internal buffer" if I don't close the file. If it does, then flush() is exactly what I need to free the internal buffer, which is what I was trying to do when I closed the file anyways... However, from your replies I take it that python doesn't store this data in an internal buffer and DOES immediately dispose of the data into the file itself (of course it still exists in variables I put it in). So, closing the file doesn't free up any more memory. On Sat, Oct 10, 2009 at 7:02 AM, Dave Angel <da...@ieee.org> wrote: > xbmuncher wrote: > >> Which piece of code will conserve more memory? >> I think that code #2 will because I close the file more often, thus >> freeing >> more memory by closing it. >> Am I right in this thinking... or does it not save me any more bytes in >> memory by closing the file often? >> Sure I realize that in my example it doesn't save much if it does... but >> I'm >> dealing with writing large files.. so every byte freed in memory counts. >> Thanks. >> >> CODE #1: >> def getData(): return '12345' #5 bytes >> f = open('file.ext', 'wb') >> for i in range(2000): >> f.write(getData()) >> >> f.close() >> >> >> CODE #2: >> def getData(): return '12345' #5 bytes >> f = open('file.ext', 'wb') >> for i in range(2000): >> f.write(getData()) >> if i == 5: >> f.close() >> f = open('file.ext', 'ab') >> i = 1 >> i = i + 1 >> >> f.close() >> >> >> > You don't save a noticeable amount of memory usage by closing and > immediately reopening the file. The amount that the system buffers probably > wouldn't depend on file size, in any case. When dealing with large files, > the thing to watch is how much of the data you've got in your own lists and > dictionaries, not how much the file subsystem and OS are using. > > But you have other issues in your code. > > 1) you don't say what version of Python you're using. So I'll assume it's > version 2.x. If so, then range is unnecessarily using a lot of memory. It > builds a list of ints, when an iterator would do just as well. Use > xrange(). ( In Python 3.x, xrange() was renamed to be called range(). ) > This may not matter for small values, but as the number gets bigger, so > would the amount of wastage. > > 2) By using the same local for the for loop as for your "should I close" > counter, you're defeating the logic. As it stands, it'll only do the > close() once. Either rename one of these, or do the simpler test, of > if i%5 == 0: > f.close() > f = open.... > > 3) Close and re-open has three other effects. One, it's slow. Two, > append-mode isn't guaranteed by the C standard to always position at the end > (!). And three, it flushes the data. That can be a very useful result, in > case the computer crashes while spending a long time updating a file. > > I'd suggest sometimes doing a flush() call on the file, if you know you'll > be spending a long time updating it. But I wouldn't bother closing it. > > DaveA > > >
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor