Oh yea, it's python 2.6. On Sat, Oct 10, 2009 at 10:32 AM, Xbox Muncher <xboxmunc...@gmail.com>wrote:
> What does flush do technically? > "Flush the internal buffer, like stdio‘s fflush(). This may be a no-op on > some file-like objects." > > The reason I thought that closing the file after I've written about 500MB > file data to it, was smart -> was because I thought that python stores that > data in memory or keeps info about it somehow and only deletes this memory > of it when I close the file. > When I write to a file in 'wb' mode at 500 bytes at a time.. I see that the > file size changes as I continue to add more data, maybe not in exact 500 > byte sequences as my code logic but it becomes bigger as I make more > iterations still. > > Seeing this, I know that the data is definitely being written pretty > immediately to the file and not being held in memory for very long. Or is > it...? Does it still keep it in this "internal buffer" if I don't close the > file. If it does, then flush() is exactly what I need to free the internal > buffer, which is what I was trying to do when I closed the file anyways... > > However, from your replies I take it that python doesn't store this data in > an internal buffer and DOES immediately dispose of the data into the file > itself (of course it still exists in variables I put it in). So, closing the > file doesn't free up any more memory. > > > On Sat, Oct 10, 2009 at 7:02 AM, Dave Angel <da...@ieee.org> wrote: > >> xbmuncher wrote: >> >>> Which piece of code will conserve more memory? >>> I think that code #2 will because I close the file more often, thus >>> freeing >>> more memory by closing it. >>> Am I right in this thinking... or does it not save me any more bytes in >>> memory by closing the file often? >>> Sure I realize that in my example it doesn't save much if it does... but >>> I'm >>> dealing with writing large files.. so every byte freed in memory counts. >>> Thanks. >>> >>> CODE #1: >>> def getData(): return '12345' #5 bytes >>> f = open('file.ext', 'wb') >>> for i in range(2000): >>> f.write(getData()) >>> >>> f.close() >>> >>> >>> CODE #2: >>> def getData(): return '12345' #5 bytes >>> f = open('file.ext', 'wb') >>> for i in range(2000): >>> f.write(getData()) >>> if i == 5: >>> f.close() >>> f = open('file.ext', 'ab') >>> i = 1 >>> i = i + 1 >>> >>> f.close() >>> >>> >>> >> You don't save a noticeable amount of memory usage by closing and >> immediately reopening the file. The amount that the system buffers probably >> wouldn't depend on file size, in any case. When dealing with large files, >> the thing to watch is how much of the data you've got in your own lists and >> dictionaries, not how much the file subsystem and OS are using. >> >> But you have other issues in your code. >> >> 1) you don't say what version of Python you're using. So I'll assume it's >> version 2.x. If so, then range is unnecessarily using a lot of memory. It >> builds a list of ints, when an iterator would do just as well. Use >> xrange(). ( In Python 3.x, xrange() was renamed to be called range(). ) >> This may not matter for small values, but as the number gets bigger, so >> would the amount of wastage. >> >> 2) By using the same local for the for loop as for your "should I close" >> counter, you're defeating the logic. As it stands, it'll only do the >> close() once. Either rename one of these, or do the simpler test, of >> if i%5 == 0: >> f.close() >> f = open.... >> >> 3) Close and re-open has three other effects. One, it's slow. Two, >> append-mode isn't guaranteed by the C standard to always position at the end >> (!). And three, it flushes the data. That can be a very useful result, in >> case the computer crashes while spending a long time updating a file. >> >> I'd suggest sometimes doing a flush() call on the file, if you know you'll >> be spending a long time updating it. But I wouldn't bother closing it. >> >> DaveA >> >> >> >
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor