> Suppose we have a very large file, and wanna remove 'n' bytes in the > middle of the file. My thought is: > 1, read() until we reach the bytes should be removed, and mark the > position as 'pos'. > 2, seek(tell() + n) bytes > 3, read() until we reach the end of the file, into a variable, say 'a' > 4, seek(pos) back to 'pos' > 5, write(a) > 6, truncate() > > If the file is really large, the performance may be a problem.
The biggest problem I see would be trying to read some massive portion if step #3 involves a huge amount of data. If you're dealing with a multi-gigabyte file, and you want to delete 5 bytes beginning at 20 bytes into the file, step #3 involves reading in file_size-(20+5) bytes into memory, and then spewing them all back out. A better way might involve reading a fixed-size chunk each time and then writing that back to its proper offset. def shift(f, offset, size, buffer_size=1024*1024): """deletes a portion of size "size" from file "f", starting at offset, and shifting the remainder of the file to fill. The buffer_size can be tweaked for performance preferences, defaulting to 1 megabyte. """ f.seek(offset+size) while True: buffer = f.read(buffer_size) if not buffer: break f.seek(offset) f.write(buffer) f.seek(buffer_size,1) offset += buffer_size f.truncate() if __name__ == '__main__': offset = ord('p') size = 5 buffer_size = 30 from StringIO import StringIO f = StringIO(''.join([chr(i) for i in xrange(256)])) print repr(f.read()) print '=' * 50 f.seek(0) shift(f, offset, size, buffer_size) f.seek(0) print repr(f.read()) > Is there a clever way to finish? Could mmap() help? Thx No idea regarding mmap(). -tkc -- http://mail.python.org/mailman/listinfo/python-list