Here's two parts of the doc on atomic commit behavior in SQLite, from http://sqlite.org/atomiccommit.html as retrieved on 2010-10-03.
Section 2.0, "Hardware Assumptions", states: | SQLite does not assume that a sector write is atomic. However, it does | assume that a sector write is linear. By "linear" we mean that SQLite | assumes that when writing a sector, the hardware begins at one end of | the data and writes byte by byte until it gets to the other end. The | write might go from beginning to end or from end to beginning. If a | power failure occurs in the middle of a sector write it might be that | part of the sector was modified and another part was left | unchanged. The key assumption by SQLite is that if any part of the | sector gets changed, then either the first or the last bytes will be | changed. I interpret this to imply that if we have a sector S that contains the bitstring D and a system crash occurs while writing a different string D' to S, a subsequent read of S will return E such that E[j] = D[j] for all j where D[j] = D'[j]. In other words, bits that are not being changed do not get corrupted by a partial sector write. (This is a weaker statement than my interpretation of the above that the bits that get flipped will be in a contiguous region starting from one of the ends.) However, in section 6.1, "Always Journal Complete Sectors", I see: | It is important to store all pages of a sector in the rollback journal | in order to prevent database corruption following a power loss while | writing the sector. Suppose that pages 1, 2, 3, and 4 are all stored | in sector 1 and that page 2 is modified. In order to write the changes | to page 2, the underlying hardware must also rewrite the content of | pages 1, 3, and 4 since the hardware must write the complete | sector. If this write operation is interrupted by a power outage, one | or more of the pages 1, 3, or 4 might be left with incorrect data. This would seem to mean that my initial interpretation of the paragraph in section 2.0 is wrong, because it would imply that the other pages remain untouched. If the identical page 1 is written back to the database file, then a partial page 2, and then a crash occurs, pages 1, 3, and 4 should all remain intact, and similarly if the write begins from the end or if the crash occurs in any other location. Is it that the parts that are "touched" by a linear write are potentially totally corrupted by a crash? Could anyone provide some clarification on what states the underlying OS+hardware stack is "allowed" to expose after a crashed sector write in order for the journaling mechanism to work properly? I'm also curious as to the source data for these sorts of assumptions regarding what modern OS+hardware stacks do when rewriting sectors of a mass storage device. In particular, I haven't found good data on what results occur after power failures during rewriting sectors of a hard disk or "rewriting" sectors of a flash-based device, or similar information for filesystems that don't do in-place writes. Pointers would be appreciated. ---> Drake Wilson _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users