On 30/04/13 22:35, Stephen Allen wrote:

> P.S.  Andy, please chime in where I've gotten something wrong (as I
> undoubtedly have, it's a pretty complex bit of the code)

The description is spot-on.

The only clarification is that the blocks in a transaction are always written to the file-based journal, the commit record written (this is the true pont at which a transaction commits in a single disk operation of syn on the journal). The journal is replayed to update the database. The in-memory blocks are not written directly even if theer is only one writer around.

Long term, it would be good to move to a single-write transaction system where the data is written index files as append operations, not writen to the journal and then to the indexes in-place. It is a significant to file formats but also the way teh B+Tree work because currently they don't need to understand transactions, only that they are given a block manager (which is transactional).

        Andy

Reply via email to