> > 2. Byte-granularity means that read-modify-write is necessary to append > > entries to the journal. Therefore a failure could destroy previously > > committed entries. > > > > Any ideas how existing journals handle this? > > You commit only whole blocks. So in this case we can consider a block > only committed as soon as a TYPE_END entry has been written (and after > that we won't touch it any more until the journalled changes have been > flushed to disk). > > There's one "interesting" case: cache=writethrough. I'm not entirely > sure yet what to do with it, but it's slow anyway, so using one block > per entry and therefore flushing the journal very often might actually > be not totally unreasonable.
This sure would finish to kill the performance because this would be an io per metadata written to disk. > > Another thing I'm not sure about is whether a fixed 4k block is good or > if we should leave it configurable. I don't think making it an option > would hurt (not necessarily modifyable with qemu-img, but as a field > in the file format). I agree. I also think about make the number of block to be flushed at once configurable. Benoît