Hello,

Thanks for your work which seems promising! I have a lot of comments and
questions.

- avoid splitting the patch in two when patch 2 cannot build without
  patch 3, you can as well just submit just one patch since it goes altogether.

- have you benchmarked the overhead that is introduced?
- It seems that it is the writeback mode that is implemented? It would be 
  useful to say so in the header.
- For data block, I have seen in the jbd2 documentation that
  some escaping is needed when the data block happens to start
  with the htobe32(JBD2_MAGIC_NUMBER) magic number.
  
https://www.kernel.org/doc/html/latest/filesystems/ext4/journal.html#data-block

- there are two versions of the journal superbloc (1 vs 2), don't we need 
  to support that somehow?

- It would be useful to reorder functions in the file, to have them in the 
  order they happen:
  start, dirty, stop, commit, force order

- journal_read_block should pass out_buf to store_read for in-place filling, 
and 
  and only copy if store_read reallocated (which should very rarely happen in 
  practice)
- why re-reading the journal superblock in journal_update_superblock?
  there is no reason why it would change, you can read once at mount, and 
  just write the new version
- the yield() call will lead to performance troubles :) The scheduler is 
  really not forced to actually yield, better use a condition variable, 
  that will behave exactly like you need.

- diskfs_notify_change: is that the only call place to be?
  it seems there are many more if (diskfs_synchronous) diskfs_node_update() 
calls

- About the journal_force_checkpoint 5-sec loop, we would want to
  combine that with the periodic_sync_thread?

- More generally, it is not clear that you are respecting the ordering
  of journal writes vs normal writes. AIUI the principle is that
  we write to the journal and flush that *before* writing anything
  normally, so that either we have the old version only, or we also have
  the journal entry which we can replay, or we have the new version, and
  no mixture of these.

- The hardcoded 50 value in journal_commit_transaction seems a bit ugly :)
  We'd probably want to put a lower limit, but with a fast system, 50 might 
  be not enough for proper overlapping for performance, we'd probably 
  rather compute a value based on the journal size.
- journal_transaction_commit may return E2BIG, we should detect that
  earlier to split the transaction.
- there is an ext2_panic call when the transaction is too huge for the 
  journal, can that actually happen? We should probably check for that 
  early in journal_dirty_block?
- It's not clear to me how the journal is getting flushed. The only thing I 
  have seen is that when the journal gets full, we call
  journal_sync_everything which syncs everything
  AIUI normally we should be progressively flushing the log, along the 
  normal writes getting done?

Samuel

Reply via email to