Milos Nikic, le lun. 21 juil. 2025 11:38:00 -0700, a ecrit: > > Which kind of operations is spamming? As I mentioned, we most probably > > want to implement relatime, that'll be useful to avoid many writes > > anyway. > > Mainly `utime` updates to `/dev/null` and `/dev/random`.
Which would be caught by relatime. « Access time is only updated if the previous access time was earlier than or equal to the current modify or change time. » Better take the time to implement that, since that'll save the corresponding inode writes too. > > Better use the ext3/4 native way of allocating blocks for the journal. > > That’s exactly what I’d like to do next — but I’m not sure how to get there in > this context. Would this involve allocating blocks outside the main filesystem > namespace via libstore? Any pointers or examples would be really appreciated. No, it's still in the disk storage. It's just that ext3 has a way to reserve blocks for the journal. I don't know a reference for this but it should be easy to find. > > Does the normal path lookup not work? At worse by rearranging some code > > to provide an internal version not meant for RPCs. > > That’s the trick: the issue isn’t how, but *when*. > The journal contains information from before the crash, but after reboot, > we’re > walking a post-crash live filesystem. If we try to resolve inode paths at > boot, > we might end up with mismatches, or restoring paths that no longer make sense. But the journal is supposed to be in an order that makes sense sequentially. Again, better check how ext3/4/jbd are doing it, rather than trying to re-invent them. > One additional note: while testing i have discovered that the filesystem > remains read-only at that early point and it onl stops being readonly after > the RPC come online. > If is just call diskfs_node_update that early (as i do in the patch) it > silently has no effect (!!!) You probably just want to set diskfs_readonly = 0 while playing the journal, and reset it to what it was (as ask on the command-line etc.) just before unleashing RPCs. > On the other hand, once RPCs are up, trying to walk the FS to replay changes > risks deadlocks. Sure, you don't want that. > It feels like journaling recovery needs to happen in a carefully coordinated > phase — perhaps a new pre-init mode, or deeper integration with `diskfs` > itself. Yes. Feel free to add hooks if libdiskfs doesn't have what you need. Samuel