Hello, Milos Nikic, le mar. 17 févr. 2026 11:00:34 -0800, a ecrit: > Ok let me maybe explain myself better and how I understand what is going on.
Ok, but by point is that it's in the source code that this should be explained :) > This actually helps with the fact that things are first in the journal and > only > then in the file system. "Helping" is not enough :) > Yes, journal_block_is_active function is the bulwark against filesystem writes > happening before the journal. And thus definitely needs documented in the source code itself, so readers get it easily. > I added logic into the ext2 pager to notify the journal when it is writing > blocks. Now the journal keeps track of which committed transactions it can > "retire" and progress the superblock tail. Cool :) > The Issue: There are files and blocks (/dev/null, /tmp folder, > /tmp/.X11-unix, /var/log and some others) that seem to get hammered a > lot with metadata updates (mostly timestamps), [...] > It seems to me most of these are just access time updates. One idea would be > to > simply ignore atime updates in the journal logic so we don't wait for them? /var/log is expected for data, but e.g. /dev/null is *really* not expected. Normally, the relatime option should already be taking care of updating atime only once a day per file when it's already younger than mtime/ctime. If it's not, we should really fix it, we have no reason to write that often. > yet the ext2 pager never seems to write them back. For translators, it is expected that no data is written. But still we shouldn't need to update the time, that's a bug that should be fixed. Milos Nikic, le mar. 17 févr. 2026 19:57:33 -0800, a ecrit: > Yes some "files" like /dev/null are a translators and can be excluded based on > mode alone. (whether that is good idea, is a separate question) We shouldn't have to exclude explicitly, relatime should be enough. > But there are other files that look perfectly ordinary: > For instance: > /etc/resolv.conf > or > /tmp/.X11-unix > /tmp/.ICE-unix > > Occasionally their mode drops to 0, but overall these files are regular files > (not translators) that for some reason isn't handled by the ext2 pager. > > And its not all atime updates either...there are "other" updates as well. > This is all early boot though, but i still don't understand why isn't ext2 > pager handling them. Possibly there's a bug to fix in there. > To my mind comes a few things, if we want to pursue them: > 1) Aggressive Filtering (The "Strict Lazy" approach) > Logic: If !S_ISREG(mode) && !S_ISDIR(mode), ignore ALL timestamp-only > updates. Only journal if mode/uid/size changes. > Pros: Likely solves the issue completely. > Cons: Potentially risky if a file transitions states (e.g., git temp > files) > or if we miss legitimate metadata updates on special nodes. Yeah, we don't want that. > 2) Active Checkpointing (The "Sweeper") > Logic: If a transaction is stuck waiting for blocks, the journal thread > explicitly calls store_write for those blocks, bypassing the Pager's > dirty-check. > Pros: Guarantees consistency. > Cons: High complexity. It fights the Pager's logic and seems like a large > architectural change. We don't want to paper over what looks like a pager bug. We want to fix the pager. > 3) Perhaps just abandon block by block tail advancement idea for now, and > revert to "flush when almost full" approach which works well. If the scenario doesn't happen too often, the flush-when-almost-full can stay along the progressive eflush. Samuel
