On Sun, Apr 9, 2023 at 11:05 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > Googling finds a lot of suggestions that O_DIRECT doesn't play nice > with btrfs, for example > > https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg92824.html > > It's not clear to me how much of that lore is still current, > but it's disturbing.
I think that particular thing might relate to modifications of the user buffer while a write is in progress (breaking btrfs's internal checksums). I don't think we should ever do that ourselves (not least because it'd break our own checksums). We lock the page during the write so no one can do that, and then we sleep in a synchronous syscall. Here's something recent. I guess it's probably not relevant (a fault on our buffer that we recently touched sounds pretty unlikely), but who knows... (developer lists for file systems are truly terrifying places to drive through). https://lore.kernel.org/linux-btrfs/20230315195231.gw10...@twin.jikos.cz/T/ It's odd, though, if it is their bug and not ours: I'd expect our friends in other databases to have hit all that sort of thing years ago, since many comparable systems have a direct I/O knob*. What are we doing differently? Are our multiple processes a factor here, breaking some coherency logic? Unsurprisingly, having compression on as Andrew does actually involves buffering anyway[1] despite our O_DIRECT flag, but maybe that's saying writes are buffered but reads are still direct (?), which sounds like the sort of initial conditions that might produce a coherency bug. I dunno. I gather that btrfs is actually Fedora's default file system (or maybe it's just "laptops and desktops"[2]?). I wonder if any of the several green Fedora systems in the 'farm are using btrfs. I wonder if they are using different mount options (thinking again of compression). *Probably a good reason to add a more prominent warning that the feature is developer-only, experimental and not for production use. I'm thinking a warning at startup or something. [1] https://btrfs.readthedocs.io/en/latest/Compression.html [2] https://fedoraproject.org/wiki/Changes/BtrfsByDefault