On Sun, Oct 22, 2023 at 09:40:53AM -0400, Greg Troxel wrote: > mlel...@serpens.de (Michael van Elst) writes: > > > g...@lexort.com (Greg Troxel) writes: > > > >>> vnd opens the backing file when the unit is created and closes > >>> the backing file when the unit is destroyed. Then you can access > >>> the file again. > > > >>Is there a guarantee of cache consistency for writes before and reads > >>after? > > > > Before the unit is created you can access the file and after the > > unit is destroyed you can access the file. That's always safe. > > Sorry if I'm failing to understand something obvious, but with a caching > layer that has file contents, how are the cache contents invalidated? > > Specifically (but loosely in commands) > > let's assume the vnd is small and there is a lot of RAM available > > process opens the file and reads it > > vnconfig > > mount vnd0 /mnt > > date > /mnt/somefile > > umount /mnt > > vnconfig -u > > process opens the file and reads it > > Without fs cache invalidation, stale data can be returned. > > If there is explicit invalidation, it would be nice to say that > precisely but I am not understanding that it is there. Reading vnd.c, I > don't see any cache invalidation on detach. The only explicit > invalidation I find is in setcred from VNDIOCSET. > > I guess that prevents the above, but doesn't prevent > > vnconfig > > mount > > read backing file > > write to mount > > unmount > > detach > > read backing file > > so maybe we need a vinvalbuf on detach? > > > I also think that when the unit is configured but not opened > > (by device access or mounts) it is safe to access the file. > > As I read the code, reads are ok but will leave possibly stale data in > the cache for post-close. > > >>> The data is written directly to the allocated blocks of the file. > >>> So exclusively opening the backing file _or_ the vnd unit should > >>> also be safe. But that's not much different from accessing any file > >>> concurrently, which also leads to "corrupt", inconsistent backups. > > > >>That's a different kind of corrupt. > > > > Yes, but in the end it's the same, the "backup" isn't usuable. > > I am expecting that after deconfiguring, a read of the entire file is > guaranteed consistent, but I think we're missing invalidate on close. > > > You cannot access the backing file to get a consistent state of the > > data while a unit is in use. And that's independent of how vnd accesses > > the bits. > > Agreed; that's more or less like using a backup program on database > files while the database is running. > > > N.B. if you want to talk about dangers, think about fdiscard(). I > > doubt that it is safe in the context of the vnd optimization. > > It seems clear that pretty much any file operations are unsafe while the > vnd is configured. That seems like an entirely reasonable situation and > if that's the rule, easy to document. > > I wrote a test script and it shows that stale reads happen. When I run > this on UFS2 (netbsd-10), I find that all 4 files are all zero. When I > run it on zfs (also netbsd-10), I find that 000 and 001 are all zero and > 002 and 003 are the same. (I am guessing that zfs doesn't use the > direct operations, or caches differently; here I haven't the slightest > idea what is happening.) > > 10 minutes later, reading VND is still all zeros. With a new vnconfig, > it still reads as all zeros. > > > #!/bin/sh > > dd if=/dev/zero of=VND bs=1m count=1 > cat VND > VND.000 > vnconfig vnd0 VND > cat VND > VND.001 > newfs /dev/rvnd0a > cat VND > VND.002 > vnconfig -u vnd0 > cat VND > VND.003
At least this DTRT: dd if=VND of=VND.004 iflag=direct This reminds me of the way AIX handled O_DIRECT vs mmap, etc. From https://www.ibm.com/docs/en/aix/7.2?topic=tuning-direct-io: To avoid consistency issues, if there are multiple calls to open a file and one or more of the calls did not specify O_DIRECT and another open specified O_DIRECT, the file stays in the normal cached I/O mode. Similarly, if the file is mapped into memory through the shmat() or mmap() system calls, it stays in normal cached mode. If the last conflicting, non-direct access is eliminated, then the file system will move the file into direct I/O mode (either by using the close(), munmap(), or shmdt() subroutines). Changing from normal mode to direct I/O mode can be expensive because all modified pages in memory will have to be flushed to disk at that point. An elegant, albeit complex, solution. -- Paul Ripke "Great minds discuss ideas, average minds discuss events, small minds discuss people." -- Disputed: Often attributed to Eleanor Roosevelt. 1948.