Thor Lancelot Simon <t...@panix.com> writes: > On Sun, Jul 07, 2024 at 02:07:40PM -0400, Greg Troxel wrote: >> I ran into a test failure with bup, where it was restoring a sparse file >> and trying to validate the resulting disk usage. It turns out that on >> zfs (NetBSD 10), when you write a file, it shows as using 1 block and >> then some seconds later shows as using the right amount. > > When you say "validate the resulting disk usage" and "using 1 block" what > do you mean, exactly? If the file is sparse, I can't see how there's any > bug unless the wrong st_size is returned by stat() or the wrong length > returned by lseek().
First, in actual operation, bup just does fs ops and there is no issue. This is a test case, to see if backing up and restoring a sparse file results in a sparse file. I realize that this probably requires a logging fuse driver and a lot of complexity to do 100% right. What I have been doing as a proxy is the script below, which skips 10 MB and writes 1 MB. Since the file is sparse, one would expect about 1 MB of usage, not 11MB, and not 1 block. Yes, this is not 100% reliable, but the point is to catch regressions where it results in 11 MB, or the test is broken. So there is a is the amount of space at least the data we actually wrote? is it well less than the sparse file's nominal length and this does not seem unreasonable, even if it is not 100% sound. > du counts allocated blocks as reported by stat(). A sparse file might > legitimately report 0, 1, or any other value, even values that exceed > (st_size / st_blksize). And the number of allocated blocks can absolutely Yes, but a sparse file with 10 MB of seeked-over and 1 MB of legit urandom more or less has to take up more than 1 block. > change even while st_size stays the same - consider a filesystem with > background deduplication or compression, both of which some variants of > ZFS have, but ZFS is not the only filesystem with these features. Sure, I get it that zfs that makes this hard. > If bup is relying on some particular block allocation behavior, that seems > like a bug. It is only tests, trying to catch problems. The actual operation tries to rely only on POSIX.