On Thu, Jul 26, 2018 at 7:58 PM Kevin Wolf <kw...@redhat.com> wrote: > Am 26.07.2018 um 18:46 hat Nir Soffer geschrieben: > > On Thu, Jul 26, 2018 at 7:10 PM Eric Blake <ebl...@redhat.com> wrote: > > > > > On 07/26/2018 10:33 AM, Kevin Wolf wrote: > > > > > > > > > > > As far as I know, the comment you quoted is accurate for BLKDISCARD > and > > > > BLKZEROOUT, but not for the fallocate() flags. > > > > > > > > > > I sure wish the man pages were more explicit on what guarantees each > > > flag offers. > > > > > > >> Hmm - that thread also mentions FALLOC_FL_NO_HIDE_STALE, which is a > new > > > flag > > > >> not present/documented on Fedora 28. I wonder if it helps, too. > > > >> > > > >>> > > > >>> FALLOC_FL_ZERO_RANGE in contrast implements write_zeroes without > unmap. > > > >> > > > >> I thought the opposite: FALLOC_FL_ZERO_RANGE guarantees that you > read > > > back > > > >> 0, using whatever is most efficient under the hood (in the case of > block > > > >> devices, unmapping that reliably reads back as zero is favored). > > > > > > > > See the code I quoted above, FALLOC_FL_ZERO_RANGE calls > > > > blkdev_issue_zeroout() with BLKDEV_ZERO_NOUNMAP internally. > > > > > > Having to resort to reading the kernel code to learn what the > guarantees > > > are is annoying (it's nice that GPL guarantees that we CAN do that, but > > > it means the man pages could use some TLC so that we don't have to). > > > Especially since the kernel code has changed over time. > > > > > > But your paste of current kernel code is rather hard to argue against. > > > > I don't think we should depend on undocumented kernel code. > > Yes, the man page says "filesystem blocks" instead of just "blocks". If > you're in a nit-picking mood, that excludes block devices. But I think > it's pretty obvious what the intended meaning is and how it consistently > extends to block devices, and the code confirms it. > > That's by far not "undocumented kernel code" in my book. >
Maybe it is more correct to say this is poorly documented. The manual is very specific about supported filesystems, but there nothing about block devices: Deallocating file space Specifying the FALLOC_FL_PUNCH_HOLE flag (available since Linux 2.6.38) in mode deallocates space (i.e., creates a hole) in the byte range starting at offset and continuing for len bytes. Within the specified range, partial filesystem blocks are zeroed, and whole filesystem blocks are removed from the file. After a successful call, subsequent reads from this range will return zeroes. The FALLOC_FL_PUNCH_HOLE flag must be ORed with FALLOC_FL_KEEP_SIZE in mode; in other words, even when punching off the end of the file, the file size (as reported by stat(2)) does not change. Not all filesystems support FALLOC_FL_PUNCH_HOLE; if a filesystem doesn't support the operation, an error is returned. The operation is supported on at least the following filesystems: * XFS (since Linux 2.6.38) * ext4 (since Linux 3.0) * Btrfs (since Linux 3.7) * tmpfs(5) (since Linux 3.5) I expect to see something like: * block devices (since Linux 4.9) Anyway this looks the same level of documentation as FALLOC_FL_ZERO_RANGE, so if we are ok with using it, FALLOC_FL_PUNCH_HOLE is fine. Does this change mean that oVirt can start to enable discard for VM when disks are using wipe-after-delete? Currently we disable discard if a disk is marked for wiping, since discard does not guarantee that discarded ranges are zeroed. Nir