On Wed, Aug 6, 2025 at 9:20 AM Alan Somers <asom...@freebsd.org> wrote: > > On Wed, Aug 6, 2025 at 9:54 AM Rick Macklem <rick.mack...@gmail.com> wrote: >> >> On Wed, Aug 6, 2025 at 8:32 AM Alan Somers <asom...@freebsd.org> wrote: >> > >> > On Wed, Aug 6, 2025 at 9:18 AM Rick Macklem <rick.mack...@gmail.com> wrote: >> >> >> >> Hi, >> >> >> >> NFSv4.2 has a CLONE operation. It is described as doing: >> >> The CLONE operation is used to clone file content from a source file >> >> specified by the SAVED_FH value into a destination file specified by >> >> CURRENT_FH without actually copying the data, e.g., by using a >> >> copy-on-write mechanism. >> >> (It takes arguments for 2 files, with byte offsets and a length.) >> >> The offsets must be aligned to a value returned by the NFSv4.2 server. >> >> 12.2.1. Attribute 77: clone_blksize >> >> >> >> The clone_blksize attribute indicates the granularity of a CLONE >> >> operation. >> >> >> >> Does ZFS block cloning do this? >> >> >> >> I am asking now, because although it might be too late, >> >> if the answer is "yes", I'd like to get VOP calls into 15.0 >> >> for it. (Hopefully with the VOP calls in place, the rest could >> >> go in sometime later, when I find the time to do it.) >> >> >> >> Thanks in advance for any comments, rick >> > >> > >> > Yes, it does that right now, if the feature@block_cloning pool attribute >> > is enabled. It works with VOP_COPY_FILE_RANGE. Does NFS really need a >> > new VOP? >> Either a new VOP or maybe a new flag argument for VOP_COPY_FILE_RANGE(). >> Linux defined a flag argument for their copy_file_range(), but they have >> never >> defined any flags. Of course, that doesn't mean there cannot be a >> "kernel internal" >> flag. >> >> So maybe adding a new VOP can be avoided. That would be nice, given the >> timing >> of the 15.0 release and other churn going on. >> >> The difference for NFSv4.2 is that CLONE cannot return with partial >> completion. >> (It assumes that a CLONE of any size will complete quickly enough for an RPC. >> Although there is no fixed limit, most assume an RPC reply should happen in >> 1-2sec at most. For COPY, the server can return with only part of the >> copy done.) >> It also includes alignment restrictions for the byte offsets. >> >> There is also the alignment restriction on CLONE. There doesn't seem to be >> an alignment restriction on zfs_clone_range(), but maybe it is buried inside >> it? >> I think adding yet another pathconf name to get the alignment requirement and >> whether or not the file system supports it would work without any VOP change. >> >> rick > > > zfs_clone_range doesn't have any alignment restrictions. But if the argument > isn't aligned to a record boundary, ZFS will actually copy a partial record, > rather than clone it. Regarding the copy-to-completion requirement, could > that be implemented within nfs by looping over VOP_COPY_FILE_RANGE? But the reason behind partial completion is the time restriction. The NFSv4.2 server limits the size to vfs.nfsd.maxcopyrange and sets a 1sec time limit via a flag to vn_copy_file_range().
For CLONE, it needs to either: - be able to complete the entire "copy" within 1-2sec under normal circumstances, irrespective of length. or - return not supported, so the client will switch to using COPY. rick