On Wed, Aug 6, 2025 at 9:20 AM Alan Somers <asom...@freebsd.org> wrote:
>
> On Wed, Aug 6, 2025 at 9:54 AM Rick Macklem <rick.mack...@gmail.com> wrote:
>>
>> On Wed, Aug 6, 2025 at 8:32 AM Alan Somers <asom...@freebsd.org> wrote:
>> >
>> > On Wed, Aug 6, 2025 at 9:18 AM Rick Macklem <rick.mack...@gmail.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> NFSv4.2 has a CLONE operation. It is described as doing:
>> >>    The CLONE operation is used to clone file content from a source file
>> >>    specified by the SAVED_FH value into a destination file specified by
>> >>    CURRENT_FH without actually copying the data, e.g., by using a
>> >>    copy-on-write mechanism.
>> >> (It takes arguments for 2 files, with byte offsets and a length.)
>> >> The offsets must be aligned to a value returned by the NFSv4.2 server.
>> >> 12.2.1.  Attribute 77: clone_blksize
>> >>
>> >>    The clone_blksize attribute indicates the granularity of a CLONE
>> >>    operation.
>> >>
>> >> Does ZFS block cloning do this?
>> >>
>> >> I am asking now, because although it might be too late,
>> >> if the answer is "yes", I'd like to get VOP calls into 15.0
>> >> for it. (Hopefully with the VOP calls in place, the rest could
>> >> go in sometime later, when I find the time to do it.)
>> >>
>> >> Thanks in advance for any comments, rick
>> >
>> >
>> > Yes, it does that right now, if the feature@block_cloning pool attribute 
>> > is enabled.  It works with VOP_COPY_FILE_RANGE.  Does NFS really need a 
>> > new VOP?
>> Either a new VOP or maybe a new flag argument for VOP_COPY_FILE_RANGE().
>> Linux defined a flag argument for their copy_file_range(), but they have 
>> never
>> defined any flags. Of course, that doesn't mean there cannot be a
>> "kernel internal"
>> flag.
>>
>> So maybe adding a new VOP can be avoided. That would be nice, given the 
>> timing
>> of the 15.0 release and other churn going on.
>>
>> The difference for NFSv4.2 is that CLONE cannot return with partial 
>> completion.
>> (It assumes that a CLONE of any size will complete quickly enough for an RPC.
>> Although there is no fixed limit, most assume an RPC reply should happen in
>> 1-2sec at most. For COPY, the server can return with only part of the
>> copy done.)
>> It also includes alignment restrictions for the byte offsets.
>>
>> There is also the alignment restriction on CLONE. There doesn't seem to be
>> an alignment restriction on zfs_clone_range(), but maybe it is buried inside 
>> it?
>> I think adding yet another pathconf name to get the alignment requirement and
>> whether or not the file system supports it would work without any VOP change.
>>
>> rick
>
>
> zfs_clone_range doesn't have any alignment restrictions.  But if the argument 
> isn't aligned to a record boundary, ZFS will actually copy a partial record, 
> rather than clone it.  Regarding the copy-to-completion requirement, could 
> that be implemented within nfs by looping over VOP_COPY_FILE_RANGE?
But the reason behind partial completion is the time restriction. The NFSv4.2
server limits the size to vfs.nfsd.maxcopyrange and sets a 1sec time limit
via a flag to vn_copy_file_range().

For CLONE, it needs to either:
- be able to complete the entire "copy" within 1-2sec under normal
  circumstances, irrespective of length.
or
- return not supported, so the client will switch to using COPY.

rick

Reply via email to