Re: RFC: Use of VOP_ALLOCATE() by NFSV4.2 nfsd
Am 10.10.21 um 05:52 schrieb Alan Somers: > On Sat, Oct 9, 2021 at 7:13 PM Rick Macklem wrote>> > This leads me to a couple of questions: >> - Is there a good reason for not using vop_stdallocate() for ZFS? > > Yes. posix_fallocate is supposed to guarantee that subsequent writes > to the file will not fail with ENOSPC. But ZFS, being a copy-on-write > file system, cannot possibly guarantee that. See SVN r325320. This is not entirely true: ZFS supports reservations and it could thus support the pre-allocation of space that is later "filled". This reservations would be substracted from the free space sum, and it would be guaranteed that this free space is available for the file for which the pre-allocation has been requested. This would require that the allocate() call recorded the block range for which an allocation is requested (and for which no disk blocks are currently allocated) without assignment of any backing blocks at that time. Later writes to that range would allocate disk blocks and at the same time reduce the amount that is reserved and remove that range (that is now allocated) from the recorded pre-allocation range. This would of course require the addition of block ranges that are reserved but not yet backed by disk blocks to the znode, and of the total count of blocks reserved for this purpose in addition to other types of reservations in a separate variable. >> - Should I try and support both file system types via vop_stdallocate() >> or not support Allocate at all? > > Since you can't possibly support it for ZFS (not to mention other file > systems like fusefs) you'll have to not support it at all. While I do think that an allocate() operation could be implemented in ZFS, it is obvious that this does not apply to all possible fusefs filesystems (which do not even need to support the concept of an allocation of blocks or ranges). Regards, STefan OpenPGP_signature Description: OpenPGP digital signature
Re: RFC: Use of VOP_ALLOCATE() by NFSV4.2 nfsd
On 10-10-2021 07:57, Rick Macklem wrote: This leads me to a couple of questions: - Is there a good reason for not using vop_stdallocate() for ZFS? Yes. posix_fallocate is supposed to guarantee that subsequent writes to the file will not fail with ENOSPC. But ZFS, being a copy-on-write file system, cannot possibly guarantee that. See SVN r325320. However, vop_stdallocate() just does VOP_WRITE()s to the area (with bytes of data all zeros). Wouldn't that satisfy the criteria? I had the same problem in Ceph, where a guaranteed writable space is required for keeping a log of modifications to the system. Not having this space might case loss of data. Writing al zero's is probably even worse on filesystems that have compression set. Almost nothing is allocated, and so no guarantee at all. Next trick wass to write random data, but then you run into the problem signaled by Alan and Warner. New writes will need free space, since the CoW nature. Solution was to actually create a specific zpool just for this. But that will not help you with NFS 4.2 I guess --WjW
Re: RFC: Use of VOP_ALLOCATE() by NFSV4.2 nfsd
Alan Somers wrote: [stuff snipped] >Rick Macklem wrote: > >Alan Somers wrote: > > >Yes. posix_fallocate is supposed to guarantee that subsequent writes > > >to the file will not fail with ENOSPC. But ZFS, being a copy-on-write > > >file system, cannot possibly guarantee that. See SVN r325320. > > However, vop_stdallocate() just does VOP_WRITE()s to the area (with > > bytes of data all zeros). Wouldn't that satisfy the criteria? > > No. It works for UFS, which is an overwriting file system. But for > ZFS, when the user comes back later to rewrite those same offsets, ZFS > will actually allocate new LBAs for them. Eighto. I get it now. Looks like I must disable it in the server, unless there is a way to enable it on a per file system basis (which I don not believe is the case for NFSv4.2, although that isn't completely clear from the RFC, which says each operation is optional, but does not mention "per file system"). Thanks everyone, for your replies, rick > > >> - Should I try and support both file system types via vop_stdallocate() > >> or not support Allocate at all? > > > >Since you can't possibly support it for ZFS (not to mention other file > >systems like fusefs) you'll have to not support it at all. > It does sound like not supporting it is the best alternative. > > rick > > > > > Btw, as a bit of an aside, "cc" uses posix_fallocate() and in weird ways, > > such as offset=0, len=1. Why, I have no idea? > > > > Thanks in advance for any comments, rick > >
Re: RFC: Use of VOP_ALLOCATE() by NFSV4.2 nfsd
On Sat, Oct 9, 2021 at 11:57 PM Rick Macklem wrote: > > Alan Somers wrote: > >On Sat, Oct 9, 2021 at 7:13 PM Rick Macklem wrote: > >> > >> Hi, > >> > >> I ran into an issue this week during the nf...@ietf.org's testing event. > >> UFS - supports VOP_ALLOCATE() by using vop_stdallocate(). > >> ZFS - just return EINVAL for VOP_ALLOCATE(). > >> > >> An NFSv4.2 server can either support Allocate or not, but it has to be > >> for all exported file systems. > > > >That seems like a protocol bug to me. Could this be fixed in a future > >NFS revision? > Who knows. I don't see any interest in a 4.3. 4.2 is extensible, but I think > this is now "cast in stone". > > >> > >> This leads me to a couple of questions: > >> - Is there a good reason for not using vop_stdallocate() for ZFS? > > > >Yes. posix_fallocate is supposed to guarantee that subsequent writes > >to the file will not fail with ENOSPC. But ZFS, being a copy-on-write > >file system, cannot possibly guarantee that. See SVN r325320. > However, vop_stdallocate() just does VOP_WRITE()s to the area (with > bytes of data all zeros). Wouldn't that satisfy the criteria? No. It works for UFS, which is an overwriting file system. But for ZFS, when the user comes back later to rewrite those same offsets, ZFS will actually allocate new LBAs for them. > > >> - Should I try and support both file system types via vop_stdallocate() > >> or not support Allocate at all? > > > >Since you can't possibly support it for ZFS (not to mention other file > >systems like fusefs) you'll have to not support it at all. > It does sound like not supporting it is the best alternative. > > rick > > > > > Btw, as a bit of an aside, "cc" uses posix_fallocate() and in weird ways, > > such as offset=0, len=1. Why, I have no idea? > > > > Thanks in advance for any comments, rick > >
Re: RFC: Use of VOP_ALLOCATE() by NFSV4.2 nfsd
On Sat, Oct 9, 2021, 11:58 PM Rick Macklem wrote: > Alan Somers wrote: > >On Sat, Oct 9, 2021 at 7:13 PM Rick Macklem wrote: > >> > >> Hi, > >> > >> I ran into an issue this week during the nf...@ietf.org's testing > event. > >> UFS - supports VOP_ALLOCATE() by using vop_stdallocate(). > >> ZFS - just return EINVAL for VOP_ALLOCATE(). > >> > >> An NFSv4.2 server can either support Allocate or not, but it has to be > >> for all exported file systems. > > > >That seems like a protocol bug to me. Could this be fixed in a future > >NFS revision? > Who knows. I don't see any interest in a 4.3. 4.2 is extensible, but I > think > this is now "cast in stone". > > >> > >> This leads me to a couple of questions: > >> - Is there a good reason for not using vop_stdallocate() for ZFS? > > > >Yes. posix_fallocate is supposed to guarantee that subsequent writes > >to the file will not fail with ENOSPC. But ZFS, being a copy-on-write > >file system, cannot possibly guarantee that. See SVN r325320. > However, vop_stdallocate() just does VOP_WRITE()s to the area (with > bytes of data all zeros). Wouldn't that satisfy the criteria? > Since it is log based, that would make it worse. The blocks aren't instantly reclaimed when marked invalid. So you'd need storage for both and the 0d blocks could cause a resource shortage when the real writes come in. ZFS doesn't have a reservation system to reserve blocks in the log for a given file... Warner >> - Should I try and support both file system types via vop_stdallocate() > >> or not support Allocate at all? > > > >Since you can't possibly support it for ZFS (not to mention other file > >systems like fusefs) you'll have to not support it at all. > It does sound like not supporting it is the best alternative. > > rick > > > > > Btw, as a bit of an aside, "cc" uses posix_fallocate() and in weird ways, > > such as offset=0, len=1. Why, I have no idea? > > > > Thanks in advance for any comments, rick > > > >
Re: RFC: Use of VOP_ALLOCATE() by NFSV4.2 nfsd
Alan Somers wrote: >On Sat, Oct 9, 2021 at 7:13 PM Rick Macklem wrote: >> >> Hi, >> >> I ran into an issue this week during the nf...@ietf.org's testing event. >> UFS - supports VOP_ALLOCATE() by using vop_stdallocate(). >> ZFS - just return EINVAL for VOP_ALLOCATE(). >> >> An NFSv4.2 server can either support Allocate or not, but it has to be >> for all exported file systems. > >That seems like a protocol bug to me. Could this be fixed in a future >NFS revision? Who knows. I don't see any interest in a 4.3. 4.2 is extensible, but I think this is now "cast in stone". >> >> This leads me to a couple of questions: >> - Is there a good reason for not using vop_stdallocate() for ZFS? > >Yes. posix_fallocate is supposed to guarantee that subsequent writes >to the file will not fail with ENOSPC. But ZFS, being a copy-on-write >file system, cannot possibly guarantee that. See SVN r325320. However, vop_stdallocate() just does VOP_WRITE()s to the area (with bytes of data all zeros). Wouldn't that satisfy the criteria? >> - Should I try and support both file system types via vop_stdallocate() >> or not support Allocate at all? > >Since you can't possibly support it for ZFS (not to mention other file >systems like fusefs) you'll have to not support it at all. It does sound like not supporting it is the best alternative. rick > > Btw, as a bit of an aside, "cc" uses posix_fallocate() and in weird ways, > such as offset=0, len=1. Why, I have no idea? > > Thanks in advance for any comments, rick >
Re: RFC: Use of VOP_ALLOCATE() by NFSV4.2 nfsd
From: Alan Somers Sent: Saturday, October 9, 2021 11:52 PM To: Rick Macklem Cc: FreeBSD Current Subject: Re: RFC: Use of VOP_ALLOCATE() by NFSV4.2 nfsd CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to ith...@uoguelph.ca On Sat, Oct 9, 2021 at 7:13 PM Rick Macklem wrote: > > Hi, > > I ran into an issue this week during the nf...@ietf.org's testing event. > UFS - supports VOP_ALLOCATE() by using vop_stdallocate(). > ZFS - just return EINVAL for VOP_ALLOCATE(). > > An NFSv4.2 server can either support Allocate or not, but it has to be > for all exported file systems. That seems like a protocol bug to me. Could this be fixed in a future NFS revision? > > This leads me to a couple of questions: > - Is there a good reason for not using vop_stdallocate() for ZFS? Yes. posix_fallocate is supposed to guarantee that subsequent writes to the file will not fail with ENOSPC. But ZFS, being a copy-on-write file system, cannot possibly guarantee that. See SVN r325320. > - Should I try and support both file system types via vop_stdallocate() > or not support Allocate at all? Since you can't possibly support it for ZFS (not to mention other file systems like fusefs) you'll have to not support it at all. > > Btw, as a bit of an aside, "cc" uses posix_fallocate() and in weird ways, > such as offset=0, len=1. Why, I have no idea? > > Thanks in advance for any comments, rick >
Re: RFC: Use of VOP_ALLOCATE() by NFSV4.2 nfsd
On Sat, Oct 9, 2021 at 7:13 PM Rick Macklem wrote: > > Hi, > > I ran into an issue this week during the nf...@ietf.org's testing event. > UFS - supports VOP_ALLOCATE() by using vop_stdallocate(). > ZFS - just return EINVAL for VOP_ALLOCATE(). > > An NFSv4.2 server can either support Allocate or not, but it has to be > for all exported file systems. That seems like a protocol bug to me. Could this be fixed in a future NFS revision? > > This leads me to a couple of questions: > - Is there a good reason for not using vop_stdallocate() for ZFS? Yes. posix_fallocate is supposed to guarantee that subsequent writes to the file will not fail with ENOSPC. But ZFS, being a copy-on-write file system, cannot possibly guarantee that. See SVN r325320. > - Should I try and support both file system types via vop_stdallocate() > or not support Allocate at all? Since you can't possibly support it for ZFS (not to mention other file systems like fusefs) you'll have to not support it at all. > > Btw, as a bit of an aside, "cc" uses posix_fallocate() and in weird ways, > such as offset=0, len=1. Why, I have no idea? > > Thanks in advance for any comments, rick >
RFC: Use of VOP_ALLOCATE() by NFSV4.2 nfsd
Hi, I ran into an issue this week during the nf...@ietf.org's testing event. UFS - supports VOP_ALLOCATE() by using vop_stdallocate(). ZFS - just return EINVAL for VOP_ALLOCATE(). An NFSv4.2 server can either support Allocate or not, but it has to be for all exported file systems. This leads me to a couple of questions: - Is there a good reason for not using vop_stdallocate() for ZFS? - Should I try and support both file system types via vop_stdallocate() or not support Allocate at all? Btw, as a bit of an aside, "cc" uses posix_fallocate() and in weird ways, such as offset=0, len=1. Why, I have no idea? Thanks in advance for any comments, rick