Re: [PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Dave Chinner
On Fri, Aug 11, 2017 at 07:31:54PM -0700, Darrick J. Wong wrote:
> On Sat, Aug 12, 2017 at 10:30:34AM +1000, Dave Chinner wrote:
> > On Fri, Aug 11, 2017 at 04:42:18PM -0700, Dan Williams wrote:
> > > On Fri, Aug 11, 2017 at 4:27 PM, Dave Chinner  wrote:
> > > > On Thu, Aug 10, 2017 at 11:39:28PM -0700, Dan Williams wrote:
> > > >> >From falloc.h:
> > > >>
> > > >> FALLOC_FL_SEAL_BLOCK_MAP is used to seal (make immutable) all of 
> > > >> the
> > > >> file logical-to-physical extent offset mappings in the file. The
> > > >> purpose is to allow an application to assume that there are no 
> > > >> holes
> > > >> or shared extents in the file and that the metadata needed to find
> > > >> all the physical extents of the file is stable and can never be
> > > >> dirtied.
> > > >>
> > > >> For now this patch only permits setting the in-memory state of
> > > >> S_IOMAP_IMMMUTABLE. Support for clearing and persisting the state is
> > > >> saved for later patches.
> > > >>
> > > >> The implementation is careful to not allow the immutable state to 
> > > >> change
> > > >> while any process might have any established mappings. It reuses the
> > > >> existing xfs_reflink_unshare() and xfs_alloc_file_space() to unshare
> > > >> extents and fill all holes in the file. It then holds XFS_ILOCK_EXCL
> > > >> while it validates the file is in the proper state and sets
> > > >> S_IOMAP_IMMUTABLE.
> > > >
> > > > SO I've been thinking about this - I'm thinking that we need to
> > > > separate the changes to the extent map from the action of sealing
> > > > the extent map.
> > > >
> > > > That is, I have a need to freeze an extent map without any
> > > > modification to it at all and breaking all the sharing and filling
> > > > all the holes completely screws up the file layout I need to
> > > > preserve. i.e. I want to be able to freeze the maps of a pair of
> > > > reflinked files so I can use FIEMAP to query only the changed blocks
> > > > and then read out the data in the private/changed blocks in the
> > > > reflinked file. I only need a temporary seal (if we crash it goes
> > > > away), so maybe there's another command modifier flag needed here,
> > > > too.
> > > >
> > > > The DAX allocation requirements can be handled by a preallocation
> > > > call to filll holes with zeros, followed by an unshare call to break
> > > > all the COW mappings, and then the extent map can be sealed. If you
> > > > need to check for holes after sealing, SEEK_HOLE will tell you what
> > > > you need to know...
> > > >
> > > > My preference really is for each fallocate command to do just one
> > > > thing - having the seal operation also modify the extent map
> > > > means it's not useful for the use cases where we need the extent map
> > > > to remain unmodified
> > > >
> > > > Thoughts?
> > > 
> > > That does seem to better follow the principle of least surprise and
> > > make the interface more composable. However, for the DAX case do we
> > > now end up needing a SEEK_SHARED, or something similar to check that
> > > we sealed the file without shared extents?
> 
> Well, fiemap/bmap will tell you (presumably after you confirm that the
> file got sealed or whatever) if there are shared extents.

Iterating potentially hundreds of thousands of extents just to find
the one shared extent seems like crazy thing to ask people to do.

> > Don't we have an inode attribute flag for that? There's definitely a
> > flag in the on disk XFS inode to indicate that there are shared
> > extents on the file.
> > 
> > H - for some reason it's not exposed in FS_IOC_FSGETXATTR.
> > Darrick? Any reason we don't expose the "this file has shared
> > extents" flag to userspace? How are we checking that on-disk state
> > in xfstests?
> > 
> > As it is, if we're adding an immutable extent flag, we've got to be
> > able to query the immutable extent state of a file from userspace so
> > I'm thinking that we'd need to expose it via the same interface that
> > exposes the immutable flag. i.e. we could add the "shared extents
> > present" flag to FS_IOC_FSGETXATTR at the same time...
> 
> We used to have a FSGETXATTR flag; iirc hch nak'd it during the review.

Ok, got a link? Seems strange to not be able to query this state
even for testing/validation purposes...

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


Re: [PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Dave Chinner
On Fri, Aug 11, 2017 at 07:31:54PM -0700, Darrick J. Wong wrote:
> On Sat, Aug 12, 2017 at 10:30:34AM +1000, Dave Chinner wrote:
> > On Fri, Aug 11, 2017 at 04:42:18PM -0700, Dan Williams wrote:
> > > On Fri, Aug 11, 2017 at 4:27 PM, Dave Chinner  wrote:
> > > > On Thu, Aug 10, 2017 at 11:39:28PM -0700, Dan Williams wrote:
> > > >> >From falloc.h:
> > > >>
> > > >> FALLOC_FL_SEAL_BLOCK_MAP is used to seal (make immutable) all of 
> > > >> the
> > > >> file logical-to-physical extent offset mappings in the file. The
> > > >> purpose is to allow an application to assume that there are no 
> > > >> holes
> > > >> or shared extents in the file and that the metadata needed to find
> > > >> all the physical extents of the file is stable and can never be
> > > >> dirtied.
> > > >>
> > > >> For now this patch only permits setting the in-memory state of
> > > >> S_IOMAP_IMMMUTABLE. Support for clearing and persisting the state is
> > > >> saved for later patches.
> > > >>
> > > >> The implementation is careful to not allow the immutable state to 
> > > >> change
> > > >> while any process might have any established mappings. It reuses the
> > > >> existing xfs_reflink_unshare() and xfs_alloc_file_space() to unshare
> > > >> extents and fill all holes in the file. It then holds XFS_ILOCK_EXCL
> > > >> while it validates the file is in the proper state and sets
> > > >> S_IOMAP_IMMUTABLE.
> > > >
> > > > SO I've been thinking about this - I'm thinking that we need to
> > > > separate the changes to the extent map from the action of sealing
> > > > the extent map.
> > > >
> > > > That is, I have a need to freeze an extent map without any
> > > > modification to it at all and breaking all the sharing and filling
> > > > all the holes completely screws up the file layout I need to
> > > > preserve. i.e. I want to be able to freeze the maps of a pair of
> > > > reflinked files so I can use FIEMAP to query only the changed blocks
> > > > and then read out the data in the private/changed blocks in the
> > > > reflinked file. I only need a temporary seal (if we crash it goes
> > > > away), so maybe there's another command modifier flag needed here,
> > > > too.
> > > >
> > > > The DAX allocation requirements can be handled by a preallocation
> > > > call to filll holes with zeros, followed by an unshare call to break
> > > > all the COW mappings, and then the extent map can be sealed. If you
> > > > need to check for holes after sealing, SEEK_HOLE will tell you what
> > > > you need to know...
> > > >
> > > > My preference really is for each fallocate command to do just one
> > > > thing - having the seal operation also modify the extent map
> > > > means it's not useful for the use cases where we need the extent map
> > > > to remain unmodified
> > > >
> > > > Thoughts?
> > > 
> > > That does seem to better follow the principle of least surprise and
> > > make the interface more composable. However, for the DAX case do we
> > > now end up needing a SEEK_SHARED, or something similar to check that
> > > we sealed the file without shared extents?
> 
> Well, fiemap/bmap will tell you (presumably after you confirm that the
> file got sealed or whatever) if there are shared extents.

Iterating potentially hundreds of thousands of extents just to find
the one shared extent seems like crazy thing to ask people to do.

> > Don't we have an inode attribute flag for that? There's definitely a
> > flag in the on disk XFS inode to indicate that there are shared
> > extents on the file.
> > 
> > H - for some reason it's not exposed in FS_IOC_FSGETXATTR.
> > Darrick? Any reason we don't expose the "this file has shared
> > extents" flag to userspace? How are we checking that on-disk state
> > in xfstests?
> > 
> > As it is, if we're adding an immutable extent flag, we've got to be
> > able to query the immutable extent state of a file from userspace so
> > I'm thinking that we'd need to expose it via the same interface that
> > exposes the immutable flag. i.e. we could add the "shared extents
> > present" flag to FS_IOC_FSGETXATTR at the same time...
> 
> We used to have a FSGETXATTR flag; iirc hch nak'd it during the review.

Ok, got a link? Seems strange to not be able to query this state
even for testing/validation purposes...

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


Re: [PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Darrick J. Wong
On Sat, Aug 12, 2017 at 10:30:34AM +1000, Dave Chinner wrote:
> On Fri, Aug 11, 2017 at 04:42:18PM -0700, Dan Williams wrote:
> > On Fri, Aug 11, 2017 at 4:27 PM, Dave Chinner  wrote:
> > > On Thu, Aug 10, 2017 at 11:39:28PM -0700, Dan Williams wrote:
> > >> >From falloc.h:
> > >>
> > >> FALLOC_FL_SEAL_BLOCK_MAP is used to seal (make immutable) all of the
> > >> file logical-to-physical extent offset mappings in the file. The
> > >> purpose is to allow an application to assume that there are no holes
> > >> or shared extents in the file and that the metadata needed to find
> > >> all the physical extents of the file is stable and can never be
> > >> dirtied.
> > >>
> > >> For now this patch only permits setting the in-memory state of
> > >> S_IOMAP_IMMMUTABLE. Support for clearing and persisting the state is
> > >> saved for later patches.
> > >>
> > >> The implementation is careful to not allow the immutable state to change
> > >> while any process might have any established mappings. It reuses the
> > >> existing xfs_reflink_unshare() and xfs_alloc_file_space() to unshare
> > >> extents and fill all holes in the file. It then holds XFS_ILOCK_EXCL
> > >> while it validates the file is in the proper state and sets
> > >> S_IOMAP_IMMUTABLE.
> > >
> > > SO I've been thinking about this - I'm thinking that we need to
> > > separate the changes to the extent map from the action of sealing
> > > the extent map.
> > >
> > > That is, I have a need to freeze an extent map without any
> > > modification to it at all and breaking all the sharing and filling
> > > all the holes completely screws up the file layout I need to
> > > preserve. i.e. I want to be able to freeze the maps of a pair of
> > > reflinked files so I can use FIEMAP to query only the changed blocks
> > > and then read out the data in the private/changed blocks in the
> > > reflinked file. I only need a temporary seal (if we crash it goes
> > > away), so maybe there's another command modifier flag needed here,
> > > too.
> > >
> > > The DAX allocation requirements can be handled by a preallocation
> > > call to filll holes with zeros, followed by an unshare call to break
> > > all the COW mappings, and then the extent map can be sealed. If you
> > > need to check for holes after sealing, SEEK_HOLE will tell you what
> > > you need to know...
> > >
> > > My preference really is for each fallocate command to do just one
> > > thing - having the seal operation also modify the extent map
> > > means it's not useful for the use cases where we need the extent map
> > > to remain unmodified
> > >
> > > Thoughts?
> > 
> > That does seem to better follow the principle of least surprise and
> > make the interface more composable. However, for the DAX case do we
> > now end up needing a SEEK_SHARED, or something similar to check that
> > we sealed the file without shared extents?

Well, fiemap/bmap will tell you (presumably after you confirm that the
file got sealed or whatever) if there are shared extents.

> Don't we have an inode attribute flag for that? There's definitely a
> flag in the on disk XFS inode to indicate that there are shared
> extents on the file.
> 
> H - for some reason it's not exposed in FS_IOC_FSGETXATTR.
> Darrick? Any reason we don't expose the "this file has shared
> extents" flag to userspace? How are we checking that on-disk state
> in xfstests?
> 
> As it is, if we're adding an immutable extent flag, we've got to be
> able to query the immutable extent state of a file from userspace so
> I'm thinking that we'd need to expose it via the same interface that
> exposes the immutable flag. i.e. we could add the "shared extents
> present" flag to FS_IOC_FSGETXATTR at the same time...

We used to have a FSGETXATTR flag; iirc hch nak'd it during the review.

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> da...@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Darrick J. Wong
On Sat, Aug 12, 2017 at 10:30:34AM +1000, Dave Chinner wrote:
> On Fri, Aug 11, 2017 at 04:42:18PM -0700, Dan Williams wrote:
> > On Fri, Aug 11, 2017 at 4:27 PM, Dave Chinner  wrote:
> > > On Thu, Aug 10, 2017 at 11:39:28PM -0700, Dan Williams wrote:
> > >> >From falloc.h:
> > >>
> > >> FALLOC_FL_SEAL_BLOCK_MAP is used to seal (make immutable) all of the
> > >> file logical-to-physical extent offset mappings in the file. The
> > >> purpose is to allow an application to assume that there are no holes
> > >> or shared extents in the file and that the metadata needed to find
> > >> all the physical extents of the file is stable and can never be
> > >> dirtied.
> > >>
> > >> For now this patch only permits setting the in-memory state of
> > >> S_IOMAP_IMMMUTABLE. Support for clearing and persisting the state is
> > >> saved for later patches.
> > >>
> > >> The implementation is careful to not allow the immutable state to change
> > >> while any process might have any established mappings. It reuses the
> > >> existing xfs_reflink_unshare() and xfs_alloc_file_space() to unshare
> > >> extents and fill all holes in the file. It then holds XFS_ILOCK_EXCL
> > >> while it validates the file is in the proper state and sets
> > >> S_IOMAP_IMMUTABLE.
> > >
> > > SO I've been thinking about this - I'm thinking that we need to
> > > separate the changes to the extent map from the action of sealing
> > > the extent map.
> > >
> > > That is, I have a need to freeze an extent map without any
> > > modification to it at all and breaking all the sharing and filling
> > > all the holes completely screws up the file layout I need to
> > > preserve. i.e. I want to be able to freeze the maps of a pair of
> > > reflinked files so I can use FIEMAP to query only the changed blocks
> > > and then read out the data in the private/changed blocks in the
> > > reflinked file. I only need a temporary seal (if we crash it goes
> > > away), so maybe there's another command modifier flag needed here,
> > > too.
> > >
> > > The DAX allocation requirements can be handled by a preallocation
> > > call to filll holes with zeros, followed by an unshare call to break
> > > all the COW mappings, and then the extent map can be sealed. If you
> > > need to check for holes after sealing, SEEK_HOLE will tell you what
> > > you need to know...
> > >
> > > My preference really is for each fallocate command to do just one
> > > thing - having the seal operation also modify the extent map
> > > means it's not useful for the use cases where we need the extent map
> > > to remain unmodified
> > >
> > > Thoughts?
> > 
> > That does seem to better follow the principle of least surprise and
> > make the interface more composable. However, for the DAX case do we
> > now end up needing a SEEK_SHARED, or something similar to check that
> > we sealed the file without shared extents?

Well, fiemap/bmap will tell you (presumably after you confirm that the
file got sealed or whatever) if there are shared extents.

> Don't we have an inode attribute flag for that? There's definitely a
> flag in the on disk XFS inode to indicate that there are shared
> extents on the file.
> 
> H - for some reason it's not exposed in FS_IOC_FSGETXATTR.
> Darrick? Any reason we don't expose the "this file has shared
> extents" flag to userspace? How are we checking that on-disk state
> in xfstests?
> 
> As it is, if we're adding an immutable extent flag, we've got to be
> able to query the immutable extent state of a file from userspace so
> I'm thinking that we'd need to expose it via the same interface that
> exposes the immutable flag. i.e. we could add the "shared extents
> present" flag to FS_IOC_FSGETXATTR at the same time...

We used to have a FSGETXATTR flag; iirc hch nak'd it during the review.

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> da...@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Dave Chinner
On Fri, Aug 11, 2017 at 04:42:18PM -0700, Dan Williams wrote:
> On Fri, Aug 11, 2017 at 4:27 PM, Dave Chinner  wrote:
> > On Thu, Aug 10, 2017 at 11:39:28PM -0700, Dan Williams wrote:
> >> >From falloc.h:
> >>
> >> FALLOC_FL_SEAL_BLOCK_MAP is used to seal (make immutable) all of the
> >> file logical-to-physical extent offset mappings in the file. The
> >> purpose is to allow an application to assume that there are no holes
> >> or shared extents in the file and that the metadata needed to find
> >> all the physical extents of the file is stable and can never be
> >> dirtied.
> >>
> >> For now this patch only permits setting the in-memory state of
> >> S_IOMAP_IMMMUTABLE. Support for clearing and persisting the state is
> >> saved for later patches.
> >>
> >> The implementation is careful to not allow the immutable state to change
> >> while any process might have any established mappings. It reuses the
> >> existing xfs_reflink_unshare() and xfs_alloc_file_space() to unshare
> >> extents and fill all holes in the file. It then holds XFS_ILOCK_EXCL
> >> while it validates the file is in the proper state and sets
> >> S_IOMAP_IMMUTABLE.
> >
> > SO I've been thinking about this - I'm thinking that we need to
> > separate the changes to the extent map from the action of sealing
> > the extent map.
> >
> > That is, I have a need to freeze an extent map without any
> > modification to it at all and breaking all the sharing and filling
> > all the holes completely screws up the file layout I need to
> > preserve. i.e. I want to be able to freeze the maps of a pair of
> > reflinked files so I can use FIEMAP to query only the changed blocks
> > and then read out the data in the private/changed blocks in the
> > reflinked file. I only need a temporary seal (if we crash it goes
> > away), so maybe there's another command modifier flag needed here,
> > too.
> >
> > The DAX allocation requirements can be handled by a preallocation
> > call to filll holes with zeros, followed by an unshare call to break
> > all the COW mappings, and then the extent map can be sealed. If you
> > need to check for holes after sealing, SEEK_HOLE will tell you what
> > you need to know...
> >
> > My preference really is for each fallocate command to do just one
> > thing - having the seal operation also modify the extent map
> > means it's not useful for the use cases where we need the extent map
> > to remain unmodified
> >
> > Thoughts?
> 
> That does seem to better follow the principle of least surprise and
> make the interface more composable. However, for the DAX case do we
> now end up needing a SEEK_SHARED, or something similar to check that
> we sealed the file without shared extents?

Don't we have an inode attribute flag for that? There's definitely a
flag in the on disk XFS inode to indicate that there are shared
extents on the file.

H - for some reason it's not exposed in FS_IOC_FSGETXATTR.
Darrick? Any reason we don't expose the "this file has shared
extents" flag to userspace? How are we checking that on-disk state
in xfstests?

As it is, if we're adding an immutable extent flag, we've got to be
able to query the immutable extent state of a file from userspace so
I'm thinking that we'd need to expose it via the same interface that
exposes the immutable flag. i.e. we could add the "shared extents
present" flag to FS_IOC_FSGETXATTR at the same time...

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


Re: [PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Dave Chinner
On Fri, Aug 11, 2017 at 04:42:18PM -0700, Dan Williams wrote:
> On Fri, Aug 11, 2017 at 4:27 PM, Dave Chinner  wrote:
> > On Thu, Aug 10, 2017 at 11:39:28PM -0700, Dan Williams wrote:
> >> >From falloc.h:
> >>
> >> FALLOC_FL_SEAL_BLOCK_MAP is used to seal (make immutable) all of the
> >> file logical-to-physical extent offset mappings in the file. The
> >> purpose is to allow an application to assume that there are no holes
> >> or shared extents in the file and that the metadata needed to find
> >> all the physical extents of the file is stable and can never be
> >> dirtied.
> >>
> >> For now this patch only permits setting the in-memory state of
> >> S_IOMAP_IMMMUTABLE. Support for clearing and persisting the state is
> >> saved for later patches.
> >>
> >> The implementation is careful to not allow the immutable state to change
> >> while any process might have any established mappings. It reuses the
> >> existing xfs_reflink_unshare() and xfs_alloc_file_space() to unshare
> >> extents and fill all holes in the file. It then holds XFS_ILOCK_EXCL
> >> while it validates the file is in the proper state and sets
> >> S_IOMAP_IMMUTABLE.
> >
> > SO I've been thinking about this - I'm thinking that we need to
> > separate the changes to the extent map from the action of sealing
> > the extent map.
> >
> > That is, I have a need to freeze an extent map without any
> > modification to it at all and breaking all the sharing and filling
> > all the holes completely screws up the file layout I need to
> > preserve. i.e. I want to be able to freeze the maps of a pair of
> > reflinked files so I can use FIEMAP to query only the changed blocks
> > and then read out the data in the private/changed blocks in the
> > reflinked file. I only need a temporary seal (if we crash it goes
> > away), so maybe there's another command modifier flag needed here,
> > too.
> >
> > The DAX allocation requirements can be handled by a preallocation
> > call to filll holes with zeros, followed by an unshare call to break
> > all the COW mappings, and then the extent map can be sealed. If you
> > need to check for holes after sealing, SEEK_HOLE will tell you what
> > you need to know...
> >
> > My preference really is for each fallocate command to do just one
> > thing - having the seal operation also modify the extent map
> > means it's not useful for the use cases where we need the extent map
> > to remain unmodified
> >
> > Thoughts?
> 
> That does seem to better follow the principle of least surprise and
> make the interface more composable. However, for the DAX case do we
> now end up needing a SEEK_SHARED, or something similar to check that
> we sealed the file without shared extents?

Don't we have an inode attribute flag for that? There's definitely a
flag in the on disk XFS inode to indicate that there are shared
extents on the file.

H - for some reason it's not exposed in FS_IOC_FSGETXATTR.
Darrick? Any reason we don't expose the "this file has shared
extents" flag to userspace? How are we checking that on-disk state
in xfstests?

As it is, if we're adding an immutable extent flag, we've got to be
able to query the immutable extent state of a file from userspace so
I'm thinking that we'd need to expose it via the same interface that
exposes the immutable flag. i.e. we could add the "shared extents
present" flag to FS_IOC_FSGETXATTR at the same time...

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


Re: [PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Dan Williams
On Fri, Aug 11, 2017 at 4:27 PM, Dave Chinner  wrote:
> On Thu, Aug 10, 2017 at 11:39:28PM -0700, Dan Williams wrote:
>> >From falloc.h:
>>
>> FALLOC_FL_SEAL_BLOCK_MAP is used to seal (make immutable) all of the
>> file logical-to-physical extent offset mappings in the file. The
>> purpose is to allow an application to assume that there are no holes
>> or shared extents in the file and that the metadata needed to find
>> all the physical extents of the file is stable and can never be
>> dirtied.
>>
>> For now this patch only permits setting the in-memory state of
>> S_IOMAP_IMMMUTABLE. Support for clearing and persisting the state is
>> saved for later patches.
>>
>> The implementation is careful to not allow the immutable state to change
>> while any process might have any established mappings. It reuses the
>> existing xfs_reflink_unshare() and xfs_alloc_file_space() to unshare
>> extents and fill all holes in the file. It then holds XFS_ILOCK_EXCL
>> while it validates the file is in the proper state and sets
>> S_IOMAP_IMMUTABLE.
>
> SO I've been thinking about this - I'm thinking that we need to
> separate the changes to the extent map from the action of sealing
> the extent map.
>
> That is, I have a need to freeze an extent map without any
> modification to it at all and breaking all the sharing and filling
> all the holes completely screws up the file layout I need to
> preserve. i.e. I want to be able to freeze the maps of a pair of
> reflinked files so I can use FIEMAP to query only the changed blocks
> and then read out the data in the private/changed blocks in the
> reflinked file. I only need a temporary seal (if we crash it goes
> away), so maybe there's another command modifier flag needed here,
> too.
>
> The DAX allocation requirements can be handled by a preallocation
> call to filll holes with zeros, followed by an unshare call to break
> all the COW mappings, and then the extent map can be sealed. If you
> need to check for holes after sealing, SEEK_HOLE will tell you what
> you need to know...
>
> My preference really is for each fallocate command to do just one
> thing - having the seal operation also modify the extent map
> means it's not useful for the use cases where we need the extent map
> to remain unmodified
>
> Thoughts?

That does seem to better follow the principle of least surprise and
make the interface more composable. However, for the DAX case do we
now end up needing a SEEK_SHARED, or something similar to check that
we sealed the file without shared extents?


Re: [PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Dan Williams
On Fri, Aug 11, 2017 at 4:27 PM, Dave Chinner  wrote:
> On Thu, Aug 10, 2017 at 11:39:28PM -0700, Dan Williams wrote:
>> >From falloc.h:
>>
>> FALLOC_FL_SEAL_BLOCK_MAP is used to seal (make immutable) all of the
>> file logical-to-physical extent offset mappings in the file. The
>> purpose is to allow an application to assume that there are no holes
>> or shared extents in the file and that the metadata needed to find
>> all the physical extents of the file is stable and can never be
>> dirtied.
>>
>> For now this patch only permits setting the in-memory state of
>> S_IOMAP_IMMMUTABLE. Support for clearing and persisting the state is
>> saved for later patches.
>>
>> The implementation is careful to not allow the immutable state to change
>> while any process might have any established mappings. It reuses the
>> existing xfs_reflink_unshare() and xfs_alloc_file_space() to unshare
>> extents and fill all holes in the file. It then holds XFS_ILOCK_EXCL
>> while it validates the file is in the proper state and sets
>> S_IOMAP_IMMUTABLE.
>
> SO I've been thinking about this - I'm thinking that we need to
> separate the changes to the extent map from the action of sealing
> the extent map.
>
> That is, I have a need to freeze an extent map without any
> modification to it at all and breaking all the sharing and filling
> all the holes completely screws up the file layout I need to
> preserve. i.e. I want to be able to freeze the maps of a pair of
> reflinked files so I can use FIEMAP to query only the changed blocks
> and then read out the data in the private/changed blocks in the
> reflinked file. I only need a temporary seal (if we crash it goes
> away), so maybe there's another command modifier flag needed here,
> too.
>
> The DAX allocation requirements can be handled by a preallocation
> call to filll holes with zeros, followed by an unshare call to break
> all the COW mappings, and then the extent map can be sealed. If you
> need to check for holes after sealing, SEEK_HOLE will tell you what
> you need to know...
>
> My preference really is for each fallocate command to do just one
> thing - having the seal operation also modify the extent map
> means it's not useful for the use cases where we need the extent map
> to remain unmodified
>
> Thoughts?

That does seem to better follow the principle of least surprise and
make the interface more composable. However, for the DAX case do we
now end up needing a SEEK_SHARED, or something similar to check that
we sealed the file without shared extents?


Re: [PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Dave Chinner
On Thu, Aug 10, 2017 at 11:39:28PM -0700, Dan Williams wrote:
> >From falloc.h:
> 
> FALLOC_FL_SEAL_BLOCK_MAP is used to seal (make immutable) all of the
> file logical-to-physical extent offset mappings in the file. The
> purpose is to allow an application to assume that there are no holes
> or shared extents in the file and that the metadata needed to find
> all the physical extents of the file is stable and can never be
> dirtied.
> 
> For now this patch only permits setting the in-memory state of
> S_IOMAP_IMMMUTABLE. Support for clearing and persisting the state is
> saved for later patches.
> 
> The implementation is careful to not allow the immutable state to change
> while any process might have any established mappings. It reuses the
> existing xfs_reflink_unshare() and xfs_alloc_file_space() to unshare
> extents and fill all holes in the file. It then holds XFS_ILOCK_EXCL
> while it validates the file is in the proper state and sets
> S_IOMAP_IMMUTABLE.

SO I've been thinking about this - I'm thinking that we need to
separate the changes to the extent map from the action of sealing
the extent map.

That is, I have a need to freeze an extent map without any
modification to it at all and breaking all the sharing and filling
all the holes completely screws up the file layout I need to
preserve. i.e. I want to be able to freeze the maps of a pair of
reflinked files so I can use FIEMAP to query only the changed blocks
and then read out the data in the private/changed blocks in the
reflinked file. I only need a temporary seal (if we crash it goes
away), so maybe there's another command modifier flag needed here,
too.

The DAX allocation requirements can be handled by a preallocation
call to filll holes with zeros, followed by an unshare call to break
all the COW mappings, and then the extent map can be sealed. If you
need to check for holes after sealing, SEEK_HOLE will tell you what
you need to know...

My preference really is for each fallocate command to do just one
thing - having the seal operation also modify the extent map
means it's not useful for the use cases where we need the extent map
to remain unmodified

Thoughts?

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


Re: [PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Dave Chinner
On Thu, Aug 10, 2017 at 11:39:28PM -0700, Dan Williams wrote:
> >From falloc.h:
> 
> FALLOC_FL_SEAL_BLOCK_MAP is used to seal (make immutable) all of the
> file logical-to-physical extent offset mappings in the file. The
> purpose is to allow an application to assume that there are no holes
> or shared extents in the file and that the metadata needed to find
> all the physical extents of the file is stable and can never be
> dirtied.
> 
> For now this patch only permits setting the in-memory state of
> S_IOMAP_IMMMUTABLE. Support for clearing and persisting the state is
> saved for later patches.
> 
> The implementation is careful to not allow the immutable state to change
> while any process might have any established mappings. It reuses the
> existing xfs_reflink_unshare() and xfs_alloc_file_space() to unshare
> extents and fill all holes in the file. It then holds XFS_ILOCK_EXCL
> while it validates the file is in the proper state and sets
> S_IOMAP_IMMUTABLE.

SO I've been thinking about this - I'm thinking that we need to
separate the changes to the extent map from the action of sealing
the extent map.

That is, I have a need to freeze an extent map without any
modification to it at all and breaking all the sharing and filling
all the holes completely screws up the file layout I need to
preserve. i.e. I want to be able to freeze the maps of a pair of
reflinked files so I can use FIEMAP to query only the changed blocks
and then read out the data in the private/changed blocks in the
reflinked file. I only need a temporary seal (if we crash it goes
away), so maybe there's another command modifier flag needed here,
too.

The DAX allocation requirements can be handled by a preallocation
call to filll holes with zeros, followed by an unshare call to break
all the COW mappings, and then the extent map can be sealed. If you
need to check for holes after sealing, SEEK_HOLE will tell you what
you need to know...

My preference really is for each fallocate command to do just one
thing - having the seal operation also modify the extent map
means it's not useful for the use cases where we need the extent map
to remain unmodified

Thoughts?

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


[PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Dan Williams
>From falloc.h:

FALLOC_FL_SEAL_BLOCK_MAP is used to seal (make immutable) all of the
file logical-to-physical extent offset mappings in the file. The
purpose is to allow an application to assume that there are no holes
or shared extents in the file and that the metadata needed to find
all the physical extents of the file is stable and can never be
dirtied.

For now this patch only permits setting the in-memory state of
S_IOMAP_IMMMUTABLE. Support for clearing and persisting the state is
saved for later patches.

The implementation is careful to not allow the immutable state to change
while any process might have any established mappings. It reuses the
existing xfs_reflink_unshare() and xfs_alloc_file_space() to unshare
extents and fill all holes in the file. It then holds XFS_ILOCK_EXCL
while it validates the file is in the proper state and sets
S_IOMAP_IMMUTABLE.

Cc: Jan Kara 
Cc: Jeff Moyer 
Cc: Christoph Hellwig 
Cc: Ross Zwisler 
Cc: Alexander Viro 
Suggested-by: Dave Chinner 
Suggested-by: "Darrick J. Wong" 
Signed-off-by: Dan Williams 
---
 fs/open.c   |   16 --
 fs/xfs/xfs_bmap_util.c  |   72 +++
 fs/xfs/xfs_bmap_util.h  |2 +
 fs/xfs/xfs_file.c   |   14 ++--
 include/linux/falloc.h  |3 +-
 include/uapi/linux/falloc.h |   17 ++
 6 files changed, 117 insertions(+), 7 deletions(-)

diff --git a/fs/open.c b/fs/open.c
index 7395860d7164..76f57f7465c4 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -273,6 +273,17 @@ int vfs_fallocate(struct file *file, int mode, loff_t 
offset, loff_t len)
(mode & ~(FALLOC_FL_UNSHARE_RANGE | FALLOC_FL_KEEP_SIZE)))
return -EINVAL;
 
+   /*
+* Seal block map operation should only be used exclusively, and
+* with the IMMUTABLE capability.
+*/
+   if (mode & FALLOC_FL_SEAL_BLOCK_MAP) {
+   if (!capable(CAP_LINUX_IMMUTABLE))
+   return -EPERM;
+   if (mode & ~FALLOC_FL_SEAL_BLOCK_MAP)
+   return -EINVAL;
+   }
+
if (!(file->f_mode & FMODE_WRITE))
return -EBADF;
 
@@ -292,9 +303,10 @@ int vfs_fallocate(struct file *file, int mode, loff_t 
offset, loff_t len)
return -ETXTBSY;
 
/*
-* We cannot allow any allocation changes on an iomap immutable file
+* We cannot allow any allocation changes on an iomap immutable file,
+* but we can allow the fs to validate if this request is redundant.
 */
-   if (IS_IOMAP_IMMUTABLE(inode))
+   if (IS_IOMAP_IMMUTABLE(inode) && !(mode & FALLOC_FL_SEAL_BLOCK_MAP))
return -ETXTBSY;
 
/*
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 8427b0003c79..2ac8f4ed5723 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -1390,6 +1390,78 @@ xfs_zero_file_space(
 
 }
 
+int
+xfs_seal_file_space(
+   struct xfs_inode*ip,
+   xfs_off_t   offset,
+   xfs_off_t   len)
+{
+   struct inode*inode = VFS_I(ip);
+   struct address_space*mapping = inode->i_mapping;
+   struct xfs_mount*mp = ip->i_mount;
+   uintblksize;
+   int error;
+
+   ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL));
+
+   if (offset)
+   return -EINVAL;
+
+   /* Before we unshare + allocate, validate if there is any work to do. */
+   xfs_ilock(ip, XFS_ILOCK_EXCL);
+   error = -EINVAL;
+   if (len != i_size_read(inode))
+   goto out_unlock;
+
+   error = 0;
+   if (IS_IOMAP_IMMUTABLE(inode))
+   goto out_unlock;
+   xfs_iunlock(ip, XFS_ILOCK_EXCL);
+
+
+   error = xfs_reflink_unshare(ip, offset, len);
+   if (error)
+   return error;
+
+   blksize = 1 << mp->m_sb.sb_blocklog;
+   error = xfs_alloc_file_space(ip, round_down(offset, blksize),
+   round_up(offset + len, blksize) -
+   round_down(offset, blksize),
+   XFS_BMAPI_CONVERT | XFS_BMAPI_ZERO);
+   if (error)
+   return error;
+
+
+   xfs_ilock(ip, XFS_ILOCK_EXCL);
+   /*
+* Did we race a size change? Note that since we hold the mmap
+* and i/o locks we cannot race hole punch.
+*/
+   error = -EINVAL;
+   if (len != i_size_read(inode))
+   goto out_unlock;
+
+   /*
+* Allow DAX path to assume that the state of S_IOMAP_IMMUTABLE
+* will never change while any mapping is established.
+*/
+   error = -EBUSY;
+   if (mapping_mapped(mapping))
+   goto out_unlock;
+
+   

[PATCH v3 2/6] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP

2017-08-11 Thread Dan Williams
>From falloc.h:

FALLOC_FL_SEAL_BLOCK_MAP is used to seal (make immutable) all of the
file logical-to-physical extent offset mappings in the file. The
purpose is to allow an application to assume that there are no holes
or shared extents in the file and that the metadata needed to find
all the physical extents of the file is stable and can never be
dirtied.

For now this patch only permits setting the in-memory state of
S_IOMAP_IMMMUTABLE. Support for clearing and persisting the state is
saved for later patches.

The implementation is careful to not allow the immutable state to change
while any process might have any established mappings. It reuses the
existing xfs_reflink_unshare() and xfs_alloc_file_space() to unshare
extents and fill all holes in the file. It then holds XFS_ILOCK_EXCL
while it validates the file is in the proper state and sets
S_IOMAP_IMMUTABLE.

Cc: Jan Kara 
Cc: Jeff Moyer 
Cc: Christoph Hellwig 
Cc: Ross Zwisler 
Cc: Alexander Viro 
Suggested-by: Dave Chinner 
Suggested-by: "Darrick J. Wong" 
Signed-off-by: Dan Williams 
---
 fs/open.c   |   16 --
 fs/xfs/xfs_bmap_util.c  |   72 +++
 fs/xfs/xfs_bmap_util.h  |2 +
 fs/xfs/xfs_file.c   |   14 ++--
 include/linux/falloc.h  |3 +-
 include/uapi/linux/falloc.h |   17 ++
 6 files changed, 117 insertions(+), 7 deletions(-)

diff --git a/fs/open.c b/fs/open.c
index 7395860d7164..76f57f7465c4 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -273,6 +273,17 @@ int vfs_fallocate(struct file *file, int mode, loff_t 
offset, loff_t len)
(mode & ~(FALLOC_FL_UNSHARE_RANGE | FALLOC_FL_KEEP_SIZE)))
return -EINVAL;
 
+   /*
+* Seal block map operation should only be used exclusively, and
+* with the IMMUTABLE capability.
+*/
+   if (mode & FALLOC_FL_SEAL_BLOCK_MAP) {
+   if (!capable(CAP_LINUX_IMMUTABLE))
+   return -EPERM;
+   if (mode & ~FALLOC_FL_SEAL_BLOCK_MAP)
+   return -EINVAL;
+   }
+
if (!(file->f_mode & FMODE_WRITE))
return -EBADF;
 
@@ -292,9 +303,10 @@ int vfs_fallocate(struct file *file, int mode, loff_t 
offset, loff_t len)
return -ETXTBSY;
 
/*
-* We cannot allow any allocation changes on an iomap immutable file
+* We cannot allow any allocation changes on an iomap immutable file,
+* but we can allow the fs to validate if this request is redundant.
 */
-   if (IS_IOMAP_IMMUTABLE(inode))
+   if (IS_IOMAP_IMMUTABLE(inode) && !(mode & FALLOC_FL_SEAL_BLOCK_MAP))
return -ETXTBSY;
 
/*
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 8427b0003c79..2ac8f4ed5723 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -1390,6 +1390,78 @@ xfs_zero_file_space(
 
 }
 
+int
+xfs_seal_file_space(
+   struct xfs_inode*ip,
+   xfs_off_t   offset,
+   xfs_off_t   len)
+{
+   struct inode*inode = VFS_I(ip);
+   struct address_space*mapping = inode->i_mapping;
+   struct xfs_mount*mp = ip->i_mount;
+   uintblksize;
+   int error;
+
+   ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL));
+
+   if (offset)
+   return -EINVAL;
+
+   /* Before we unshare + allocate, validate if there is any work to do. */
+   xfs_ilock(ip, XFS_ILOCK_EXCL);
+   error = -EINVAL;
+   if (len != i_size_read(inode))
+   goto out_unlock;
+
+   error = 0;
+   if (IS_IOMAP_IMMUTABLE(inode))
+   goto out_unlock;
+   xfs_iunlock(ip, XFS_ILOCK_EXCL);
+
+
+   error = xfs_reflink_unshare(ip, offset, len);
+   if (error)
+   return error;
+
+   blksize = 1 << mp->m_sb.sb_blocklog;
+   error = xfs_alloc_file_space(ip, round_down(offset, blksize),
+   round_up(offset + len, blksize) -
+   round_down(offset, blksize),
+   XFS_BMAPI_CONVERT | XFS_BMAPI_ZERO);
+   if (error)
+   return error;
+
+
+   xfs_ilock(ip, XFS_ILOCK_EXCL);
+   /*
+* Did we race a size change? Note that since we hold the mmap
+* and i/o locks we cannot race hole punch.
+*/
+   error = -EINVAL;
+   if (len != i_size_read(inode))
+   goto out_unlock;
+
+   /*
+* Allow DAX path to assume that the state of S_IOMAP_IMMUTABLE
+* will never change while any mapping is established.
+*/
+   error = -EBUSY;
+   if (mapping_mapped(mapping))
+   goto out_unlock;
+
+   /* Did we race someone attempting to share extents? */
+   if (xfs_is_reflink_inode(ip))
+   goto out_unlock;
+
+   error = 0;
+   inode->i_flags |=