Re: [PATCH v2 2/5] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP
On Fri, Aug 04, 2017 at 04:43:50PM -0700, Dan Williams wrote: > On Fri, Aug 4, 2017 at 4:31 PM, Dave Chinnerwrote: > > On Thu, Aug 03, 2017 at 07:28:17PM -0700, Dan Williams wrote: > >> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c > >> index fe0f8f7f4bb7..46d8eb9e19fc 100644 > >> --- a/fs/xfs/xfs_bmap_util.c > >> +++ b/fs/xfs/xfs_bmap_util.c > >> @@ -1393,6 +1393,107 @@ xfs_zero_file_space( > >> > >> } > >> > >> +/* Return 1 if hole detected, 0 if not, and < 0 if fail to determine */ > >> +STATIC int > >> +xfs_file_has_holes( > >> + struct xfs_inode*ip) > >> +{ > > > > Why do we need this function? > > > > We've just run xfs_alloc_file_space() across the entire range we > > are sealing, so we've already guaranteed that it won't have holes > > in it. > > I'm sure this is due to my ignorance of the scope of XFS_IOLOCK_EXCL > vs XFS_ILOCK_EXCL. I had assumed that since we drop and retake > XFS_ILOCK_EXCL that we need to re-validate the block map before > setting S_IOMAP_IMMUTABLE. THe ILOCK is there to protect the inode metadata when there is concurrent access through the IO/MMAP lock paths. However, if we hold the IOLOCK_EXCL and the MMAPLOCK_EXCL, then nothing can get through the IO interfaces to modify the data in the file. This is required because APIs that directly modify the extent map (e.g. fallocate, truncate, etc) have to lock out the IO path to ensure there are no IOs in flight across the range we are manipulating. Holding these locks also locks out other APIs that modify the extent map and so effectively nothing else can be accessing or modifying the extent map while a fallocate or truncate operation is in progress. Cheers, Dave. -- Dave Chinner da...@fromorbit.com ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
Re: [PATCH v2 2/5] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP
On Fri, Aug 4, 2017 at 4:31 PM, Dave Chinnerwrote: > On Thu, Aug 03, 2017 at 07:28:17PM -0700, Dan Williams wrote: >> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c >> index fe0f8f7f4bb7..46d8eb9e19fc 100644 >> --- a/fs/xfs/xfs_bmap_util.c >> +++ b/fs/xfs/xfs_bmap_util.c >> @@ -1393,6 +1393,107 @@ xfs_zero_file_space( >> >> } >> >> +/* Return 1 if hole detected, 0 if not, and < 0 if fail to determine */ >> +STATIC int >> +xfs_file_has_holes( >> + struct xfs_inode*ip) >> +{ > > Why do we need this function? > > We've just run xfs_alloc_file_space() across the entire range we > are sealing, so we've already guaranteed that it won't have holes > in it. I'm sure this is due to my ignorance of the scope of XFS_IOLOCK_EXCL vs XFS_ILOCK_EXCL. I had assumed that since we drop and retake XFS_ILOCK_EXCL that we need to re-validate the block map before setting S_IOMAP_IMMUTABLE. ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
Re: [PATCH v2 2/5] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP
On Thu, Aug 03, 2017 at 07:28:17PM -0700, Dan Williams wrote: > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c > index fe0f8f7f4bb7..46d8eb9e19fc 100644 > --- a/fs/xfs/xfs_bmap_util.c > +++ b/fs/xfs/xfs_bmap_util.c > @@ -1393,6 +1393,107 @@ xfs_zero_file_space( > > } > > +/* Return 1 if hole detected, 0 if not, and < 0 if fail to determine */ > +STATIC int > +xfs_file_has_holes( > + struct xfs_inode*ip) > +{ Why do we need this function? We've just run xfs_alloc_file_space() across the entire range we are sealing, so we've already guaranteed that it won't have holes in it. Cheers, Dave. -- Dave Chinner da...@fromorbit.com ___ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
Re: [PATCH v2 2/5] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP
On Fri, Aug 4, 2017 at 12:46 PM, Darrick J. Wongwrote: > On Thu, Aug 03, 2017 at 07:28:17PM -0700, Dan Williams wrote: >> >From falloc.h: >> >> FALLOC_FL_SEAL_BLOCK_MAP is used to seal (make immutable) all of the >> file logical-to-physical extent offset mappings in the file. The >> purpose is to allow an application to assume that there are no holes >> or shared extents in the file and that the metadata needed to find >> all the physical extents of the file is stable and can never be >> dirtied. >> >> For now this patch only permits setting the in-memory state of >> S_IOMAP_IMMMUTABLE. Support for clearing and persisting the state is >> saved for later patches. >> >> The implementation is careful to not allow the immutable state to change >> while any process might have any established mappings. It reuses the >> existing xfs_reflink_unshare() and xfs_alloc_file_space() to unshare >> extents and fill all holes in the file. It then holds XFS_ILOCK_EXCL >> while it validates the file is in the proper state and sets >> S_IOMAP_IMMUTABLE. >> >> Cc: Jan Kara >> Cc: Jeff Moyer >> Cc: Christoph Hellwig >> Cc: Ross Zwisler >> Cc: Alexander Viro >> Suggested-by: Dave Chinner >> Suggested-by: "Darrick J. Wong" >> Signed-off-by: Dan Williams >> --- >> fs/open.c | 11 + >> fs/xfs/xfs_bmap_util.c | 101 >> +++ >> fs/xfs/xfs_bmap_util.h |2 + >> fs/xfs/xfs_file.c | 14 -- >> include/linux/falloc.h |3 + >> include/uapi/linux/falloc.h | 19 >> 6 files changed, 145 insertions(+), 5 deletions(-) >> >> diff --git a/fs/open.c b/fs/open.c >> index 7395860d7164..e3aae59785ae 100644 >> --- a/fs/open.c >> +++ b/fs/open.c >> @@ -273,6 +273,17 @@ int vfs_fallocate(struct file *file, int mode, loff_t >> offset, loff_t len) >> (mode & ~(FALLOC_FL_UNSHARE_RANGE | FALLOC_FL_KEEP_SIZE))) >> return -EINVAL; >> >> + /* >> + * Seal block map operation should only be used exclusively, and >> + * with the IMMUTABLE capability. >> + */ >> + if (mode & FALLOC_FL_SEAL_BLOCK_MAP) { >> + if (!capable(CAP_LINUX_IMMUTABLE)) >> + return -EPERM; >> + if (mode & ~FALLOC_FL_SEAL_BLOCK_MAP) >> + return -EINVAL; >> + } >> + >> if (!(file->f_mode & FMODE_WRITE)) >> return -EBADF; >> >> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c >> index fe0f8f7f4bb7..46d8eb9e19fc 100644 >> --- a/fs/xfs/xfs_bmap_util.c >> +++ b/fs/xfs/xfs_bmap_util.c >> @@ -1393,6 +1393,107 @@ xfs_zero_file_space( >> >> } >> >> +/* Return 1 if hole detected, 0 if not, and < 0 if fail to determine */ >> +STATIC int >> +xfs_file_has_holes( >> + struct xfs_inode*ip) >> +{ >> + struct xfs_mount*mp = ip->i_mount; >> + struct xfs_bmbt_irec*map; >> + const int map_size = 10; /* constrain memory overhead */ >> + int i, nmaps; >> + int error = 0; >> + xfs_fileoff_t lblkno = 0; >> + xfs_filblks_t maxlblkcnt; >> + >> + map = kmem_alloc(map_size * sizeof(*map), KM_SLEEP); > > Sleeping with an inode fully locked and (eventually) a running > transaction? Yikes. > > Just allocate one xfs_bmbt_irec on the stack and pass in nmaps=1. > > This method might fit better in libxfs/xfs_bmap.c where it'll be > able to scan the extent list more quickly with the iext helpers. Ok, I'll take a look. > >> + >> + maxlblkcnt = XFS_B_TO_FSB(mp, i_size_read(VFS_I(ip))); >> + do { >> + nmaps = map_size; >> + error = xfs_bmapi_read(ip, lblkno, maxlblkcnt - lblkno, >> +map, , 0); >> + if (error) >> + break; >> + >> + ASSERT(nmaps <= map_size); >> + for (i = 0; i < nmaps; i++) { >> + lblkno += map[i].br_blockcount; >> + if (map[i].br_startblock == HOLESTARTBLOCK) { > > I think we also need to check for unwritten extents here, because a > write to an unwritten block requires some zeroing and a mapping metdata > update. Will do. > >> + error = 1; >> + break; >> + } >> + } >> + } while (nmaps > 0 && error == 0); >> + >> + kmem_free(map); >> + return error; >> +} >> + >> +int >> +xfs_seal_file_space( >> + struct xfs_inode*ip, >> + xfs_off_t offset, >> + xfs_off_t len) >> +{ >> + struct inode*inode = VFS_I(ip); >> + struct address_space*mapping = inode->i_mapping; >> +
Re: [PATCH v2 2/5] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP
On Thu, Aug 03, 2017 at 07:28:17PM -0700, Dan Williams wrote: > >From falloc.h: > > FALLOC_FL_SEAL_BLOCK_MAP is used to seal (make immutable) all of the > file logical-to-physical extent offset mappings in the file. The > purpose is to allow an application to assume that there are no holes > or shared extents in the file and that the metadata needed to find > all the physical extents of the file is stable and can never be > dirtied. > > For now this patch only permits setting the in-memory state of > S_IOMAP_IMMMUTABLE. Support for clearing and persisting the state is > saved for later patches. > > The implementation is careful to not allow the immutable state to change > while any process might have any established mappings. It reuses the > existing xfs_reflink_unshare() and xfs_alloc_file_space() to unshare > extents and fill all holes in the file. It then holds XFS_ILOCK_EXCL > while it validates the file is in the proper state and sets > S_IOMAP_IMMUTABLE. > > Cc: Jan Kara> Cc: Jeff Moyer > Cc: Christoph Hellwig > Cc: Ross Zwisler > Cc: Alexander Viro > Suggested-by: Dave Chinner > Suggested-by: "Darrick J. Wong" > Signed-off-by: Dan Williams > --- > fs/open.c | 11 + > fs/xfs/xfs_bmap_util.c | 101 > +++ > fs/xfs/xfs_bmap_util.h |2 + > fs/xfs/xfs_file.c | 14 -- > include/linux/falloc.h |3 + > include/uapi/linux/falloc.h | 19 > 6 files changed, 145 insertions(+), 5 deletions(-) > > diff --git a/fs/open.c b/fs/open.c > index 7395860d7164..e3aae59785ae 100644 > --- a/fs/open.c > +++ b/fs/open.c > @@ -273,6 +273,17 @@ int vfs_fallocate(struct file *file, int mode, loff_t > offset, loff_t len) > (mode & ~(FALLOC_FL_UNSHARE_RANGE | FALLOC_FL_KEEP_SIZE))) > return -EINVAL; > > + /* > + * Seal block map operation should only be used exclusively, and > + * with the IMMUTABLE capability. > + */ > + if (mode & FALLOC_FL_SEAL_BLOCK_MAP) { > + if (!capable(CAP_LINUX_IMMUTABLE)) > + return -EPERM; > + if (mode & ~FALLOC_FL_SEAL_BLOCK_MAP) > + return -EINVAL; > + } > + > if (!(file->f_mode & FMODE_WRITE)) > return -EBADF; > > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c > index fe0f8f7f4bb7..46d8eb9e19fc 100644 > --- a/fs/xfs/xfs_bmap_util.c > +++ b/fs/xfs/xfs_bmap_util.c > @@ -1393,6 +1393,107 @@ xfs_zero_file_space( > > } > > +/* Return 1 if hole detected, 0 if not, and < 0 if fail to determine */ > +STATIC int > +xfs_file_has_holes( > + struct xfs_inode*ip) > +{ > + struct xfs_mount*mp = ip->i_mount; > + struct xfs_bmbt_irec*map; > + const int map_size = 10; /* constrain memory overhead */ > + int i, nmaps; > + int error = 0; > + xfs_fileoff_t lblkno = 0; > + xfs_filblks_t maxlblkcnt; > + > + map = kmem_alloc(map_size * sizeof(*map), KM_SLEEP); Sleeping with an inode fully locked and (eventually) a running transaction? Yikes. Just allocate one xfs_bmbt_irec on the stack and pass in nmaps=1. This method might fit better in libxfs/xfs_bmap.c where it'll be able to scan the extent list more quickly with the iext helpers. > + > + maxlblkcnt = XFS_B_TO_FSB(mp, i_size_read(VFS_I(ip))); > + do { > + nmaps = map_size; > + error = xfs_bmapi_read(ip, lblkno, maxlblkcnt - lblkno, > +map, , 0); > + if (error) > + break; > + > + ASSERT(nmaps <= map_size); > + for (i = 0; i < nmaps; i++) { > + lblkno += map[i].br_blockcount; > + if (map[i].br_startblock == HOLESTARTBLOCK) { I think we also need to check for unwritten extents here, because a write to an unwritten block requires some zeroing and a mapping metdata update. > + error = 1; > + break; > + } > + } > + } while (nmaps > 0 && error == 0); > + > + kmem_free(map); > + return error; > +} > + > +int > +xfs_seal_file_space( > + struct xfs_inode*ip, > + xfs_off_t offset, > + xfs_off_t len) > +{ > + struct inode*inode = VFS_I(ip); > + struct address_space*mapping = inode->i_mapping; > + int error; > + > + ASSERT(xfs_isilocked(ip, XFS_MMAPLOCK_EXCL)); The IOLOCK must be held here too. > + > + if (offset) > + return -EINVAL; > + > + error = xfs_reflink_unshare(ip, offset, len); > +