Re: [Cluster-devel] [PATCH v2 00/23] fs-verity support for XFS
Hi Darrick, On Tue, Apr 04, 2023 at 09:39:42AM -0700, Darrick J. Wong wrote: > On Tue, Apr 04, 2023 at 04:52:56PM +0200, Andrey Albershteyn wrote: > > Hi all, > > > > This is V2 of fs-verity support in XFS. In this series I did > > numerous changes from V1 which are described below. > > > > This patchset introduces fs-verity [5] support for XFS. This > > implementation utilizes extended attributes to store fs-verity > > metadata. The Merkle tree blocks are stored in the remote extended > > attributes. > > > > A few key points: > > - fs-verity metadata is stored in extended attributes > > - Direct path and DAX are disabled for inodes with fs-verity > > - Pages are verified in iomap's read IO path (offloaded to > > workqueue) > > - New workqueue for verification processing > > - New ro-compat flag > > - Inodes with fs-verity have new on-disk diflag > > - xfs_attr_get() can return buffer with the attribute > > > > The patchset is tested with xfstests -g auto on xfs_1k, xfs_4k, > > xfs_1k_quota, and xfs_4k_quota. Haven't found any major failures. > > > > Patches [6/23] and [7/23] touch ext4, f2fs, btrfs, and patch [8/23] > > touches erofs, gfs2, and zonefs. > > > > The patchset consist of four parts: > > - [1..4]: Patches from Parent Pointer patchset which add binary > > xattr names with a few deps > > - [5..7]: Improvements to core fs-verity > > - [8..9]: Add read path verification to iomap > > - [10..23]: Integration of fs-verity to xfs > > > > Changes from V1: > > - Added parent pointer patches for easier testing > > - Many issues and refactoring points fixed from the V1 review > > - Adjusted for recent changes in fs-verity core (folios, non-4k) > > - Dropped disabling of large folios > > - Completely new fsverity patches (fix, callout, log_blocksize) > > - Change approach to verification in iomap to the same one as in > > write path. Callouts to fs instead of direct fs-verity use. > > - New XFS workqueue for post read folio verification > > - xfs_attr_get() can return underlying xfs_buf > > - xfs_bufs are marked with XBF_VERITY_CHECKED to track verified > > blocks > > > > kernel: > > [1]: https://github.com/alberand/linux/tree/xfs-verity-v2 > > > > xfsprogs: > > [2]: https://github.com/alberand/xfsprogs/tree/fsverity-v2 > > Will there any means for xfs_repair to check the merkle tree contents? > Should it clear the ondisk inode flag if it decides to trash the xattr > structure, or is it ok to let the kernel deal with flag set and no > verity data? The fsverity-util can calculate merkle tree offline, so, it's possible for xfs_repair to do the same and compare, also it can check that all merkle tree blocks are there. The flag without tree is probably bad as all reading ops will fail and it won't be possible to regenerate the tree (enable also checks for flag). -- - Andrey
Re: [Cluster-devel] [PATCH v2 00/23] fs-verity support for XFS
On Tue, Apr 04, 2023 at 04:37:13PM -0700, Eric Biggers wrote: > On Tue, Apr 04, 2023 at 04:52:56PM +0200, Andrey Albershteyn wrote: > > The patchset is tested with xfstests -g auto on xfs_1k, xfs_4k, > > xfs_1k_quota, and xfs_4k_quota. Haven't found any major failures. > > Just to double check, did you verify that the tests in the "verity" group are > running, and were not skipped? Yes, the linked xfstests in cover-letter has patch enabling xfs (xfsprogs also needed). > > - Eric > -- - Andrey
Re: [Cluster-devel] [PATCH v2 21/23] xfs: handle merkle tree block size != fs blocksize != PAGE_SIZE
Hi Darrick, On Tue, Apr 04, 2023 at 09:36:02AM -0700, Darrick J. Wong wrote: > On Tue, Apr 04, 2023 at 04:53:17PM +0200, Andrey Albershteyn wrote: > > In case of different Merkle tree block size fs-verity expects > > ->read_merkle_tree_page() to return Merkle tree page filled with > > Merkle tree blocks. The XFS stores each merkle tree block under > > extended attribute. Those attributes are addressed by block offset > > into Merkle tree. > > > > This patch make ->read_merkle_tree_page() to fetch multiple merkle > > tree blocks based on size ratio. Also the reference to each xfs_buf > > is passed with page->private to ->drop_page(). > > > > Signed-off-by: Andrey Albershteyn > > --- > > fs/xfs/xfs_verity.c | 74 +++-- > > fs/xfs/xfs_verity.h | 8 + > > 2 files changed, 66 insertions(+), 16 deletions(-) > > > > diff --git a/fs/xfs/xfs_verity.c b/fs/xfs/xfs_verity.c > > index a9874ff4efcd..ef0aff216f06 100644 > > --- a/fs/xfs/xfs_verity.c > > +++ b/fs/xfs/xfs_verity.c > > @@ -134,6 +134,10 @@ xfs_read_merkle_tree_page( > > struct page *page = NULL; > > __be64 name = cpu_to_be64(index << PAGE_SHIFT); > > uint32_tbs = 1 << log_blocksize; > > + int blocks_per_page = > > + (1 << (PAGE_SHIFT - log_blocksize)); > > + int n = 0; > > + int offset = 0; > > struct xfs_da_args args = { > > .dp = ip, > > .attr_filter= XFS_ATTR_VERITY, > > @@ -143,26 +147,59 @@ xfs_read_merkle_tree_page( > > .valuelen = bs, > > }; > > int error = 0; > > + boolis_checked = true; > > + struct xfs_verity_buf_list *buf_list; > > > > page = alloc_page(GFP_KERNEL); > > if (!page) > > return ERR_PTR(-ENOMEM); > > > > - error = xfs_attr_get(); > > - if (error) { > > - kmem_free(args.value); > > - xfs_buf_rele(args.bp); > > + buf_list = kzalloc(sizeof(struct xfs_verity_buf_list), GFP_KERNEL); > > + if (!buf_list) { > > put_page(page); > > - return ERR_PTR(-EFAULT); > > + return ERR_PTR(-ENOMEM); > > } > > > > - if (args.bp->b_flags & XBF_VERITY_CHECKED) > > + /* > > +* Fill the page with Merkle tree blocks. The blcoks_per_page is higher > > +* than 1 when fs block size != PAGE_SIZE or Merkle tree block size != > > +* PAGE SIZE > > +*/ > > + for (n = 0; n < blocks_per_page; n++) { > > Ahah, ok, that's why we can't pass the xfs_buf pages up to fsverity. > > > + offset = bs * n; > > + name = cpu_to_be64(((index << PAGE_SHIFT) + offset)); > > Really this ought to be a typechecked helper... > > struct xfs_fsverity_merkle_key { > __be64 merkleoff; Sure, thanks, will change this > }; > > static inline void > xfs_fsverity_merkle_key_to_disk(struct xfs_fsverity_merkle_key *k, loff_t pos) > { > k->merkeloff = cpu_to_be64(pos); > } > > > > > + args.name = (const uint8_t *) > > + > > + error = xfs_attr_get(); > > + if (error) { > > + kmem_free(args.value); > > + /* > > +* No more Merkle tree blocks (e.g. this was the last > > +* block of the tree) > > +*/ > > + if (error == -ENOATTR) > > + break; > > + xfs_buf_rele(args.bp); > > + put_page(page); > > + kmem_free(buf_list); > > + return ERR_PTR(-EFAULT); > > + } > > + > > + buf_list->bufs[buf_list->buf_count++] = args.bp; > > + > > + /* One of the buffers was dropped */ > > + if (!(args.bp->b_flags & XBF_VERITY_CHECKED)) > > + is_checked = false; > > If there's enough memory pressure to cause the merkle tree pages to get > evicted, what are the chances that the xfs_bufs survive the eviction? The merkle tree pages are dropped after verification. When page is dropped xfs_buf is marked as verified. If fs-verity wants to verify again it will get the same verified buffer. If buffer is evicted it won't have verified state. So, with enough memo
Re: [Cluster-devel] [PATCH v2 21/23] xfs: handle merkle tree block size != fs blocksize != PAGE_SIZE
Hi Eric, On Tue, Apr 04, 2023 at 04:32:24PM -0700, Eric Biggers wrote: > Hi Andrey, > > On Tue, Apr 04, 2023 at 04:53:17PM +0200, Andrey Albershteyn wrote: > > In case of different Merkle tree block size fs-verity expects > > ->read_merkle_tree_page() to return Merkle tree page filled with > > Merkle tree blocks. The XFS stores each merkle tree block under > > extended attribute. Those attributes are addressed by block offset > > into Merkle tree. > > > > This patch make ->read_merkle_tree_page() to fetch multiple merkle > > tree blocks based on size ratio. Also the reference to each xfs_buf > > is passed with page->private to ->drop_page(). > > > > Signed-off-by: Andrey Albershteyn > > --- > > fs/xfs/xfs_verity.c | 74 +++-- > > fs/xfs/xfs_verity.h | 8 + > > 2 files changed, 66 insertions(+), 16 deletions(-) > > > > diff --git a/fs/xfs/xfs_verity.c b/fs/xfs/xfs_verity.c > > index a9874ff4efcd..ef0aff216f06 100644 > > --- a/fs/xfs/xfs_verity.c > > +++ b/fs/xfs/xfs_verity.c > > @@ -134,6 +134,10 @@ xfs_read_merkle_tree_page( > > struct page *page = NULL; > > __be64 name = cpu_to_be64(index << PAGE_SHIFT); > > uint32_tbs = 1 << log_blocksize; > > + int blocks_per_page = > > + (1 << (PAGE_SHIFT - log_blocksize)); > > + int n = 0; > > + int offset = 0; > > struct xfs_da_args args = { > > .dp = ip, > > .attr_filter= XFS_ATTR_VERITY, > > @@ -143,26 +147,59 @@ xfs_read_merkle_tree_page( > > .valuelen = bs, > > }; > > int error = 0; > > + boolis_checked = true; > > + struct xfs_verity_buf_list *buf_list; > > > > page = alloc_page(GFP_KERNEL); > > if (!page) > > return ERR_PTR(-ENOMEM); > > > > - error = xfs_attr_get(); > > - if (error) { > > - kmem_free(args.value); > > - xfs_buf_rele(args.bp); > > + buf_list = kzalloc(sizeof(struct xfs_verity_buf_list), GFP_KERNEL); > > + if (!buf_list) { > > put_page(page); > > - return ERR_PTR(-EFAULT); > > + return ERR_PTR(-ENOMEM); > > } > > > > - if (args.bp->b_flags & XBF_VERITY_CHECKED) > > + /* > > +* Fill the page with Merkle tree blocks. The blcoks_per_page is higher > > +* than 1 when fs block size != PAGE_SIZE or Merkle tree block size != > > +* PAGE SIZE > > +*/ > > + for (n = 0; n < blocks_per_page; n++) { > > + offset = bs * n; > > + name = cpu_to_be64(((index << PAGE_SHIFT) + offset)); > > + args.name = (const uint8_t *) > > + > > + error = xfs_attr_get(); > > + if (error) { > > + kmem_free(args.value); > > + /* > > +* No more Merkle tree blocks (e.g. this was the last > > +* block of the tree) > > +*/ > > + if (error == -ENOATTR) > > + break; > > + xfs_buf_rele(args.bp); > > + put_page(page); > > + kmem_free(buf_list); > > + return ERR_PTR(-EFAULT); > > + } > > + > > + buf_list->bufs[buf_list->buf_count++] = args.bp; > > + > > + /* One of the buffers was dropped */ > > + if (!(args.bp->b_flags & XBF_VERITY_CHECKED)) > > + is_checked = false; > > + > > + memcpy(page_address(page) + offset, args.value, args.valuelen); > > + kmem_free(args.value); > > + args.value = NULL; > > + } > > I was really hoping for a solution where the cached data can be used directly, > instead allocating a temporary page and copying the cached data into it every > time the cache is accessed. The problem with what you have now is that every > time a single 32-byte hash is accessed, a full page (potentially 64KB!) will > be > allocated and filled. That's not very efficient. The need to allocate the > temporary page can also cause ENOMEM (which will get reported as EIO). > > Did you consider alternatives that would work more efficiently? I think it > would be worth designing something that works prop
Re: [Cluster-devel] [PATCH v2 19/23] xfs: disable direct read path for fs-verity sealed files
On Tue, Apr 04, 2023 at 09:10:47AM -0700, Darrick J. Wong wrote: > On Tue, Apr 04, 2023 at 04:53:15PM +0200, Andrey Albershteyn wrote: > > The direct path is not supported on verity files. Attempts to use direct > > I/O path on such files should fall back to buffered I/O path. > > > > Signed-off-by: Andrey Albershteyn > > --- > > fs/xfs/xfs_file.c | 14 +++--- > > 1 file changed, 11 insertions(+), 3 deletions(-) > > > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > > index 947b5c436172..9e072e82f6c1 100644 > > --- a/fs/xfs/xfs_file.c > > +++ b/fs/xfs/xfs_file.c > > @@ -244,7 +244,8 @@ xfs_file_dax_read( > > struct kiocb*iocb, > > struct iov_iter *to) > > { > > - struct xfs_inode*ip = XFS_I(iocb->ki_filp->f_mapping->host); > > + struct inode*inode = iocb->ki_filp->f_mapping->host; > > + struct xfs_inode*ip = XFS_I(inode); > > ssize_t ret = 0; > > > > trace_xfs_file_dax_read(iocb, to); > > @@ -297,10 +298,17 @@ xfs_file_read_iter( > > > > if (IS_DAX(inode)) > > ret = xfs_file_dax_read(iocb, to); > > - else if (iocb->ki_flags & IOCB_DIRECT) > > + else if (iocb->ki_flags & IOCB_DIRECT && !fsverity_active(inode)) > > ret = xfs_file_dio_read(iocb, to); > > - else > > + else { > > + /* > > +* In case fs-verity is enabled, we also fallback to the > > +* buffered read from the direct read path. Therefore, > > +* IOCB_DIRECT is set and need to be cleared > > +*/ > > + iocb->ki_flags &= ~IOCB_DIRECT; > > ret = xfs_file_buffered_read(iocb, to); > > XFS doesn't usually allow directio fallback to the pagecache. Why > would fsverity be any different? Didn't know that, this is what happens on ext4 so I did the same. Then it probably make sense to just error on DIRECT on verity sealed file. > > --D > > > + } > > > > if (ret > 0) > > XFS_STATS_ADD(mp, xs_read_bytes, ret); > > -- > > 2.38.4 > > > -- - Andrey
Re: [Cluster-devel] [PATCH v2 16/23] xfs: add inode on-disk VERITY flag
Hi Eric and Dave, On Wed, Apr 05, 2023 at 09:56:33AM +1000, Dave Chinner wrote: > On Tue, Apr 04, 2023 at 03:41:23PM -0700, Eric Biggers wrote: > > Hi Andrey, > > > > On Tue, Apr 04, 2023 at 04:53:12PM +0200, Andrey Albershteyn wrote: > > > Add flag to mark inodes which have fs-verity enabled on them (i.e. > > > descriptor exist and tree is built). > > > > > > Signed-off-by: Andrey Albershteyn > > > --- > > > fs/ioctl.c | 4 > > > fs/xfs/libxfs/xfs_format.h | 4 +++- > > > fs/xfs/xfs_inode.c | 2 ++ > > > fs/xfs/xfs_iops.c | 2 ++ > > > include/uapi/linux/fs.h| 1 + > > > 5 files changed, 12 insertions(+), 1 deletion(-) > > [...] > > > diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h > > > index b7b56871029c..5172a2eb902c 100644 > > > --- a/include/uapi/linux/fs.h > > > +++ b/include/uapi/linux/fs.h > > > @@ -140,6 +140,7 @@ struct fsxattr { > > > #define FS_XFLAG_FILESTREAM 0x4000 /* use filestream > > > allocator */ > > > #define FS_XFLAG_DAX 0x8000 /* use DAX for IO */ > > > #define FS_XFLAG_COWEXTSIZE 0x0001 /* CoW extent size > > > allocator hint */ > > > +#define FS_XFLAG_VERITY 0x0002 /* fs-verity sealed > > > inode */ > > > #define FS_XFLAG_HASATTR 0x8000 /* no DIFLAG for this */ > > > > > > > I don't think "xfs: add inode on-disk VERITY flag" is an accurate > > description of > > a patch that involves adding something to the UAPI. > > Well it does that, but it also adds the UAPI for querying the > on-disk flag via the FS_IOC_FSGETXATTR interface as well. It > probably should be split up into two patches. Sure. > > > Should the other filesystems support this new flag too? > > I think they should get it automatically now that it has been > defined for FS_IOC_FSGETXATTR and added to the generic fileattr flag > fill functions in fs/ioctl.c. > > > I'd also like all ways of getting the verity flag to continue to be > > mentioned in > > Documentation/filesystems/fsverity.rst. The existing methods > > (FS_IOC_GETFLAGS > > and statx) are already mentioned there. > > *nod* > Ok, sure, missed that. Will split this patch and add description. -- - Andrey
Re: [Cluster-devel] [PATCH v2 09/23] iomap: allow filesystem to implement read path verification
Hi Christoph, On Tue, Apr 04, 2023 at 08:37:02AM -0700, Christoph Hellwig wrote: > > if (iomap_block_needs_zeroing(iter, pos)) { > > folio_zero_range(folio, poff, plen); > > + if (iomap->flags & IOMAP_F_READ_VERITY) { > > Wju do we need the new flag vs just testing that folio_ops and > folio_ops->verify_folio is non-NULL? Yes, it can be just test, haven't noticed that it's used only here, initially I used it in several places. > > > - ctx->bio = bio_alloc(iomap->bdev, bio_max_segs(nr_vecs), > > -REQ_OP_READ, gfp); > > + ctx->bio = bio_alloc_bioset(iomap->bdev, bio_max_segs(nr_vecs), > > + REQ_OP_READ, GFP_NOFS, > > _read_ioend_bioset); > > All other callers don't really need the larger bioset, so I'd avoid > the unconditional allocation here, but more on that later. Ok, make sense. > > > + ioend = container_of(ctx->bio, struct iomap_read_ioend, > > + read_inline_bio); > > + ioend->io_inode = iter->inode; > > + if (ctx->ops && ctx->ops->prepare_ioend) > > + ctx->ops->prepare_ioend(ioend); > > + > > So what we're doing in writeback and direct I/O, is to: > > a) have a submit_bio hook > b) allow the file system to then hook the bi_end_io caller > c) (only in direct O/O for now) allow the file system to provide > a bio_set to allocate from I see. > > I wonder if that also makes sense and keep all the deferral in the > file system. We'll need that for the btrfs iomap conversion anyway, > and it seems more flexible. The ioend processing would then move into > XFS. > Not sure what you mean here. > > @@ -156,6 +160,11 @@ struct iomap_folio_ops { > > * locked by the iomap code. > > */ > > bool (*iomap_valid)(struct inode *inode, const struct iomap *iomap); > > + > > + /* > > +* Verify folio when successfully read > > +*/ > > + bool (*verify_folio)(struct folio *folio, loff_t pos, unsigned int len); > > Why isn't this in iomap_readpage_ops? > Yes, it can be. But it appears to me to be more relevant to _folio_ops, any particular reason to move it there? Don't mind moving it to iomap_readpage_ops. -- - Andrey
Re: [Cluster-devel] [PATCH v2 06/23] fsverity: add drop_page() callout
Hi Dave, On Wed, Apr 05, 2023 at 09:40:19AM +1000, Dave Chinner wrote: > On Tue, Apr 04, 2023 at 04:53:02PM +0200, Andrey Albershteyn wrote: > > Allow filesystem to make additional processing on verified pages > > instead of just dropping a reference. This will be used by XFS for > > internal buffer cache manipulation in further patches. The btrfs, > > ext4, and f2fs just drop the reference. > > > > Signed-off-by: Andrey Albershteyn > > --- > > fs/btrfs/verity.c | 12 > > fs/ext4/verity.c | 6 ++ > > fs/f2fs/verity.c | 6 ++ > > fs/verity/read_metadata.c | 4 ++-- > > fs/verity/verify.c| 6 +++--- > > include/linux/fsverity.h | 10 ++ > > 6 files changed, 39 insertions(+), 5 deletions(-) > > > > diff --git a/fs/btrfs/verity.c b/fs/btrfs/verity.c > > index c5ff16f9e9fa..4c2c09204bb4 100644 > > --- a/fs/btrfs/verity.c > > +++ b/fs/btrfs/verity.c > > @@ -804,10 +804,22 @@ static int btrfs_write_merkle_tree_block(struct inode > > *inode, const void *buf, > >pos, buf, size); > > } > > > > +/* > > + * fsverity op that releases the reference obtained by > > ->read_merkle_tree_page() > > + * > > + * @page: reference to the page which can be released > > + * > > + */ > > +static void btrfs_drop_page(struct page *page) > > +{ > > + put_page(page); > > +} > > + > > const struct fsverity_operations btrfs_verityops = { > > .begin_enable_verity = btrfs_begin_enable_verity, > > .end_enable_verity = btrfs_end_enable_verity, > > .get_verity_descriptor = btrfs_get_verity_descriptor, > > .read_merkle_tree_page = btrfs_read_merkle_tree_page, > > .write_merkle_tree_block = btrfs_write_merkle_tree_block, > > + .drop_page = _drop_page, > > }; > > Ok, that's a generic put_page() call. > > > > diff --git a/fs/verity/verify.c b/fs/verity/verify.c > > index f50e3b5b52c9..c2fc4c86af34 100644 > > --- a/fs/verity/verify.c > > +++ b/fs/verity/verify.c > > @@ -210,7 +210,7 @@ verify_data_block(struct inode *inode, struct > > fsverity_info *vi, > > if (is_hash_block_verified(vi, hpage, hblock_idx)) { > > memcpy_from_page(_want_hash, hpage, hoffset, hsize); > > want_hash = _want_hash; > > - put_page(hpage); > > + inode->i_sb->s_vop->drop_page(hpage); > > goto descend; > > fsverity_drop_page(hpage); > > static inline void > fsverity_drop_page(struct inode *inode, struct page *page) > { > if (inode->i_sb->s_vop->drop_page) > inode->i_sb->s_vop->drop_page(page); > else > put_page(page); > } > > And then you don't need to add the functions to each of the > filesystems nor make an indirect call just to run put_page(). Sure, this makes more sense, thank you! -- - Andrey
Re: [Cluster-devel] [PATCH v2 05/23] fsverity: make fsverity_verify_folio() accept folio's offset and size
Hi Christoph, On Tue, Apr 04, 2023 at 08:30:36AM -0700, Christoph Hellwig wrote: > On Tue, Apr 04, 2023 at 04:53:01PM +0200, Andrey Albershteyn wrote: > > Not the whole folio always need to be verified by fs-verity (e.g. > > with 1k blocks). Use passed folio's offset and size. > > Why can't those callers just call fsverity_verify_blocks directly? > They can. Calling _verify_folio with explicit offset; size appeared more clear to me. But I'm ok with dropping this patch to have full folio verify function. -- - Andrey
[Cluster-devel] [PATCH v2 18/23] xfs: don't allow to enable DAX on fs-verity sealsed inode
fs-verity doesn't support DAX. Forbid filesystem to enable DAX on inodes which already have fs-verity enabled. The opposite is checked when fs-verity is enabled, it won't be enabled if DAX is. Signed-off-by: Andrey Albershteyn --- fs/xfs/xfs_iops.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 5398be75a76a..e0d7107a9ba1 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -1204,6 +1204,8 @@ xfs_inode_should_enable_dax( return false; if (!xfs_inode_supports_dax(ip)) return false; + if (ip->i_diflags2 & XFS_DIFLAG2_VERITY) + return false; if (xfs_has_dax_always(ip->i_mount)) return true; if (ip->i_diflags2 & XFS_DIFLAG2_DAX) -- 2.38.4
[Cluster-devel] [PATCH v2 12/23] xfs: introduce workqueue for post read IO work
As noted by Dave there are two problems with using fs-verity's workqueue in XFS: 1. High priority workqueues are used within XFS to ensure that data IO completion cannot stall processing of journal IO completions. Hence using a WQ_HIGHPRI workqueue directly in the user data IO path is a potential filesystem livelock/deadlock vector. 2. The fsverity workqueue is global - it creates a cross-filesystem contention point. This patch adds per-filesystem, per-cpu workqueue for fsverity work. Signed-off-by: Andrey Albershteyn --- fs/xfs/xfs_mount.h | 1 + fs/xfs/xfs_super.c | 9 + 2 files changed, 10 insertions(+) diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index f3269c0626f0..53a4a9304937 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -107,6 +107,7 @@ typedef struct xfs_mount { struct xfs_mru_cache*m_filestream; /* per-mount filestream data */ struct workqueue_struct *m_buf_workqueue; struct workqueue_struct *m_unwritten_workqueue; + struct workqueue_struct *m_postread_workqueue; struct workqueue_struct *m_reclaim_workqueue; struct workqueue_struct *m_sync_workqueue; struct workqueue_struct *m_blockgc_wq; diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 4f814f9e12ab..d6f22cb94ee2 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -548,6 +548,12 @@ xfs_init_mount_workqueues( if (!mp->m_unwritten_workqueue) goto out_destroy_buf; + mp->m_postread_workqueue = alloc_workqueue("xfs-pread/%s", + XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM), + 0, mp->m_super->s_id); + if (!mp->m_postread_workqueue) + goto out_destroy_postread; + mp->m_reclaim_workqueue = alloc_workqueue("xfs-reclaim/%s", XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM), 0, mp->m_super->s_id); @@ -581,6 +587,8 @@ xfs_init_mount_workqueues( destroy_workqueue(mp->m_reclaim_workqueue); out_destroy_unwritten: destroy_workqueue(mp->m_unwritten_workqueue); +out_destroy_postread: + destroy_workqueue(mp->m_postread_workqueue); out_destroy_buf: destroy_workqueue(mp->m_buf_workqueue); out: @@ -596,6 +604,7 @@ xfs_destroy_mount_workqueues( destroy_workqueue(mp->m_inodegc_wq); destroy_workqueue(mp->m_reclaim_workqueue); destroy_workqueue(mp->m_unwritten_workqueue); + destroy_workqueue(mp->m_postread_workqueue); destroy_workqueue(mp->m_buf_workqueue); } -- 2.38.4
[Cluster-devel] [PATCH v2 22/23] xfs: add fs-verity ioctls
Add fs-verity ioctls to enable, dump metadata (descriptor and Merkle tree pages) and obtain file's digest. Signed-off-by: Andrey Albershteyn --- fs/xfs/xfs_ioctl.c | 17 + 1 file changed, 17 insertions(+) diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 3d6d680b6cf3..ffa04f0aed4a 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -42,6 +42,7 @@ #include #include #include +#include /* * xfs_find_handle maps from userspace xfs_fsop_handlereq structure to @@ -2154,6 +2155,22 @@ xfs_file_ioctl( return error; } + case FS_IOC_ENABLE_VERITY: + if (!xfs_has_verity(mp)) + return -EOPNOTSUPP; + return fsverity_ioctl_enable(filp, (const void __user *)arg); + + case FS_IOC_MEASURE_VERITY: + if (!xfs_has_verity(mp)) + return -EOPNOTSUPP; + return fsverity_ioctl_measure(filp, (void __user *)arg); + + case FS_IOC_READ_VERITY_METADATA: + if (!xfs_has_verity(mp)) + return -EOPNOTSUPP; + return fsverity_ioctl_read_metadata(filp, + (const void __user *)arg); + default: return -ENOTTY; } -- 2.38.4
[Cluster-devel] [PATCH v2 19/23] xfs: disable direct read path for fs-verity sealed files
The direct path is not supported on verity files. Attempts to use direct I/O path on such files should fall back to buffered I/O path. Signed-off-by: Andrey Albershteyn --- fs/xfs/xfs_file.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 947b5c436172..9e072e82f6c1 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -244,7 +244,8 @@ xfs_file_dax_read( struct kiocb*iocb, struct iov_iter *to) { - struct xfs_inode*ip = XFS_I(iocb->ki_filp->f_mapping->host); + struct inode*inode = iocb->ki_filp->f_mapping->host; + struct xfs_inode*ip = XFS_I(inode); ssize_t ret = 0; trace_xfs_file_dax_read(iocb, to); @@ -297,10 +298,17 @@ xfs_file_read_iter( if (IS_DAX(inode)) ret = xfs_file_dax_read(iocb, to); - else if (iocb->ki_flags & IOCB_DIRECT) + else if (iocb->ki_flags & IOCB_DIRECT && !fsverity_active(inode)) ret = xfs_file_dio_read(iocb, to); - else + else { + /* +* In case fs-verity is enabled, we also fallback to the +* buffered read from the direct read path. Therefore, +* IOCB_DIRECT is set and need to be cleared +*/ + iocb->ki_flags &= ~IOCB_DIRECT; ret = xfs_file_buffered_read(iocb, to); + } if (ret > 0) XFS_STATS_ADD(mp, xs_read_bytes, ret); -- 2.38.4
[Cluster-devel] [PATCH v2 00/23] fs-verity support for XFS
Hi all, This is V2 of fs-verity support in XFS. In this series I did numerous changes from V1 which are described below. This patchset introduces fs-verity [5] support for XFS. This implementation utilizes extended attributes to store fs-verity metadata. The Merkle tree blocks are stored in the remote extended attributes. A few key points: - fs-verity metadata is stored in extended attributes - Direct path and DAX are disabled for inodes with fs-verity - Pages are verified in iomap's read IO path (offloaded to workqueue) - New workqueue for verification processing - New ro-compat flag - Inodes with fs-verity have new on-disk diflag - xfs_attr_get() can return buffer with the attribute The patchset is tested with xfstests -g auto on xfs_1k, xfs_4k, xfs_1k_quota, and xfs_4k_quota. Haven't found any major failures. Patches [6/23] and [7/23] touch ext4, f2fs, btrfs, and patch [8/23] touches erofs, gfs2, and zonefs. The patchset consist of four parts: - [1..4]: Patches from Parent Pointer patchset which add binary xattr names with a few deps - [5..7]: Improvements to core fs-verity - [8..9]: Add read path verification to iomap - [10..23]: Integration of fs-verity to xfs Changes from V1: - Added parent pointer patches for easier testing - Many issues and refactoring points fixed from the V1 review - Adjusted for recent changes in fs-verity core (folios, non-4k) - Dropped disabling of large folios - Completely new fsverity patches (fix, callout, log_blocksize) - Change approach to verification in iomap to the same one as in write path. Callouts to fs instead of direct fs-verity use. - New XFS workqueue for post read folio verification - xfs_attr_get() can return underlying xfs_buf - xfs_bufs are marked with XBF_VERITY_CHECKED to track verified blocks kernel: [1]: https://github.com/alberand/linux/tree/xfs-verity-v2 xfsprogs: [2]: https://github.com/alberand/xfsprogs/tree/fsverity-v2 xfstests: [3]: https://github.com/alberand/xfstests/tree/fsverity-v2 v1: [4]: https://lore.kernel.org/linux-xfs/20221213172935.680971-1-aalbe...@redhat.com/ fs-verity: [5]: https://www.kernel.org/doc/html/latest/filesystems/fsverity.html Thanks, Andrey Allison Henderson (4): xfs: Add new name to attri/d xfs: add parent pointer support to attribute code xfs: define parent pointer xattr format xfs: Add xfs_verify_pptr Andrey Albershteyn (19): fsverity: make fsverity_verify_folio() accept folio's offset and size fsverity: add drop_page() callout fsverity: pass Merkle tree block size to ->read_merkle_tree_page() iomap: hoist iomap_readpage_ctx from the iomap_readahead/_folio iomap: allow filesystem to implement read path verification xfs: add XBF_VERITY_CHECKED xfs_buf flag xfs: add XFS_DA_OP_BUFFER to make xfs_attr_get() return buffer xfs: introduce workqueue for post read IO work xfs: add iomap's readpage operations xfs: add attribute type for fs-verity xfs: add fs-verity ro-compat flag xfs: add inode on-disk VERITY flag xfs: initialize fs-verity on file open and cleanup on inode destruction xfs: don't allow to enable DAX on fs-verity sealsed inode xfs: disable direct read path for fs-verity sealed files xfs: add fs-verity support xfs: handle merkle tree block size != fs blocksize != PAGE_SIZE xfs: add fs-verity ioctls xfs: enable ro-compat fs-verity flag fs/btrfs/verity.c | 15 +- fs/erofs/data.c | 12 +- fs/ext4/verity.c| 9 +- fs/f2fs/verity.c| 9 +- fs/gfs2/aops.c | 10 +- fs/ioctl.c | 4 + fs/iomap/buffered-io.c | 89 ++- fs/verity/read_metadata.c | 7 +- fs/verity/verify.c | 9 +- fs/xfs/Makefile | 1 + fs/xfs/libxfs/xfs_attr.c| 81 +- fs/xfs/libxfs/xfs_attr.h| 7 +- fs/xfs/libxfs/xfs_attr_leaf.c | 7 + fs/xfs/libxfs/xfs_attr_remote.c | 13 +- fs/xfs/libxfs/xfs_da_btree.h| 7 +- fs/xfs/libxfs/xfs_da_format.h | 46 +- fs/xfs/libxfs/xfs_format.h | 14 +- fs/xfs/libxfs/xfs_log_format.h | 8 +- fs/xfs/libxfs/xfs_sb.c | 2 + fs/xfs/scrub/attr.c | 4 +- fs/xfs/xfs_aops.c | 61 +++- fs/xfs/xfs_attr_item.c | 142 +++--- fs/xfs/xfs_attr_item.h | 1 + fs/xfs/xfs_attr_list.c | 17 ++- fs/xfs/xfs_buf.h| 17 ++- fs/xfs/xfs_file.c | 22 ++- fs/xfs/xfs_inode.c | 2 + fs/xfs/xfs_inode.h | 3 +- fs/xfs/xfs_ioctl.c | 22 +++ fs/xfs/xfs_iomap.c | 14 ++ fs/xfs/xfs_iops.c | 4 + fs/xfs/xfs_linux.h | 1 + fs/xfs/xfs_mount.h | 3 + fs/xfs/xfs_ondisk.h | 4 + fs/xfs/xfs_super.c | 19 +++ fs/xfs/xfs_trace.h | 1 + fs/xfs/xfs_verity.c | 256 fs/
[Cluster-devel] [PATCH v2 01/23] xfs: Add new name to attri/d
From: Allison Henderson This patch adds two new fields to the atti/d. They are nname and nnamelen. This will be used for parent pointer updates since a rename operation may cause the parent pointer to update both the name and value. So we need to carry both the new name as well as the target name in the attri/d. Signed-off-by: Allison Henderson Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_attr.c | 12 ++- fs/xfs/libxfs/xfs_attr.h | 4 +- fs/xfs/libxfs/xfs_da_btree.h | 2 + fs/xfs/libxfs/xfs_log_format.h | 6 +- fs/xfs/xfs_attr_item.c | 135 +++-- fs/xfs/xfs_attr_item.h | 1 + 6 files changed, 133 insertions(+), 27 deletions(-) diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c index e28d93d232de..b1dbed7655e8 100644 --- a/fs/xfs/libxfs/xfs_attr.c +++ b/fs/xfs/libxfs/xfs_attr.c @@ -423,6 +423,12 @@ xfs_attr_complete_op( args->op_flags &= ~XFS_DA_OP_REPLACE; if (do_replace) { args->attr_filter &= ~XFS_ATTR_INCOMPLETE; + if (args->new_namelen > 0) { + args->name = args->new_name; + args->namelen = args->new_namelen; + args->hashval = xfs_da_hashname(args->name, + args->namelen); + } return replace_state; } return XFS_DAS_DONE; @@ -922,9 +928,13 @@ xfs_attr_defer_replace( struct xfs_da_args *args) { struct xfs_attr_intent *new; + int op_flag; int error = 0; - error = xfs_attr_intent_init(args, XFS_ATTRI_OP_FLAGS_REPLACE, ); + op_flag = args->new_namelen == 0 ? XFS_ATTRI_OP_FLAGS_REPLACE : + XFS_ATTRI_OP_FLAGS_NVREPLACE; + + error = xfs_attr_intent_init(args, op_flag, ); if (error) return error; diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h index 81be9b3e4004..3e81f3f48560 100644 --- a/fs/xfs/libxfs/xfs_attr.h +++ b/fs/xfs/libxfs/xfs_attr.h @@ -510,8 +510,8 @@ struct xfs_attr_intent { struct xfs_da_args *xattri_da_args; /* -* Shared buffer containing the attr name and value so that the logging -* code can share large memory buffers between log items. +* Shared buffer containing the attr name, new name, and value so that +* the logging code can share large memory buffers between log items. */ struct xfs_attri_log_nameval*xattri_nameval; diff --git a/fs/xfs/libxfs/xfs_da_btree.h b/fs/xfs/libxfs/xfs_da_btree.h index ffa3df5b2893..a4b29827603f 100644 --- a/fs/xfs/libxfs/xfs_da_btree.h +++ b/fs/xfs/libxfs/xfs_da_btree.h @@ -55,7 +55,9 @@ enum xfs_dacmp { typedef struct xfs_da_args { struct xfs_da_geometry *geo;/* da block geometry */ const uint8_t *name; /* string (maybe not NULL terminated) */ + const uint8_t *new_name; /* new attr name */ int namelen;/* length of string (maybe no NULL) */ + int new_namelen;/* new attr name len */ uint8_t filetype; /* filetype of inode for directories */ void*value; /* set of bytes (maybe contain NULLs) */ int valuelen; /* length of value */ diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h index f13e0809dc63..ae9c99762a24 100644 --- a/fs/xfs/libxfs/xfs_log_format.h +++ b/fs/xfs/libxfs/xfs_log_format.h @@ -117,7 +117,8 @@ struct xfs_unmount_log_format { #define XLOG_REG_TYPE_ATTRD_FORMAT 28 #define XLOG_REG_TYPE_ATTR_NAME29 #define XLOG_REG_TYPE_ATTR_VALUE 30 -#define XLOG_REG_TYPE_MAX 30 +#define XLOG_REG_TYPE_ATTR_NNAME 31 +#define XLOG_REG_TYPE_MAX 31 /* @@ -957,6 +958,7 @@ struct xfs_icreate_log { #define XFS_ATTRI_OP_FLAGS_SET 1 /* Set the attribute */ #define XFS_ATTRI_OP_FLAGS_REMOVE 2 /* Remove the attribute */ #define XFS_ATTRI_OP_FLAGS_REPLACE 3 /* Replace the attribute */ +#define XFS_ATTRI_OP_FLAGS_NVREPLACE 4 /* Replace attr name and val */ #define XFS_ATTRI_OP_FLAGS_TYPE_MASK 0xFF/* Flags type mask */ /* @@ -974,7 +976,7 @@ struct xfs_icreate_log { struct xfs_attri_log_format { uint16_talfi_type; /* attri log item type */ uint16_talfi_size; /* size of this item */ - uint32_t__pad; /* pad to 64 bit aligned */ + uint32_talfi_nname_len; /* attr new name length */ uint64_talfi_id;/* attri identifier */ uint64_talfi_ino; /* the inode for this attr operation */ uint32_talfi_op_flags; /* marks the op as a set or remove */ diff --git a/fs/xfs/xfs_attr_item.c
[Cluster-devel] [PATCH v2 15/23] xfs: add fs-verity ro-compat flag
To mark inodes sealed with fs-verity the new XFS_DIFLAG2_VERITY flag will be added in further patch. This requires ro-compat flag to let older kernels know that fs with fs-verity can not be modified. Signed-off-by: Andrey Albershteyn --- fs/xfs/libxfs/xfs_format.h | 1 + fs/xfs/libxfs/xfs_sb.c | 2 ++ fs/xfs/xfs_mount.h | 2 ++ 3 files changed, 5 insertions(+) diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index 371dc07233e0..ef617be2839c 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -353,6 +353,7 @@ xfs_sb_has_compat_feature( #define XFS_SB_FEAT_RO_COMPAT_RMAPBT (1 << 1)/* reverse map btree */ #define XFS_SB_FEAT_RO_COMPAT_REFLINK (1 << 2)/* reflinked files */ #define XFS_SB_FEAT_RO_COMPAT_INOBTCNT (1 << 3)/* inobt block counts */ +#define XFS_SB_FEAT_RO_COMPAT_VERITY (1 << 4)/* fs-verity */ #define XFS_SB_FEAT_RO_COMPAT_ALL \ (XFS_SB_FEAT_RO_COMPAT_FINOBT | \ XFS_SB_FEAT_RO_COMPAT_RMAPBT | \ diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c index 99cc03a298e2..b1f1b21e8953 100644 --- a/fs/xfs/libxfs/xfs_sb.c +++ b/fs/xfs/libxfs/xfs_sb.c @@ -161,6 +161,8 @@ xfs_sb_version_to_features( features |= XFS_FEAT_REFLINK; if (sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_INOBTCNT) features |= XFS_FEAT_INOBTCNT; + if (sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_VERITY) + features |= XFS_FEAT_VERITY; if (sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_FTYPE) features |= XFS_FEAT_FTYPE; if (sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_SPINODES) diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index 53a4a9304937..9254c3cd9077 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -279,6 +279,7 @@ typedef struct xfs_mount { #define XFS_FEAT_BIGTIME (1ULL << 24)/* large timestamps */ #define XFS_FEAT_NEEDSREPAIR (1ULL << 25)/* needs xfs_repair */ #define XFS_FEAT_NREXT64 (1ULL << 26)/* large extent counters */ +#define XFS_FEAT_VERITY(1ULL << 27)/* fs-verity */ /* Mount features */ #define XFS_FEAT_NOATTR2 (1ULL << 48)/* disable attr2 creation */ @@ -342,6 +343,7 @@ __XFS_HAS_FEAT(inobtcounts, INOBTCNT) __XFS_HAS_FEAT(bigtime, BIGTIME) __XFS_HAS_FEAT(needsrepair, NEEDSREPAIR) __XFS_HAS_FEAT(large_extent_counts, NREXT64) +__XFS_HAS_FEAT(verity, VERITY) /* * Mount features -- 2.38.4
[Cluster-devel] [PATCH v2 20/23] xfs: add fs-verity support
Add integration with fs-verity. The XFS store fs-verity metadata in the extended attributes. The metadata consist of verity descriptor and Merkle tree blocks. The descriptor is stored under "verity_descriptor" extended attribute. The Merkle tree blocks are stored under binary indexes. When fs-verity is enabled on an inode, the XFS_IVERITY_CONSTRUCTION flag is set meaning that the Merkle tree is being build. The initialization ends with storing of verity descriptor and setting inode on-disk flag (XFS_DIFLAG2_VERITY). The verification on read is done in iomap. Based on the inode verity flag the IOMAP_F_READ_VERITY is set in xfs_read_iomap_begin() to let iomap know that verification is needed. Signed-off-by: Andrey Albershteyn --- fs/xfs/Makefile | 1 + fs/xfs/libxfs/xfs_attr.c | 13 +++ fs/xfs/xfs_inode.h | 3 +- fs/xfs/xfs_iomap.c | 3 + fs/xfs/xfs_ondisk.h | 4 + fs/xfs/xfs_super.c | 8 ++ fs/xfs/xfs_verity.c | 214 +++ fs/xfs/xfs_verity.h | 19 8 files changed, 264 insertions(+), 1 deletion(-) create mode 100644 fs/xfs/xfs_verity.c create mode 100644 fs/xfs/xfs_verity.h diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index 92d88dc3c9f7..76174770d91a 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -130,6 +130,7 @@ xfs-$(CONFIG_XFS_POSIX_ACL) += xfs_acl.o xfs-$(CONFIG_SYSCTL) += xfs_sysctl.o xfs-$(CONFIG_COMPAT) += xfs_ioctl32.o xfs-$(CONFIG_EXPORTFS_BLOCK_OPS) += xfs_pnfs.o +xfs-$(CONFIG_FS_VERITY)+= xfs_verity.o # notify failure ifeq ($(CONFIG_MEMORY_FAILURE),y) diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c index 298b74245267..39d9038fbeee 100644 --- a/fs/xfs/libxfs/xfs_attr.c +++ b/fs/xfs/libxfs/xfs_attr.c @@ -26,6 +26,7 @@ #include "xfs_trace.h" #include "xfs_attr_item.h" #include "xfs_xattr.h" +#include "xfs_verity.h" struct kmem_cache *xfs_attr_intent_cache; @@ -1635,6 +1636,18 @@ xfs_attr_namecheck( return xfs_verify_pptr(mp, (struct xfs_parent_name_rec *)name); } + if (flags & XFS_ATTR_VERITY) { + /* Merkle tree pages are stored under u64 indexes */ + if (length == sizeof(__be64)) + return true; + + /* Verity descriptor blocks are held in a named attribute. */ + if (length == XFS_VERITY_DESCRIPTOR_NAME_LEN) + return true; + + return false; + } + return xfs_str_attr_namecheck(name, length); } diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 69d21e42c10a..a95f28cb049f 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -324,7 +324,8 @@ static inline bool xfs_inode_has_large_extent_counts(struct xfs_inode *ip) * inactivation completes, both flags will be cleared and the inode is a * plain old IRECLAIMABLE inode. */ -#define XFS_INACTIVATING (1 << 13) +#define XFS_INACTIVATING (1 << 13) +#define XFS_IVERITY_CONSTRUCTION (1 << 14) /* merkle tree construction */ /* All inode state flags related to inode reclaim. */ #define XFS_ALL_IRECLAIM_FLAGS (XFS_IRECLAIMABLE | \ diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index e0f3c5d709f6..0adde39f02a5 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -143,6 +143,9 @@ xfs_bmbt_to_iomap( (ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP)) iomap->flags |= IOMAP_F_DIRTY; + if (fsverity_active(VFS_I(ip))) + iomap->flags |= IOMAP_F_READ_VERITY; + iomap->validity_cookie = sequence_cookie; iomap->folio_ops = _iomap_folio_ops; return 0; diff --git a/fs/xfs/xfs_ondisk.h b/fs/xfs/xfs_ondisk.h index 9737b5a9f405..7fe88ccda519 100644 --- a/fs/xfs/xfs_ondisk.h +++ b/fs/xfs/xfs_ondisk.h @@ -189,6 +189,10 @@ xfs_check_ondisk_structs(void) XFS_CHECK_VALUE(XFS_DQ_BIGTIME_EXPIRY_MIN << XFS_DQ_BIGTIME_SHIFT, 4); XFS_CHECK_VALUE(XFS_DQ_BIGTIME_EXPIRY_MAX << XFS_DQ_BIGTIME_SHIFT, 16299260424LL); + + /* fs-verity descriptor xattr name */ + XFS_CHECK_VALUE(strlen(XFS_VERITY_DESCRIPTOR_NAME), + XFS_VERITY_DESCRIPTOR_NAME_LEN); } #endif /* __XFS_ONDISK_H */ diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index d40de32362b1..b6e99ed3b187 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -30,6 +30,7 @@ #include "xfs_filestream.h" #include "xfs_quota.h" #include "xfs_sysfs.h" +#include "xfs_verity.h" #include "xfs_ondisk.h" #include "xfs_rmap_item.h" #include "xfs_refcount_item.h" @@ -1489,6 +1490,9 @@ xfs_fs_fill_super( sb->s_quota_types = QTYPE_MASK_USR | QTYPE_MASK_GRP | QTYPE_MASK_PRJ; #endif
[Cluster-devel] [PATCH v2 09/23] iomap: allow filesystem to implement read path verification
Add IOMAP_F_READ_VERITY which indicates that iomap need to verify BIO (e.g. fs-verity) after I/O is completed. Add iomap_readpage_ops with only optional ->prepare_ioend() to allow filesystem to add callout used for configuring read path ioend. Mainly for setting ->bi_end_io() callout. Add iomap_folio_ops->verify_folio() for direct folio verification. The verification itself is suppose to happen on filesystem side. The verification is done when the BIO is processed by calling out ->bi_end_io(). Make iomap_read_end_io() exportable, so, it can be called back from filesystem callout after verification is done. The read path ioend are stored side by side with BIOs allocated from iomap_read_ioend_bioset. Signed-off-by: Andrey Albershteyn --- fs/iomap/buffered-io.c | 32 +--- include/linux/iomap.h | 26 ++ 2 files changed, 55 insertions(+), 3 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index d39be64b1da9..7e59c299c496 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -42,6 +42,7 @@ static inline struct iomap_page *to_iomap_page(struct folio *folio) } static struct bio_set iomap_ioend_bioset; +static struct bio_set iomap_read_ioend_bioset; static struct iomap_page * iomap_page_create(struct inode *inode, struct folio *folio, unsigned int flags) @@ -184,7 +185,7 @@ static void iomap_finish_folio_read(struct folio *folio, size_t offset, folio_unlock(folio); } -static void iomap_read_end_io(struct bio *bio) +void iomap_read_end_io(struct bio *bio) { int error = blk_status_to_errno(bio->bi_status); struct folio_iter fi; @@ -193,6 +194,7 @@ static void iomap_read_end_io(struct bio *bio) iomap_finish_folio_read(fi.folio, fi.offset, fi.length, error); bio_put(bio); } +EXPORT_SYMBOL_GPL(iomap_read_end_io); /** * iomap_read_inline_data - copy inline data into the page cache @@ -257,6 +259,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter, loff_t orig_pos = pos; size_t poff, plen; sector_t sector; + struct iomap_read_ioend *ioend; if (iomap->type == IOMAP_INLINE) return iomap_read_inline_data(iter, folio); @@ -269,6 +272,13 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter, if (iomap_block_needs_zeroing(iter, pos)) { folio_zero_range(folio, poff, plen); + if (iomap->flags & IOMAP_F_READ_VERITY) { + if (!iomap->folio_ops->verify_folio(folio, poff, plen)) { + folio_set_error(folio); + goto done; + } + } + iomap_set_range_uptodate(folio, iop, poff, plen); goto done; } @@ -290,8 +300,8 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter, if (ctx->rac) /* same as readahead_gfp_mask */ gfp |= __GFP_NORETRY | __GFP_NOWARN; - ctx->bio = bio_alloc(iomap->bdev, bio_max_segs(nr_vecs), -REQ_OP_READ, gfp); + ctx->bio = bio_alloc_bioset(iomap->bdev, bio_max_segs(nr_vecs), + REQ_OP_READ, GFP_NOFS, _read_ioend_bioset); /* * If the bio_alloc fails, try it again for a single page to * avoid having to deal with partial page reads. This emulates @@ -305,6 +315,13 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter, ctx->bio->bi_opf |= REQ_RAHEAD; ctx->bio->bi_iter.bi_sector = sector; ctx->bio->bi_end_io = iomap_read_end_io; + + ioend = container_of(ctx->bio, struct iomap_read_ioend, + read_inline_bio); + ioend->io_inode = iter->inode; + if (ctx->ops && ctx->ops->prepare_ioend) + ctx->ops->prepare_ioend(ioend); + bio_add_folio(ctx->bio, folio, plen, poff); } @@ -1813,6 +1830,15 @@ EXPORT_SYMBOL_GPL(iomap_writepages); static int __init iomap_init(void) { + int error = 0; + + error = bioset_init(_read_ioend_bioset, + 4 * (PAGE_SIZE / SECTOR_SIZE), + offsetof(struct iomap_read_ioend, read_inline_bio), + BIOSET_NEED_BVECS); + if (error) + return error; + return bioset_init(_ioend_bioset, 4 * (PAGE_SIZE / SECTOR_SIZE), offsetof(struct iomap_ioend, io_inline_bio), BIOSET_NEED_BVECS); diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 0fbce375265d..9a17b53309c9 100644 --- a/include/linux/iomap.h +++ b/incl
[Cluster-devel] [PATCH v2 17/23] xfs: initialize fs-verity on file open and cleanup on inode destruction
fs-verity will read and attach metadata (not the tree itself) from a disk for those inodes which already have fs-verity enabled. Signed-off-by: Andrey Albershteyn --- fs/xfs/xfs_file.c | 8 fs/xfs/xfs_super.c | 2 ++ 2 files changed, 10 insertions(+) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 705250f9f90a..947b5c436172 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -31,6 +31,7 @@ #include #include #include +#include static const struct vm_operations_struct xfs_file_vm_ops; @@ -1169,9 +1170,16 @@ xfs_file_open( struct inode*inode, struct file *file) { + int error = 0; + if (xfs_is_shutdown(XFS_M(inode->i_sb))) return -EIO; file->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC | FMODE_BUF_WASYNC; + + error = fsverity_file_open(inode, file); + if (error) + return error; + return generic_file_open(inode, file); } diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index d6f22cb94ee2..d40de32362b1 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -46,6 +46,7 @@ #include #include #include +#include static const struct super_operations xfs_super_operations; @@ -667,6 +668,7 @@ xfs_fs_destroy_inode( ASSERT(!rwsem_is_locked(>i_rwsem)); XFS_STATS_INC(ip->i_mount, vn_rele); XFS_STATS_INC(ip->i_mount, vn_remove); + fsverity_cleanup_inode(inode); xfs_inode_mark_reclaimable(ip); } -- 2.38.4
[Cluster-devel] [PATCH v2 02/23] xfs: add parent pointer support to attribute code
From: Allison Henderson Add the new parent attribute type. XFS_ATTR_PARENT is used only for parent pointer entries; it uses reserved blocks like XFS_ATTR_ROOT. Signed-off-by: Mark Tinguely Signed-off-by: Dave Chinner Signed-off-by: Allison Henderson Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_attr.c | 4 +++- fs/xfs/libxfs/xfs_da_format.h | 5 - fs/xfs/libxfs/xfs_log_format.h | 1 + fs/xfs/scrub/attr.c| 2 +- 4 files changed, 9 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c index b1dbed7655e8..101823772bf9 100644 --- a/fs/xfs/libxfs/xfs_attr.c +++ b/fs/xfs/libxfs/xfs_attr.c @@ -976,11 +976,13 @@ xfs_attr_set( struct xfs_inode*dp = args->dp; struct xfs_mount*mp = dp->i_mount; struct xfs_trans_restres; - boolrsvd = (args->attr_filter & XFS_ATTR_ROOT); + boolrsvd; int error, local; int rmt_blks = 0; unsigned inttotal; + rsvd = (args->attr_filter & (XFS_ATTR_ROOT | XFS_ATTR_PARENT)) != 0; + if (xfs_is_shutdown(dp->i_mount)) return -EIO; diff --git a/fs/xfs/libxfs/xfs_da_format.h b/fs/xfs/libxfs/xfs_da_format.h index 25e2841084e1..3dc03968bba6 100644 --- a/fs/xfs/libxfs/xfs_da_format.h +++ b/fs/xfs/libxfs/xfs_da_format.h @@ -688,12 +688,15 @@ struct xfs_attr3_leafblock { #defineXFS_ATTR_LOCAL_BIT 0 /* attr is stored locally */ #defineXFS_ATTR_ROOT_BIT 1 /* limit access to trusted attrs */ #defineXFS_ATTR_SECURE_BIT 2 /* limit access to secure attrs */ +#defineXFS_ATTR_PARENT_BIT 3 /* parent pointer attrs */ #defineXFS_ATTR_INCOMPLETE_BIT 7 /* attr in middle of create/delete */ #define XFS_ATTR_LOCAL (1u << XFS_ATTR_LOCAL_BIT) #define XFS_ATTR_ROOT (1u << XFS_ATTR_ROOT_BIT) #define XFS_ATTR_SECURE(1u << XFS_ATTR_SECURE_BIT) +#define XFS_ATTR_PARENT(1u << XFS_ATTR_PARENT_BIT) #define XFS_ATTR_INCOMPLETE(1u << XFS_ATTR_INCOMPLETE_BIT) -#define XFS_ATTR_NSP_ONDISK_MASK (XFS_ATTR_ROOT | XFS_ATTR_SECURE) +#define XFS_ATTR_NSP_ONDISK_MASK \ + (XFS_ATTR_ROOT | XFS_ATTR_SECURE | XFS_ATTR_PARENT) /* * Alignment for namelist and valuelist entries (since they are mixed diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h index ae9c99762a24..727b5a858028 100644 --- a/fs/xfs/libxfs/xfs_log_format.h +++ b/fs/xfs/libxfs/xfs_log_format.h @@ -967,6 +967,7 @@ struct xfs_icreate_log { */ #define XFS_ATTRI_FILTER_MASK (XFS_ATTR_ROOT | \ XFS_ATTR_SECURE | \ +XFS_ATTR_PARENT | \ XFS_ATTR_INCOMPLETE) /* diff --git a/fs/xfs/scrub/attr.c b/fs/xfs/scrub/attr.c index 31529b9bf389..9d2e33743ecd 100644 --- a/fs/xfs/scrub/attr.c +++ b/fs/xfs/scrub/attr.c @@ -441,7 +441,7 @@ xchk_xattr_rec( /* Retrieve the entry and check it. */ hash = be32_to_cpu(ent->hashval); badflags = ~(XFS_ATTR_LOCAL | XFS_ATTR_ROOT | XFS_ATTR_SECURE | - XFS_ATTR_INCOMPLETE); + XFS_ATTR_INCOMPLETE | XFS_ATTR_PARENT); if ((ent->flags & badflags) != 0) xchk_da_set_corrupt(ds, level); if (ent->flags & XFS_ATTR_LOCAL) { -- 2.38.4
[Cluster-devel] [PATCH v2 14/23] xfs: add attribute type for fs-verity
The Merkle tree blocks and descriptor are stored in the extended attributes of the inode. Add new attribute type for fs-verity metadata. Add XFS_ATTR_INTERNAL_MASK to skip parent pointer and fs-verity attributes as those are only for internal use. While we're at it add a few comments in relevant places that internally visible attributes are not suppose to be handled via interface defined in xfs_xattr.c. Signed-off-by: Andrey Albershteyn --- fs/xfs/libxfs/xfs_da_format.h | 10 +- fs/xfs/libxfs/xfs_log_format.h | 1 + fs/xfs/xfs_ioctl.c | 5 + fs/xfs/xfs_trace.h | 1 + fs/xfs/xfs_xattr.c | 9 + 5 files changed, 25 insertions(+), 1 deletion(-) diff --git a/fs/xfs/libxfs/xfs_da_format.h b/fs/xfs/libxfs/xfs_da_format.h index 75b13807145d..2b5967befc2e 100644 --- a/fs/xfs/libxfs/xfs_da_format.h +++ b/fs/xfs/libxfs/xfs_da_format.h @@ -689,14 +689,22 @@ struct xfs_attr3_leafblock { #defineXFS_ATTR_ROOT_BIT 1 /* limit access to trusted attrs */ #defineXFS_ATTR_SECURE_BIT 2 /* limit access to secure attrs */ #defineXFS_ATTR_PARENT_BIT 3 /* parent pointer attrs */ +#defineXFS_ATTR_VERITY_BIT 4 /* verity merkle tree and descriptor */ #defineXFS_ATTR_INCOMPLETE_BIT 7 /* attr in middle of create/delete */ #define XFS_ATTR_LOCAL (1u << XFS_ATTR_LOCAL_BIT) #define XFS_ATTR_ROOT (1u << XFS_ATTR_ROOT_BIT) #define XFS_ATTR_SECURE(1u << XFS_ATTR_SECURE_BIT) #define XFS_ATTR_PARENT(1u << XFS_ATTR_PARENT_BIT) +#define XFS_ATTR_VERITY(1u << XFS_ATTR_VERITY_BIT) #define XFS_ATTR_INCOMPLETE(1u << XFS_ATTR_INCOMPLETE_BIT) #define XFS_ATTR_NSP_ONDISK_MASK \ - (XFS_ATTR_ROOT | XFS_ATTR_SECURE | XFS_ATTR_PARENT) + (XFS_ATTR_ROOT | XFS_ATTR_SECURE | XFS_ATTR_PARENT | \ +XFS_ATTR_VERITY) + +/* + * Internal attributes not exposed to the user + */ +#define XFS_ATTR_INTERNAL_MASK (XFS_ATTR_PARENT | XFS_ATTR_VERITY) /* * Alignment for namelist and valuelist entries (since they are mixed diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h index 727b5a858028..678eacb7925c 100644 --- a/fs/xfs/libxfs/xfs_log_format.h +++ b/fs/xfs/libxfs/xfs_log_format.h @@ -968,6 +968,7 @@ struct xfs_icreate_log { #define XFS_ATTRI_FILTER_MASK (XFS_ATTR_ROOT | \ XFS_ATTR_SECURE | \ XFS_ATTR_PARENT | \ +XFS_ATTR_VERITY | \ XFS_ATTR_INCOMPLETE) /* diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 55bb01173cde..3d6d680b6cf3 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -351,6 +351,11 @@ static unsigned int xfs_attr_filter( u32 ioc_flags) { + /* +* Only externally visible attributes should be specified here. +* Internally used attributes (such as parent pointers or fs-verity) +* should not be exposed to userspace. +*/ if (ioc_flags & XFS_IOC_ATTR_ROOT) return XFS_ATTR_ROOT; if (ioc_flags & XFS_IOC_ATTR_SECURE) diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 9c0006c55fec..e842b9d145cb 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -79,6 +79,7 @@ struct xfs_perag; #define XFS_ATTR_FILTER_FLAGS \ { XFS_ATTR_ROOT,"ROOT" }, \ { XFS_ATTR_SECURE, "SECURE" }, \ + { XFS_ATTR_VERITY, "VERITY" }, \ { XFS_ATTR_INCOMPLETE, "INCOMPLETE" } DECLARE_EVENT_CLASS(xfs_attr_list_class, diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c index 7b9a0ed1b11f..5a71797fbd44 100644 --- a/fs/xfs/xfs_xattr.c +++ b/fs/xfs/xfs_xattr.c @@ -20,6 +20,12 @@ #include +/* + * This file defines interface to work with externally visible extended + * attributes, such as those in system or security namespaces. This interface + * should not be used for internally used attributes (consider xfs_attr.c). + */ + /* * Get permission to use log-assisted atomic exchange of file extents. * @@ -234,6 +240,9 @@ xfs_xattr_put_listent( ASSERT(context->count >= 0); + if (flags & XFS_ATTR_INTERNAL_MASK) + return; + if (flags & XFS_ATTR_ROOT) { #ifdef CONFIG_XFS_POSIX_ACL if (namelen == SGI_ACL_FILE_SIZE && -- 2.38.4
[Cluster-devel] [PATCH v2 04/23] xfs: Add xfs_verify_pptr
From: Allison Henderson Attribute names of parent pointers are not strings. So we need to modify attr_namecheck to verify parent pointer records when the XFS_ATTR_PARENT flag is set. Signed-off-by: Allison Henderson Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_attr.c | 47 --- fs/xfs/libxfs/xfs_attr.h | 3 ++- fs/xfs/libxfs/xfs_da_format.h | 8 ++ fs/xfs/scrub/attr.c | 2 +- fs/xfs/xfs_attr_item.c| 11 +--- fs/xfs/xfs_attr_list.c| 17 + 6 files changed, 74 insertions(+), 14 deletions(-) diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c index 101823772bf9..711022742e34 100644 --- a/fs/xfs/libxfs/xfs_attr.c +++ b/fs/xfs/libxfs/xfs_attr.c @@ -1577,9 +1577,33 @@ xfs_attr_node_get( return error; } -/* Returns true if the attribute entry name is valid. */ -bool -xfs_attr_namecheck( +/* + * Verify parent pointer attribute is valid. + * Return true on success or false on failure + */ +STATIC bool +xfs_verify_pptr( + struct xfs_mount*mp, + const struct xfs_parent_name_rec*rec) +{ + xfs_ino_t p_ino; + xfs_dir2_dataptr_t p_diroffset; + + p_ino = be64_to_cpu(rec->p_ino); + p_diroffset = be32_to_cpu(rec->p_diroffset); + + if (!xfs_verify_ino(mp, p_ino)) + return false; + + if (p_diroffset > XFS_DIR2_MAX_DATAPTR) + return false; + + return true; +} + +/* Returns true if the string attribute entry name is valid. */ +static bool +xfs_str_attr_namecheck( const void *name, size_t length) { @@ -1594,6 +1618,23 @@ xfs_attr_namecheck( return !memchr(name, 0, length); } +/* Returns true if the attribute entry name is valid. */ +bool +xfs_attr_namecheck( + struct xfs_mount*mp, + const void *name, + size_t length, + int flags) +{ + if (flags & XFS_ATTR_PARENT) { + if (length != sizeof(struct xfs_parent_name_rec)) + return false; + return xfs_verify_pptr(mp, (struct xfs_parent_name_rec *)name); + } + + return xfs_str_attr_namecheck(name, length); +} + int __init xfs_attr_intent_init_cache(void) { diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h index 3e81f3f48560..b79dae788cfb 100644 --- a/fs/xfs/libxfs/xfs_attr.h +++ b/fs/xfs/libxfs/xfs_attr.h @@ -547,7 +547,8 @@ int xfs_attr_get(struct xfs_da_args *args); int xfs_attr_set(struct xfs_da_args *args); int xfs_attr_set_iter(struct xfs_attr_intent *attr); int xfs_attr_remove_iter(struct xfs_attr_intent *attr); -bool xfs_attr_namecheck(const void *name, size_t length); +bool xfs_attr_namecheck(struct xfs_mount *mp, const void *name, size_t length, + int flags); int xfs_attr_calc_size(struct xfs_da_args *args, int *local); void xfs_init_attr_trans(struct xfs_da_args *args, struct xfs_trans_res *tres, unsigned int *total); diff --git a/fs/xfs/libxfs/xfs_da_format.h b/fs/xfs/libxfs/xfs_da_format.h index b02b67f1999e..75b13807145d 100644 --- a/fs/xfs/libxfs/xfs_da_format.h +++ b/fs/xfs/libxfs/xfs_da_format.h @@ -731,6 +731,14 @@ xfs_attr3_leaf_name(xfs_attr_leafblock_t *leafp, int idx) return &((char *)leafp)[be16_to_cpu(entries[idx].nameidx)]; } +static inline int +xfs_attr3_leaf_flags(xfs_attr_leafblock_t *leafp, int idx) +{ + struct xfs_attr_leaf_entry *entries = xfs_attr3_leaf_entryp(leafp); + + return entries[idx].flags; +} + static inline xfs_attr_leaf_name_remote_t * xfs_attr3_leaf_name_remote(xfs_attr_leafblock_t *leafp, int idx) { diff --git a/fs/xfs/scrub/attr.c b/fs/xfs/scrub/attr.c index 9d2e33743ecd..2a79a13cb600 100644 --- a/fs/xfs/scrub/attr.c +++ b/fs/xfs/scrub/attr.c @@ -129,7 +129,7 @@ xchk_xattr_listent( } /* Does this name make sense? */ - if (!xfs_attr_namecheck(name, namelen)) { + if (!xfs_attr_namecheck(sx->sc->mp, name, namelen, flags)) { xchk_fblock_set_corrupt(sx->sc, XFS_ATTR_FORK, args.blkno); return; } diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c index 95e9ecbb4a67..da807f286a09 100644 --- a/fs/xfs/xfs_attr_item.c +++ b/fs/xfs/xfs_attr_item.c @@ -593,7 +593,8 @@ xfs_attri_item_recover( */ attrp = >attri_format; if (!xfs_attri_validate(mp, attrp) || - !xfs_attr_namecheck(nv->name.i_addr, nv->name.i_len)) + !xfs_attr_namecheck(mp, nv->name.i_addr, nv->name.i_len, + attrp->alfi_attr_filter)) return -EFSCORRUPTED; error = xlog_recover_iget(mp, attrp->alfi_ino, ); @@ -804,7 +805,8 @@ xlog_recover_attri_commit_pass2( } attr_name = item->ri_buf[i].i_addr; - if
[Cluster-devel] [PATCH v2 21/23] xfs: handle merkle tree block size != fs blocksize != PAGE_SIZE
In case of different Merkle tree block size fs-verity expects ->read_merkle_tree_page() to return Merkle tree page filled with Merkle tree blocks. The XFS stores each merkle tree block under extended attribute. Those attributes are addressed by block offset into Merkle tree. This patch make ->read_merkle_tree_page() to fetch multiple merkle tree blocks based on size ratio. Also the reference to each xfs_buf is passed with page->private to ->drop_page(). Signed-off-by: Andrey Albershteyn --- fs/xfs/xfs_verity.c | 74 +++-- fs/xfs/xfs_verity.h | 8 + 2 files changed, 66 insertions(+), 16 deletions(-) diff --git a/fs/xfs/xfs_verity.c b/fs/xfs/xfs_verity.c index a9874ff4efcd..ef0aff216f06 100644 --- a/fs/xfs/xfs_verity.c +++ b/fs/xfs/xfs_verity.c @@ -134,6 +134,10 @@ xfs_read_merkle_tree_page( struct page *page = NULL; __be64 name = cpu_to_be64(index << PAGE_SHIFT); uint32_tbs = 1 << log_blocksize; + int blocks_per_page = + (1 << (PAGE_SHIFT - log_blocksize)); + int n = 0; + int offset = 0; struct xfs_da_args args = { .dp = ip, .attr_filter= XFS_ATTR_VERITY, @@ -143,26 +147,59 @@ xfs_read_merkle_tree_page( .valuelen = bs, }; int error = 0; + boolis_checked = true; + struct xfs_verity_buf_list *buf_list; page = alloc_page(GFP_KERNEL); if (!page) return ERR_PTR(-ENOMEM); - error = xfs_attr_get(); - if (error) { - kmem_free(args.value); - xfs_buf_rele(args.bp); + buf_list = kzalloc(sizeof(struct xfs_verity_buf_list), GFP_KERNEL); + if (!buf_list) { put_page(page); - return ERR_PTR(-EFAULT); + return ERR_PTR(-ENOMEM); } - if (args.bp->b_flags & XBF_VERITY_CHECKED) + /* +* Fill the page with Merkle tree blocks. The blcoks_per_page is higher +* than 1 when fs block size != PAGE_SIZE or Merkle tree block size != +* PAGE SIZE +*/ + for (n = 0; n < blocks_per_page; n++) { + offset = bs * n; + name = cpu_to_be64(((index << PAGE_SHIFT) + offset)); + args.name = (const uint8_t *) + + error = xfs_attr_get(); + if (error) { + kmem_free(args.value); + /* +* No more Merkle tree blocks (e.g. this was the last +* block of the tree) +*/ + if (error == -ENOATTR) + break; + xfs_buf_rele(args.bp); + put_page(page); + kmem_free(buf_list); + return ERR_PTR(-EFAULT); + } + + buf_list->bufs[buf_list->buf_count++] = args.bp; + + /* One of the buffers was dropped */ + if (!(args.bp->b_flags & XBF_VERITY_CHECKED)) + is_checked = false; + + memcpy(page_address(page) + offset, args.value, args.valuelen); + kmem_free(args.value); + args.value = NULL; + } + + if (is_checked) SetPageChecked(page); + page->private = (unsigned long)buf_list; - page->private = (unsigned long)args.bp; - memcpy(page_address(page), args.value, args.valuelen); - - kmem_free(args.value); return page; } @@ -191,16 +228,21 @@ xfs_write_merkle_tree_block( static void xfs_drop_page( - struct page *page) + struct page *page) { - struct xfs_buf *buf = (struct xfs_buf *)page->private; + int i = 0; + struct xfs_verity_buf_list *buf_list = + (struct xfs_verity_buf_list *)page->private; - ASSERT(buf != NULL); + ASSERT(buf_list != NULL); - if (PageChecked(page)) - buf->b_flags |= XBF_VERITY_CHECKED; + for (i = 0; i < buf_list->buf_count; i++) { + if (PageChecked(page)) + buf_list->bufs[i]->b_flags |= XBF_VERITY_CHECKED; + xfs_buf_rele(buf_list->bufs[i]); + } - xfs_buf_rele(buf); + kmem_free(buf_list); put_page(page); } diff --git a/fs/xfs/xfs_verity.h b/fs/xfs/xfs_verity.h index ae5d87ca32a8..433b2f4ae3bc 100644 --- a/fs/xfs/xfs_verity.h +++ b/fs/xfs/xfs_verity.h @@ -16,4 +16,12 @@ extern const struct fsverity_operations xfs_verity_ops; #define xfs_verity_ops NULL #endif /* CONFIG_FS_VERITY */ +/* Minimal Mer
[Cluster-devel] [PATCH v2 08/23] iomap: hoist iomap_readpage_ctx from the iomap_readahead/_folio
Make filesystems create readpage context, similar as iomap_writepage_ctx in write path. This will allow filesystem to pass _ops to iomap for ioend configuration (->prepare_ioend) which in turn would be used to set BIO end callout (bio->bi_end_io). This will be utilized in further patches by fs-verity to verify pages on BIO completion by XFS. Signed-off-by: Andrey Albershteyn --- fs/erofs/data.c| 12 +++-- fs/gfs2/aops.c | 10 ++-- fs/iomap/buffered-io.c | 57 -- fs/xfs/xfs_aops.c | 16 +--- fs/zonefs/file.c | 12 +++-- include/linux/iomap.h | 13 -- 6 files changed, 73 insertions(+), 47 deletions(-) diff --git a/fs/erofs/data.c b/fs/erofs/data.c index e16545849ea7..189591249f61 100644 --- a/fs/erofs/data.c +++ b/fs/erofs/data.c @@ -344,12 +344,20 @@ int erofs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, */ static int erofs_read_folio(struct file *file, struct folio *folio) { - return iomap_read_folio(folio, _iomap_ops); + struct iomap_readpage_ctx ctx = { + .cur_folio = folio, + }; + + return iomap_read_folio(, _iomap_ops); } static void erofs_readahead(struct readahead_control *rac) { - return iomap_readahead(rac, _iomap_ops); + struct iomap_readpage_ctx ctx = { + .rac = rac, + }; + + return iomap_readahead(, _iomap_ops); } static sector_t erofs_bmap(struct address_space *mapping, sector_t block) diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c index a5f4be6b9213..2764e1e99e8b 100644 --- a/fs/gfs2/aops.c +++ b/fs/gfs2/aops.c @@ -453,10 +453,13 @@ static int gfs2_read_folio(struct file *file, struct folio *folio) struct gfs2_inode *ip = GFS2_I(inode); struct gfs2_sbd *sdp = GFS2_SB(inode); int error; + struct iomap_readpage_ctx ctx = { + .cur_folio = folio, + }; if (!gfs2_is_jdata(ip) || (i_blocksize(inode) == PAGE_SIZE && !folio_buffers(folio))) { - error = iomap_read_folio(folio, _iomap_ops); + error = iomap_read_folio(, _iomap_ops); } else if (gfs2_is_stuffed(ip)) { error = stuffed_readpage(ip, >page); folio_unlock(folio); @@ -528,13 +531,16 @@ static void gfs2_readahead(struct readahead_control *rac) { struct inode *inode = rac->mapping->host; struct gfs2_inode *ip = GFS2_I(inode); + struct iomap_readpage_ctx ctx = { + .rac = rac, + }; if (gfs2_is_stuffed(ip)) ; else if (gfs2_is_jdata(ip)) mpage_readahead(rac, gfs2_block_map); else - iomap_readahead(rac, _iomap_ops); + iomap_readahead(, _iomap_ops); } /** diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 6f4c97a6d7e9..d39be64b1da9 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -194,13 +194,6 @@ static void iomap_read_end_io(struct bio *bio) bio_put(bio); } -struct iomap_readpage_ctx { - struct folio*cur_folio; - boolcur_folio_in_bio; - struct bio *bio; - struct readahead_control *rac; -}; - /** * iomap_read_inline_data - copy inline data into the page cache * @iter: iteration structure @@ -325,32 +318,29 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter, return pos - orig_pos + plen; } -int iomap_read_folio(struct folio *folio, const struct iomap_ops *ops) +int iomap_read_folio(struct iomap_readpage_ctx *ctx, const struct iomap_ops *ops) { struct iomap_iter iter = { - .inode = folio->mapping->host, - .pos= folio_pos(folio), - .len= folio_size(folio), - }; - struct iomap_readpage_ctx ctx = { - .cur_folio = folio, + .inode = ctx->cur_folio->mapping->host, + .pos= folio_pos(ctx->cur_folio), + .len= folio_size(ctx->cur_folio), }; int ret; trace_iomap_readpage(iter.inode, 1); while ((ret = iomap_iter(, ops)) > 0) - iter.processed = iomap_readpage_iter(, , 0); + iter.processed = iomap_readpage_iter(, ctx, 0); if (ret < 0) - folio_set_error(folio); + folio_set_error(ctx->cur_folio); - if (ctx.bio) { - submit_bio(ctx.bio); - WARN_ON_ONCE(!ctx.cur_folio_in_bio); + if (ctx->bio) { + submit_bio(ctx->bio); + WARN_ON_ONCE(!ctx->cur_folio_in_bio); } else { - WARN_ON_ONCE(ctx.cur_folio_in_bio); - folio_unlock(folio); + WARN_ON_ONCE(ctx->cur_folio_in_bio); + folio_
[Cluster-devel] [PATCH v2 11/23] xfs: add XFS_DA_OP_BUFFER to make xfs_attr_get() return buffer
One of essential ideas of fs-verity is that pages which are already verified won't need to be re-verified if they still in page cache. The XFS stores Merkle tree blocks in extended attributes. Each attribute has one Merkle tree block. We can not directly mark underlying xfs_buf's pages as checked. The are not aligned with xattr value and we don't have a reference to that buffer which is immediately release when value is copied out. One way to track that this block was verified is to mark xattr's buffer as verified. If buffer is evicted the incore XBF_VERITY_CHECKED flag is lost. When the xattr is read again xfs_attr_get() returns new buffer without the flag. The flag is then used to tell fs-verity if it's new page or cached one. This patch adds XFS_DA_OP_BUFFER to tell xfs_attr_get() to xfs_buf_hold() underlying buffer and return it as xfs_da_args->bp. The caller must then xfs_buf_rele() the buffer. Signed-off-by: Andrey Albershteyn --- fs/xfs/libxfs/xfs_attr.c| 5 - fs/xfs/libxfs/xfs_attr_leaf.c | 7 +++ fs/xfs/libxfs/xfs_attr_remote.c | 13 +++-- fs/xfs/libxfs/xfs_da_btree.h| 5 - 4 files changed, 26 insertions(+), 4 deletions(-) diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c index 711022742e34..298b74245267 100644 --- a/fs/xfs/libxfs/xfs_attr.c +++ b/fs/xfs/libxfs/xfs_attr.c @@ -251,6 +251,8 @@ xfs_attr_get_ilocked( * If the attribute is found, but exceeds the size limit set by the caller in * args->valuelen, return -ERANGE with the size of the attribute that was found * in args->valuelen. + * + * Using XFS_DA_OP_BUFFER the caller have to release the buffer args->bp. */ int xfs_attr_get( @@ -269,7 +271,8 @@ xfs_attr_get( args->hashval = xfs_da_hashname(args->name, args->namelen); /* Entirely possible to look up a name which doesn't exist */ - args->op_flags = XFS_DA_OP_OKNOENT; + args->op_flags = XFS_DA_OP_OKNOENT | + (args->op_flags & XFS_DA_OP_BUFFER); lock_mode = xfs_ilock_attr_map_shared(args->dp); error = xfs_attr_get_ilocked(args); diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c index beee51ad75ce..112bb2604c89 100644 --- a/fs/xfs/libxfs/xfs_attr_leaf.c +++ b/fs/xfs/libxfs/xfs_attr_leaf.c @@ -2533,6 +2533,13 @@ xfs_attr3_leaf_getvalue( name_loc = xfs_attr3_leaf_name_local(leaf, args->index); ASSERT(name_loc->namelen == args->namelen); ASSERT(memcmp(args->name, name_loc->nameval, args->namelen) == 0); + + /* must be released by the caller */ + if (args->op_flags & XFS_DA_OP_BUFFER) { + xfs_buf_hold(bp); + args->bp = bp; + } + return xfs_attr_copy_value(args, _loc->nameval[args->namelen], be16_to_cpu(name_loc->valuelen)); diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c index d440393b40eb..72908e0e1c86 100644 --- a/fs/xfs/libxfs/xfs_attr_remote.c +++ b/fs/xfs/libxfs/xfs_attr_remote.c @@ -424,9 +424,18 @@ xfs_attr_rmtval_get( error = xfs_attr_rmtval_copyout(mp, bp, args->dp->i_ino, , , ); - xfs_buf_relse(bp); - if (error) + xfs_buf_unlock(bp); + /* must be released by the caller */ + if (args->op_flags & XFS_DA_OP_BUFFER) + args->bp = bp; + else + xfs_buf_rele(bp); + + if (error) { + if (args->op_flags & XFS_DA_OP_BUFFER) + xfs_buf_rele(args->bp); return error; + } /* roll attribute extent map forwards */ lblkno += map[i].br_blockcount; diff --git a/fs/xfs/libxfs/xfs_da_btree.h b/fs/xfs/libxfs/xfs_da_btree.h index a4b29827603f..269d26730bca 100644 --- a/fs/xfs/libxfs/xfs_da_btree.h +++ b/fs/xfs/libxfs/xfs_da_btree.h @@ -61,6 +61,7 @@ typedef struct xfs_da_args { uint8_t filetype; /* filetype of inode for directories */ void*value; /* set of bytes (maybe contain NULLs) */ int valuelen; /* length of value */ + struct xfs_buf *bp;/* OUT: xfs_buf which contains the attr */ unsigned intattr_filter;/* XFS_ATTR_{ROOT,SECURE,INCOMPLETE} */ unsigned intattr_flags; /* XATTR_{CREATE,REPLACE} */ xfs_dahash_thashval;
[Cluster-devel] [PATCH v2 16/23] xfs: add inode on-disk VERITY flag
Add flag to mark inodes which have fs-verity enabled on them (i.e. descriptor exist and tree is built). Signed-off-by: Andrey Albershteyn --- fs/ioctl.c | 4 fs/xfs/libxfs/xfs_format.h | 4 +++- fs/xfs/xfs_inode.c | 2 ++ fs/xfs/xfs_iops.c | 2 ++ include/uapi/linux/fs.h| 1 + 5 files changed, 12 insertions(+), 1 deletion(-) diff --git a/fs/ioctl.c b/fs/ioctl.c index 5b2481cd4750..a274b33b2fd0 100644 --- a/fs/ioctl.c +++ b/fs/ioctl.c @@ -480,6 +480,8 @@ void fileattr_fill_xflags(struct fileattr *fa, u32 xflags) fa->flags |= FS_DAX_FL; if (fa->fsx_xflags & FS_XFLAG_PROJINHERIT) fa->flags |= FS_PROJINHERIT_FL; + if (fa->fsx_xflags & FS_XFLAG_VERITY) + fa->flags |= FS_VERITY_FL; } EXPORT_SYMBOL(fileattr_fill_xflags); @@ -510,6 +512,8 @@ void fileattr_fill_flags(struct fileattr *fa, u32 flags) fa->fsx_xflags |= FS_XFLAG_DAX; if (fa->flags & FS_PROJINHERIT_FL) fa->fsx_xflags |= FS_XFLAG_PROJINHERIT; + if (fa->flags & FS_VERITY_FL) + fa->fsx_xflags |= FS_XFLAG_VERITY; } EXPORT_SYMBOL(fileattr_fill_flags); diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index ef617be2839c..ccb2ae5c2c93 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -1070,16 +1070,18 @@ static inline void xfs_dinode_put_rdev(struct xfs_dinode *dip, xfs_dev_t rdev) #define XFS_DIFLAG2_COWEXTSIZE_BIT 2 /* copy on write extent size hint */ #define XFS_DIFLAG2_BIGTIME_BIT3 /* big timestamps */ #define XFS_DIFLAG2_NREXT64_BIT 4 /* large extent counters */ +#define XFS_DIFLAG2_VERITY_BIT 5 /* inode sealed by fsverity */ #define XFS_DIFLAG2_DAX(1 << XFS_DIFLAG2_DAX_BIT) #define XFS_DIFLAG2_REFLINK (1 << XFS_DIFLAG2_REFLINK_BIT) #define XFS_DIFLAG2_COWEXTSIZE (1 << XFS_DIFLAG2_COWEXTSIZE_BIT) #define XFS_DIFLAG2_BIGTIME(1 << XFS_DIFLAG2_BIGTIME_BIT) #define XFS_DIFLAG2_NREXT64(1 << XFS_DIFLAG2_NREXT64_BIT) +#define XFS_DIFLAG2_VERITY (1 << XFS_DIFLAG2_VERITY_BIT) #define XFS_DIFLAG2_ANY \ (XFS_DIFLAG2_DAX | XFS_DIFLAG2_REFLINK | XFS_DIFLAG2_COWEXTSIZE | \ -XFS_DIFLAG2_BIGTIME | XFS_DIFLAG2_NREXT64) +XFS_DIFLAG2_BIGTIME | XFS_DIFLAG2_NREXT64 | XFS_DIFLAG2_VERITY) static inline bool xfs_dinode_has_bigtime(const struct xfs_dinode *dip) { diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 5808abab786c..3b2bf9e7580b 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -634,6 +634,8 @@ xfs_ip2xflags( flags |= FS_XFLAG_DAX; if (ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) flags |= FS_XFLAG_COWEXTSIZE; + if (ip->i_diflags2 & XFS_DIFLAG2_VERITY) + flags |= FS_XFLAG_VERITY; } if (xfs_inode_has_attr_fork(ip)) diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 24718adb3c16..5398be75a76a 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -1232,6 +1232,8 @@ xfs_diflags_to_iflags( flags |= S_NOATIME; if (init && xfs_inode_should_enable_dax(ip)) flags |= S_DAX; + if (xflags & FS_XFLAG_VERITY) + flags |= S_VERITY; /* * S_DAX can only be set during inode initialization and is never set by diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index b7b56871029c..5172a2eb902c 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -140,6 +140,7 @@ struct fsxattr { #define FS_XFLAG_FILESTREAM0x4000 /* use filestream allocator */ #define FS_XFLAG_DAX 0x8000 /* use DAX for IO */ #define FS_XFLAG_COWEXTSIZE0x0001 /* CoW extent size allocator hint */ +#define FS_XFLAG_VERITY0x0002 /* fs-verity sealed inode */ #define FS_XFLAG_HASATTR 0x8000 /* no DIFLAG for this */ /* the read-only stuff doesn't really belong here, but any other place is -- 2.38.4
[Cluster-devel] [PATCH v2 13/23] xfs: add iomap's readpage operations
The read IO path provides callout for configuring ioend. This allows filesystem to add verification of completed BIOs. The xfs_prepare_read_ioend() configures bio->bi_end_io which places verification task in the workqueue. The task does fs-verity verification and then call back to the iomap to finish IO. This patch add callouts implementation to verify pages with fs-verity. Also implements folio operation .verify_folio for direct folio verification by fs-verity. Signed-off-by: Andrey Albershteyn --- fs/xfs/xfs_aops.c | 45 + fs/xfs/xfs_iomap.c | 11 +++ fs/xfs/xfs_linux.h | 1 + 3 files changed, 57 insertions(+) diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index daa0dd4768fb..2a3ab5afd665 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -548,6 +548,49 @@ xfs_vm_bmap( return iomap_bmap(mapping, block, _read_iomap_ops); } +static void +xfs_read_work_end_io( + struct work_struct *work) +{ + struct iomap_read_ioend *ioend = + container_of(work, struct iomap_read_ioend, work); + struct bio *bio = >read_inline_bio; + + fsverity_verify_bio(bio); + iomap_read_end_io(bio); + /* +* The iomap_read_ioend has been freed by bio_put() in +* iomap_read_end_io() +*/ +} + +static void +xfs_read_end_io( + struct bio *bio) +{ + struct iomap_read_ioend *ioend = + container_of(bio, struct iomap_read_ioend, read_inline_bio); + struct xfs_inode*ip = XFS_I(ioend->io_inode); + + WARN_ON_ONCE(!queue_work(ip->i_mount->m_postread_workqueue, + >work)); +} + +static void +xfs_prepare_read_ioend( + struct iomap_read_ioend *ioend) +{ + if (!fsverity_active(ioend->io_inode)) + return; + + INIT_WORK(>work, _read_work_end_io); + ioend->read_inline_bio.bi_end_io = _read_end_io; +} + +static const struct iomap_readpage_ops xfs_readpage_ops = { + .prepare_ioend = _prepare_read_ioend, +}; + STATIC int xfs_vm_read_folio( struct file *unused, @@ -555,6 +598,7 @@ xfs_vm_read_folio( { struct iomap_readpage_ctx ctx = { .cur_folio = folio, + .ops= _readpage_ops, }; return iomap_read_folio(, _read_iomap_ops); @@ -566,6 +610,7 @@ xfs_vm_readahead( { struct iomap_readpage_ctx ctx = { .rac= rac, + .ops= _readpage_ops, }; iomap_readahead(, _read_iomap_ops); diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 285885c308bd..e0f3c5d709f6 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -27,6 +27,7 @@ #include "xfs_dquot_item.h" #include "xfs_dquot.h" #include "xfs_reflink.h" +#include "xfs_verity.h" #define XFS_ALLOC_ALIGN(mp, off) \ (((off) >> mp->m_allocsize_log) << mp->m_allocsize_log) @@ -83,8 +84,18 @@ xfs_iomap_valid( return true; } +static bool +xfs_verify_folio( + struct folio*folio, + loff_t pos, + unsigned intlen) +{ + return fsverity_verify_folio(folio, len, pos); +} + static const struct iomap_folio_ops xfs_iomap_folio_ops = { .iomap_valid= xfs_iomap_valid, + .verify_folio = xfs_verify_folio, }; int diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h index e88f18f85e4b..c574fbf4b67d 100644 --- a/fs/xfs/xfs_linux.h +++ b/fs/xfs/xfs_linux.h @@ -63,6 +63,7 @@ typedef __u32 xfs_nlink_t; #include #include #include +#include #include #include -- 2.38.4
[Cluster-devel] [PATCH v2 05/23] fsverity: make fsverity_verify_folio() accept folio's offset and size
Not the whole folio always need to be verified by fs-verity (e.g. with 1k blocks). Use passed folio's offset and size. Signed-off-by: Andrey Albershteyn --- include/linux/fsverity.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h index 119a3266791f..6d7a4b3ea626 100644 --- a/include/linux/fsverity.h +++ b/include/linux/fsverity.h @@ -249,9 +249,10 @@ static inline void fsverity_enqueue_verify_work(struct work_struct *work) #endif /* !CONFIG_FS_VERITY */ -static inline bool fsverity_verify_folio(struct folio *folio) +static inline bool fsverity_verify_folio(struct folio *folio, size_t len, +size_t offset) { - return fsverity_verify_blocks(folio, folio_size(folio), 0); + return fsverity_verify_blocks(folio, len, offset); } static inline bool fsverity_verify_page(struct page *page) -- 2.38.4
[Cluster-devel] [PATCH v2 07/23] fsverity: pass Merkle tree block size to ->read_merkle_tree_page()
XFS will need to know size of Merkle tree block as these blocks will not be stored consecutively in fs blocks but under indexes in extended attributes. Signed-off-by: Andrey Albershteyn --- fs/btrfs/verity.c | 3 ++- fs/ext4/verity.c | 3 ++- fs/f2fs/verity.c | 3 ++- fs/verity/read_metadata.c | 3 ++- fs/verity/verify.c| 3 ++- include/linux/fsverity.h | 3 ++- 6 files changed, 12 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/verity.c b/fs/btrfs/verity.c index 4c2c09204bb4..737ad277b15a 100644 --- a/fs/btrfs/verity.c +++ b/fs/btrfs/verity.c @@ -713,7 +713,8 @@ int btrfs_get_verity_descriptor(struct inode *inode, void *buf, size_t buf_size) */ static struct page *btrfs_read_merkle_tree_page(struct inode *inode, pgoff_t index, - unsigned long num_ra_pages) + unsigned long num_ra_pages, + u8 log_blocksize) { struct page *page; u64 off = (u64)index << PAGE_SHIFT; diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c index 35a2feb6fd68..cbf1253dd14a 100644 --- a/fs/ext4/verity.c +++ b/fs/ext4/verity.c @@ -361,7 +361,8 @@ static int ext4_get_verity_descriptor(struct inode *inode, void *buf, static struct page *ext4_read_merkle_tree_page(struct inode *inode, pgoff_t index, - unsigned long num_ra_pages) + unsigned long num_ra_pages, + u8 log_blocksize) { struct page *page; diff --git a/fs/f2fs/verity.c b/fs/f2fs/verity.c index 019c7a6c6bcf..63c6a1b1bdef 100644 --- a/fs/f2fs/verity.c +++ b/fs/f2fs/verity.c @@ -256,7 +256,8 @@ static int f2fs_get_verity_descriptor(struct inode *inode, void *buf, static struct page *f2fs_read_merkle_tree_page(struct inode *inode, pgoff_t index, - unsigned long num_ra_pages) + unsigned long num_ra_pages, + u8 log_blocksize) { struct page *page; diff --git a/fs/verity/read_metadata.c b/fs/verity/read_metadata.c index cab1612bf4a3..d6cc58c24a2e 100644 --- a/fs/verity/read_metadata.c +++ b/fs/verity/read_metadata.c @@ -44,7 +44,8 @@ static int fsverity_read_merkle_tree(struct inode *inode, struct page *page; const void *virt; - page = vops->read_merkle_tree_page(inode, index, num_ra_pages); + page = vops->read_merkle_tree_page(inode, index, num_ra_pages, + vi->tree_params.log_blocksize); if (IS_ERR(page)) { err = PTR_ERR(page); fsverity_err(inode, diff --git a/fs/verity/verify.c b/fs/verity/verify.c index c2fc4c86af34..9213b1e5ed2c 100644 --- a/fs/verity/verify.c +++ b/fs/verity/verify.c @@ -199,7 +199,8 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi, hpage = inode->i_sb->s_vop->read_merkle_tree_page(inode, hpage_idx, level == 0 ? min(max_ra_pages, - params->tree_pages - hpage_idx) : 0); + params->tree_pages - hpage_idx) : 0, + params->log_blocksize); if (IS_ERR(hpage)) { err = PTR_ERR(hpage); fsverity_err(inode, diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h index 3e923a8e0d6f..ad07a1d10fdf 100644 --- a/include/linux/fsverity.h +++ b/include/linux/fsverity.h @@ -103,7 +103,8 @@ struct fsverity_operations { */ struct page *(*read_merkle_tree_page)(struct inode *inode, pgoff_t index, - unsigned long num_ra_pages); + unsigned long num_ra_pages, + u8 log_blocksize); /** * Write a Merkle tree block to the given inode. -- 2.38.4
[Cluster-devel] [PATCH v2 10/23] xfs: add XBF_VERITY_CHECKED xfs_buf flag
The meaning of the flag is that value of the extended attribute in the buffer was verified. The underlying pages have PageChecked() == false (the way fs-verity identifies verified pages), as page content will be copied out to newly allocated pages in further patches. The flag is being used later to SetPageChecked() on pages handed to the fs-verity. Signed-off-by: Andrey Albershteyn --- fs/xfs/xfs_buf.h | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h index 549c60942208..8cc86fed962b 100644 --- a/fs/xfs/xfs_buf.h +++ b/fs/xfs/xfs_buf.h @@ -24,14 +24,15 @@ struct xfs_buf; #define XFS_BUF_DADDR_NULL ((xfs_daddr_t) (-1LL)) -#define XBF_READ(1u << 0) /* buffer intended for reading from device */ -#define XBF_WRITE (1u << 1) /* buffer intended for writing to device */ -#define XBF_READ_AHEAD (1u << 2) /* asynchronous read-ahead */ -#define XBF_NO_IOACCT (1u << 3) /* bypass I/O accounting (non-LRU bufs) */ -#define XBF_ASYNC (1u << 4) /* initiator will not wait for completion */ -#define XBF_DONE(1u << 5) /* all pages in the buffer uptodate */ -#define XBF_STALE (1u << 6) /* buffer has been staled, do not find it */ -#define XBF_WRITE_FAIL (1u << 7) /* async writes have failed on this buffer */ +#define XBF_READ (1u << 0) /* buffer intended for reading from device */ +#define XBF_WRITE (1u << 1) /* buffer intended for writing to device */ +#define XBF_READ_AHEAD (1u << 2) /* asynchronous read-ahead */ +#define XBF_NO_IOACCT (1u << 3) /* bypass I/O accounting (non-LRU bufs) */ +#define XBF_ASYNC (1u << 4) /* initiator will not wait for completion */ +#define XBF_DONE (1u << 5) /* all pages in the buffer uptodate */ +#define XBF_STALE (1u << 6) /* buffer has been staled, do not find it */ +#define XBF_WRITE_FAIL (1u << 7) /* async writes have failed on this buffer */ +#define XBF_VERITY_CHECKED (1u << 8) /* buffer was verified by fs-verity*/ /* buffer type flags for write callbacks */ #define _XBF_INODES (1u << 16)/* inode buffer */ -- 2.38.4
[Cluster-devel] [PATCH v2 23/23] xfs: enable ro-compat fs-verity flag
Finalize fs-verity integration in XFS by making kernel fs-verity aware with ro-compat flag. Signed-off-by: Andrey Albershteyn --- fs/xfs/libxfs/xfs_format.h | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index ccb2ae5c2c93..a21612319765 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -355,10 +355,11 @@ xfs_sb_has_compat_feature( #define XFS_SB_FEAT_RO_COMPAT_INOBTCNT (1 << 3)/* inobt block counts */ #define XFS_SB_FEAT_RO_COMPAT_VERITY (1 << 4)/* fs-verity */ #define XFS_SB_FEAT_RO_COMPAT_ALL \ - (XFS_SB_FEAT_RO_COMPAT_FINOBT | \ -XFS_SB_FEAT_RO_COMPAT_RMAPBT | \ -XFS_SB_FEAT_RO_COMPAT_REFLINK| \ -XFS_SB_FEAT_RO_COMPAT_INOBTCNT) + (XFS_SB_FEAT_RO_COMPAT_FINOBT | \ +XFS_SB_FEAT_RO_COMPAT_RMAPBT | \ +XFS_SB_FEAT_RO_COMPAT_REFLINK | \ +XFS_SB_FEAT_RO_COMPAT_INOBTCNT| \ +XFS_SB_FEAT_RO_COMPAT_VERITY) #define XFS_SB_FEAT_RO_COMPAT_UNKNOWN ~XFS_SB_FEAT_RO_COMPAT_ALL static inline bool xfs_sb_has_ro_compat_feature( -- 2.38.4
[Cluster-devel] [PATCH v2 03/23] xfs: define parent pointer xattr format
From: Allison Henderson We need to define the parent pointer attribute format before we start adding support for it into all the code that needs to use it. The EA format we will use encodes the following information: name={parent inode #, parent inode generation, dirent offset} value={dirent filename} The inode/gen gives all the information we need to reliably identify the parent without requiring child->parent lock ordering, and allows userspace to do pathname component level reconstruction without the kernel ever needing to verify the parent itself as part of ioctl calls. By using the dirent offset in the EA name, we have a method of knowing the exact parent pointer EA we need to modify/remove in rename/unlink without an unbound EA name search. By keeping the dirent name in the value, we have enough information to be able to validate and reconstruct damaged directory trees. While the diroffset of a filename alone is not unique enough to identify the child, the {diroffset,filename,child_inode} tuple is sufficient. That is, if the diroffset gets reused and points to a different filename, we can detect that from the contents of EA. If a link of the same name is created, then we can check whether it points at the same inode as the parent EA we current have. Signed-off-by: Dave Chinner Signed-off-by: Allison Henderson Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_da_format.h | 25 + 1 file changed, 25 insertions(+) diff --git a/fs/xfs/libxfs/xfs_da_format.h b/fs/xfs/libxfs/xfs_da_format.h index 3dc03968bba6..b02b67f1999e 100644 --- a/fs/xfs/libxfs/xfs_da_format.h +++ b/fs/xfs/libxfs/xfs_da_format.h @@ -805,4 +805,29 @@ static inline unsigned int xfs_dir2_dirblock_bytes(struct xfs_sb *sbp) xfs_failaddr_t xfs_da3_blkinfo_verify(struct xfs_buf *bp, struct xfs_da3_blkinfo *hdr3); +/* + * Parent pointer attribute format definition + * + * EA name encodes the parent inode number, generation and the offset of + * the dirent that points to the child inode. The EA value contains the + * same name as the dirent in the parent directory. + */ +struct xfs_parent_name_rec { + __be64 p_ino; + __be32 p_gen; + __be32 p_diroffset; +}; + +/* + * incore version of the above, also contains name pointers so callers + * can pass/obtain all the parent pointer information in a single structure + */ +struct xfs_parent_name_irec { + xfs_ino_t p_ino; + uint32_tp_gen; + xfs_dir2_dataptr_t p_diroffset; + const char *p_name; + uint8_t p_namelen; +}; + #endif /* __XFS_DA_FORMAT_H__ */ -- 2.38.4
[Cluster-devel] [PATCH v2 06/23] fsverity: add drop_page() callout
Allow filesystem to make additional processing on verified pages instead of just dropping a reference. This will be used by XFS for internal buffer cache manipulation in further patches. The btrfs, ext4, and f2fs just drop the reference. Signed-off-by: Andrey Albershteyn --- fs/btrfs/verity.c | 12 fs/ext4/verity.c | 6 ++ fs/f2fs/verity.c | 6 ++ fs/verity/read_metadata.c | 4 ++-- fs/verity/verify.c| 6 +++--- include/linux/fsverity.h | 10 ++ 6 files changed, 39 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/verity.c b/fs/btrfs/verity.c index c5ff16f9e9fa..4c2c09204bb4 100644 --- a/fs/btrfs/verity.c +++ b/fs/btrfs/verity.c @@ -804,10 +804,22 @@ static int btrfs_write_merkle_tree_block(struct inode *inode, const void *buf, pos, buf, size); } +/* + * fsverity op that releases the reference obtained by ->read_merkle_tree_page() + * + * @page: reference to the page which can be released + * + */ +static void btrfs_drop_page(struct page *page) +{ + put_page(page); +} + const struct fsverity_operations btrfs_verityops = { .begin_enable_verity = btrfs_begin_enable_verity, .end_enable_verity = btrfs_end_enable_verity, .get_verity_descriptor = btrfs_get_verity_descriptor, .read_merkle_tree_page = btrfs_read_merkle_tree_page, .write_merkle_tree_block = btrfs_write_merkle_tree_block, + .drop_page = _drop_page, }; diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c index e4da1704438e..35a2feb6fd68 100644 --- a/fs/ext4/verity.c +++ b/fs/ext4/verity.c @@ -388,10 +388,16 @@ static int ext4_write_merkle_tree_block(struct inode *inode, const void *buf, return pagecache_write(inode, buf, size, pos); } +static void ext4_drop_page(struct page *page) +{ + put_page(page); +} + const struct fsverity_operations ext4_verityops = { .begin_enable_verity= ext4_begin_enable_verity, .end_enable_verity = ext4_end_enable_verity, .get_verity_descriptor = ext4_get_verity_descriptor, .read_merkle_tree_page = ext4_read_merkle_tree_page, .write_merkle_tree_block = ext4_write_merkle_tree_block, + .drop_page = _drop_page, }; diff --git a/fs/f2fs/verity.c b/fs/f2fs/verity.c index 4fc95f353a7a..019c7a6c6bcf 100644 --- a/fs/f2fs/verity.c +++ b/fs/f2fs/verity.c @@ -283,10 +283,16 @@ static int f2fs_write_merkle_tree_block(struct inode *inode, const void *buf, return pagecache_write(inode, buf, size, pos); } +static void f2fs_drop_page(struct page *page) +{ + put_page(page); +} + const struct fsverity_operations f2fs_verityops = { .begin_enable_verity= f2fs_begin_enable_verity, .end_enable_verity = f2fs_end_enable_verity, .get_verity_descriptor = f2fs_get_verity_descriptor, .read_merkle_tree_page = f2fs_read_merkle_tree_page, .write_merkle_tree_block = f2fs_write_merkle_tree_block, + .drop_page = _drop_page, }; diff --git a/fs/verity/read_metadata.c b/fs/verity/read_metadata.c index 2aefc5565152..cab1612bf4a3 100644 --- a/fs/verity/read_metadata.c +++ b/fs/verity/read_metadata.c @@ -56,12 +56,12 @@ static int fsverity_read_merkle_tree(struct inode *inode, virt = kmap_local_page(page); if (copy_to_user(buf, virt + offs_in_page, bytes_to_copy)) { kunmap_local(virt); - put_page(page); + inode->i_sb->s_vop->drop_page(page); err = -EFAULT; break; } kunmap_local(virt); - put_page(page); + inode->i_sb->s_vop->drop_page(page); retval += bytes_to_copy; buf += bytes_to_copy; diff --git a/fs/verity/verify.c b/fs/verity/verify.c index f50e3b5b52c9..c2fc4c86af34 100644 --- a/fs/verity/verify.c +++ b/fs/verity/verify.c @@ -210,7 +210,7 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi, if (is_hash_block_verified(vi, hpage, hblock_idx)) { memcpy_from_page(_want_hash, hpage, hoffset, hsize); want_hash = _want_hash; - put_page(hpage); + inode->i_sb->s_vop->drop_page(hpage); goto descend; } hblocks[level].page = hpage; @@ -248,7 +248,7 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi, SetPageChecked(hpage); memcpy_from_page(_want_hash, hpage, hoffset, hsize); want_hash = _want_hash; - put_page(hpage); + inode->i_sb->s_vop->drop_page(hpage); } /* Finally, verify the data block. */ @@ -259,7 +259,7 @@ verify_data_block(struct