> On Feb 5, 2021, at 1:39 AM, Eric Biggers <ebigg...@kernel.org> wrote:
> 
> On Thu, Feb 04, 2021 at 03:21:38PM -0800, Boris Burkov wrote:
>> +/*
>> + * drop all the items for this inode with this key_type.  Before
>> + * doing a verity enable we cleanup any existing verity items.
>> + *
>> + * This is also used to clean up if a verity enable failed half way
>> + * through
>> + */
>> +static int drop_verity_items(struct btrfs_inode *inode, u8 key_type)
> 
> I assume you checked whether there's already code in btrfs that does this?

Nikolay correctly called out btrfs_truncate_inode_items(), but I wanted to keep 
my fingers out of that in v0 of the patches.

>  This
> seems like a fairly generic thing that might be needed elsewhere in btrfs.
> Similarly for write_key_bytes() and read_key_bytes().
> 

It might make sense to make read/write_key_bytes() our generic functions, but 
unless I missed something I didn’t see great fits.

>> +/*
>> + * fsverity does a two pass setup for reading the descriptor, in the first 
>> pass
>> + * it calls with buf_size = 0 to query the size of the descriptor,
>> + * and then in the second pass it actually reads the descriptor off
>> + * disk.
>> + */
>> +static int btrfs_get_verity_descriptor(struct inode *inode, void *buf,
>> +                                   size_t buf_size)
>> +{
>> +    ssize_t ret = 0;
>> +
>> +    if (!buf_size) {
>> +            return read_key_bytes(BTRFS_I(inode),
>> +                                 BTRFS_VERITY_DESC_ITEM_KEY,
>> +                                 0, NULL, (u64)-1, NULL);
>> +    }
>> +
>> +    ret = read_key_bytes(BTRFS_I(inode),
>> +                         BTRFS_VERITY_DESC_ITEM_KEY, 0,
>> +                         buf, buf_size, NULL);
>> +    if (ret < 0)
>> +            return ret;
>> +
>> +    if (ret != buf_size)
>> +            return -EIO;
>> +
>> +    return buf_size;
>> +}
> 
> This doesn't return the right value when buf_size != 0 && buf_size != 
> desc_size.
> It's supposed to return the actual size or -ERANGE, like getxattr() does; see
> the comment above fsverity_operations::get_verity_descriptor.
> 
> It doesn't matter much because that case doesn't happen currently, but it 
> would
> be nice to keep things consistent.
> 

Forgot all about this, but I was going to suggest optimizing this part a bit, 
since btrfs is doing a tree search for each call.  I’d love to have a way to do 
it in one search-allocate-copy round.

>> +/*
>> + * reads and caches a merkle tree page.  These are stored in the btree,
>> + * but we cache them in the inode's address space after EOF.
>> + */
>> +static struct page *btrfs_read_merkle_tree_page(struct inode *inode,
>> +                                           pgoff_t index,
>> +                                           unsigned long num_ra_pages)
>> +{
>> +    struct page *p;
>> +    u64 start = index << PAGE_SHIFT;
>> +    unsigned long mapping_index;
>> +    ssize_t ret;
>> +    int err;
>> +
>> +    /*
>> +     * the file is readonly, so i_size can't change here.  We jump
>> +     * some pages past the last page to cache our merkles.  The goal
>> +     * is just to jump past any hugepages that might be mapped in.
>> +     */
>> +    mapping_index = (i_size_read(inode) >> PAGE_SHIFT) + 2048 + index;
> 
> btrfs allows files of up to the page cache limit of MAX_LFS_FILESIZE already.
> So if the Merkle tree pages are cached past EOF like this, it would be 
> necessary
> to limit the size of files that verity can be enabled on, like what ext4 and
> f2fs do.  See the -EFBIG case in pagecache_write() in fs/ext4/verity.c and
> fs/f2fs/verity.c.
> 
> Note that this extra limit isn't likely to be encountered in practice, as it
> would only decrease a very large limit by about 1%, and fs-verity isn't likely
> to be used on terabyte-sized files.
> 
> However maybe there's a way to avoid this weirdness entirely, e.g. by 
> allocating
> a temporary in-memory inode and using its address_space?


This is a good point, I think maybe just do the EFBIG dance for now?  It’s not 
a factor for the disk format, and we can add the separate address space any 
time.  For the common case today, I would prefer avoiding the overhead of 
allocating the temp inode/address_space.

-chris

Reply via email to