Re: [PATCH v10 24/25] fuse: Convert from readpages to readahead
On Wed, Mar 25, 2020 at 4:32 PM Matthew Wilcox wrote: > > On Wed, Mar 25, 2020 at 03:43:02PM +0100, Miklos Szeredi wrote: > > > > > > - while ((page = readahead_page(rac))) { > > > - if (fuse_readpages_fill(&data, page) != 0) > > > + nr_pages = min(readahead_count(rac), fc->max_pages); > > > > Missing fc->max_read clamp. > > Yeah, I realised that. I ended up doing ... > > + unsigned int i, max_pages, nr_pages = 0; > ... > + max_pages = min(fc->max_pages, fc->max_read / PAGE_SIZE); > > > > + ia = fuse_io_alloc(NULL, nr_pages); > > > + if (!ia) > > > return; > > > + ap = &ia->ap; > > > + __readahead_batch(rac, ap->pages, nr_pages); > > > > nr_pages = __readahead_batch(...)? > > That's the other bug ... this was designed for btrfs which has a fixed-size > buffer. But you want to dynamically allocate fuse_io_args(), so we need to > figure out the number of pages beforehand, which is a little awkward. I've > settled on this for the moment: > > for (;;) { >struct fuse_io_args *ia; > struct fuse_args_pages *ap; > > nr_pages = readahead_count(rac) - nr_pages; > if (nr_pages > max_pages) > nr_pages = max_pages; > if (nr_pages == 0) > break; > ia = fuse_io_alloc(NULL, nr_pages); > if (!ia) > return; > ap = &ia->ap; > __readahead_batch(rac, ap->pages, nr_pages); > for (i = 0; i < nr_pages; i++) { > fuse_wait_on_page_writeback(inode, > readahead_index(rac) + i); > ap->descs[i].length = PAGE_SIZE; > } > ap->num_pages = nr_pages; > fuse_send_readpages(ia, rac->file); > } > > but I'm not entirely happy with that either. Pondering better options. I think that's fine. Note how the original code possibly over-allocates the the page array, because it doesn't know the batch size beforehand. So this is already better. > > > This will give consecutive pages, right? > > readpages() was already being called with consecutive pages. Several > filesystems had code to cope with the pages being non-consecutive, but > that wasn't how the core code worked; if there was a discontiguity it > would send off the pages that were consecutive and start a new batch. > > __readahead_batch() can't return fewer than nr_pages, so you don't need > to check for that. That's far from obvious. I'd put a WARN_ON at least to make document the fact. Thanks, Miklos
Re: [PATCH v10 24/25] fuse: Convert from readpages to readahead
On Wed, Mar 25, 2020 at 03:43:02PM +0100, Miklos Szeredi wrote: > > > > - while ((page = readahead_page(rac))) { > > - if (fuse_readpages_fill(&data, page) != 0) > > + nr_pages = min(readahead_count(rac), fc->max_pages); > > Missing fc->max_read clamp. Yeah, I realised that. I ended up doing ... + unsigned int i, max_pages, nr_pages = 0; ... + max_pages = min(fc->max_pages, fc->max_read / PAGE_SIZE); > > + ia = fuse_io_alloc(NULL, nr_pages); > > + if (!ia) > > return; > > + ap = &ia->ap; > > + __readahead_batch(rac, ap->pages, nr_pages); > > nr_pages = __readahead_batch(...)? That's the other bug ... this was designed for btrfs which has a fixed-size buffer. But you want to dynamically allocate fuse_io_args(), so we need to figure out the number of pages beforehand, which is a little awkward. I've settled on this for the moment: for (;;) { struct fuse_io_args *ia; struct fuse_args_pages *ap; nr_pages = readahead_count(rac) - nr_pages; if (nr_pages > max_pages) nr_pages = max_pages; if (nr_pages == 0) break; ia = fuse_io_alloc(NULL, nr_pages); if (!ia) return; ap = &ia->ap; __readahead_batch(rac, ap->pages, nr_pages); for (i = 0; i < nr_pages; i++) { fuse_wait_on_page_writeback(inode, readahead_index(rac) + i); ap->descs[i].length = PAGE_SIZE; } ap->num_pages = nr_pages; fuse_send_readpages(ia, rac->file); } but I'm not entirely happy with that either. Pondering better options. > This will give consecutive pages, right? readpages() was already being called with consecutive pages. Several filesystems had code to cope with the pages being non-consecutive, but that wasn't how the core code worked; if there was a discontiguity it would send off the pages that were consecutive and start a new batch. __readahead_batch() can't return fewer than nr_pages, so you don't need to check for that.
Re: [PATCH v10 24/25] fuse: Convert from readpages to readahead
On Wed, Mar 25, 2020 at 1:02 PM Matthew Wilcox wrote: > > On Wed, Mar 25, 2020 at 10:42:56AM +0100, Miklos Szeredi wrote: > > > + while ((page = readahead_page(rac))) { > > > + if (fuse_readpages_fill(&data, page) != 0) > > > > Shouldn't this unlock + put page on error? > > We're certainly inconsistent between the two error exits from > fuse_readpages_fill(). But I think we can simplify the whole thing > ... how does this look to you? Nice, overall. > > - while ((page = readahead_page(rac))) { > - if (fuse_readpages_fill(&data, page) != 0) > + nr_pages = min(readahead_count(rac), fc->max_pages); Missing fc->max_read clamp. > + ia = fuse_io_alloc(NULL, nr_pages); > + if (!ia) > return; > + ap = &ia->ap; > + __readahead_batch(rac, ap->pages, nr_pages); nr_pages = __readahead_batch(...)? This will give consecutive pages, right? Thanks, Miklos
Re: [PATCH v10 24/25] fuse: Convert from readpages to readahead
On Wed, Mar 25, 2020 at 10:42:56AM +0100, Miklos Szeredi wrote: > > + while ((page = readahead_page(rac))) { > > + if (fuse_readpages_fill(&data, page) != 0) > > Shouldn't this unlock + put page on error? We're certainly inconsistent between the two error exits from fuse_readpages_fill(). But I think we can simplify the whole thing ... how does this look to you? diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 5749505bcff6..57ea9a364e62 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -915,76 +915,32 @@ static void fuse_send_readpages(struct fuse_io_args *ia, struct file *file) fuse_readpages_end(fc, &ap->args, err); } -struct fuse_fill_data { - struct fuse_io_args *ia; - struct file *file; - struct inode *inode; - unsigned int nr_pages; - unsigned int max_pages; -}; - -static int fuse_readpages_fill(struct fuse_fill_data *data, struct page *page) -{ - struct fuse_io_args *ia = data->ia; - struct fuse_args_pages *ap = &ia->ap; - struct inode *inode = data->inode; - struct fuse_conn *fc = get_fuse_conn(inode); - - fuse_wait_on_page_writeback(inode, page->index); - - if (ap->num_pages && - (ap->num_pages == fc->max_pages || -(ap->num_pages + 1) * PAGE_SIZE > fc->max_read || -ap->pages[ap->num_pages - 1]->index + 1 != page->index)) { - data->max_pages = min_t(unsigned int, data->nr_pages, - fc->max_pages); - fuse_send_readpages(ia, data->file); - data->ia = ia = fuse_io_alloc(NULL, data->max_pages); - if (!ia) - return -ENOMEM; - ap = &ia->ap; - } - - if (WARN_ON(ap->num_pages >= data->max_pages)) { - unlock_page(page); - fuse_io_free(ia); - return -EIO; - } - - ap->pages[ap->num_pages] = page; - ap->descs[ap->num_pages].length = PAGE_SIZE; - ap->num_pages++; - data->nr_pages--; - return 0; -} - static void fuse_readahead(struct readahead_control *rac) { struct inode *inode = rac->mapping->host; struct fuse_conn *fc = get_fuse_conn(inode); - struct fuse_fill_data data; - struct page *page; if (is_bad_inode(inode)) return; - data.file = rac->file; - data.inode = inode; - data.nr_pages = readahead_count(rac); - data.max_pages = min_t(unsigned int, data.nr_pages, fc->max_pages); - data.ia = fuse_io_alloc(NULL, data.max_pages); - if (!data.ia) - return; + while (readahead_count(rac)) { + struct fuse_io_args *ia; + struct fuse_args_pages *ap; + unsigned int i, nr_pages; - while ((page = readahead_page(rac))) { - if (fuse_readpages_fill(&data, page) != 0) + nr_pages = min(readahead_count(rac), fc->max_pages); + ia = fuse_io_alloc(NULL, nr_pages); + if (!ia) return; + ap = &ia->ap; + __readahead_batch(rac, ap->pages, nr_pages); + for (i = 0; i < nr_pages; i++) { + fuse_wait_on_page_writeback(inode, ap->pages[i]->index); + ap->descs[i].length = PAGE_SIZE; + } + ap->num_pages = nr_pages; + fuse_send_readpages(ia, rac->file); } - - if (data.ia->ap.num_pages) - fuse_send_readpages(data.ia, rac->file); - else - fuse_io_free(data.ia); } static ssize_t fuse_cache_read_iter(struct kiocb *iocb, struct iov_iter *to)
Re: [PATCH v10 24/25] fuse: Convert from readpages to readahead
On Mon, Mar 23, 2020 at 9:23 PM Matthew Wilcox wrote: > > From: "Matthew Wilcox (Oracle)" > > Use the new readahead operation in fuse. Switching away from the > read_cache_pages() helper gets rid of an implicit call to put_page(), > so we can get rid of the get_page() call in fuse_readpages_fill(). > > Signed-off-by: Matthew Wilcox (Oracle) > Reviewed-by: Dave Chinner > Reviewed-by: William Kucharski > --- > fs/fuse/file.c | 46 +++--- > 1 file changed, 19 insertions(+), 27 deletions(-) > > diff --git a/fs/fuse/file.c b/fs/fuse/file.c > index 9d67b830fb7a..5749505bcff6 100644 > --- a/fs/fuse/file.c > +++ b/fs/fuse/file.c > @@ -923,9 +923,8 @@ struct fuse_fill_data { > unsigned int max_pages; > }; > > -static int fuse_readpages_fill(void *_data, struct page *page) > +static int fuse_readpages_fill(struct fuse_fill_data *data, struct page > *page) > { > - struct fuse_fill_data *data = _data; > struct fuse_io_args *ia = data->ia; > struct fuse_args_pages *ap = &ia->ap; > struct inode *inode = data->inode; > @@ -941,10 +940,8 @@ static int fuse_readpages_fill(void *_data, struct page > *page) > fc->max_pages); > fuse_send_readpages(ia, data->file); > data->ia = ia = fuse_io_alloc(NULL, data->max_pages); > - if (!ia) { > - unlock_page(page); > + if (!ia) > return -ENOMEM; > - } > ap = &ia->ap; > } > > @@ -954,7 +951,6 @@ static int fuse_readpages_fill(void *_data, struct page > *page) > return -EIO; > } > > - get_page(page); > ap->pages[ap->num_pages] = page; > ap->descs[ap->num_pages].length = PAGE_SIZE; > ap->num_pages++; > @@ -962,37 +958,33 @@ static int fuse_readpages_fill(void *_data, struct page > *page) > return 0; > } > > -static int fuse_readpages(struct file *file, struct address_space *mapping, > - struct list_head *pages, unsigned nr_pages) > +static void fuse_readahead(struct readahead_control *rac) > { > - struct inode *inode = mapping->host; > + struct inode *inode = rac->mapping->host; > struct fuse_conn *fc = get_fuse_conn(inode); > struct fuse_fill_data data; > - int err; > + struct page *page; > > - err = -EIO; > if (is_bad_inode(inode)) > - goto out; > + return; > > - data.file = file; > + data.file = rac->file; > data.inode = inode; > - data.nr_pages = nr_pages; > - data.max_pages = min_t(unsigned int, nr_pages, fc->max_pages); > -; > + data.nr_pages = readahead_count(rac); > + data.max_pages = min_t(unsigned int, data.nr_pages, fc->max_pages); > data.ia = fuse_io_alloc(NULL, data.max_pages); > - err = -ENOMEM; > if (!data.ia) > - goto out; > + return; > > - err = read_cache_pages(mapping, pages, fuse_readpages_fill, &data); > - if (!err) { > - if (data.ia->ap.num_pages) > - fuse_send_readpages(data.ia, file); > - else > - fuse_io_free(data.ia); > + while ((page = readahead_page(rac))) { > + if (fuse_readpages_fill(&data, page) != 0) Shouldn't this unlock + put page on error? Otherwise looks good. Thanks, Miklos