On Thu, 2018-07-12 at 09:08 -0600, Jens Axboe wrote:
> On 7/12/18 8:36 AM, Hannes Reinecke wrote:
> > Hi Jens, Christoph,
> > 
> > we're currently hunting down a silent data corruption occurring due
> > to
> > commit 72ecad22d9f1 ("block: support a full bio worth of IO for
> > simplified bdev direct-io").
> > 
> > While the whole thing is still hazy on the details, the one thing
> > we've
> > found is that reverting that patch fixes the data corruption.
> > 
> > And looking closer, I've found this:
> > 
> > static ssize_t
> > blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
> > {
> >     int nr_pages;
> > 
> >     nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES + 1);
> >     if (!nr_pages)
> >             return 0;
> >     if (is_sync_kiocb(iocb) && nr_pages <= BIO_MAX_PAGES)
> >             return __blkdev_direct_IO_simple(iocb, iter, nr_pages);
> > 
> >     return __blkdev_direct_IO(iocb, iter, min(nr_pages,
> > BIO_MAX_PAGES));
> > }
> > 
> > When checking the call path
> > __blkdev_direct_IO()->bio_alloc_bioset()->bvec_alloc()
> > I found that bvec_alloc() will fail if nr_pages > BIO_MAX_PAGES.
> > 
> > So why is there the check for 'nr_pages <= BIO_MAX_PAGES' ?
> > It's not that we can handle it in __blkdev_direct_IO() ...
> 
> The logic could be cleaned up like below, the sync part is really all
> we care about. What is the test case for this? async or sync?

It's sync, and the corruption is in the __blkdev_direct_IO_simple()
path. It starts to occur with your "block: support a full bio worth of
IO for simplified bdev direct-io" patch, which causes the "simple" path
to be taken for larger IO sizes.

Martin



> 
> I also don't remember why it's BIO_MAX_PAGES + 1...
> 
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 0dd87aaeb39a..14ef3d71b55f 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -424,13 +424,13 @@ blkdev_direct_IO(struct kiocb *iocb, struct
> iov_iter *iter)
>  {
>       int nr_pages;
>  
> -     nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES + 1);
> +     nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES);
>       if (!nr_pages)
>               return 0;
> -     if (is_sync_kiocb(iocb) && nr_pages <= BIO_MAX_PAGES)
> +     if (is_sync_kiocb(iocb))
>               return __blkdev_direct_IO_simple(iocb, iter,
> nr_pages);
>  
> -     return __blkdev_direct_IO(iocb, iter, min(nr_pages,
> BIO_MAX_PAGES));
> +     return __blkdev_direct_IO(iocb, iter, nr_pages);
>  }
>  
>  static __init int blkdev_init(void)
> 

-- 
Dr. Martin Wilck <[email protected]>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

Reply via email to