On 7/12/18 10:14 AM, Hannes Reinecke wrote:
> On 07/12/2018 05:08 PM, Jens Axboe wrote:
>> On 7/12/18 8:36 AM, Hannes Reinecke wrote:
>>> Hi Jens, Christoph,
>>>
>>> we're currently hunting down a silent data corruption occurring due to
>>> commit 72ecad22d9f1 ("block: support a full bio worth of IO for
>>> simplified bdev direct-io").
>>>
>>> While the whole thing is still hazy on the details, the one thing we've
>>> found is that reverting that patch fixes the data corruption.
>>>
>>> And looking closer, I've found this:
>>>
>>> static ssize_t
>>> blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>>> {
>>>     int nr_pages;
>>>
>>>     nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES + 1);
>>>     if (!nr_pages)
>>>             return 0;
>>>     if (is_sync_kiocb(iocb) && nr_pages <= BIO_MAX_PAGES)
>>>             return __blkdev_direct_IO_simple(iocb, iter, nr_pages);
>>>
>>>     return __blkdev_direct_IO(iocb, iter, min(nr_pages, BIO_MAX_PAGES));
>>> }
>>>
>>> When checking the call path
>>> __blkdev_direct_IO()->bio_alloc_bioset()->bvec_alloc()
>>> I found that bvec_alloc() will fail if nr_pages > BIO_MAX_PAGES.
>>>
>>> So why is there the check for 'nr_pages <= BIO_MAX_PAGES' ?
>>> It's not that we can handle it in __blkdev_direct_IO() ...
>>
>> The logic could be cleaned up like below, the sync part is really all
>> we care about. What is the test case for this? async or sync?
>>
>> I also don't remember why it's BIO_MAX_PAGES + 1...
>>
>> diff --git a/fs/block_dev.c b/fs/block_dev.c
>> index 0dd87aaeb39a..14ef3d71b55f 100644
>> --- a/fs/block_dev.c
>> +++ b/fs/block_dev.c
>> @@ -424,13 +424,13 @@ blkdev_direct_IO(struct kiocb *iocb, struct iov_iter 
>> *iter)
>>   {
>>      int nr_pages;
>>   
>> -    nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES + 1);
>> +    nr_pages = iov_iter_npages(iter, BIO_MAX_PAGES);
>>      if (!nr_pages)
>>              return 0;
>> -    if (is_sync_kiocb(iocb) && nr_pages <= BIO_MAX_PAGES)
>> +    if (is_sync_kiocb(iocb))
>>              return __blkdev_direct_IO_simple(iocb, iter, nr_pages);
>>   
>> -    return __blkdev_direct_IO(iocb, iter, min(nr_pages, BIO_MAX_PAGES));
>> +    return __blkdev_direct_IO(iocb, iter, nr_pages);
>>   }
>>   
>>   static __init int blkdev_init(void)
>>
> Hmm. We'll give it a go, but somehow I feel this won't solve our problem.

It probably won't, the only joker here is the BIO_MAX_PAGES + 1. But it
does simplify that part...

> Another question (which probably shows my complete ignorance):
> What happens if the iter holds >= BIO_MAX_PAGES?
> 'iov_iter_npages' will only ever return BIO_MAX_PAGES, independent on 
> whether the iter contains more pages.
> What happens to those left-over pages?
> Will they ever be processed?

Short read or write, we rely on being called again for the remainder.

-- 
Jens Axboe

Reply via email to