On 2019/08/06 9:25, Dave Chinner wrote:
> On Tue, Aug 06, 2019 at 12:05:51AM +0000, Damien Le Moal wrote:
>> On 2019/08/06 7:05, Damien Le Moal wrote:
>>> On 2019/08/06 6:59, Damien Le Moal wrote:
>>>> On 2019/08/06 6:28, Jens Axboe wrote:
>>>>> On 8/5/19 2:27 PM, Damien Le Moal wrote:
>>>>>> On 2019/08/06 6:26, Jens Axboe wrote:
>>>>>>>> In any case, looking again at this code, it looks like there is a
>>>>>>>> problem with dio->size being incremented early, even for fragments
>>>>>>>> that get BLK_QC_T_EAGAIN, because dio->size is being used in
>>>>>>>> blkdev_bio_end_io(). So an incorrect size can be reported to user
>>>>>>>> space in that case on completion (e.g. large asynchronous no-wait dio
>>>>>>>> that cannot be issued in one go).
>>>>>>>>
>>>>>>>> So maybe something like this ? (completely untested)
>>>>>>>
>>>>>>> I think that looks pretty good, I like not double accounting with
>>>>>>> this_size and dio->size, and we retain the old style ordering for the
>>>>>>> ret value.
>>>>>>
>>>>>> Do you want a proper patch with real testing backup ? I can send that
>>>>>> later today.
>>>>>
>>>>> Yeah that'd be great, I like your approach better.
>>>>>
>>>>
>>>> Looking again, I think this is not it yet: dio->size is being referenced
>>>> after
>>>> submit_bio(), so blkdev_bio_end_io() may see the old value if the bio
>>>> completes
>>>> before dio->size increment. So the use-after-free is still there. And since
>>>> blkdev_bio_end_io() processes completion to user space only when dio->ref
>>>> becomes 0, adding an atomic_inc/dec(&dio->ref) over the loop would not
>>>> help and
>>>> does not cover the single BIO case. Any idea how to address this one ?
>>>>
>>>
>>> May be add a bio_get/put() over the 2 places that do submit_bio() would
>>> work,
>>> for all cases (single/multi BIO, sync & async). E.g.:
>>>
>>> + bio_get(bio);
>>> qc = submit_bio(bio);
>>> if (qc == BLK_QC_T_EAGAIN) {
>>> if (!dio->size)
>>> ret = -EAGAIN;
>>> + bio_put(bio);
>>> goto error;
>>> }
>>> dio->size += bio_size;
>>> + bio_put(bio);
>>>
>>> Thoughts ?
>>>
>>
>> That does not work since the reference to dio->size in blkdev_bio_end_io()
>> depends on atomic_dec_and_test(&dio->ref) which counts the BIO fragments for
>> the
>> dio (+1 for async multi-bio case). So completion of the last bio can still
>> reference the old value of dio->size.
>
> Didn't we fix this same use-after-free in iomap_dio_rw() in commit
> 4ea899ead278 ("iomap: fix a use after free in iomap_dio_rw")?
Not so similar in this case with raw block device dio. But I will look more into
it for inspiration. Thank you for the pointer.
>
> Cheers,
>
> Dave.
>
--
Damien Le Moal
Western Digital Research