On 08/09/2017 08:17 PM, Shaohua Li wrote:
> On Wed, Aug 09, 2017 at 05:16:23PM -0500, Goldwyn Rodrigues wrote:
>>
>>
>> On 08/09/2017 03:21 PM, Shaohua Li wrote:
>>> On Wed, Aug 09, 2017 at 10:35:39AM -0500, Goldwyn Rodrigues wrote:
>>>>
>>>>
>>>> On 08/09/2017 10:02 AM, Shaohua Li wrote:
>>>>> On Wed, Aug 09, 2017 at 06:44:55AM -0500, Goldwyn Rodrigues wrote:
>>>>>>
>>>>>>
>>>>>> On 08/08/2017 03:32 PM, Shaohua Li wrote:
>>>>>>> On Wed, Jul 26, 2017 at 06:57:58PM -0500, Goldwyn Rodrigues wrote:
>>>>>>>> From: Goldwyn Rodrigues <rgold...@suse.com>
>>>>>>>>
>>>>>>>> Nowait is a feature of direct AIO, where users can request
>>>>>>>> to return immediately if the I/O is going to block. This translates
>>>>>>>> to REQ_NOWAIT in bio.bi_opf flags. While request based devices
>>>>>>>> don't wait, stacked devices such as md/dm will.
>>>>>>>>
>>>>>>>> In order to explicitly mark stacked devices as supported, we
>>>>>>>> set the QUEUE_FLAG_NOWAIT in the queue_flags and return -EAGAIN
>>>>>>>> whenever the device would block.
>>>>>>>
>>>>>>> probably you should route this patch to Jens first, DM/MD are different 
>>>>>>> trees.
>>>>>>
>>>>>> Yes, I have sent it to linux-block as well, and he has commented as well.
>>>>>>
>>>>>>
>>>>>>>  
>>>>>>>> Signed-off-by: Goldwyn Rodrigues <rgold...@suse.com>
>>>>>>>> ---
>>>>>>>>  block/blk-core.c       | 3 ++-
>>>>>>>>  include/linux/blkdev.h | 2 ++
>>>>>>>>  2 files changed, 4 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>>>>>>> index 970b9c9638c5..1c9a981d88e5 100644
>>>>>>>> --- a/block/blk-core.c
>>>>>>>> +++ b/block/blk-core.c
>>>>>>>> @@ -2025,7 +2025,8 @@ generic_make_request_checks(struct bio *bio)
>>>>>>>>         * if queue is not a request based queue.
>>>>>>>>         */
>>>>>>>>  
>>>>>>>> -      if ((bio->bi_opf & REQ_NOWAIT) && !queue_is_rq_based(q))
>>>>>>>> +      if ((bio->bi_opf & REQ_NOWAIT) && !queue_is_rq_based(q) &&
>>>>>>>> +          !blk_queue_supports_nowait(q))
>>>>>>>>                goto not_supported;
>>>>>>>>  
>>>>>>>>        part = bio->bi_bdev->bd_part;
>>>>>>>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>>>>>>>> index 25f6a0cb27d3..fae021ebec1b 100644
>>>>>>>> --- a/include/linux/blkdev.h
>>>>>>>> +++ b/include/linux/blkdev.h
>>>>>>>> @@ -633,6 +633,7 @@ struct request_queue {
>>>>>>>>  #define QUEUE_FLAG_REGISTERED  29     /* queue has been registered to 
>>>>>>>> a disk */
>>>>>>>>  #define QUEUE_FLAG_SCSI_PASSTHROUGH 30        /* queue supports SCSI 
>>>>>>>> commands */
>>>>>>>>  #define QUEUE_FLAG_QUIESCED    31     /* queue has been quiesced */
>>>>>>>> +#define QUEUE_FLAG_NOWAIT      32     /* stack device driver supports 
>>>>>>>> REQ_NOWAIT */
>>>>>>>>  
>>>>>>>>  #define QUEUE_FLAG_DEFAULT    ((1 << QUEUE_FLAG_IO_STAT) |            
>>>>>>>> \
>>>>>>>>                                 (1 << QUEUE_FLAG_STACKABLE)    |       
>>>>>>>> \
>>>>>>>> @@ -732,6 +733,7 @@ static inline void queue_flag_clear(unsigned int 
>>>>>>>> flag, struct request_queue *q)
>>>>>>>>  #define blk_queue_dax(q)      test_bit(QUEUE_FLAG_DAX, 
>>>>>>>> &(q)->queue_flags)
>>>>>>>>  #define blk_queue_scsi_passthrough(q) \
>>>>>>>>        test_bit(QUEUE_FLAG_SCSI_PASSTHROUGH, &(q)->queue_flags)
>>>>>>>> +#define blk_queue_supports_nowait(q)  test_bit(QUEUE_FLAG_NOWAIT, 
>>>>>>>> &(q)->queue_flags)
>>>>>>>
>>>>>>> Should this bit consider under layer disks? For example, one raid array 
>>>>>>> disk
>>>>>>> doesn't support NOWAIT, shouldn't we disable NOWAIT for the array?
>>>>>>
>>>>>> Yes, it should. I will add a check before setting the flag. Thanks.
>>>>>> Request-based devices don't wait. So, they would not have this flag set.
>>>>>> It is only the bio-based, with the  make_request_fn hook which need this.
>>>>>>
>>>>>>>  
>>>>>>> I have another generic question. If a bio is splitted into 2 bios, one 
>>>>>>> bio
>>>>>>> doesn't need to wait but the other need to wait. We will return -EAGAIN 
>>>>>>> for the
>>>>>>> second bio, so the whole bio will return -EAGAIN, but the first bio is 
>>>>>>> already
>>>>>>> dispatched to disk. Is this correct behavior?
>>>>>>>
>>>>>>
>>>>>> No, from a multi-device point of view, this is inconsistent. I have
>>>>>> tried the request bio returns -EAGAIN before the split, but I shall
>>>>>> check again. Where do you see this happening?
>>>>>
>>>>> No, this isn't multi-device specific, any driver can do it. Please see 
>>>>> blk_queue_split.
>>>>>
>>>>
>>>> In that case, the bio end_io function is chained and the bio of the
>>>> split will replicate the error to the parent (if not already set).
>>>
>>> this doesn't answer my question. So if a bio returns -EAGAIN, part of the 
>>> bio
>>> probably already dispatched to disk (if the bio is splitted to 2 bios, one
>>> returns -EAGAIN, the other one doesn't block and dispatch to disk), what 
>>> will
>>> application be going to do? I think this is different to other IO errors. 
>>> FOr
>>> other IO errors, application will handle the error, while we ask app to 
>>> retry
>>> the whole bio here and app doesn't know part of bio is already written to 
>>> disk.
>>
>> It is the same as for other I/O errors as well, such as EIO. You do not
>> know which bio of all submitted bio's returned the error EIO. The
>> application would and should consider the whole I/O as failed.
>>
>> The user application does not know of bios, or how it is going to be
>> split in the underlying layers. It knows at the system call level. In
>> this case, the EAGAIN will be returned to the user for the whole I/O not
>> as a part of the I/O. It is up to application to try the I/O again with
>> or without RWF_NOWAIT set. In direct I/O, it is bubbled out using
>> dio->io_error. You can read about it at the patch header for the initial
>> patchset at [1].
>>
>> Use case: It is for applications having two threads, a compute thread
>> and an I/O thread. It would try to push AIO as much as possible in the
>> compute thread using RWF_NOWAIT, and if it fails, would pass it on to
>> I/O thread which would perform without RWF_NOWAIT. End result if done
>> right is you save on context switches and all the
>> synchronization/messaging machinery to perform I/O.
>>
>> [1] http://marc.info/?l=linux-block&m=149789003305876&w=2
> 
> Yes, I knew the concept, but I didn't see previous patches mentioned the
> -EAGAIN actually should be taken as a real IO error. This means a lot to
> applications and make the API hard to use. I'm wondering if we should disable
> bio split for NOWAIT bio, which will make the -EAGAIN only mean 'try again'.
> 

Don't take it as EAGAIN, but read it as EWOULDBLOCK. Why do you say the
API is hard to use? Do you have a case to back it up?

No, not splitting the bio does not make sense here. I do not see any
advantage in it, unless you can present a case otherwise.


-- 
Goldwyn

Reply via email to