Re: [Qemu-block] [Qemu-devel] [PATCH 1/2] vl: Drain before (block) job cancel when quitting

Max Reitz Wed, 19 Jun 2019 12:40:41 -0700

On 14.06.19 11:22, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 19:03, Max Reitz wrote:
>> [re-adding the original CCs, why not]
>>
>> On 13.06.19 16:30, Vladimir Sementsov-Ogievskiy wrote:
>>> 13.06.2019 17:21, Max Reitz wrote:
>>>> On 13.06.19 16:19, Vladimir Sementsov-Ogievskiy wrote:
>>>>> 13.06.2019 1:08, Max Reitz wrote:
>>>>>> If the main loop cancels all block jobs while the block layer is not
>>>>>> drained, this cancelling may not happen instantaneously.  We can start a
>>>>>> drained section before vm_shutdown(), which entails another
>>>>>> bdrv_drain_all(); this nested bdrv_drain_all() will thus be a no-op,
>>>>>> basically.
>>>>>>
>>>>>> We do not have to end the drained section, because we actually do not
>>>>>> want any requests to happen from this point on.
>>>>>>
>>>>>> Signed-off-by: Max Reitz <[email protected]>
>>>>>> ---
>>>>>> I don't know whether it actually makes sense to never end this drained
>>>>>> section.  It makes sense to me.  Please correct me if I'm wrong.
>>>>>> ---
>>>>>>     vl.c | 11 +++++++++++
>>>>>>     1 file changed, 11 insertions(+)
>>>>>>
>>>>>> diff --git a/vl.c b/vl.c
>>>>>> index cd1fbc4cdc..3f8b3f74f5 100644
>>>>>> --- a/vl.c
>>>>>> +++ b/vl.c
>>>>>> @@ -4538,6 +4538,17 @@ int main(int argc, char **argv, char **envp)
>>>>>>          */
>>>>>>         migration_shutdown();
>>>>>>     
>>>>>> +    /*
>>>>>> +     * We must cancel all block jobs while the block layer is drained,
>>>>>> +     * or cancelling will be affected by throttling and thus may block
>>>>>> +     * for an extended period of time.
>>>>>> +     * vm_shutdown() will bdrv_drain_all(), so we may as well include
>>>>>> +     * it in the drained section.
>>>>>> +     * We do not need to end this section, because we do not want any
>>>>>> +     * requests happening from here on anyway.
>>>>>> +     */
>>>>>> +    bdrv_drain_all_begin();
>>>>>> +
>>>>>>         /* No more vcpu or device emulation activity beyond this point */
>>>>>>         vm_shutdown();
>>>>>>     
>>>>>>
>>>>>
>>>>> So, actually, the problem is that we may wait for job requests twice:
>>>>> on drain and then on cancel.
>>>>
>>>> We don’t wait on drain.  When the throttle node is drained, it will
>>>> ignore throttling (as noted in the cover letter).
>>>>
>>>> We do wait when cancelling a job while the throttle node isn’t drained,
>>>> though.  That’s the problem.
>>>
>>> Ah, understand now.
>>>
>>> Is it safe to drain_begin before stopping cpus? We may finish up then with 
>>> some queued
>>> somewhere IO requests..
>>
>> Hm...  Aren’t guest devices prohibited from issuing requests to the
>> block layer while their respective block device is drained?
> 
> It's at least a buggy place, I remember Denis Plotnikov sent patch to fix it 
> and had a huge
> discussion with Kevin.
> And here it is:
> https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg00732.html


Ah, I even have that in my inbox...  The latest reply I see came in April:

https://lists.nongnu.org/archive/html/qemu-block/2019-04/msg00243.html

Where Kevin asked for an RFC patch in the current state.

I’m not sure whether I should work around a potential bug here, if we
can agree that it is a bug, and if it isn’t clear whether this place
would actually be affected.

Max

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH 1/2] vl: Drain before (block) job cancel when quitting

Reply via email to