Re: [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup

Max Reitz Tue, 19 Nov 2019 01:37:05 -0800

On 13.11.19 14:24, Sergio Lopez wrote:
> 
> Sergio Lopez <s...@redhat.com> writes:
> 
>> no-re...@patchew.org writes:
>>
>>> Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-...@redhat.com/
>>>
>>>
>>>
>>> Hi,
>>>
>>> This series failed the docker-quick@centos7 build test. Please find the 
>>> testing commands and
>>> their output below. If you have Docker installed, you can probably 
>>> reproduce it
>>> locally.
>>>
>>> === TEST SCRIPT BEGIN ===
>>> #!/bin/bash
>>> make docker-image-centos7 V=1 NETWORK=1
>>> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
>>> === TEST SCRIPT END ===
>>>
>>>   TEST    iotest-qcow2: 268
>>> Failures: 141
>>
>> Hm... 141 didn't fail in my test machine. I'm going to have a look.
> 
> So here's the output:
> 
> --- /root/qemu/tests/qemu-iotests/141.out     2019-11-12 04:43:27.651557587 
> -0500
> +++ /root/qemu/build/tests/qemu-iotests/141.out.bad   2019-11-13 
> 08:12:06.575967337 -0500
> @@ -10,6 +10,8 @@
>  Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 
> backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "job0"}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
>  {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: node is 
> used as backing hd of 'NODE_NAME'"}}
>  {"return": {}}
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
> "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
> 
> Those extra lines, the "paused" and "running", are a result of the job
> being done in a transaction, within a drained section.
> 
> We can update 141.out, but now I'm wondering, was it safe creating the
> job at do_drive_backup() outside of a drained section, as
> qmp_drive_backup was doing?


I think it is.  Someone needs to drain the source node before attaching
the job filter (which intercepts writes), and bdrv_backup_top_append()
does precisely this.

If the source node is in an I/O thread, you could argue that the drain
starts later than when the user has invoked the backup command, and so
some writes might slip through.  That’s correct.  But at the same time,
it’s impossible to drain it the instant the command is received.  So
some writes might always slip through (and the drain will not stop them
either, it will just let them happen).

Therefore, I think it’s fine the way it is.

> Do you think there may be any potential drawbacks as a result of always
> doing it now inside a drained section?

Well, one drawback is clearly visible.  The job goes to paused for no
reason.

Max

signature.asc
Description: OpenPGP digital signature

Re: [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup

Reply via email to