backup: refactor write_flags

Vladimir Sementsov-Ogievskiy Thu, 01 Aug 2019 05:40:53 -0700

01.08.2019 15:21, Max Reitz wrote:
> On 01.08.19 14:02, Vladimir Sementsov-Ogievskiy wrote:
>> 01.08.2019 14:37, Max Reitz wrote:
>>> On 01.08.19 13:32, Vladimir Sementsov-Ogievskiy wrote:
>>>> 01.08.2019 14:28, Max Reitz wrote:
>>>>> On 31.07.19 18:01, Vladimir Sementsov-Ogievskiy wrote:
>>>>>> 30.07.2019 21:28, John Snow wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 7/30/19 12:32 PM, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>>> write flags are constant, let's store it in BackupBlockJob instead of
>>>>>>>> recalculating. It also makes two boolean fields to be unused, so,
>>>>>>>> drop them.
>>>>>>>>
>>>>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <[email protected]>
>>>>>>>> ---
>>>>>>>>      block/backup.c | 24 ++++++++++++------------
>>>>>>>>      1 file changed, 12 insertions(+), 12 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/block/backup.c b/block/backup.c
>>>>>>>> index c5f941101a..4651649e9d 100644
>>>>>>>> --- a/block/backup.c
>>>>>>>> +++ b/block/backup.c
>>>>>>>> @@ -47,7 +47,6 @@ typedef struct BackupBlockJob {
>>>>>>>>          uint64_t len;
>>>>>>>>          uint64_t bytes_read;
>>>>>>>>          int64_t cluster_size;
>>>>>>>> -    bool compress;
>>>>>>>>          NotifierWithReturn before_write;
>>>>>>>>          QLIST_HEAD(, CowRequest) inflight_reqs;
>>>>>>>>      
>>>>>>>> @@ -55,7 +54,7 @@ typedef struct BackupBlockJob {
>>>>>>>>          bool use_copy_range;
>>>>>>>>          int64_t copy_range_size;
>>>>>>>>      
>>>>>>>> -    bool serialize_target_writes;
>>>>>>>> +    BdrvRequestFlags write_flags;
>>>>>>>>      } BackupBlockJob;
>>>>>>>>      
>>>>>>>>      static const BlockJobDriver backup_job_driver;
>>>>>>>> @@ -110,10 +109,6 @@ static int coroutine_fn 
>>>>>>>> backup_cow_with_bounce_buffer(BackupBlockJob *job,
>>>>>>>>          BlockBackend *blk = job->common.blk;
>>>>>>>>          int nbytes;
>>>>>>>>          int read_flags = is_write_notifier ? BDRV_REQ_NO_SERIALISING 
>>>>>>>> : 0;
>>>>>>>> -    int write_flags =
>>>>>>>> -            (job->serialize_target_writes ? BDRV_REQ_SERIALISING : 0) 
>>>>>>>> |
>>>>>>>> -            (job->compress ? BDRV_REQ_WRITE_COMPRESSED : 0);
>>>>>>>> -
>>>>>>>>      
>>>>>>>>          assert(QEMU_IS_ALIGNED(start, job->cluster_size));
>>>>>>>>          hbitmap_reset(job->copy_bitmap, start, job->cluster_size);
>>>>>>>> @@ -132,7 +127,7 @@ static int coroutine_fn 
>>>>>>>> backup_cow_with_bounce_buffer(BackupBlockJob *job,
>>>>>>>>          }
>>>>>>>>      
>>>>>>>>          ret = blk_co_pwrite(job->target, start, nbytes, 
>>>>>>>> *bounce_buffer,
>>>>>>>> -                        write_flags);
>>>>>>>> +                        job->write_flags);
>>>>>>>>          if (ret < 0) {
>>>>>>>>              trace_backup_do_cow_write_fail(job, start, ret);
>>>>>>>>              if (error_is_read) {
>>>>>>>> @@ -160,7 +155,6 @@ static int coroutine_fn 
>>>>>>>> backup_cow_with_offload(BackupBlockJob *job,
>>>>>>>>          BlockBackend *blk = job->common.blk;
>>>>>>>>          int nbytes;
>>>>>>>>          int read_flags = is_write_notifier ? BDRV_REQ_NO_SERIALISING 
>>>>>>>> : 0;
>>>>>>>> -    int write_flags = job->serialize_target_writes ? 
>>>>>>>> BDRV_REQ_SERIALISING : 0;
>>>>>>>>      
>>>>>>>>          assert(QEMU_IS_ALIGNED(job->copy_range_size, 
>>>>>>>> job->cluster_size));
>>>>>>>>          assert(QEMU_IS_ALIGNED(start, job->cluster_size));
>>>>>>>> @@ -168,7 +162,7 @@ static int coroutine_fn 
>>>>>>>> backup_cow_with_offload(BackupBlockJob *job,
>>>>>>>>          nr_clusters = DIV_ROUND_UP(nbytes, job->cluster_size);
>>>>>>>>          hbitmap_reset(job->copy_bitmap, start, job->cluster_size * 
>>>>>>>> nr_clusters);
>>>>>>>>          ret = blk_co_copy_range(blk, start, job->target, start, 
>>>>>>>> nbytes,
>>>>>>>> -                            read_flags, write_flags);
>>>>>>>> +                            read_flags, job->write_flags);
>>>>>>>>          if (ret < 0) {
>>>>>>>>              trace_backup_do_cow_copy_range_fail(job, start, ret);
>>>>>>>>              hbitmap_set(job->copy_bitmap, start, job->cluster_size * 
>>>>>>>> nr_clusters);
>>>>>>>> @@ -638,10 +632,16 @@ BlockJob *backup_job_create(const char *job_id, 
>>>>>>>> BlockDriverState *bs,
>>>>>>>>          job->sync_mode = sync_mode;
>>>>>>>>          job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ?
>>>>>>>>                             sync_bitmap : NULL;
>>>>>>>> -    job->compress = compress;
>>>>>>>>      
>>>>>>>> -    /* Detect image-fleecing (and similar) schemes */
>>>>>>>> -    job->serialize_target_writes = bdrv_chain_contains(target, bs);
>>>>>>>> +    /*
>>>>>>>> +     * Set write flags:
>>>>>>>> +     *  1. Detect image-fleecing (and similar) schemes
>>>>>>>> +     *  2. Handle compression
>>>>>>>> +     */
>>>>>>>> +    job->write_flags =
>>>>>>>> +            (bdrv_chain_contains(target, bs) ? BDRV_REQ_SERIALISING : 
>>>>>>>> 0) |
>>>>>>>> +            (compress ? BDRV_REQ_WRITE_COMPRESSED : 0);
>>>>>>>> +
>>>>>>>>          job->cluster_size = cluster_size;
>>>>>>>>          job->copy_bitmap = copy_bitmap;
>>>>>>>>          copy_bitmap = NULL;
>>>>>>>>
>>>>>>>
>>>>>>> What happens if you did pass BDRV_REQ_WRITE_COMPRESSED to
>>>>>>> blk_co_copy_range? Is that rejected somewhere in the stack?
>>>>>>
>>>>>>
>>>>>> Hmm, I'm afraid that it will be silently ignored.
>>>>>>
>>>>>> And I have one question related to copy offload too.
>>>>>>
>>>>>> Do we really need to handle max_transfer in backup code for copy offload?
>>>>>> Is max_transfer related to it really?
>>>>>>
>>>>>> Anyway, bl.max_transfer should be handled in generic copy_range code in 
>>>>>> block/io.c
>>>>>> (if it should at all), I'm going to fix it, but may be, I can just drop 
>>>>>> this limitation
>>>>>> from backup?
>>>>>
>>>>> On a quick glance, it doesn’t look like a limitation to me but actually
>>>>> like the opposite.  backup_cow_with_bounce_buffer() only copies up to
>>>>> cluster_size, whereas backup_cow_with_offload() will copy up to the
>>>>> maximum transfer size permitted by both source and target for copy
>>>>> offloading.
>>>>>
>>>>
>>>> I mean, why not to just copy the whole chunk comes in notifier and don't 
>>>> care about
>>>> max_transfer? Backup loop copies cluster by cluster anyway, so only 
>>>> notifier may copy
>>>> larger chunk.
>>>
>>> One thing that comes to mind is the hbitmap_get() check in
>>> backup_do_cow().  You don’t want to copy everything just because the
>>> first cluster needs to be copied.
>>>
>>
>> Hmm, but seems that we do exactly this, and this is wrong. But this is 
>> separate thing..
> 
> You’re right.  It’s totally broken.  Nice.
> 
> The following gets me a nice content mismatch:
> 
> $ ./qemu-img create -f qcow2 src.qcow2 2M
> $ ./qemu-io -c 'write -P 42 0 2M' src.qcow2
> $ cp src.qcow2 ref.qcow2
> 
> {"execute":"qmp_capabilities"}
> {"return": {}}
> {"execute":"blockdev-add","arguments":
>   {"node-name":"src","driver":"qcow2",
>    "file":{"driver":"file","filename":"src.qcow2"}}}
> {"return": {}}
> {"execute":"drive-backup","arguments":
>   {"device":"src","job-id":"backup","target":"tgt.qcow2",
>    "format":"qcow2","sync":"full","speed":1048576}}
> {"timestamp": {"seconds": 1564661742, "microseconds": 268384},
>   "event": "JOB_STATUS_CHANGE",
>    "data": {"status": "created", "id": "backup"}}
> {"timestamp": {"seconds": 1564661742, "microseconds": 268436},
>   "event": "JOB_STATUS_CHANGE",
>    "data": {"status": "running", "id": "backup"}}
> {"return": {}}
> {"execute":"human-monitor-command",
>   "arguments":{"command-line":
>                "qemu-io src \"write -P 23 1114112 65536\""}}
> {"return": ""}
> {"execute":"human-monitor-command",
>   "arguments":{"command-line":
>                "qemu-io src \"write -P 66 1048576 1M\""}}
> {"return": ""}
> {"timestamp": {"seconds": 1564661744, "microseconds": 278362}.
>   "event": "JOB_STATUS_CHANGE",
>   "data": {"status": "waiting", "id": "backup"}}
> {"timestamp": {"seconds": 1564661744, "microseconds": 278534},
>   "event": "JOB_STATUS_CHANGE",
>   "data": {"status": "pending", "id": "backup"}}
> {"timestamp": {"seconds": 1564661744, "microseconds": 278778},
>   "event": "BLOCK_JOB_COMPLETED",
>   "data": {"device": "backup", "len": 2097152, "offset": 2162688,
>            "speed": 1048576, "type": "backup"}}
> {"execute":"quit"}
> {"timestamp": {"seconds": 1564661744, "microseconds": 278884},
>   "event": "JOB_STATUS_CHANGE",
>   "data": {"status": "concluded", "id": "backup"}}
> {"timestamp": {"seconds": 1564661744, "microseconds": 278946},
>   "event": "JOB_STATUS_CHANGE",
>   "data": {"status": "null", "id": "backup"}}
> {"return": {}}
> 
> $ ./qemu-img compare src.qcow2
> Content mismatch at offset 1114112!
> 
> 
> Aww maaan.  Setting copy_range to false “fixes” it.  I guess this’ll
> need to be fixed for 4.1. :-/


Oops yes. Writing "this is a separate thing..", I thought that it's not very 
bad to
copy more than needed, but actually we must not copy clusters not-dirty in 
copy_bitmap,
as they may already be rewritten by the guest.

> 
>> About copy_range, I just don't sure that max_transfer is a true restriction 
>> for copy_range.
>> For example, for file_posix max_transfer comes from some specific ioctl or 
>> from sysfs.. Is
>> it appropriate as limitation for copy_file_range?
> 
> Who knows, but it’s probably the best approximation we have.

Ok. Than I'll try to move handling of max_transfer for copy_range into 
block/io, like it's done for
write.

> 
>> Also, Max, could you please take a look at "[PATCH v3] blockjob: drain all 
>> job nodes in block_job_drain"
>> thread? Maybe, what John questions is obvious for you.
> 
> Perhaps after fixing backup. :-/
> 
> Max
> 


-- 
Best regards,
Vladimir

Re: [Qemu-block] [Qemu-devel] [PATCH 3/3] block/backup: refactor write_flags

Reply via email to