Re: [Qemu-block] [PATCH v8 4/7] block: introduce backup-top filter driver

Max Reitz Mon, 17 Jun 2019 09:25:29 -0700

On 17.06.19 17:53, Kevin Wolf wrote:
> Am 17.06.2019 um 16:56 hat Max Reitz geschrieben:
>> On 17.06.19 12:36, Vladimir Sementsov-Ogievskiy wrote:
>>> 14.06.2019 23:03, Max Reitz wrote:
>>>> On 14.06.19 18:22, Vladimir Sementsov-Ogievskiy wrote:
>>>>> 14.06.2019 15:57, Max Reitz wrote:
>>>>>> On 14.06.19 11:04, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>> 13.06.2019 18:57, Max Reitz wrote:
>>>>>>>> On 29.05.19 17:46, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>>>> Backup-top filter does copy-before-write operation. It should be
>>>>>>>>> inserted above active disk and has a target node for CBW, like the
>>>>>>>>> following:
>>>>>>>>>
>>>>>>>>>        +-------+
>>>>>>>>>        | Guest |
>>>>>>>>>        +-------+
>>>>>>>>>            |r,w
>>>>>>>>>            v
>>>>>>>>>        +------------+  target   +---------------+
>>>>>>>>>        | backup_top |---------->| target(qcow2) |
>>>>>>>>>        +------------+   CBW     +---------------+
>>>>>>>>>            |
>>>>>>>>> backing |r,w
>>>>>>>>>            v
>>>>>>>>>        +-------------+
>>>>>>>>>        | Active disk |
>>>>>>>>>        +-------------+
>>>>>>>>>
>>>>>>>>> The driver will be used in backup instead of write-notifiers.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <[email protected]>
>>>>>>>>> ---
>>>>>>>>>     block/backup-top.h  |  64 +++++++++
>>>>>>>>>     block/backup-top.c  | 322 
>>>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>>     block/Makefile.objs |   2 +
>>>>>>>>>     3 files changed, 388 insertions(+)
>>>>>>>>>     create mode 100644 block/backup-top.h
>>>>>>>>>     create mode 100644 block/backup-top.c
>>>>>>>>>
>>>>>>>>> diff --git a/block/backup-top.h b/block/backup-top.h
>>>>>>>>> new file mode 100644
>>>>>>>>> index 0000000000..788e18c358
>>>>>>>>> --- /dev/null
>>>>>>>>> +++ b/block/backup-top.h
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>>>>> +/*
>>>>>>>>> + * bdrv_backup_top_append
>>>>>>>>> + *
>>>>>>>>> + * Append backup_top filter node above @source node. @target node 
>>>>>>>>> will receive
>>>>>>>>> + * the data backed up during CBE operations. New filter together 
>>>>>>>>> with @target
>>>>>>>>> + * node are attached to @source aio context.
>>>>>>>>> + *
>>>>>>>>> + * The resulting filter node is implicit.
>>>>>>>>
>>>>>>>> Why?  It’s just as easy for the caller to just make it implicit if it
>>>>>>>> should be.  (And usually the caller should decide that.)
>>>>>>>
>>>>>>> Personally, I don't know what are the reasons for filters to bi 
>>>>>>> implicit or not,
>>>>>>> I just made it like other job-filters.. I can move making-implicit to 
>>>>>>> the caller
>>>>>>> or drop it at all (if it will work).
>>>>>>
>>>>>> Nodes are implicit if they haven’t been added consciously by the user.
>>>>>> A node added by a block job can be non-implicit, too, as mirror shows;
>>>>>> If the user specifies the filter-node-name option, they will know about
>>>>>> the node, thus it is no longer implicit.
>>>>>>
>>>>>> If the user doesn’t know about the node (they didn’t give the
>>>>>> filter-node-name option), the node is implicit.
>>>>>>
>>>>>
>>>>> Ok, I understand it. But it doesn't show, why it should be implicit?
>>>>> Isn't it less buggy to make all filters explicit? We don't hide implicit 
>>>>> nodes
>>>>> from query-named-block-nodes (the only interface to explore the whole 
>>>>> graph for now)
>>>>> anyway. And we can't absolutely hide side effects of additional node in 
>>>>> the graph.
>>>>
>>>> Well, we try, at least.  At least we hide them from query-block.
>>>>
>>>>> So, is there any real benefit of supporting separation into implicit and 
>>>>> explicit filters?
>>>>> It seems for me that it only complicates things...
>>>>> In other words, what will break if we make all filters explicit?
>>>>
>>>> The theory is that qemu may decide to add nodes at any point, but at
>>>> least when managing chains etc., they may not be visible to the user.  I
>>>> don’t think we can get rid of them so easily.
> 
> I think the important point to understand here is that implicit filters
> are a concept to maintain compatibility with legacy (pre-blockdev)
> management tools which simply don't manage stuff on the node level. So
> changing the structure of the graph is not a problem for them and we
> just need to make sure that when they do something with a BlockBackend,
> we perform the action on the node that they actually mean.
> 
> Modern management tools are expected to manage the graph on a node level
> and to assign node names to everything.
> 
> Note that libvirt is close to becoming a modern management tool, so it's
> probably a good idea to now make all block jobs add filters where we'll
> need them in the long run.
> 
>>>> One example that isn’t implemented yet is copy-on-read.  In theory,
>>>> specifying copy-on-read=on for -drive should create an implicit COR node
>>>> on top.  The user shouldn’t see that node when inspecting the drive or
>>>> when issuing block jobs on it, etc.  And this node has to stay there
>>>> when the user does e.g. an active commit somewhere down the chain.
>>>
>>> Why instead not to just write in doc that yes, filter is created when you
>>> enable copy-on-read?
>>
>> Because the problem is that existing users are not aware of that.
>>
>> If they were, they could simply create a COR filter already.
>>
>> I suppose we could interpret the deprecation policy in a way that we
>> could change the behavior of -drive copy-on-read=on, but as I already
>> said, what’s the point of supporting -drive copy-on-read=on, when you
>> can simply explicitly create a COR filter on top?
> 
> I actually changed my opinion on how to do this since we introduced
> implicit filters. I know think that the right way to move forward is to
> make sure that the copy-on-read filter can do everything it needs to do,
> and then just completely deprecate -drive copy-on-read=on instead of
> adding compatibility magic to turn it into an implicit copy-on-read
> filter internally.


Sure, so your answer to “what’s the point of supporting -drive
copy-on-read=on” is “there is none”.  Fair enough. :-)

>>>> That sounds like a horrible ordeal to implement, so it hasn’t been done
>>>> yet.  Maybe it never will.  It isn’t that bad for the job filters,
>>>> because they generally freeze the block graph, so there is no problem
>>>> with potential modifications.
>>>>
>>>> All in all I do think having implicit nodes makes sense.  Maybe not so
>>>> much now, but in the future (if someone implements converting -drive COR
>>>> and throttle options to implicit nodes...).
>>>
>>> But do we have at least strong definition of how implicit nodes should 
>>> behave
>>> on any graph-modifying operations around them?
>>
>> No.  Depends on the node.
>>
>>> Should new implicit/explicit
>>> filters be created above or under them?
>>
>> That was always the most difficult question we had when we introduced
>> filters.
>>
>> The problem is that we never answered it in our code base.
>>
>> One day, we just added filters (“let’s see what happens”), and in the
>> beginning, they were all explicit, so we still didn’t have to answer
>> this question (in code).  Then job filters were added, and we still
>> didn’t have to, because they set blockers so the graph couldn’t be
>> reorganized with them in place anyway.
>>
>> If we converted copy-on-read=on to a COR node, we would have to answer
>> that question.
> 
> That's why we shouldn't do that, but just remove copy-on-read=on. :-)

And BB-level throttling, fair enough.

(Although the first step would be probably to make throttle groups
non-experimental?  Like, drop the x- prefix for all their parameters.)

>>>>> But should really filter do that job, or it is here only to do CBW? Maybe 
>>>>> target
>>>>> must have another parent which will care about flushing.
>>>>>
>>>>> Ok, I think I'll flush target here too for safety, and leave a comment, 
>>>>> that something
>>>>> smarter would be better.
>>>>> (or, if we decide to flush all children by default in generic code, I'll 
>>>>> drop this handler)
>>>>
>>>> [1] Thinking more about this, for normal backups the target file does
>>>> not reflect a valid disk state until the backup is done; just like for
>>>> qemu-img convert.  Just like qemu-img convert, there is therefore no
>>>> reason to flush the target until the job is done.
> 
> Depends, the target could have the source as its backing file (like
> fleecing, but without exporting it right now), and then you could always
> have a consistent view on the target. Well, almost.
> 
> Almost because to guarantee this, we'd have to flush between each CBW
> and the corresponding write to the same block, to make sure that the old
> data is backed up before it is overwritten.

Yes, that’s what I meant by “enforce on the host that the target is
always flushed before the source”.  Well, I meant to say there is no
good way of doing that, and I kind of don’t consider this a good way.

> Of course, this would perform about as badly as internal COW in qcow2...
> So probably not a guarantee we should be making by default. But it might
> make sense as an option.

I don’t know.  “Use this option so your data isn’t lost in case of a
power failure, but it makes everything pretty slow”?  Who is going to do
explicitly enable that (before their first data loss)?

Max

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH v8 4/7] block: introduce backup-top filter driver

Reply via email to