On 02/26/2015 04:44 PM, Fam Zheng wrote: > On Thu, 02/26 14:38, Wen Congyang wrote: >> On 02/25/2015 10:46 AM, Fam Zheng wrote: >>> On Tue, 02/24 15:50, Wen Congyang wrote: >>>> On 02/12/2015 04:44 PM, Fam Zheng wrote: >>>>> On Thu, 02/12 15:40, Wen Congyang wrote: >>>>>> On 02/12/2015 03:21 PM, Fam Zheng wrote: >>>>>>> Hi Congyang, >>>>>>> >>>>>>> On Thu, 02/12 11:07, Wen Congyang wrote: >>>>>>>> +== Workflow == >>>>>>>> +The following is the image of block replication workflow: >>>>>>>> + >>>>>>>> + +----------------------+ +------------------------+ >>>>>>>> + |Primary Write Requests| |Secondary Write Requests| >>>>>>>> + +----------------------+ +------------------------+ >>>>>>>> + | | >>>>>>>> + | (4) >>>>>>>> + | V >>>>>>>> + | /-------------\ >>>>>>>> + | Copy and Forward | | >>>>>>>> + |---------(1)----------+ | Disk Buffer | >>>>>>>> + | | | | >>>>>>>> + | (3) \-------------/ >>>>>>>> + | speculative ^ >>>>>>>> + | write through (2) >>>>>>>> + | | | >>>>>>>> + V V | >>>>>>>> + +--------------+ +----------------+ >>>>>>>> + | Primary Disk | | Secondary Disk | >>>>>>>> + +--------------+ +----------------+ >>>>>>>> + >>>>>>>> + 1) Primary write requests will be copied and forwarded to >>>>>>>> Secondary >>>>>>>> + QEMU. >>>>>>>> + 2) Before Primary write requests are written to Secondary disk, >>>>>>>> the >>>>>>>> + original sector content will be read from Secondary disk and >>>>>>>> + buffered in the Disk buffer, but it will not overwrite the >>>>>>>> existing >>>>>>>> + sector content in the Disk buffer. >>>>>>> >>>>>>> I'm a little confused by the tenses ("will be" versus "are") and terms. >>>>>>> I am >>>>>>> reading them as "s/will be/are/g" >>>>>>> >>>>>>> Why do you need this buffer? >>>>>> >>>>>> We only sync the disk till next checkpoint. Before next checkpoint, >>>>>> secondary >>>>>> vm write to the buffer. >>>>>> >>>>>>> >>>>>>> If both primary and secondary write to the same sector, what is saved >>>>>>> in the >>>>>>> buffer? >>>>>> >>>>>> The primary content will be written to the secondary disk, and the >>>>>> secondary content >>>>>> is saved in the buffer. >>>>> >>>>> I wonder if alternatively this is possible with an imaginary "writable >>>>> backing >>>>> image" feature, as described below. >>>>> >>>>> When we have a normal backing chain, >>>>> >>>>> {virtio-blk dev 'foo'} >>>>> | >>>>> | >>>>> | >>>>> [base] <- [mid] <- (foo) >>>>> >>>>> Where [base] and [mid] are read only, (foo) is writable. When we add an >>>>> overlay >>>>> to an existing image on top, >>>>> >>>>> {virtio-blk dev 'foo'} {virtio-blk dev 'bar'} >>>>> | | >>>>> | | >>>>> | | >>>>> [base] <- [mid] <- (foo) <---------------------- (bar) >>>>> >>>>> It's important to make sure that writes to 'foo' doesn't break data for >>>>> 'bar'. >>>>> We can utilize an automatic hidden drive-backup target: >>>>> >>>>> {virtio-blk dev 'foo'} >>>>> {virtio-blk dev 'bar'} >>>>> | >>>>> | >>>>> | >>>>> | >>>>> v >>>>> v >>>>> >>>>> [base] <- [mid] <- (foo) <----------------- (hidden target) >>>>> <--------------- (bar) >>>>> >>>>> v ^ >>>>> v ^ >>>>> v ^ >>>>> v ^ >>>>> >>>> drive-backup sync=none >>>> >>>>> >>>>> So when guest writes to 'foo', the old data is moved to (hidden target), >>>>> which >>>>> remains unchanged from (bar)'s PoV. >>>>> >>>>> The drive in the middle is called hidden because QEMU creates it >>>>> automatically, >>>>> the naming is arbitrary. >>>> >>>> I don't understand this. In which function, the hidden target is created >>>> automatically? >>>> >>> >>> It's to be determined. This part is only in my mind :) >> >> What about this: >> -drive file=nbd-target,if=none,id=nbd-target0 \ >> -drive >> file=active-disk,if=virtio,driver=qcow2,backing.file.filename=hidden-disk,backing.driver=qcow2,backing.backing=nbd-target0 >> > > It's close. I suppose backing.backing is referencing another drive as its > backing_hd, then you cannot have the other backing.file.* option - they > conflict. It would be something along: > > -drive file=nbd-target,if=none,id=nbd-target0 \ > -drive file=hidden-disk,if=none,id=hidden0,backing.backing=nbd-target0 \ > -drive file=active-disk,if=virtio,driver=qcow2,backing.backing=hidden0 > > Or for simplicity, s/backing.backing=/backing=/g
If using backing=drive_id, backing.backing and backing.file.* are not conflict. backing.backing=$drive_id means that: backing file's backing file's id is $drive_id. > > Yes, adding these "backing=$drive_id" option is also exactly what we expect > in order to support image-fleecing, but we haven't figured how to allow that > without breaking other qmp operations like block jobs, etc. I don't understand this. In which case, qmp operations will be broken? Can you give me some examples? Thanks Wen Congyang > > Fam > . >