mirror: support unaligned write in active mirror

Vladimir Sementsov-Ogievskiy Fri, 04 Oct 2019 08:39:15 -0700

04.10.2019 18:27, Max Reitz wrote:
> On 04.10.19 17:04, Vladimir Sementsov-Ogievskiy wrote:
>> 04.10.2019 17:48, Max Reitz wrote:
>>> On 04.10.19 15:22, Vladimir Sementsov-Ogievskiy wrote:
>>>> 04.10.2019 15:59, Max Reitz wrote:
>>>>> On 03.10.19 11:34, Vladimir Sementsov-Ogievskiy wrote:
>>>>>> 02.10.2019 18:52, Max Reitz wrote:
>>>>>>> On 02.10.19 17:06, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>>> 02.10.2019 18:03, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>>>> 02.10.2019 17:57, Max Reitz wrote:
>>>>>>>>>> On 12.09.19 17:13, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>>>>>> Prior 9adc1cb49af8d do_sync_target_write had a bug: it reset 
>>>>>>>>>>> aligned-up
>>>>>>>>>>> region in the dirty bitmap, which means that we may not copy some 
>>>>>>>>>>> bytes
>>>>>>>>>>> and assume them copied, which actually leads to producing corrupted
>>>>>>>>>>> target.
>>>>>>>>>>>
>>>>>>>>>>> So 9adc1cb49af8d forced dirty bitmap granularity to be
>>>>>>>>>>> request_alignment for mirror-top filter, so we are not working with
>>>>>>>>>>> unaligned requests. However forcing large alignment obviously 
>>>>>>>>>>> decreases
>>>>>>>>>>> performance of unaligned requests.
>>>>>>>>>>>
>>>>>>>>>>> This commit provides another solution for the problem: if unaligned
>>>>>>>>>>> padding is already dirty, we can safely ignore it, as
>>>>>>>>>>> 1. It's dirty, it will be copied by mirror_iteration anyway
>>>>>>>>>>> 2. It's dirty, so skipping it now we don't increase dirtiness of the
>>>>>>>>>>>         bitmap and therefore don't damage "synchronicity" of the
>>>>>>>>>>>         write-blocking mirror.
>>>>>>>>>>
>>>>>>>>>> But that’s not what active mirror is for.  The point of active 
>>>>>>>>>> mirror is
>>>>>>>>>> that it must converge because every guest write will contribute 
>>>>>>>>>> towards
>>>>>>>>>> that goal.
>>>>>>>>>>
>>>>>>>>>> If you skip active mirroring for unaligned guest writes, they will 
>>>>>>>>>> not
>>>>>>>>>> contribute towards converging, but in fact lead to the opposite.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The will not contribute only if region is already dirty. Actually, 
>>>>>>>>> after
>>>>>>>>> first iteration of mirroring (copying the whole disk), all following 
>>>>>>>>> writes
>>>>>>>>> will contribute, so the whole process must converge. It is a bit 
>>>>>>>>> similar with
>>>>>>>>> running one mirror loop in normal mode, and then enable 
>>>>>>>>> write-blocking.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> In other words, we don't need "all guest writes contribute" to 
>>>>>>>> converge,
>>>>>>>> "all guest writes don't create new dirty bits" is enough, as we have 
>>>>>>>> parallel
>>>>>>>> mirror iteration which contiguously handles dirty bits.
>>>>>>>
>>>>>>> Hm, in a sense.  But it does mean that guest writes will not contribute
>>>>>>> to convergence.
>>>>>>>
>>>>>>> And that’s against the current definition of write-blocking, which does
>>>>>>> state that “when data is written to the source, write it (synchronously)
>>>>>>> to the target as well”.
>>>>>>>
>>>>>>
>>>>>> Hmm, understand. But IMHO our proposed behavior is better in general.
>>>>>> Do you think it's a problem to change spec now?
>>>>>> If yes, I'll resend with an option
>>>>>
>>>>> Well, the thing is that I’d find it weird if write-blocking wasn’t
>>>>> blocking in all cases.  And in my opinion, it makes more sense for
>>>>> active mirror if all writes actively contributed to convergence.
>>>>>
>>>>
>>>> Why? What is the benefit in it?
>>>> What is "all writes actively contributed to convergence" for user?
>>>
>>> One thing I wonder about is whether it’s really guaranteed that the
>>> background job will run under full I/O load, and how often it runs.
>>>
>>> I fear that with your model, the background job might starve and the
>>> mirror may take a very long time.
>>
>> Hmmm. I think mirror job is in same context as guest writes, and goes
>> through same IO api.. Why it will starve? (I understand that my words
>> are not an evidence...).
> 
> I thought that maybe if the disk is read to write and write all the
> time, there’d be no time for the mirror coroutine.
> 
>>>   It won’t diverge, but it also won’t
>>> really converge.
>>
>> But same will be with current behavior: guest is not guaranteed to write
>> to all parts of disk. And in most scenarios it doesn't. So, if mirror job
>> starve because of huge guest IO load, we will not converge anyway.
>>
>> So, background process is necessary thing for converge anyway.
> 
> Good point.  That convinces me.
> 
>>> The advantage of letting all writes block is that even under full I/O
>>> load, the mirror job will progress at a steady pace.
>>>
>>>> I think for user there may be the following criteria:
>>>>
>>>> 1. guaranteed converge, with any guest write load.
>>>> Both current and my proposed variants are OK.
>>>>
>>>> 2. Less impact on guest.
>>>> Obviously my proposed variant is better
>>>>
>>>> 3. Total time of mirroring
>>>> Seems, current may be a bit better, but I don't think that unaligned
>>>> tails gives significant impact.
>>>>
>>>> ===
>>>>
>>>> So, assume I want [1]+[2]. And possibly
>>>> 2.2: Even less impact on guest: ignore not only unaligned tails if they are
>>>> already dirty, but full synchronous mirror operation if area is already 
>>>> dirty.
>>>>
>>>> How should I call this? Should it be separate mode, or option for 
>>>> write-blocking?
>>>
>>> I don’t know whether it makes sense to add a separate mode or a separate
>>> option just for this difference.  I don’t think anyone would choose the
>>> non-default option.
>>
>> At least Virtuozzo will choose :)
> 
> But if Virtuozzo knows exactly what to choose, then we can just
> implement only that behavior.
>


Then, welcome to these series)

Still [2.2] is arguable: if we skip some dirty areas on synchronous copy, guest
will be a bit happier, but we miss a chance to skip read part of copying for the
mirror part. But this may be postponed.


-- 
Best regards,
Vladimir

Re: [PATCH 3/4] block/mirror: support unaligned write in active mirror

Reply via email to