Re: [Qemu-devel] migration: broken ram_save_pending

Alexey Kardashevskiy Wed, 05 Feb 2014 19:12:40 -0800

On 02/06/2014 03:45 AM, Paolo Bonzini wrote:
> Il 05/02/2014 17:42, Dr. David Alan Gilbert ha scritto:
>> Because:
>>     * the code is still running and keeps redirtying a small handful of
>> pages
>>     * but because we've underestimated our available bandwidth we never stop
>>       it and just throw those pages across immediately
> 
> Ok, I thought Alexey was saying we are not redirtying that handful of pages.



Every iteration we read the dirty map from KVM and send all dirty pages
across the stream.


> And in turn, this is because the max downtime we have is too low
> (especially for the default 32 MB/sec default bandwidth; that's also pretty
> low).


My understanding nooow is that in order to finish migration QEMU waits for
the earliest 100ms (BUFFER_DELAY) of continuously low trafic but due to
those pages getting dirty every time we read the dirty map, we transfer
more in these 100ms than we are actually allowed (>32MB/s or 320KB/100ms).
So we transfer-transfer-transfer, detect than we transfer too much, do
delay() and if max_size (calculated from actual transfer and downtime) for
the next iteration is less (by luck) than those 96 pages (uncompressed) -
we finish.

Increasing speed or/and downtime will help but still - we would not need
that if migration did not expect all 96 pages to have to be sent but did
have some smart way to detect that many are empty (so - compressed).

Literally, move is_zero_range() from ram_save_block() to
migration_bitmap_sync() and store this bit in some new pages_zero_map, for
example. But does it make a lot of sense?


-- 
Alexey

Re: [Qemu-devel] migration: broken ram_save_pending

Reply via email to