On 04/10/2023 20:33, Peter Xu wrote:
> On Tue, Sep 26, 2023 at 05:18:41PM +0100, Joao Martins wrote:
>> Right now, migration statistics either print downtime or expected
>> downtime depending on migration completing of in progress. Also in the
>> beginning of migration by printing the downtime limit as expected
>> downtime, when estimation is not available.
>>
>> The pending_size is private in migration iteration and not necessarily
>> accessible outside. Given the non-determinism of the switchover cost, it
>> can be useful to understand if the downtime was far off from the one
>> detected by the migration algoritm, thus print the resultant downtime
>> alongside its estimation.
>>
>> Signed-off-by: Joao Martins <joao.m.mart...@oracle.com>
>> ---
>> migration/migration.c | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index dec6c88fbff9..f08f65b4b1c3 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -943,6 +943,10 @@ static void populate_time_info(MigrationInfo *info,
>> MigrationState *s)
>> if (s->state == MIGRATION_STATUS_COMPLETED) {
>> info->has_total_time = true;
>> info->total_time = s->total_time;
>> + if (s->expected_downtime) {
>> + info->has_expected_downtime = true;
>> + info->expected_downtime = s->expected_downtime;
>> + }
>
> There's another chunk right below that will also show
> expected_downtime.. How about we merge them to be clear?
>
Definitly
> IIUC the current patch will not display expected_downtime during postcopy,
> which makes sense. But it'll pop up again after postcopy completes... so
> not ideal either. If so sounds easier to just show it as long as we have a
> value, and the user can ignore it.
>
Yes.
> @@ -913,7 +913,9 @@ static void populate_time_info(MigrationInfo *info,
> MigrationState *s)
> if (migrate_show_downtime(s)) {
> info->has_downtime = true;
> info->downtime = s->downtime;
> - } else {
> + }
> +
> + if (s->expected_downtime) {
> info->has_expected_downtime = true;
> info->expected_downtime = s->expected_downtime;
> }
>
> IIUC currently expected_downtime for postcopy makes less sense.
I think it makes sense, you still need to switchover to destination. Knowing how
much you miss is useful? Albeit compared to the rest of the algorithm is less
critical than say in precopy.
> Maybe one
> day we can make it reflect reality, by taking more things into account
> (besides dirty RAM rate).
>
>> } else {
>> info->has_total_time = true;
>> info->total_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) -
>> @@ -2844,6 +2848,10 @@ static MigIterateState
>> migration_iteration_run(MigrationState *s)
>>
>> if ((!pending_size || pending_size < s->threshold_size) &&
>> can_switchover) {
>> trace_migration_thread_low_pending(pending_size);
>> + if (s->threshold_size) {
>> + s->expected_downtime = (pending_size *
>> s->parameters.downtime_limit) /
>> + s->threshold_size;
>> + }
>
> I had a feeling that you did the calculation to avoid accessing ->mbps. :)
>
It was oversight on my end
> I'd suggest we move this into migration_completion(), and use ->mbps
> (before the other avail-switchover-bandwidth patch lands). It's just that
> using the bandwidth value seems more straightforward.
It is better that way, and things get consolidated.
> Or maybe I missed
> something tricky?
>
Nah, just oversight :)