Re: [Qemu-devel] Migrating decrementer

Mark Cave-Ayland Tue, 26 Jan 2016 14:33:03 -0800

On 25/01/16 11:10, David Gibson wrote:

> Um.. so the migration duration is a complete red herring, regardless
> of the units.
> 
> Remember, we only ever compute the guest timebase value at the moment
> the guest requests it - actually maintaining a current timebase value
> makes sense in hardware, but would be nuts in software.
> 
> The timebase is a function of real, wall-clock time, and the migration
> destination has a notion of wall-clock time without reference to the
> source.
> 
> So what you need to transmit for migration is enough information to
> compute the guest timebase from real-time - essentially just an offset
> between real-time and the timebase.
> 
> The guest can potentially observe the migration duration as a jump in
> timebase values, but qemu doesn't need to do any calculations with it.


Thanks for more pointers - I think I'm slowly getting there. My current
thoughts are that the basic migration algorithm is doing the right thing
in that it works out the number of host ticks different between source
and destination.

I have a slight query with this section of code though:

    migration_duration_tb = muldiv64(migration_duration_ns, freq,
                                     NANOSECONDS_PER_SECOND);

This is not technically correct on TCG x86 since the timebase is the x86
TSC which is running somewhere in the GHz range, compared to freq which
is hard-coded to 16MHz. However this doesn't seem to matter because the
timebase adjustment is limited to a maximum of 1s. Why should this be if
the timebase is supposed to be free running as you mentioned in a
previous email?

AFAICT the main problem on TCG x86 is that post-migration the timebase
calculated by cpu_ppc_get_tb() is incorrect:

uint64_t cpu_ppc_get_tb(ppc_tb_t *tb_env, uint64_t vmclk, int64_t tb_offset)
{
    /* TB time in tb periods */
    return muldiv64(vmclk, tb_env->tb_freq, get_ticks_per_sec()) +
                    tb_offset;
}

For a typical savevm/loadvm pair I see something like this:

savevm:

tb->guest_timebase = 26281306490558
qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) = 7040725511

loadvm:

cpu_get_host_ticks() = 26289847005259
tb_off_adj = -8540514701
qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) = 7040725511
cpu_ppc_get_tb() = -15785159386

But as cpu_ppc_get_tb() uses QEMU_CLOCK_VIRTUAL for vmclk we end up with
a negative number for the timebase since the virtual clock is dwarfed by
the number of TSC ticks calculated for tb_off_adj. This will work on a
PPC host though since cpu_host_get_ticks() is also derived from the
timebase.

Another question I have is cpu_ppc_load_tbl():

uint64_t cpu_ppc_load_tbl (CPUPPCState *env)
{
    ppc_tb_t *tb_env = env->tb_env;
    uint64_t tb;

    if (kvm_enabled()) {
        return env->spr[SPR_TBL];
    }

    tb = cpu_ppc_get_tb(tb_env, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
                        tb_env->tb_offset);
    LOG_TB("%s: tb %016" PRIx64 "\n", __func__, tb);

    return tb;
}

Compared with cpu_ppc_load_tbu(), it is returning uint64_t rather than
uint32_t and doesn't appear to mask the bottom 32-bits of the timebase
value?


ATB,

Mark.

Re: [Qemu-devel] Migrating decrementer

Reply via email to