On 07/09/2013 09:20 AM, Vinzenz Feenstra wrote:
Hi,

While addressing a change for the migration progress timeout (http://gerrit.ovirt.org/#/c/16602/) I happen to notice that there are currently some strange things implemented which don't really make sense to me and it seems like that they lost their original intend and functionality over time.

I'd like to explain here what is currently happening in the code related to the 3 locations where the migration_timeout value is used.

1. Migration progress timeout (this seems to be working fine)
The progress is checked via Vm._dom.jobInfo() every "migration_monitor_interval" seconds (default is 10 seconds) If dataRemaining is less than the last remembered value, the current timestamp is taken and remembered.

if there is no progress (no lower value of dataRemaining) and the time difference from the last remembered timestamp is longer than 'migration_timeout' then the migration job is aborted. And the MigrationMonitorThread
    stopped.

Note: as part of Changeset 16602 migration_timeout has been replaced with migration_progress_timeout in the code.

2. Migration destination timeout

    In _waitForIncomingMigrationFinish:

We are waiting 'migration_timeout' on the _incomingMigrationFinished event. If the event times out we're simply trying to create a new NotifyingVirDomain instance by looking up the VirDomainPtr instance via libvirt.lookupByUUIDString which will only fail if the domain is not yet defined in libvirt.

Without even checking if the _incomingMigrationFinished is still not set, it continues:
In my opinion, it should check if the _incomingMigrationFinished is still not set before calling _domDependentInit, because libvirt.lookupByUUIDString
can return the domain even if the migrations is not finished.

It calls _domDependentInit which initializes the internal device information and fires up
    the statistics thread. And also calls _dom.resume()

Note: I did on purpose ignore that _incomingMigrationFinished being set first is another scenario, however it is not so problematic as the one mentioned above.

3. MigrationDowntimeThread

    _migrationTiemout is defined as follows:
    def _migrationTimeout(self):
        timeout = config.getint('vars', 'migration_timeout')
        mem = int(self.conf['memSize'])
        if mem > 2048:
            timeout = timeout * mem / 2048
        return timeout

The MigrationDowntimeThread is intialized with a value retrieved from _migrationTimeout and devided by 2. => t = MigrationDowntimeThread(int(self._downtime), self._migrationTimeout() / 2)
    the second value is called 'wait' here.

In the MigrationDowntimeThread.run method there's a loop which is executed 'migration_downtime_steps' (10) times. In each step it first waits: (wait / migration_downtime_steps seconds) And then with each step it increases the value by a 10th (1 in migration_downtime_steps) of the passed downtime and sets it to
    _dom.migrateSetMaxDowntime

Note: There are two things which are confusing me about this one:
1. migrateSetMaxDowntime documentation says: Sets maximum tolerable time for which the domain is allowed to be paused at the end of live migration.
       Why does this need to be increased gradually over time?
2. Why is this MigrationDowntimeThread running only for half of the time returned by _migrationTimeout?
Here's my understanding:
As we know, the less migration downtime has less impact on guest's app, but more difficult to achieve, especially for a guest with high memory workload. So I think the intention of MigrationDowntimeThread.runt is trying to start with a small downtime, and gradually increase it if the migration can't be finished.

For 2nd question, I guess it just want to leave enough time for the last try with the biggest downtime.




-----------------------------

Now please could someone look over my description and tell me if this behaviour I have described here was the intended behaviour? For me it seems like that with the patchset 16602 the 1st described part makes sense. Probably the value could be even lower than the currently suggested 150s

Part 2 and 3 however don't seem to be as intended, at least that's how it looks like to me.

Please enlighten me if this is the way it was intended. :-)
I have some ideas regarding these things which could be part of the continuation of the re-factoring work in vm.py, but first I would like to understand the original intend. Thanks.


_______________________________________________
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

Reply via email to