Hi Michael, 

What you have committed is not correct. This is basically the same what was 
before and it does not solve the problem. You changed my implementation of 
prepSourceVm to the:

def prepSourceVm(self, instance):
                instance.state = InstanceState.MigratePrep

but it has to be as follows:

def prepSourceVm(self, vmId):
                instance = self.getInstance(vmId)
                instance.state = InstanceState.MigratePrep

Because the problem was that CM instance was not synced with the NM instance 
(the VM instance is stored in CM and also independently in NM, the CM instance 
is updated by stateTransition call, for updating NM instance you have to first 
retrieve the NM instance by self.getInstance(vmId) call).

Migrations on our cluster failed a lot of times, because the error happened 
later in migrateVm, when this method expected the instance to be in 
MigrateTrans state, but it was set to Running once again inside 
registerNodeManager (after registerNodeManager reported that the state was not 
as expected - oldInstance was in Running state, instance was in MigrateTrans 
state).

However, after this patch migrations on our cluster work fine.

Best,
Miha


----- Original Message -----
From: "Michael Stroucken" <[email protected]>
To: [email protected]
Sent: Wednesday, November 24, 2010 10:40:50 PM
Subject: Re: KVM migrations

Miha Stopar wrote:
> Dear all,
>
> I don't know if anybody else experienced problems when executing KVM live 
> migrations with Tashi, but I experienced some errors related to VM states. I 
> think the problem lies inside prepReceiveVm method, which is inside 
> nodemanagerservice.py and is called by CM.
>
> The prepReceiveVm method sets the instance state to MigratePrep 
> ("instance.state = InstanceState.MigratePrep"), which does not make sense, 
> because the state of this instance was already set to MigratePrep inside 
> clustermanagerservice.py migrateVm method (see 
> "self.stateTransition(instance, InstanceState.Running, 
> InstanceState.MigratePrep)"). So the source VM state is not updated and this 
> causes an exception being thrown all the time inside stateTransition method.
>
Hi Miha,

Thanks for you suggestion. I have applied it to the code as follows:-

Author: stroucki
Date: Wed Nov 24 21:37:33 2010
New Revision: 1038840

URL: http://svn.apache.org/viewvc?rev=1038840&view=rev
Log:
Implement Miha Stopar's changes to set the state of the VM on the source
machine of a migration to MigratePrep.

Modified:
    incubator/tashi/trunk/src/tashi/clustermanager/clustermanagerservice.py
    incubator/tashi/trunk/src/tashi/nodemanager/nodemanagerservice.py
    incubator/tashi/trunk/src/tashi/rpycservices/rpycservices.py

Modified: 
incubator/tashi/trunk/src/tashi/clustermanager/clustermanagerservice.py
URL: 
http://svn.apache.org/viewvc/incubator/tashi/trunk/src/tashi/clustermanager/clustermanagerservice.py?rev=1038840&r1=1038839&r2=1038840&view=diff
==============================================================================
--- incubator/tashi/trunk/src/tashi/clustermanager/clustermanagerservice.py 
(original)
+++ incubator/tashi/trunk/src/tashi/clustermanager/clustermanagerservice.py Wed 
Nov 24 21:37:33 2010
@@ -253,6 +253,9 @@ class ClusterManagerService(object):
                self.data.releaseInstance(instance)
                try:
                        # Prepare the target
+                       self.log.info("migrateVm: Calling prepSourceVm on 
source host %s" % sourceHost.name)
+                       self.proxy[sourceHost.name].prepSourceVm(instance)
+                       self.log.info("migrateVm: Calling prepReceiveVm on 
target host %s" % targetHost.name)
                        cookie = 
self.proxy[targetHost.name].prepReceiveVm(instance, sourceHost)
                except Exception, e:
                        self.log.exception('prepReceiveVm failed')

Modified: incubator/tashi/trunk/src/tashi/nodemanager/nodemanagerservice.py
URL: 
http://svn.apache.org/viewvc/incubator/tashi/trunk/src/tashi/nodemanager/nodemanagerservice.py?rev=1038840&r1=1038839&r2=1038840&view=diff
==============================================================================
--- incubator/tashi/trunk/src/tashi/nodemanager/nodemanagerservice.py (original)
+++ incubator/tashi/trunk/src/tashi/nodemanager/nodemanagerservice.py Wed Nov 
24 21:37:33 2010
@@ -198,10 +198,12 @@ class NodeManagerService(object):
                return instance.vmId
        
        def prepReceiveVm(self, instance, source):
-               instance.state = InstanceState.MigratePrep
                instance.vmId = -1
                transportCookie = self.vmm.prepReceiveVm(instance, source.name)
                return transportCookie
+
+       def prepSourceVm(self, instance):
+               instance.state = InstanceState.MigratePrep
        
        def migrateVmHelper(self, instance, target, transportCookie):
                self.vmm.migrateVm(instance.vmId, target.name, transportCookie)

Modified: incubator/tashi/trunk/src/tashi/rpycservices/rpycservices.py
URL: 
http://svn.apache.org/viewvc/incubator/tashi/trunk/src/tashi/rpycservices/rpycservices.py?rev=1038840&r1=1038839&r2=1038840&view=diff
==============================================================================
--- incubator/tashi/trunk/src/tashi/rpycservices/rpycservices.py (original)
+++ incubator/tashi/trunk/src/tashi/rpycservices/rpycservices.py Wed Nov 24 
21:37:33 2010
@@ -3,7 +3,7 @@ from tashi.rpycservices.rpyctypes import
 import cPickle
 
 clusterManagerRPCs = ['createVm', 'shutdownVm', 'destroyVm', 'suspendVm', 
'resumeVm', 'migrateVm', 'pauseVm', 'unpauseVm', 'getHosts', 'getNetworks', 
'getUsers', 'getInstances', 'vmmSpecificCall', 'registerNodeManager', 
'vmUpdate', 'activateVm']
-nodeManagerRPCs = ['instantiateVm', 'shutdownVm', 'destroyVm', 'suspendVm', 
'resumeVm', 'prepReceiveVm', 'migrateVm', 'receiveVm', 'pauseVm', 'unpauseVm', 
'getVmInfo', 'listVms', 'vmmSpecificCall', 'getHostInfo']
+nodeManagerRPCs = ['instantiateVm', 'shutdownVm', 'destroyVm', 'suspendVm', 
'resumeVm', 'prepReceiveVm', 'prepSourceVm', 'migrateVm', 'receiveVm', 
'pauseVm', 'unpauseVm', 'getVmInfo', 'listVms', 'vmmSpecificCall', 
'getHostInfo']
 
 def clean(args):
        """Cleans the object so cPickle can be used."""


Greetings,
Michael.


Reply via email to