Vincent Vuong created CLOUDSTACK-7827:
-----------------------------------------

             Summary: storage migration timeout, loss of data
                 Key: CLOUDSTACK-7827
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7827
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
    Affects Versions: 4.4.1
         Environment: CentOS 6.5, Xenserver 6.2 with latest patches, Cloudstack 
4.4.1
            Reporter: Vincent Vuong
            Priority: Critical


If a volume migration is not completed before the Cloudstack timeout is 
reached, the VM cannot be started after being stopped.  We have observed this 
behavior with Cloudstack 4.1 – 4.4.  Loss of data will occur if the admin stops 
the VM before finding the new VHD chain.  Here are the steps to reproduce:

1)      Execute a storage migration on a running VM that will exceed the 
Cloudstack timeout value.
2)      Storage migration will fail with Cloudstack reporting a “Host timed 
out” but Xenserver continues with the volume migration.
3)      After Xenserver completes the volume migration, Xenserver deletes the 
original VHD chain.  The database volume “PATH” in Cloudstack is not updated 
with the new VHD chain.
4)      VM cannot be started after being stopped.  There is no way to find out 
what the new VHD chain is if the VM has stopped.

Fix:
1)      While the VM is still running, run the following command to find the 
new VHD file name:  xe vbd-list vm-uuid=
2)      Stop the VM and copy the VHD chain back to the original primary storage 
and update the volume “PATH” with the new VHD chain in the Cloudstack database.
3)      Start the VM.

2014-11-01 21:16:56,887 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] 
(Work-Job-Executor-3:ctx-80290066 job-174/job-175 ctx-c104adfc) copy failed
com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to 
Agent:4, com.cloud.exception.OperationTimedoutException: Commands 
1959910262836298211 to Host 4 timed out after 3600
        at 
org.apache.cloudstack.storage.RemoteHostEndPoint.sendMessage(RemoteHostEndPoint.java:133)
        at 
org.apache.cloudstack.storage.motion.AncientDataMotionStrategy.migrateVolumeToPool(AncientDataMotionStrategy.java:383)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to