Hi Laine, Thanks for your reply. I think I have found a different problem. I tried what you said "before saving your domain, first suspend it with "virsh suspend <domain>", then save it; after you've restored the domain with "virsh restore <image-file>", resume the domain with "virsh resume <domain>"", when I restore the domain on another host, it failed with the error I mentioned before:
error: Failed to restore domain from testRes.dat error: operation failed: failed to start VM But now I can resolve my problem in an "ugly" way: before restoring the domain on another host, read the whole suspend image completely on that host, here is my code to do that read, it is very simple: if ((fd = open(suspendImage, O_RDONLY)) < 0) { goto error; } while ((size = read(fd, buf, MAXLINELEN))) { if (size == -1) { goto error; } } close(fd); After these codes is executed, then restoring the domain on that host will succeed! This solution(before restoring a domain on another host, read the suspend image on that host completely) works every time in my environment up to now. I am not sure why it works, maybe this read operation triggers the NFS cache refresh, so that the complete suspend image can be accessed in the target host, I don't know... Regards, Qian 2010/5/12 Laine Stump <la...@laine.org> > On 05/11/2010 04:40 AM, Zhang Qian wrote: > >> Hi, >> >> I have two KVM host: h1 and h2, both of them mount an NFS directory as a >> shared storage. >> I can save (virsh save <domain> <file>) a domain in h1 to a state file in >> the shared storage successfully, but failed to restore it from h2 with the >> following error message: >> # virsh restore testRes.dat >> error: Failed to restore domain from testRes.dat >> error: operation failed: failed to start VM >> >> I can always restore it from h1, but sometimes works for h2 (wait for a >> while, then "virsh restore" command may succeed in h2). I guess the state >> file generated by "virsh save" command is not intact from h2 point view, may >> be cause by the cache of NFS server? >> > > There is a race condition in qemu when restarting a domain - it is possible > for qemu to start the CPU before the domain image has been read from the > file (this is regardless of where the file is stored). This may or may not > be your problem (the error condition I saw due to this race was different > from what you are seeing). It is easy to test for though - before saving > your domain, first suspend it with "virsh suspend <domain>", then save it; > after you've restored the domain with "virsh restore <image-file>", resume > the domain with "virsh resume <domain>". If the domain successfully resumes, > your problem was the race I describe. If not, you have found a different > problem. > > I'm interested to know if this solves your problem. >
_______________________________________________ libvirt-users mailing list libvirt-users@redhat.com https://www.redhat.com/mailman/listinfo/libvirt-users