Re: [one-users] Nebula 4.0.1 Xen 4.1.4 Debian Wheezy - MIGRATION problem..

Jacek Jarosiewicz Wed, 26 Jun 2013 05:03:23 -0700

I've created shared storage on both hosts, but live migration stillgives error:

[2013-06-26 13:21:19 2721] DEBUG (XendCheckpoint:305) [xc_restore]:/usr/lib/xen-4.1/bin/xc_restore 18 4 1 2 0 0 0 0[2013-06-26 13:24:29 2721] INFO (XendCheckpoint:423) xc: error: Failedto pin batch of 1024 page tables (22 = Invalid argument): Internal error[2013-06-26 13:24:30 2721] DEBUG (XendDomainInfo:3071)XendDomainInfo.destroy: domid=4[2013-06-26 13:24:30 2721] ERROR (XendDomainInfo:3085)XendDomainInfo.destroy: domain destruction failed.

Traceback (most recent call last):

File "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendDomainInfo.py",line 3078, in destroy

    xc.domain_pause(self.domid)
Error: (3, 'No such process')
[2013-06-26 13:24:30 2721] DEBUG (XendDomainInfo:2406) No device model
[2013-06-26 13:24:30 2721] DEBUG (XendDomainInfo:2408) Releasing devices
[2013-06-26 13:24:30 2721] DEBUG (XendDomainInfo:2414) Removing vif/0

[2013-06-26 13:24:30 2721] DEBUG (XendDomainInfo:1276)XendDomainInfo.destroyDevice: deviceClass = vif, device = vif/0

[2013-06-26 13:24:30 2721] DEBUG (XendDomainInfo:2414) Removing vkbd/0

[2013-06-26 13:24:30 2721] DEBUG (XendDomainInfo:1276)XendDomainInfo.destroyDevice: deviceClass = vkbd, device = vkbd/0

[2013-06-26 13:24:30 2721] DEBUG (XendDomainInfo:2414) Removing console/0

[2013-06-26 13:24:30 2721] DEBUG (XendDomainInfo:1276)XendDomainInfo.destroyDevice: deviceClass = console, device = console/0

[2013-06-26 13:24:30 2721] DEBUG (XendDomainInfo:2414) Removing vbd/51712

[2013-06-26 13:24:30 2721] DEBUG (XendDomainInfo:1276)XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/51712

[2013-06-26 13:24:30 2721] DEBUG (XendDomainInfo:2414) Removing vbd/51728

[2013-06-26 13:24:30 2721] DEBUG (XendDomainInfo:1276)XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/51728

[2013-06-26 13:24:30 2721] DEBUG (XendDomainInfo:2414) Removing vfb/0

[2013-06-26 13:24:30 2721] DEBUG (XendDomainInfo:1276)XendDomainInfo.destroyDevice: deviceClass = vfb, device = vfb/0[2013-06-26 13:24:30 2721] INFO (XendDomain:1126) Domain one-33(c80f42e8-c47e-8f96-a26c-0f98b966167b) deleted.[2013-06-26 13:24:30 2721] ERROR (XendCheckpoint:357)/usr/lib/xen-4.1/bin/xc_restore 18 4 1 2 0 0 0 0 failed

Traceback (most recent call last):

File "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendCheckpoint.py",line 309, in restore

    forkHelper(cmd, fd, handler.handler, True)

File "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendCheckpoint.py",line 411, in forkHelper

    raise XendError("%s failed" % string.join(cmd))
XendError: /usr/lib/xen-4.1/bin/xc_restore 18 4 1 2 0 0 0 0 failed
[2013-06-26 13:24:30 2721] ERROR (XendDomain:1194) Restore failed
Traceback (most recent call last):

File "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendDomain.py",line 1178, in domain_restore_fddominfo = XendCheckpoint.restore(self, fd, paused=paused,relocating=relocating)File "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendCheckpoint.py",line 358, in restore

    raise exn
XendError: /usr/lib/xen-4.1/bin/xc_restore 18 4 1 2 0 0 0 0 failed

the shared storage is moosefs, mounted on both hosts, permissions areOK, but live migration still gives error. cold migration doesn't workeither.. I've posted my logs on xen-users list, but no response :(


Help!
J

On 06/24/2013 12:52 PM, Javier Fontan wrote:

I can not find a reason why cold migration is not working.

Yes, live migration only works with shared storage.

On Thu, Jun 13, 2013 at 11:03 AM, Jacek Jarosiewicz
<[email protected]> wrote:

both hosts are exactly the same software-wise (same versions of OS, same
distributions, same versions of opennebula, same versions of xen).

processors are different though, one host has Intel Xeon E5430, and the
other has Intel Core i5 760.

so live migration can be done only with shared storage?

J


On 06/13/2013 10:14 AM, Javier Fontan wrote:


In live migration nobody copies the image, it needs to reside in a
shared filesystem mounted in both hosts.

The cold migration problem is a bit more tricky as it suspends the VM,
copies everything and starts it again in the new host. Can you check
that both hosts have the exact same version of xen? Check also that
the processors are the same.


On Thu, Jun 13, 2013 at 9:17 AM, Jacek Jarosiewicz <[email protected]>
wrote:


Hi,

No, it's not a persistent disk. It's just a regular OS image.
Yes - it doesn't get copied to the other nebula host. But it seems like
it
doesn't even try to copy the image. The live migration error shows almost
immediately. And the vm keeps running on the original host.

I'm not entirely sure if it's nebula's job to copy the image, or is it
Xen's
job..?

And the other - cold migration - it doesn't work either.. :(
It copies the image and the checkpoint file to the other host, but then
when
it tries to boot the VM I get the error below..

Cheers,
J


On 12.06.2013 18:29, Javier Fontan wrote:



It looks that it can not find a image file:

VmError: Device 51712 (vbd) could not be connected.
/var/lib/one//datastores/0/28/disk.0 does not exist.

Is that image a persistent disk? In that case, is it located in a
shared datastore that is not mounted in that host?

Cheers

On Wed, Jun 12, 2013 at 3:13 PM, Jacek Jarosiewicz
<[email protected]>
wrote:



Hi,

I have a problem with migrating VMs between hosts. Both cold and live
migration.

Cold migration log is:
Wed Jun 12 12:32:24 2013 [LCM][I]: New VM state is RUNNING
Wed Jun 12 12:32:41 2013 [VMM][I]: ExitCode: 0
Wed Jun 12 12:39:56 2013 [LCM][I]: New VM state is SAVE_MIGRATE
Wed Jun 12 12:40:40 2013 [VMM][I]: ExitCode: 0
Wed Jun 12 12:40:40 2013 [VMM][I]: Successfully execute virtualization
driver operation: save.
Wed Jun 12 12:40:40 2013 [VMM][I]: ExitCode: 0
Wed Jun 12 12:40:40 2013 [VMM][I]: Successfully execute network driver
operation: clean.
Wed Jun 12 12:40:40 2013 [LCM][I]: New VM state is PROLOG_MIGRATE
Wed Jun 12 12:40:40 2013 [TM][I]: ExitCode: 0
Wed Jun 12 12:41:18 2013 [LCM][E]: monitor_done_action, VM in a wrong
state
Wed Jun 12 12:46:29 2013 [LCM][E]: monitor_done_action, VM in a wrong
state
Wed Jun 12 12:51:40 2013 [LCM][E]: monitor_done_action, VM in a wrong
state
Wed Jun 12 12:56:09 2013 [TM][I]: mv: Moving
nebula1:/var/lib/one/datastores/0/29 to
nebula0:/var/lib/one/datastores/0/29
Wed Jun 12 12:56:09 2013 [TM][I]: ExitCode: 0
Wed Jun 12 12:56:09 2013 [LCM][I]: New VM state is BOOT
Wed Jun 12 12:56:09 2013 [VMM][I]: ExitCode: 0
Wed Jun 12 12:56:09 2013 [VMM][I]: Successfully execute network driver
operation: pre.
Wed Jun 12 12:56:32 2013 [VMM][I]: Command execution fail:
/var/tmp/one/vmm/xen4/restore /var/lib/one//datastores/0/29/checkpoint
nebula0 29 nebula0
Wed Jun 12 12:56:32 2013 [VMM][E]: restore: Command "sudo /usr/sbin/xm
restore /var/lib/one//datastores/0/29/checkpoint" failed: Error:
/usr/lib/xen-4.1/bin/xc_restore 23 12 1 2 0 0 0 0 failed
Wed Jun 12 12:56:32 2013 [VMM][E]: Could not restore from
/var/lib/one//datastores/0/29/checkpoint
Wed Jun 12 12:56:32 2013 [VMM][I]: ExitCode: 1
Wed Jun 12 12:56:32 2013 [VMM][I]: Failed to execute virtualization
driver
operation: restore.
Wed Jun 12 12:56:32 2013 [VMM][E]: Error restoring VM: Could not
restore
from /var/lib/one//datastores/0/29/checkpoint
Wed Jun 12 12:56:33 2013 [DiM][I]: New VM state is FAILED

and in xend.log i see:

[2013-06-12 12:56:32 24698] ERROR (XendCheckpoint:357)
/usr/lib/xen-4.1/bin/xc_restore 23 12 1 2 0 0 0 0 failed
Traceback (most recent call last):
     File
"/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendCheckpoint.py",
line
309, in restore
       forkHelper(cmd, fd, handler.handler, True)
     File
"/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendCheckpoint.py",
line
411, in forkHelper
       raise XendError("%s failed" % string.join(cmd))
XendError: /usr/lib/xen-4.1/bin/xc_restore 23 12 1 2 0 0 0 0 failed
[2013-06-12 12:56:32 24698] ERROR (XendDomain:1194) Restore failed
Traceback (most recent call last):
     File "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendDomain.py",
line
1178, in domain_restore_fd
       dominfo = XendCheckpoint.restore(self, fd, paused=paused,
relocating=relocating)
     File
"/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendCheckpoint.py",
line
358, in restore
       raise exn
XendError: /usr/lib/xen-4.1/bin/xc_restore 23 12 1 2 0 0 0 0 failed


..and with live migration i see:

Wed Jun 12 12:27:16 2013 [LCM][I]: New VM state is RUNNING
Wed Jun 12 12:27:32 2013 [VMM][I]: ExitCode: 0
Wed Jun 12 13:34:26 2013 [LCM][I]: New VM state is MIGRATE
Wed Jun 12 13:34:26 2013 [VMM][I]: ExitCode: 0
Wed Jun 12 13:34:26 2013 [VMM][I]: Successfully execute transfer
manager
driver operation: tm_premigrate.
Wed Jun 12 13:34:26 2013 [VMM][I]: ExitCode: 0
Wed Jun 12 13:34:26 2013 [VMM][I]: Successfully execute network driver
operation: pre.
Wed Jun 12 13:37:34 2013 [VMM][I]: ExitCode: 0
Wed Jun 12 13:37:34 2013 [VMM][I]: Successfully execute virtualization
driver operation: migrate.
Wed Jun 12 13:37:34 2013 [VMM][I]: ExitCode: 0
Wed Jun 12 13:37:34 2013 [VMM][I]: Successfully execute network driver
operation: clean.
Wed Jun 12 13:37:34 2013 [VMM][I]: ExitCode: 0
Wed Jun 12 13:37:34 2013 [VMM][I]: Successfully execute network driver
operation: post.
Wed Jun 12 13:37:34 2013 [VMM][I]: ExitCode: 0
Wed Jun 12 13:37:34 2013 [VMM][I]: Successfully execute transfer
manager
driver operation: tm_postmigrate.
Wed Jun 12 13:37:35 2013 [LCM][I]: New VM state is RUNNING

and in xend.log:

[2013-06-12 13:37:39 9651] ERROR (XendCheckpoint:357) Device 51712
(vbd)
could not be connected. /var/lib/one//datastores/0/28/disk.0 does not
exist.
Traceback (most recent call last):
     File
"/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendCheckpoint.py",
line
346, in restore
       dominfo.waitForDevices() # Wait for backends to set up
     File
"/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendDomainInfo.py",
line
1237, in waitForDevices
       self.getDeviceController(devclass).waitForDevices()
     File
"/usr/lib/xen-4.1/bin/../lib/python/xen/xend/server/DevController.py",
line
140, in waitForDevices
       return map(self.waitForDevice, self.deviceIDs())
     File
"/usr/lib/xen-4.1/bin/../lib/python/xen/xend/server/DevController.py",
line
165, in waitForDevice
       "%s" % (devid, self.deviceClass, err))
VmError: Device 51712 (vbd) could not be connected.
/var/lib/one//datastores/0/28/disk.0 does not exist.
[2013-06-12 13:37:39 9651] ERROR (XendDomain:1194) Restore failed
Traceback (most recent call last):
     File "/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendDomain.py",
line
1178, in domain_restore_fd
       dominfo = XendCheckpoint.restore(self, fd, paused=paused,
relocating=relocating)
     File
"/usr/lib/xen-4.1/bin/../lib/python/xen/xend/XendCheckpoint.py",
line
358, in restore
       raise exn
VmError: Device 51712 (vbd) could not be connected.
/var/lib/one//datastores/0/28/disk.0 does not exist.

any help would be appreciated..

Cheers,
J

--
Jacek Jarosiewicz
_______________________________________________
Users mailing list
[email protected]
http://lists.opennebula.org/listinfo.cgi/users-opennebula.org


--
Jacek Jarosiewicz



--
Jacek Jarosiewicz



--
Jacek Jarosiewicz
_______________________________________________
Users mailing list
[email protected]
http://lists.opennebula.org/listinfo.cgi/users-opennebula.org

Re: [one-users] Nebula 4.0.1 Xen 4.1.4 Debian Wheezy - MIGRATION problem..

Reply via email to