Continuing with the 3.6 Night Builds testing...

While hosted-engine-setup was adding the host to the newly created cluster, VDSM crashed, probably because the gluster engine storage disappeared as in BZ 1201355 [1]

Facts:
- the engine storage (/rhev/data-center/mmt/...) was umounted during this process - another mount of the same volume was still mounted after the VDSM crash (maybe the problem is not related with gluster)

After doing a "hosted-engine --connect-storage", the volume is mounted again.
Now, when trying to restart VDSM, I get an "invalid lockspace":

Thread-46::ERROR::2015-03-26 19:24:31,843::vm::1237::vm.Vm::(_startUnderlyingVm) vmId=`191045ac-79e4-4ce8-aad7-52cc9af313c5`::The vm start process failed
    Traceback (most recent call last):
      File "/usr/share/vdsm/virt/vm.py", line 1185, in _startUnderlyingVm
        self._run()
      File "/usr/share/vdsm/virt/vm.py", line 2253, in _run
        self._connection.createXML(domxml, flags),
File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 126, in wrapper
        ret = f(*args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3427, in createXML if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self)
    libvirtError: Failed to acquire lock: No space left on device
Thread-46::INFO::2015-03-26 19:24:31,844::vm::1709::vm.Vm::(setDownStatus) vmId=`191045ac-79e4-4ce8-aad7-52cc9af313c5`::Changed state to Down: Failed to acquire lock: No space left on device (code=1) Thread-46::DEBUG::2015-03-26 19:24:31,844::vmchannels::214::vds::(unregister) Delete fileno 60 from listener. VM Channels Listener::DEBUG::2015-03-26 19:24:32,346::vmchannels::121::vds::(_do_del_channels) fileno 60 was removed from listener.

In sanlock.log we have:

    2015-03-26 19:24:30+0000 7589 [752]: cmd 9 target pid 9559 not found
2015-03-26 19:24:31+0000 7589 [764]: r7 cmd_acquire 2,8,9559 invalid lockspace found -1 failed 935819904 name 7ba46e75-51af-4648-becc-5a469cb8e9c2

(All 3 lease files are present)

This problem is similar to BZ 1201355 reported by Sandro [1].

About the hosted-engine VM not being resumed after restarting VDSM, please check [2] and [3] (duplicated). I confirmed that QEMU is not reopening the file descriptors when resuming a paused VMs, which explains those issues.

Now, how can I fix the "invalid lockspace"?

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1201355
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1172905
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1058300
_______________________________________________
Devel mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/devel

Reply via email to