Hello,
  I found a problem that:
  vm's status file may be left over in the path /var/run/libvirt/qemu under 
some situation, such as host reboot. When vm's status file is left over, some 
persistent but inactive vms will be lost by libvirtd after it is rebooted. And 
you can do as follows to reproduce the problem:
  1、Create a vm and start it by the commands: virsh define vm-xml and virsh 
start vm-name.
  2、Stop the libvirtd by the command: service libvirtd stop.
  3、Kill the qemu process related to the vm, and make the vm's status file left 
over.
  4、Start libvirtd.
  After starting the libvirtd service, we find that the vm has been lost by 
libvirtd with command"virsh list --all". 
What we expect is that the vm is shown with shutoff status, should we?

The reason for the problem is that:
  During libvirtd startup, it first loads status files of vms under the path 
/var/run/libvirt/qemu, creates virDomainObj for each vm and adds it to 
driver->domains list.  
  Then it creates a thread to connect related qemu process for each 
virDomainObj in the domains list.Because the qemu process has been killed, so 
connecting to 
qemu will be failed. When connecting to qemu failed, connection-thread will do 
the follows: 
  1、Check if vm->persistent is 1. 
  2、If vm->persistent is not 1, then qemuDomainRemoveInactive() is called to 
remove the virDomainObj.
  3、Then the following calling sequence will occur:qemuDomainRemoveInactive() 
-->virDomainObjListRemove()-->virHashRemoveEntry(). Around 
virHashRemoveEntry(), 
  domlist and dom will be locked and unlocked sequencely.
  The problem of the above steps is that vm->persistent maybe has been set to 1 
by libvirtd main-thread when connection-thread calling virHashRemoveEntry() to 
remove the dom. That is a persistent virDomainObj is removed during libvirtd 
startup.

Two ways can resolve the above problem:
  1、expending the range of locking virDomainObj and virDomainObjList, lock the 
object of virDomainObj and virDomainObjList in connection-thread before 
checking vm->persistent.
  2、checking vm->persistent again before calling virHashRemoveEntry().

  Do you think it is a problem described above and which way listed above is 
more suitable to resolve the problem, or is there any other better idea? Any 
suggestions?

Best Regards,
-WangYufei



--
libvir-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/libvir-list

Reply via email to