Hi,

during a bulk VM deletion process 3 out of 9 VMs failed to be actually deleted from the hypervisor host (they're still running). This is obvious by comparing 'onevm list' with 'virsh list':

   -bash-4.1$ onevm list | grep myhost
192 userx oneadmin one-192 runn 5 2G myhost 02 17:56:45 193 userx oneadmin one-193 runn 5 2G myhost 02 17:42:58 194 userx oneadmin one-194 runn 1 2G myhost 00 20:17:20


   [root@myhost ~]# virsh list
     Id    Name                           State
   ----------------------------------------------------
     5     one-192                        running
     6     one-193                        running
     7     one-194                        running
     11    one-198                        running
     14    one-201                        running
     15    one-202                        running

I used sunstone, from the 'Virtual Machines' tab I marked the 9 VMs and pressed the 'Delete' button (top right).

Looking at the logs we see that the command supposed to shutdown the VM failed, but the failure was ignored by VMM:

   Fri Feb 22 10:27:06 2013 [VMM][D]: Monitor Information:
        CPU   : 3
        Memory: 2097152
        Net_TX: 215229
        Net_RX: 5664872
   Fri Feb 22 10:35:34 2013 [DiM][I]: New VM state is DONE
   Fri Feb 22 10:35:34 2013 [VMM][W]: Ignored: LOG I 201 Driver command
   for 201 cancelled

   *Fri Feb 22 10:35:35 2013 [VMM][W]: Ignored: LOG I 201 Command
   execution fail: /var/tmp/one/vmm/kvm/cancel one-201 myhost 201 myhost*

   Fri Feb 22 10:35:35 2013 [VMM][W]: Ignored: LOG I 201
   ssh_exchange_identification: Connection closed by remote host

   Fri Feb 22 10:35:35 2013 [VMM][W]: Ignored: LOG I 201 ExitSSHCode: 255

   Fri Feb 22 10:35:35 2013 [VMM][W]: Ignored: LOG E 201 Error
   connecting to myhost

   Fri Feb 22 10:35:35 2013 [VMM][W]: Ignored: LOG I 201 Failed to
   execute virtualization driver operation: cancel.

   Fri Feb 22 10:35:35 2013 [VMM][W]: Ignored: CANCEL FAILURE 201 Error
   connecting to myhost

   Fri Feb 22 10:35:35 2013 [TM][W]: Ignored: LOG I 201 tm_delete.sh:
   HK Deleting myhost /var/lib/one/local/201/images

   Fri Feb 22 10:35:35 2013 [TM][W]: Ignored: LOG I 201 tm_delete.sh:
   Executed "ssh myhost rm -rf /var/lib/one/local/201/images".

   Fri Feb 22 10:35:35 2013 [TM][W]: Ignored: LOG I 201 ExitCode: 0

   Fri Feb 22 10:35:35 2013 [TM][W]: Ignored: TRANSFER SUCCESS 201 -


The delete command actually succeeded:

   [root@myhost ~]# ll /var/lib/one/local/201
   total 0

This looks like a bug in the VMM component; it should not be ignoring failures...

IMHO in this case a proper ONE behavior should be to consider the 'delete' operation as failed, thus not removing the image. Then it could check for the actual VM status or leave it in a ERROR state.

Is there any way to avoid VMM to Ignore errors on the scripts it calls? I am using ONE3.2.

Thanks,

--
Gerard Bernabeu
FermiCloud and FermiGrid Services at Fermilab
Phone (+1) 630-840-6509

_______________________________________________
Users mailing list
Users@lists.opennebula.org
http://lists.opennebula.org/listinfo.cgi/users-opennebula.org

Reply via email to