Re: [one-users] Automatic recover from hardware failure
Hi, I experience this issue with my production environment. The issue was OpenNebula network setting was reset (network down few minutes) therefore ONE think that all the node is offline / error. As we set “redeploy” when host is in error, ONE initiate the redeployment. The moment ONE’s network is up, cause intermittent network issue and read-only mode on the VMs. So, hope this will be addressed in the new solution as well. -- Regards, Anandharaj From: Users [mailto:users-boun...@lists.opennebula.org] On Behalf Of Robert Foote Sent: Monday, September 01, 2014 11:50 PM To: 'Carlos Martín Sánchez'; 'Sander Klein' Cc: 'users' Subject: Re: [one-users] Automatic recover from hardware failure I hope this particular feature is coming soon, as this is a standard ‘cloud’ feature for High Availability, and automatic failover. Robert Foote bpsNode www.bpsnode.comhttp://www.bpsnode.com From: Users [mailto:users-boun...@lists.opennebula.org] On Behalf Of Carlos Martín Sánchez Sent: Monday, September 01, 2014 10:46 AM To: Sander Klein Cc: users Subject: Re: [one-users] Automatic recover from hardware failure Hi, On Thu, Aug 14, 2014 at 8:36 PM, Sander Klein roe...@roedie.nlmailto:roe...@roedie.nl wrote: The delete-recreate action will create a new VM from the original template. This means that the VM will have a new ID, IP, and clean disks. We are working on a new mechanism for hosts with shared storage, to migrate the failed VM to a new host keeping the current IPs, disk state, etc. Is there an ETA on this new mechanism? I just ran into the same problem today in my test setup. I can't say when it will be ready. It's not a patch that you can apply, the changes are made in the core and the scheduler, so you will need to wait for the next OpenNebula release. Regards. -- Carlos Martín, MSc Project Engineer OpenNebula - Flexible Enterprise Cloud Made Simple www.OpenNebula.orghttp://www.opennebula.org/ | cmar...@opennebula.orgmailto:cmar...@opennebula.org | @OpenNebulahttp://twitter.com/opennebula [http://static.avast.com/emails/avast-mail-stamp.png]http://www.avast.com/ This email is free from viruses and malware because avast! Antivirushttp://www.avast.com/ protection is active. DISCLAIMER: This e-mail (including any attachments) is for the addressee(s) only and may be confidential, especially as regards personal data. If you are not the intended recipient, please note that any dealing, review, distribution, printing, copying or use of this e-mail is strictly prohibited. If you have received this email in error, please notify the sender immediately and delete the original message (including any attachments). MIMOS Berhad is a research and development institution under the purview of the Malaysian Ministry of Science, Technology and Innovation. Opinions, conclusions and other information in this e-mail that do not relate to the official business of MIMOS Berhad and/or its subsidiaries shall be understood as neither given nor endorsed by MIMOS Berhad and/or its subsidiaries and neither MIMOS Berhad nor its subsidiaries accepts responsibility for the same. All liability arising from or in connection with computer viruses and/or corrupted e-mails is excluded to the fullest extent permitted by law. -- - - DISCLAIMER: This e-mail (including any attachments) is for the addressee(s) only and may contain confidential information. If you are not the intended recipient, please note that any dealing, review, distribution, printing, copying or use of this e-mail is strictly prohibited. If you have received this email in error, please notify the sender immediately and delete the original message. MIMOS Berhad is a research and development institution under the purview of the Malaysian Ministry of Science, Technology and Innovation. Opinions, conclusions and other information in this e- mail that do not relate to the official business of MIMOS Berhad and/or its subsidiaries shall be understood as neither given nor endorsed by MIMOS Berhad and/or its subsidiaries and neither MIMOS Berhad nor its subsidiaries accepts responsibility for the same. All liability arising from or in connection with computer viruses and/or corrupted e-mails is excluded to the fullest extent permitted by law. ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] Automatic recover from hardware failure
Hi, On Thu, Aug 14, 2014 at 8:36 PM, Sander Klein roe...@roedie.nl wrote: The delete-recreate action will create a new VM from the original template. This means that the VM will have a new ID, IP, and clean disks. We are working on a new mechanism for hosts with shared storage, to migrate the failed VM to a new host keeping the current IPs, disk state, etc. Is there an ETA on this new mechanism? I just ran into the same problem today in my test setup. I can't say when it will be ready. It's not a patch that you can apply, the changes are made in the core and the scheduler, so you will need to wait for the next OpenNebula release. Regards. -- Carlos Martín, MSc Project Engineer OpenNebula - Flexible Enterprise Cloud Made Simple www.OpenNebula.org http://www.opennebula.org/ | cmar...@opennebula.org | @OpenNebula http://twitter.com/opennebula cmar...@opennebula.org ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] Automatic recover from hardware failure
I hope this particular feature is coming soon, as this is a standard ‘cloud’ feature for High Availability, and automatic failover. Robert Foote bpsNode www.bpsnode.com From: Users [mailto:users-boun...@lists.opennebula.org] On Behalf Of Carlos Martín Sánchez Sent: Monday, September 01, 2014 10:46 AM To: Sander Klein Cc: users Subject: Re: [one-users] Automatic recover from hardware failure Hi, On Thu, Aug 14, 2014 at 8:36 PM, Sander Klein roe...@roedie.nl mailto:roe...@roedie.nl wrote: The delete-recreate action will create a new VM from the original template. This means that the VM will have a new ID, IP, and clean disks. We are working on a new mechanism for hosts with shared storage, to migrate the failed VM to a new host keeping the current IPs, disk state, etc. Is there an ETA on this new mechanism? I just ran into the same problem today in my test setup. I can't say when it will be ready. It's not a patch that you can apply, the changes are made in the core and the scheduler, so you will need to wait for the next OpenNebula release. Regards. -- Carlos Martín, MSc Project Engineer OpenNebula - Flexible Enterprise Cloud Made Simple www.OpenNebula.org http://www.opennebula.org/ | cmar...@opennebula.org mailto:cmar...@opennebula.org | @OpenNebula http://twitter.com/opennebula --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] Automatic recover from hardware failure
Sander, See https://github.com/dignus/one/commit/02a5a65eae48d58edaa414af94802bc74e5b1e24 for the fix you can apply yourself before 4.8.1 is released. On Thu, Aug 14, 2014 at 8:36 PM, Sander Klein roe...@roedie.nl wrote: Hi, On 14.08.2014 16:20, Carlos Martín Sánchez wrote: Hi, On Thu, Aug 14, 2014 at 1:07 PM, Johan Kooijman m...@johankooijman.com wrote: Hi all, I'm almost finished setting up the POC for ONE so far. One last piece I can't really figure out is how to recreate VM's on a different host in case a host fails, like hardware off the grid. How would I do such a thing? You can enable the fault tolerance hook in oned.conf [1]. This hook will perform the onevm delete --recreate action on the VMs of the failed host. To avoid false positives because of network connectivity issues, use the -p flag of the hook. The delete-recreate action will create a new VM from the original template. This means that the VM will have a new ID, IP, and clean disks. We are working on a new mechanism for hosts with shared storage, to migrate the failed VM to a new host keeping the current IPs, disk state, etc. Is there an ETA on this new mechanism? I just ran into the same problem today in my test setup. Greets, Sander -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
[one-users] Automatic recover from hardware failure
Hi all, I'm almost finished setting up the POC for ONE so far. One last piece I can't really figure out is how to recreate VM's on a different host in case a host fails, like hardware off the grid. How would I do such a thing? -- Met vriendelijke groeten / With kind regards, Johan Kooijman ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] Automatic recover from hardware failure
Hi, On Thu, Aug 14, 2014 at 1:07 PM, Johan Kooijman m...@johankooijman.com wrote: Hi all, I'm almost finished setting up the POC for ONE so far. One last piece I can't really figure out is how to recreate VM's on a different host in case a host fails, like hardware off the grid. How would I do such a thing? You can enable the fault tolerance hook in oned.conf [1]. This hook will perform the onevm delete --recreate action on the VMs of the failed host. To avoid false positives because of network connectivity issues, use the -p flag of the hook. The delete-recreate action will create a new VM from the original template. This means that the VM will have a new ID, IP, and clean disks. We are working on a new mechanism for hosts with shared storage, to migrate the failed VM to a new host keeping the current IPs, disk state, etc. Regards [1] http://docs.opennebula.org/4.8/advanced_administration/high_availability/ftguide.html -- Carlos Martín, MSc Project Engineer OpenNebula - Flexible Enterprise Cloud Made Simple www.OpenNebula.org http://www.opennebula.org/ | cmar...@opennebula.org | @OpenNebula http://twitter.com/opennebula cmar...@opennebula.org ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] Automatic recover from hardware failure
Carlos, Thanks for the reply. When I put that hook into oned, I get this: oneadmin@admin:~$ /var/lib/one/remotes//hooks/ft/host_error.rb 0 -r -p 2 /var/lib/one/remotes//hooks/ft/host_error.rb:97:in `main': undefined method `each' for #String:0x0001e02148 (NoMethodError) Something wrong on my end? Sounds like a coding thingy? HOST_HOOK = [ name = error, on= ERROR, command = ft/host_error.rb, arguments = $HID -r -p 2, remote= no ] ONE 4.8 on Ubuntu 14.04. On Thu, Aug 14, 2014 at 4:20 PM, Carlos Martín Sánchez cmar...@opennebula.org wrote: Hi, On Thu, Aug 14, 2014 at 1:07 PM, Johan Kooijman m...@johankooijman.com wrote: Hi all, I'm almost finished setting up the POC for ONE so far. One last piece I can't really figure out is how to recreate VM's on a different host in case a host fails, like hardware off the grid. How would I do such a thing? You can enable the fault tolerance hook in oned.conf [1]. This hook will perform the onevm delete --recreate action on the VMs of the failed host. To avoid false positives because of network connectivity issues, use the -p flag of the hook. The delete-recreate action will create a new VM from the original template. This means that the VM will have a new ID, IP, and clean disks. We are working on a new mechanism for hosts with shared storage, to migrate the failed VM to a new host keeping the current IPs, disk state, etc. Regards [1] http://docs.opennebula.org/4.8/advanced_administration/high_availability/ftguide.html -- Carlos Martín, MSc Project Engineer OpenNebula - Flexible Enterprise Cloud Made Simple www.OpenNebula.org http://www.opennebula.org/ | cmar...@opennebula.org | @OpenNebula http://twitter.com/opennebula cmar...@opennebula.org -- Met vriendelijke groeten / With kind regards, Johan Kooijman T +31(0) 6 43 44 45 27 E m...@johankooijman.com ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] Automatic recover from hardware failure
Hi Johan, Yes, it looks like a bug related to the -p option. We'll take a look: http://dev.opennebula.org/issues/3153 Regards. -- Carlos Martín, MSc Project Engineer OpenNebula - Flexible Enterprise Cloud Made Simple www.OpenNebula.org | cmar...@opennebula.org | @OpenNebula http://twitter.com/opennebula cmar...@opennebula.org On Thu, Aug 14, 2014 at 5:28 PM, Johan Kooijman m...@johankooijman.com wrote: Carlos, Thanks for the reply. When I put that hook into oned, I get this: oneadmin@admin:~$ /var/lib/one/remotes//hooks/ft/host_error.rb 0 -r -p 2 /var/lib/one/remotes//hooks/ft/host_error.rb:97:in `main': undefined method `each' for #String:0x0001e02148 (NoMethodError) Something wrong on my end? Sounds like a coding thingy? HOST_HOOK = [ name = error, on= ERROR, command = ft/host_error.rb, arguments = $HID -r -p 2, remote= no ] ONE 4.8 on Ubuntu 14.04. On Thu, Aug 14, 2014 at 4:20 PM, Carlos Martín Sánchez cmar...@opennebula.org wrote: Hi, On Thu, Aug 14, 2014 at 1:07 PM, Johan Kooijman m...@johankooijman.com wrote: Hi all, I'm almost finished setting up the POC for ONE so far. One last piece I can't really figure out is how to recreate VM's on a different host in case a host fails, like hardware off the grid. How would I do such a thing? You can enable the fault tolerance hook in oned.conf [1]. This hook will perform the onevm delete --recreate action on the VMs of the failed host. To avoid false positives because of network connectivity issues, use the -p flag of the hook. The delete-recreate action will create a new VM from the original template. This means that the VM will have a new ID, IP, and clean disks. We are working on a new mechanism for hosts with shared storage, to migrate the failed VM to a new host keeping the current IPs, disk state, etc. Regards [1] http://docs.opennebula.org/4.8/advanced_administration/high_availability/ftguide.html -- Carlos Martín, MSc Project Engineer OpenNebula - Flexible Enterprise Cloud Made Simple www.OpenNebula.org http://www.opennebula.org/ | cmar...@opennebula.org | @OpenNebula http://twitter.com/opennebula cmar...@opennebula.org -- Met vriendelijke groeten / With kind regards, Johan Kooijman T +31(0) 6 43 44 45 27 E m...@johankooijman.com ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
Re: [one-users] Automatic recover from hardware failure
Hi, On 14.08.2014 16:20, Carlos Martín Sánchez wrote: Hi, On Thu, Aug 14, 2014 at 1:07 PM, Johan Kooijman m...@johankooijman.com wrote: Hi all, I'm almost finished setting up the POC for ONE so far. One last piece I can't really figure out is how to recreate VM's on a different host in case a host fails, like hardware off the grid. How would I do such a thing? You can enable the fault tolerance hook in oned.conf [1]. This hook will perform the onevm delete --recreate action on the VMs of the failed host. To avoid false positives because of network connectivity issues, use the -p flag of the hook. The delete-recreate action will create a new VM from the original template. This means that the VM will have a new ID, IP, and clean disks. We are working on a new mechanism for hosts with shared storage, to migrate the failed VM to a new host keeping the current IPs, disk state, etc. Is there an ETA on this new mechanism? I just ran into the same problem today in my test setup. Greets, Sander ___ Users mailing list Users@lists.opennebula.org http://lists.opennebula.org/listinfo.cgi/users-opennebula.org