Re: [ovirt-users] [Users] HA
Hi All, Any news about this? DSM hook or anything? Thanx! Kind regards 2014-04-09 9:37 GMT+02:00 Omer Frenkel ofren...@redhat.com: - Original Message - From: Koen Vanoppen vanoppen.k...@gmail.com To: users@ovirt.org Sent: Tuesday, April 8, 2014 3:41:02 PM Subject: Re: [Users] HA Or with other words, the SPM and the VM should move almost immediate after the storage connections on the hypervisor are gone. I know, I'm asking to much maybe, but we would be very happy :-) :-). So sketch: Mercury1 SPM Mercury 2 Mercury1 loses both fibre connections -- goes in non-operational and the VM goes in paused state and stays this way, until I manually reboot the host so it fences. What I would like is that when mercury 1 loses both fibre connections. He fences immediate so the VM's are moved also almost instantly... If this is possible... :-) Kind regards and thanks for all the help! Michal, is there a vdsm hook for vm moved to pause? if so, you could send KILL to it, and engine will identify vm was killed+HA, so it will be restarted, and no need to reboot the host, it will stay in non-operational until storage is fixed. 2014-04-08 14:26 GMT+02:00 Koen Vanoppen vanoppen.k...@gmail.com : Ok, Thanx already for all the help. I adapted some things for quicker respons: engine-config --get FenceQuietTimeBetweenOperationsInSec--180 engine-config --set FenceQuietTimeBetweenOperationsInSec=60 engine-config --get StorageDomainFalureTimeoutInMinutes--180 engine-config --set StorageDomainFalureTimeoutInMinutes=1 engine-config --get SpmCommandFailOverRetries--5 engine-config --set SpmCommandFailOverRetries engine-config --get SPMFailOverAttempts--3 engine-config --set SPMFailOverAttempts=1 engine-config --get NumberOfFailedRunsOnVds--3 engine-config --set NumberOfFailedRunsOnVds=1 engine-config --get vdsTimeout--180 engine-config --set vdsTimeout=30 engine-config --get VDSAttemptsToResetCount--2 engine-config --set VDSAttemptsToResetCount=1 engine-config --get TimeoutToResetVdsInSeconds--60 engine-config --set TimeoutToResetVdsInSeconds=30 Now the result of this is that when the VM is not running on the SPM that it will migrate before going in pause mode. But when we tried it, when the vm is running on the SPM, it get's in paused mode (for safety reasons, I know ;-) ). And stays there until the host gets MANUALLY fenced by rebooting it. So now my question is... How can I make the hypervisor fence (so reboots, so vm is moved) quicker? Kind regards, Koen 2014-04-04 16:28 GMT+02:00 Koen Vanoppen vanoppen.k...@gmail.com : Ja das waar. Maar was aan't rijden... Dus ik stuur maar door dan :-). Ik heb reeds de time out aangepast. Die stond op 5 min voor hij den time out ging geven. Staat nu op 2 min On Apr 4, 2014 4:14 PM, David Van Zeebroeck david.van.zeebro...@brusselsairport.be wrote: Ik heb ze ook he Maar normaal had de fencing moeten werken als ik het zo lees Dus daar is ergens iets verkeerd gelopen zo te lezen From: Koen Vanoppen [mailto: vanoppen.k...@gmail.com ] Sent: vrijdag 4 april 2014 16:07 To: David Van Zeebroeck Subject: Fwd: Re: [Users] HA David Van Zeebroeck Product Manager Unix Infrastructure Information Communication Technology Brussels Airport Company T +32 (0)2 753 66 24 M +32 (0)497 02 17 31 david.van.zeebro...@brusselsairport.be www.brusselsairport.be FOLLOW US ON: Company Info -- Forwarded message -- From: Michal Skrivanek michal.skriva...@redhat.com Date: Apr 4, 2014 3:39 PM Subject: Re: [Users] HA To: Koen Vanoppen vanoppen.k...@gmail.com Cc: ovirt-users Users users@ovirt.org On 4 Apr 2014, at 15:14, Sander Grendelman wrote: Do you have power management configured? Was the failed host fenced/rebooted? On Fri, Apr 4, 2014 at 2:21 PM, Koen Vanoppen vanoppen.k...@gmail.com wrote: So... It is possible for a fully automatic migration of the VM to another hypervisor in case Storage connection fails? How can we make this happen? Because for the moment, when we tested the situation they stayed in pause state. (Test situation: * Unplug the 2 fibre cables from the hypervisor * VM's go in pause state * VM's stayed in pause state until the failure was solved as said before, it's not safe hence we (try to) not migrate them. They only get paused when they actually access the storage which may not be always the case. I.e. the storage connection is severed, host deemed NonOperational and VMs are getting migrated from it, then some of them will succeed if they didn't access that bad storage … the paused VMs will remain (mostly, it can still
Re: [ovirt-users] [Users] HA
On 11 Apr 2014, at 09:00, Koen Vanoppen wrote: Hi All, Any news about this? DSM hook or anything? Thanx! Kind regards 2014-04-09 9:37 GMT+02:00 Omer Frenkel ofren...@redhat.com: - Original Message - From: Koen Vanoppen vanoppen.k...@gmail.com To: users@ovirt.org Sent: Tuesday, April 8, 2014 3:41:02 PM Subject: Re: [Users] HA Or with other words, the SPM and the VM should move almost immediate after the storage connections on the hypervisor are gone. I know, I'm asking to much maybe, but we would be very happy :-) :-). So sketch: Mercury1 SPM Mercury 2 Mercury1 loses both fibre connections -- goes in non-operational and the VM goes in paused state and stays this way, until I manually reboot the host so it fences. What I would like is that when mercury 1 loses both fibre connections. He fences immediate so the VM's are moved also almost instantly... If this is possible... :-) Kind regards and thanks for all the help! Michal, is there a vdsm hook for vm moved to pause? if so, you could send KILL to it, and engine will identify vm was killed+HA, so it will be restarted, and no need to reboot the host, it will stay in non-operational until storage is fixed. you have to differentiate - if only the VMs would be paused, yes, you can do anything (also change the err reporting policy to not pause the VM) but if the host becomes non-operational then it simply doesn't work, vdsm got stuck somewhere (often in get blk device stats) proper power management config should fence it Thanks, michal 2014-04-08 14:26 GMT+02:00 Koen Vanoppen vanoppen.k...@gmail.com : Ok, Thanx already for all the help. I adapted some things for quicker respons: engine-config --get FenceQuietTimeBetweenOperationsInSec--180 engine-config --set FenceQuietTimeBetweenOperationsInSec=60 engine-config --get StorageDomainFalureTimeoutInMinutes--180 engine-config --set StorageDomainFalureTimeoutInMinutes=1 engine-config --get SpmCommandFailOverRetries--5 engine-config --set SpmCommandFailOverRetries engine-config --get SPMFailOverAttempts--3 engine-config --set SPMFailOverAttempts=1 engine-config --get NumberOfFailedRunsOnVds--3 engine-config --set NumberOfFailedRunsOnVds=1 engine-config --get vdsTimeout--180 engine-config --set vdsTimeout=30 engine-config --get VDSAttemptsToResetCount--2 engine-config --set VDSAttemptsToResetCount=1 engine-config --get TimeoutToResetVdsInSeconds--60 engine-config --set TimeoutToResetVdsInSeconds=30 Now the result of this is that when the VM is not running on the SPM that it will migrate before going in pause mode. But when we tried it, when the vm is running on the SPM, it get's in paused mode (for safety reasons, I know ;-) ). And stays there until the host gets MANUALLY fenced by rebooting it. So now my question is... How can I make the hypervisor fence (so reboots, so vm is moved) quicker? Kind regards, Koen 2014-04-04 16:28 GMT+02:00 Koen Vanoppen vanoppen.k...@gmail.com : Ja das waar. Maar was aan't rijden... Dus ik stuur maar door dan :-). Ik heb reeds de time out aangepast. Die stond op 5 min voor hij den time out ging geven. Staat nu op 2 min On Apr 4, 2014 4:14 PM, David Van Zeebroeck david.van.zeebro...@brusselsairport.be wrote: Ik heb ze ook he Maar normaal had de fencing moeten werken als ik het zo lees Dus daar is ergens iets verkeerd gelopen zo te lezen From: Koen Vanoppen [mailto: vanoppen.k...@gmail.com ] Sent: vrijdag 4 april 2014 16:07 To: David Van Zeebroeck Subject: Fwd: Re: [Users] HA David Van Zeebroeck Product Manager Unix Infrastructure Information Communication Technology Brussels Airport Company T +32 (0)2 753 66 24 M +32 (0)497 02 17 31 david.van.zeebro...@brusselsairport.be www.brusselsairport.be FOLLOW US ON: Company Info -- Forwarded message -- From: Michal Skrivanek michal.skriva...@redhat.com Date: Apr 4, 2014 3:39 PM Subject: Re: [Users] HA To: Koen Vanoppen vanoppen.k...@gmail.com Cc: ovirt-users Users users@ovirt.org On 4 Apr 2014, at 15:14, Sander Grendelman wrote: Do you have power management configured? Was the failed host fenced/rebooted? On Fri, Apr 4, 2014 at 2:21 PM, Koen Vanoppen vanoppen.k...@gmail.com wrote: So... It is possible for a fully automatic migration of the VM to another hypervisor in case Storage connection fails? How can we make this happen? Because for the moment, when we tested the situation they stayed in pause state. (Test situation: * Unplug the 2 fibre cables from the hypervisor * VM's go in pause state * VM's stayed in pause state until the failure was solved as said before,
Re: [ovirt-users] compatibility relationship between datacenter, ovirt and cluster
you should look at the feature page. if you do not fully upgrade cpu/data center you simply continue to work with 3.3 features On 04/10/2014 08:50 PM, Tamer Lima wrote: Hi, yesterday my ovirt was 3.3 my datacenter and cluster (compatibility version) was aligned with ovirt 3.3 today my ovirt is now 3.4. and my datacenter and cluster (compatibility version) remains 3.3 (with the option enabled to change for 3.4) browsing the ovirt admin page I see 2 occurrences of ovirt version: datacenter tab = 3.3 cluster tab =3.3 I would like to understand what means all these versions, the same version for a lot of important things, and how my ovirt works/behaves using different versions. all my doubts together : What means datacenter in version 3.3(or lower version) when ovirt is 3.4 ? what means cluster in version 3.3 when ovirt is 3.4? what means change the compatibility version for datacenter? what means change the compatibility version for cluster? thanks ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Dafna Ron ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Re-add a node
please add the host-deploy log from /var/log/ovirt-engine/host-deploy/ On 04/11/2014 12:11 AM, James James wrote: Hi, I don know i the subject is explicit enough but I have a problem and I hope to find some help here. I had two hosts in my cluster (node1 and node2). I had to reinstall node1 due to some networks problem. In the engine, node1 appears now Not reponsive and I can't remove it from the engine ui. Now node1 is back and I want to add it in the cluster but I can't. I've got this error (vdsm.log) : BindingXMLRPC::ERROR::2014-04-11 01:04:42,622::BindingXMLRPC::81::vds::(threaded_start) xml-rpc handler exception Traceback (most recent call last): File /usr/share/vdsm/BindingXMLRPC.py, line 77, in threaded_start self.server.handle_request() File /usr/lib64/python2.6/SocketServer.py, line 278, in handle_request self._handle_request_noblock() File /usr/lib64/python2.6/SocketServer.py, line 288, in _handle_request_noblock request, client_address = self.get_request() File /usr/lib64/python2.6/SocketServer.py, line 456, in get_request return self.socket.accept() File /usr/lib64/python2.6/site-packages/vdsm/SecureXMLRPCServer.py, line 136, in accept raise SSL.SSLError(%s, client %s % (e, address[0])) SSLError: sslv3 alert certificate unknown, client 192.168.1.100 192.168.1.100 is the engine's address Can somebody help me ? ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Dafna Ron ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] compatibility relationship between datacenter, ovirt and cluster
On Fri, Apr 11, 2014 at 10:09 AM, Dafna Ron d...@redhat.com wrote: you should look at the feature page. if you do not fully upgrade cpu/data center you simply continue to work with 3.3 features For oVirt I only found 3.0 features page (adapted form RHEV itself..) here: http://www.ovirt.org/OVirt_3.0_Feature_Guide It is quite generic. Also I found release notes for 3.x, but not for 3.4 yet, at least from page http://www.ovirt.org/Documentation Can anyone add this link to it? http://www.ovirt.org/OVirt_3.4_Release_Notes I can do it, but I'm not a maintainer... and this is the main page... let me know if you wish me to do. You can also follow RHEV documents, tough its versions are not completely aligned with oVirt versions. To get information about clusters and DataCenters compatibility version: https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.3/html/Administration_Guide/sect-Post-upgrade_Tasks.html Regarding the features: https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.3/html/Administration_Guide/sect-Post-upgrade_Tasks.html#Features_Requiring_a_Comparability_Upgrade_to_Red_Hat_Enterprise_Virtualization_3.3 Similar pages for RHEV 3.4 beta, but I don't know their degree of consistency at this time. https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.4-Beta/html/Administration_Guide/sect-Post-upgrade_Tasks.html https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.4-Beta/html/Administration_Guide/Features_Requiring_a_Compatibility_Upgrade_to_Red_Hat_Enterprise_Virtualization_3.4.html HIH, Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Re-add a node
the log contains information about the first node1 installation http://pastebin.com/mZSb2wmD 2014-04-11 10:12 GMT+02:00 Dafna Ron d...@redhat.com: please add the host-deploy log from /var/log/ovirt-engine/host-deploy/ On 04/11/2014 12:11 AM, James James wrote: Hi, I don know i the subject is explicit enough but I have a problem and I hope to find some help here. I had two hosts in my cluster (node1 and node2). I had to reinstall node1 due to some networks problem. In the engine, node1 appears now Not reponsive and I can't remove it from the engine ui. Now node1 is back and I want to add it in the cluster but I can't. I've got this error (vdsm.log) : BindingXMLRPC::ERROR::2014-04-11 01:04:42,622::BindingXMLRPC::81::vds::(threaded_start) xml-rpc handler exception Traceback (most recent call last): File /usr/share/vdsm/BindingXMLRPC.py, line 77, in threaded_start self.server.handle_request() File /usr/lib64/python2.6/SocketServer.py, line 278, in handle_request self._handle_request_noblock() File /usr/lib64/python2.6/SocketServer.py, line 288, in _handle_request_noblock request, client_address = self.get_request() File /usr/lib64/python2.6/SocketServer.py, line 456, in get_request return self.socket.accept() File /usr/lib64/python2.6/site-packages/vdsm/SecureXMLRPCServer.py, line 136, in accept raise SSL.SSLError(%s, client %s % (e, address[0])) SSLError: sslv3 alert certificate unknown, client 192.168.1.100 192.168.1.100 is the engine's address Can somebody help me ? ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Dafna Ron ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Re-add a node
- Original Message - From: James James jre...@gmail.com To: d...@redhat.com Cc: users users@ovirt.org Sent: Friday, April 11, 2014 11:48:51 AM Subject: Re: [ovirt-users] Re-add a node the log contains information about the first node1 installation http://pastebin.com/mZSb2wmD Is it different/reinstalled engine trying to communicate with that node? 2014-04-11 10:12 GMT+02:00 Dafna Ron d...@redhat.com : please add the host-deploy log from /var/log/ovirt-engine/host- deploy/ On 04/11/2014 12:11 AM, James James wrote: Hi, I don know i the subject is explicit enough but I have a problem and I hope to find some help here. I had two hosts in my cluster (node1 and node2). I had to reinstall node1 due to some networks problem. In the engine, node1 appears now Not reponsive and I can't remove it from the engine ui. Now node1 is back and I want to add it in the cluster but I can't. I've got this error (vdsm.log) : BindingXMLRPC::ERROR::2014-04- 11 01:04:42,622::BindingXMLRPC:: 81::vds::(threaded_start) xml-rpc handler exception Traceback (most recent call last): File /usr/share/vdsm/ BindingXMLRPC.py, line 77, in threaded_start self.server.handle_request() File /usr/lib64/python2.6/ SocketServer.py, line 278, in handle_request self._handle_request_noblock() File /usr/lib64/python2.6/ SocketServer.py, line 288, in _handle_request_noblock request, client_address = self.get_request() File /usr/lib64/python2.6/ SocketServer.py, line 456, in get_request return self.socket.accept() File /usr/lib64/python2.6/site- packages/vdsm/ SecureXMLRPCServer.py, line 136, in accept raise SSL.SSLError(%s, client %s % (e, address[0])) SSLError: sslv3 alert certificate unknown, client 192.168.1.100 192.168.1.100 is the engine's address Can somebody help me ? __ _ Users mailing list Users@ovirt.org http://lists.ovirt.org/ mailman/listinfo/users -- Dafna Ron ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Re-add a node
The engine is the same but the node (node1) has been reinstalled ... 2014-04-11 10:48 GMT+02:00 James James jre...@gmail.com: the log contains information about the first node1 installation http://pastebin.com/mZSb2wmD 2014-04-11 10:12 GMT+02:00 Dafna Ron d...@redhat.com: please add the host-deploy log from /var/log/ovirt-engine/host-deploy/ On 04/11/2014 12:11 AM, James James wrote: Hi, I don know i the subject is explicit enough but I have a problem and I hope to find some help here. I had two hosts in my cluster (node1 and node2). I had to reinstall node1 due to some networks problem. In the engine, node1 appears now Not reponsive and I can't remove it from the engine ui. Now node1 is back and I want to add it in the cluster but I can't. I've got this error (vdsm.log) : BindingXMLRPC::ERROR::2014-04-11 01:04:42,622::BindingXMLRPC::81::vds::(threaded_start) xml-rpc handler exception Traceback (most recent call last): File /usr/share/vdsm/BindingXMLRPC.py, line 77, in threaded_start self.server.handle_request() File /usr/lib64/python2.6/SocketServer.py, line 278, in handle_request self._handle_request_noblock() File /usr/lib64/python2.6/SocketServer.py, line 288, in _handle_request_noblock request, client_address = self.get_request() File /usr/lib64/python2.6/SocketServer.py, line 456, in get_request return self.socket.accept() File /usr/lib64/python2.6/site-packages/vdsm/SecureXMLRPCServer.py, line 136, in accept raise SSL.SSLError(%s, client %s % (e, address[0])) SSLError: sslv3 alert certificate unknown, client 192.168.1.100 192.168.1.100 is the engine's address Can somebody help me ? ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Dafna Ron ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Re-add a node
- Original Message - From: James James jre...@gmail.com To: d...@redhat.com Cc: users users@ovirt.org Sent: Friday, April 11, 2014 12:04:23 PM Subject: Re: [ovirt-users] Re-add a node The engine is the same but the node (node1) has been reinstalled ... So you need to re-add it to the engine. Delete the old node at engine side and use the node user interface to add it to the engine. 2014-04-11 10:48 GMT+02:00 James James jre...@gmail.com : the log contains information about the first node1 installation http://pastebin.com/mZSb2wmD 2014-04-11 10:12 GMT+02:00 Dafna Ron d...@redhat.com : please add the host-deploy log from /var/log/ovirt-engine/host- deploy/ On 04/11/2014 12:11 AM, James James wrote: Hi, I don know i the subject is explicit enough but I have a problem and I hope to find some help here. I had two hosts in my cluster (node1 and node2). I had to reinstall node1 due to some networks problem. In the engine, node1 appears now Not reponsive and I can't remove it from the engine ui. Now node1 is back and I want to add it in the cluster but I can't. I've got this error (vdsm.log) : BindingXMLRPC::ERROR::2014-04- 11 01:04:42,622::BindingXMLRPC:: 81::vds::(threaded_start) xml-rpc handler exception Traceback (most recent call last): File /usr/share/vdsm/ BindingXMLRPC.py, line 77, in threaded_start self.server.handle_request() File /usr/lib64/python2.6/ SocketServer.py, line 278, in handle_request self._handle_request_noblock() File /usr/lib64/python2.6/ SocketServer.py, line 288, in _handle_request_noblock request, client_address = self.get_request() File /usr/lib64/python2.6/ SocketServer.py, line 456, in get_request return self.socket.accept() File /usr/lib64/python2.6/site- packages/vdsm/ SecureXMLRPCServer.py, line 136, in accept raise SSL.SSLError(%s, client %s % (e, address[0])) SSLError: sslv3 alert certificate unknown, client 192.168.1.100 192.168.1.100 is the engine's address Can somebody help me ? __ _ Users mailing list Users@ovirt.org http://lists.ovirt.org/ mailman/listinfo/users -- Dafna Ron ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Re-add a node
can you put the host in maintenance? On 04/11/2014 10:43 AM, James James wrote: I can't delete the old node because it is in Non Responsive state. The remove button is stil blank . In the engine.log I've got this log : 2014-04-11 11:40:45,911 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-88) Command GetCapabilitiesVDSCommand(HostName = node1, HostId = 36fb6df3-c2c2-4133-86ac-fe50b99ee2e3, vds=Host[node1]) execution failed. Exception: VDSNetworkException: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target 2014-04-11 11:40:48,943 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-12) Command GetCapabilitiesVDSCommand(HostName = node1, HostId = 36fb6df3-c2c2-4133-86ac-fe50b99ee2e3, vds=Host[node1]) execution failed. Exception: VDSNetworkException: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target 2014-04-11 11:06 GMT+02:00 Alon Bar-Lev alo...@redhat.com mailto:alo...@redhat.com: - Original Message - From: James James jre...@gmail.com mailto:jre...@gmail.com To: d...@redhat.com mailto:d...@redhat.com Cc: users users@ovirt.org mailto:users@ovirt.org Sent: Friday, April 11, 2014 12:04:23 PM Subject: Re: [ovirt-users] Re-add a node The engine is the same but the node (node1) has been reinstalled ... So you need to re-add it to the engine. Delete the old node at engine side and use the node user interface to add it to the engine. 2014-04-11 10:48 GMT+02:00 James James jre...@gmail.com mailto:jre...@gmail.com : the log contains information about the first node1 installation http://pastebin.com/mZSb2wmD 2014-04-11 10:12 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com : please add the host-deploy log from /var/log/ovirt-engine/host- deploy/ On 04/11/2014 12:11 AM, James James wrote: Hi, I don know i the subject is explicit enough but I have a problem and I hope to find some help here. I had two hosts in my cluster (node1 and node2). I had to reinstall node1 due to some networks problem. In the engine, node1 appears now Not reponsive and I can't remove it from the engine ui. Now node1 is back and I want to add it in the cluster but I can't. I've got this error (vdsm.log) : BindingXMLRPC::ERROR::2014-04- 11 01 tel:2014-04-%2011%2001:04:42,622::BindingXMLRPC:: 81::vds::(threaded_start) xml-rpc handler exception Traceback (most recent call last): File /usr/share/vdsm/ BindingXMLRPC.py, line 77, in threaded_start self.server.handle_request() File /usr/lib64/python2.6/ SocketServer.py, line 278, in handle_request self._handle_request_noblock() File /usr/lib64/python2.6/ SocketServer.py, line 288, in _handle_request_noblock request, client_address = self.get_request() File /usr/lib64/python2.6/ SocketServer.py, line 456, in get_request return self.socket.accept() File /usr/lib64/python2.6/site- packages/vdsm/ SecureXMLRPCServer.py, line 136, in accept raise SSL.SSLError(%s, client %s % (e, address[0])) SSLError: sslv3 alert certificate unknown, client 192.168.1.100 192.168.1.100 is the engine's address Can somebody help me ? __ _ Users mailing list Users@ovirt.org mailto:Users@ovirt.org http://lists.ovirt.org/ mailman/listinfo/users -- Dafna Ron ___ Users mailing list Users@ovirt.org mailto:Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Dafna Ron ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] [Spice-devel] [Users] 2 virtual monitors for Fedora guest
On Mi, 2014-04-09 at 14:15 +0300, Itamar Heim wrote: On 04/09/2014 01:57 PM, René Koch wrote: On 04/09/2014 11:24 AM, René Koch wrote: Thanks a lot for testing. Too bad that multiple monitors didn't work for you, too. I'll test RHEL next - maybe this works better then Fedora... I just tested CentOS 6.5 with Gnome desktop and 2 monitors aren't working, too. I can see 3 vdagent processes running in CentOS... adding spice-devel RHEL 6.5 host hasn't the bits needed to support multi-monitor with the qxl kms driver. Planned to be fixed in 6.6. Experimental builds are here: http://people.redhat.com/ghoffman/bz1075139/ HTH, Gerd ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Fwd: [Users] HA
The Power management is configured correctly. And as long as the host who loses his storage isn't the SPM, there is no problem. If I can make it work that, when the VM is pauzed it's get switched of and (HA-way) reboots itself. I'm perfectly happy :-). Kind regards, -- Forwarded message -- From: Koen Vanoppen vanoppen.k...@gmail.com Date: 2014-04-11 14:47 GMT+02:00 Subject: Re: [ovirt-users] [Users] HA To: Michal Skrivanek michal.skriva...@redhat.com The Power management is configured correctly. And as long as the host who loses his storage isn't the SPM, there is no problem. If I can make it work that, when the VM is pauzed it's get switched of and (HA-way) reboots itself. I'm perfectly happy :-). Kind regards, 2014-04-11 9:37 GMT+02:00 Michal Skrivanek michal.skriva...@redhat.com: On 11 Apr 2014, at 09:00, Koen Vanoppen wrote: Hi All, Any news about this? DSM hook or anything? Thanx! Kind regards 2014-04-09 9:37 GMT+02:00 Omer Frenkel ofren...@redhat.com: - Original Message - From: Koen Vanoppen vanoppen.k...@gmail.com To: users@ovirt.org Sent: Tuesday, April 8, 2014 3:41:02 PM Subject: Re: [Users] HA Or with other words, the SPM and the VM should move almost immediate after the storage connections on the hypervisor are gone. I know, I'm asking to much maybe, but we would be very happy :-) :-). So sketch: Mercury1 SPM Mercury 2 Mercury1 loses both fibre connections -- goes in non-operational and the VM goes in paused state and stays this way, until I manually reboot the host so it fences. What I would like is that when mercury 1 loses both fibre connections. He fences immediate so the VM's are moved also almost instantly... If this is possible... :-) Kind regards and thanks for all the help! Michal, is there a vdsm hook for vm moved to pause? if so, you could send KILL to it, and engine will identify vm was killed+HA, so it will be restarted, and no need to reboot the host, it will stay in non-operational until storage is fixed. you have to differentiate - if only the VMs would be paused, yes, you can do anything (also change the err reporting policy to not pause the VM) but if the host becomes non-operational then it simply doesn't work, vdsm got stuck somewhere (often in get blk device stats) proper power management config should fence it Thanks, michal 2014-04-08 14:26 GMT+02:00 Koen Vanoppen vanoppen.k...@gmail.com : Ok, Thanx already for all the help. I adapted some things for quicker respons: engine-config --get FenceQuietTimeBetweenOperationsInSec--180 engine-config --set FenceQuietTimeBetweenOperationsInSec=60 engine-config --get StorageDomainFalureTimeoutInMinutes--180 engine-config --set StorageDomainFalureTimeoutInMinutes=1 engine-config --get SpmCommandFailOverRetries--5 engine-config --set SpmCommandFailOverRetries engine-config --get SPMFailOverAttempts--3 engine-config --set SPMFailOverAttempts=1 engine-config --get NumberOfFailedRunsOnVds--3 engine-config --set NumberOfFailedRunsOnVds=1 engine-config --get vdsTimeout--180 engine-config --set vdsTimeout=30 engine-config --get VDSAttemptsToResetCount--2 engine-config --set VDSAttemptsToResetCount=1 engine-config --get TimeoutToResetVdsInSeconds--60 engine-config --set TimeoutToResetVdsInSeconds=30 Now the result of this is that when the VM is not running on the SPM that it will migrate before going in pause mode. But when we tried it, when the vm is running on the SPM, it get's in paused mode (for safety reasons, I know ;-) ). And stays there until the host gets MANUALLY fenced by rebooting it. So now my question is... How can I make the hypervisor fence (so reboots, so vm is moved) quicker? Kind regards, Koen 2014-04-04 16:28 GMT+02:00 Koen Vanoppen vanoppen.k...@gmail.com : Ja das waar. Maar was aan't rijden... Dus ik stuur maar door dan :-). Ik heb reeds de time out aangepast. Die stond op 5 min voor hij den time out ging geven. Staat nu op 2 min On Apr 4, 2014 4:14 PM, David Van Zeebroeck david.van.zeebro...@brusselsairport.be wrote: Ik heb ze ook he Maar normaal had de fencing moeten werken als ik het zo lees Dus daar is ergens iets verkeerd gelopen zo te lezen From: Koen Vanoppen [mailto: vanoppen.k...@gmail.com ] Sent: vrijdag 4 april 2014 16:07 To: David Van Zeebroeck Subject: Fwd: Re: [Users] HA David Van Zeebroeck Product Manager Unix Infrastructure Information Communication Technology Brussels Airport Company T +32 (0)2 753 66 24 M +32 (0)497 02 17 31 david.van.zeebro...@brusselsairport.be www.brusselsairport.be FOLLOW US ON: Company Info -- Forwarded message -- From: Michal Skrivanek michal.skriva...@redhat.com Date: Apr
Re: [ovirt-users] Re-add a node
confirm host has been rebooted should release the SPM :) On 04/11/2014 01:23 PM, James James wrote: 2014-04-11 14:08 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com: James, Please answer the user's list as well as to me so that other people can participate as well :) Oups ... I will do that .. did you try to press the confirm host has been rebooted (right click) Yes but same problem. node1 cannot be in maintenance mode. node1 is SPM . I will make node1 release the SPM ressource. On 04/11/2014 12:41 PM, James James wrote: 2014-04-11 13:16 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com: what is the error message you get when you try to put the host in maintenance? I have this message : Error while executing action: Cannot switch Host to Maintenance mode. Host is Storage Pool Manager and is in Non Responsive state. - If power management is configured, engine will try to fence the host automatically. - Otherwise, either bring the node back up, or release the SPM resource. To do so, verify that the node is really down by right clicking on the host and confirm that the node was shutdown manually. are there any running vm's reported? No there is no VM running on this host Try to press the confirm host has been rebooted button and than see if you can put the host in maintenance. If that fails, select the host, in the general tab you will get the re-install link. I am running ovirt 3.4.0-1. I down't know where is the re-install link but I can't see it. try to re-install, when install fails the host should change status to failed installation. On 04/11/2014 12:09 PM, James James wrote: No, I can't 2014-04-11 12:12 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com: can you put the host in maintenance? On 04/11/2014 10:43 AM, James James wrote: I can't delete the old node because it is in Non Responsive state. The remove button is stil blank . In the engine.log I've got this log : 2014-04-11 11:40:45,911 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-88) Command GetCapabilitiesVDSCommand(HostName = node1, HostId = 36fb6df3-c2c2-4133-86ac-fe50b99ee2e3, vds=Host[node1]) execution failed. Exception: VDSNetworkException: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target 2014-04-11 11:40:48,943 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-12) Command GetCapabilitiesVDSCommand(HostName = node1, HostId = 36fb6df3-c2c2-4133-86ac-fe50b99ee2e3, vds=Host[node1]) execution failed. Exception: VDSNetworkException: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target 2014-04-11 11:06 GMT+02:00 Alon Bar-Lev alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com: - Original Message - From: James James jre...@gmail.com mailto:jre...@gmail.com mailto:jre...@gmail.com mailto:jre...@gmail.com mailto:jre...@gmail.com mailto:jre...@gmail.com mailto:jre...@gmail.com mailto:jre...@gmail.com mailto:jre...@gmail.com mailto:jre...@gmail.com mailto:jre...@gmail.com
Re: [ovirt-users] compatibility relationship between datacenter, ovirt and cluster
On 04/10/2014 10:50 PM, Tamer Lima wrote: Hi, yesterday my ovirt was 3.3 my datacenter and cluster (compatibility version) was aligned with ovirt 3.3 today my ovirt is now 3.4. and my datacenter and cluster (compatibility version) remains 3.3 (with the option enabled to change for 3.4) browsing the ovirt admin page I see 2 occurrences of ovirt version: datacenter tab = 3.3 cluster tab =3.3 I would like to understand what means all these versions, the same version for a lot of important things, and how my ovirt works/behaves using different versions. all my doubts together : What means datacenter in version 3.3(or lower version) when ovirt is 3.4 ? what means cluster in version 3.3 when ovirt is 3.4? what means change the compatibility version for datacenter? what means change the compatibility version for cluster? Gianluca replied with links for specific features. I'll reply on the general concept: - a host can be upgraded to a specific 3.x version. it can be in any 3.y cluster which is = 3.x once the host is upgraded, 3.x host-level features can be used - a cluster can be upgraded to 3.x once all active hosts in it are at least with 3.x. once the cluster is upgraded, 3.x cluster-level features can be used - a data cetner can be upgaeded to 3.x once all clusters in it are at least with 3.x once the DC is upgraded, 3.x DC-level features can be used i don't think we have it clearly documented per feture if it is host level, cluster level or DC level. when in doubt, ask... ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] off topic = newbie with bugzilla, gerrit and ovirt VDSM 4.14
On 04/10/2014 11:04 PM, Tamer Lima wrote: hi, I have now ovirt 3.4 with vdsm 4.14 danken - the bug you referenced is fixed in 3.3, yet recurs in 3.4? Well, my question is not exactly how to solve an specific problem. In fact, what I want is learn how to apply the corrections proposed on bugzila, gerrit, etc. Or this is only for ovirt internal developers ? thanks we'll be happy if you try. this would be the starting point. if anything is not clear, ask, and we should probably add it there: http://www.ovirt.org/Develop tamer On Thu, Apr 10, 2014 at 8:28 AM, Dan Kenigsberg dan...@redhat.com mailto:dan...@redhat.com wrote: On Wed, Apr 09, 2014 at 11:06:44AM -0300, Tamer Lima wrote: Hi, I have a problem with an automatic yum vdsm upgrade (4.13 to 4.14) on ovirt 3.3 : Host x.x.x.x is installed with VDSM version (4.14) and cannot join cluster Default which is compatible with VDSM versions [4.13,4.9,4.11,4.12,4.10]. I am searching the solution and one of them (from bz Bug 1083008 ) points to gerrit site(http://gerrit.ovirt.org/#/c/23456/) I do not have any experience in solving bugs, neither bz, gerrit, github How to initialize in this world and solve the ovirt problem ? another thing : does downgrade or remove vdms package excludes the installed virtual machines ? I know about ovirt 3.4 is the new release, but I want to learn how to solve some bugs when new releases is not launched; (off course I am afraid to upgrade to ovirt 3.4 too :) ) Which exact version of Engine do you have? We have a recurrent bug of Engine not honoring Vdsm's supportedENGINEs. (Cf. Bug 1016461 - [vdsm] engine fails to add host with vdsm version 4.13.0) The simplest hack to do is to add 4.14 to SupportedVDSMVersions in Engines database. Eli and Yair could give the exact command line for that. I believe that there's work to fix this properly on Engine's side. Dan. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] vdsm dependency rsyslog?
On 04/10/2014 02:36 PM, Sven Kieske wrote: Am 10.04.2014 12:04, schrieb Dan Kenigsberg: Could you present (here, or on bugzilla) your use case for another syslog service? Which one? Or do you want to turn it off completely? I'll do both (here and on BZ): Usecase: Well utilize existing log infrastructure which doesn't use rsyslog, perhaps? e.g. syslog-ng furthermore: vdsms logging is configured via /etc/vdsm/logger.conf and uses the python module logging: https://docs.python.org/2/library/logging.html this module is pretty cool, it allows you to redirect your logging basically to wherever you like, including syslog services or files, whatever. so why restrict this builtin feature by adding a hard dependency to a specific logging service? I know it's done because it's built on red hat linux where rsyslog is the default, and I don't want to say rsyslog is bad, it's just not used by everybody. As I understand there is work be done to port vdsm and engine to different distros like debian/ubuntu and gentoo. If you really want to port your software, keep it portable. This means in the first place don't introduce dependencies which aren't absolutely necessary. wouldn't that just be a dependency change for the package for the other distro's? its not like they will use the rhel spec anyway? This one isn't really necessary. I can report to you that e.g. syslog-ng handles vdsm logs well. ;) HTH ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Re-add a node
Nothing new. node1 can't release SPM ... :( 2014-04-11 14:49 GMT+02:00 Dafna Ron d...@redhat.com: confirm host has been rebooted should release the SPM :) On 04/11/2014 01:23 PM, James James wrote: 2014-04-11 14:08 GMT+02:00 Dafna Ron d...@redhat.com mailto: d...@redhat.com: James, Please answer the user's list as well as to me so that other people can participate as well :) Oups ... I will do that .. did you try to press the confirm host has been rebooted (right click) Yes but same problem. node1 cannot be in maintenance mode. node1 is SPM . I will make node1 release the SPM ressource. On 04/11/2014 12:41 PM, James James wrote: 2014-04-11 13:16 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com: what is the error message you get when you try to put the host in maintenance? I have this message : Error while executing action: Cannot switch Host to Maintenance mode. Host is Storage Pool Manager and is in Non Responsive state. - If power management is configured, engine will try to fence the host automatically. - Otherwise, either bring the node back up, or release the SPM resource. To do so, verify that the node is really down by right clicking on the host and confirm that the node was shutdown manually. are there any running vm's reported? No there is no VM running on this host Try to press the confirm host has been rebooted button and than see if you can put the host in maintenance. If that fails, select the host, in the general tab you will get the re-install link. I am running ovirt 3.4.0-1. I down't know where is the re-install link but I can't see it. try to re-install, when install fails the host should change status to failed installation. On 04/11/2014 12:09 PM, James James wrote: No, I can't 2014-04-11 12:12 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com: can you put the host in maintenance? On 04/11/2014 10:43 AM, James James wrote: I can't delete the old node because it is in Non Responsive state. The remove button is stil blank . In the engine.log I've got this log : 2014-04-11 11:40:45,911 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker. GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-88) Command GetCapabilitiesVDSCommand(HostName = node1, HostId = 36fb6df3-c2c2-4133-86ac-fe50b99ee2e3, vds=Host[node1]) execution failed. Exception: VDSNetworkException: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target 2014-04-11 11:40:48,943 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker. GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-12) Command GetCapabilitiesVDSCommand(HostName = node1, HostId = 36fb6df3-c2c2-4133-86ac-fe50b99ee2e3, vds=Host[node1]) execution failed. Exception: VDSNetworkException: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target 2014-04-11 11:06 GMT+02:00 Alon Bar-Lev alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com: - Original Message - From: James James jre...@gmail.com mailto:jre...@gmail.com mailto:jre...@gmail.com mailto:jre...@gmail.com
Re: [ovirt-users] [Users] HA
On 11 Apr 2014, at 14:47, Koen Vanoppen wrote: The Power management is configured correctly. And as long as the host who loses his storage isn't the SPM, there is no problem. ah, I see If I can make it work that, when the VM is pauzed it's get switched of and (HA-way) reboots itself. I'm perfectly happy :-). I'm not entirely sure that the after_vm_pause() hook gets invoked in this case. It was not intended for involuntary pause…but give it a try!:) otherwise ….well, you can always do a periodic query…not very effective though Thanks, michal Kind regards, -- Forwarded message -- From: Koen Vanoppen vanoppen.k...@gmail.com Date: 2014-04-11 14:47 GMT+02:00 Subject: Re: [ovirt-users] [Users] HA To: Michal Skrivanek michal.skriva...@redhat.com The Power management is configured correctly. And as long as the host who loses his storage isn't the SPM, there is no problem. If I can make it work that, when the VM is pauzed it's get switched of and (HA-way) reboots itself. I'm perfectly happy :-). Kind regards, 2014-04-11 9:37 GMT+02:00 Michal Skrivanek michal.skriva...@redhat.com: On 11 Apr 2014, at 09:00, Koen Vanoppen wrote: Hi All, Any news about this? DSM hook or anything? Thanx! Kind regards 2014-04-09 9:37 GMT+02:00 Omer Frenkel ofren...@redhat.com: - Original Message - From: Koen Vanoppen vanoppen.k...@gmail.com To: users@ovirt.org Sent: Tuesday, April 8, 2014 3:41:02 PM Subject: Re: [Users] HA Or with other words, the SPM and the VM should move almost immediate after the storage connections on the hypervisor are gone. I know, I'm asking to much maybe, but we would be very happy :-) :-). So sketch: Mercury1 SPM Mercury 2 Mercury1 loses both fibre connections -- goes in non-operational and the VM goes in paused state and stays this way, until I manually reboot the host so it fences. What I would like is that when mercury 1 loses both fibre connections. He fences immediate so the VM's are moved also almost instantly... If this is possible... :-) Kind regards and thanks for all the help! Michal, is there a vdsm hook for vm moved to pause? if so, you could send KILL to it, and engine will identify vm was killed+HA, so it will be restarted, and no need to reboot the host, it will stay in non-operational until storage is fixed. you have to differentiate - if only the VMs would be paused, yes, you can do anything (also change the err reporting policy to not pause the VM) but if the host becomes non-operational then it simply doesn't work, vdsm got stuck somewhere (often in get blk device stats) proper power management config should fence it Thanks, michal 2014-04-08 14:26 GMT+02:00 Koen Vanoppen vanoppen.k...@gmail.com : Ok, Thanx already for all the help. I adapted some things for quicker respons: engine-config --get FenceQuietTimeBetweenOperationsInSec--180 engine-config --set FenceQuietTimeBetweenOperationsInSec=60 engine-config --get StorageDomainFalureTimeoutInMinutes--180 engine-config --set StorageDomainFalureTimeoutInMinutes=1 engine-config --get SpmCommandFailOverRetries--5 engine-config --set SpmCommandFailOverRetries engine-config --get SPMFailOverAttempts--3 engine-config --set SPMFailOverAttempts=1 engine-config --get NumberOfFailedRunsOnVds--3 engine-config --set NumberOfFailedRunsOnVds=1 engine-config --get vdsTimeout--180 engine-config --set vdsTimeout=30 engine-config --get VDSAttemptsToResetCount--2 engine-config --set VDSAttemptsToResetCount=1 engine-config --get TimeoutToResetVdsInSeconds--60 engine-config --set TimeoutToResetVdsInSeconds=30 Now the result of this is that when the VM is not running on the SPM that it will migrate before going in pause mode. But when we tried it, when the vm is running on the SPM, it get's in paused mode (for safety reasons, I know ;-) ). And stays there until the host gets MANUALLY fenced by rebooting it. So now my question is... How can I make the hypervisor fence (so reboots, so vm is moved) quicker? Kind regards, Koen 2014-04-04 16:28 GMT+02:00 Koen Vanoppen vanoppen.k...@gmail.com : Ja das waar. Maar was aan't rijden... Dus ik stuur maar door dan :-). Ik heb reeds de time out aangepast. Die stond op 5 min voor hij den time out ging geven. Staat nu op 2 min On Apr 4, 2014 4:14 PM, David Van Zeebroeck david.van.zeebro...@brusselsairport.be wrote: Ik heb ze ook he Maar normaal had de fencing moeten werken als ik het zo lees Dus daar is ergens iets verkeerd gelopen zo te lezen From: Koen Vanoppen [mailto: vanoppen.k...@gmail.com ] Sent: vrijdag 4 april 2014 16:07 To: David Van Zeebroeck Subject: Fwd: Re: [Users] HA David Van Zeebroeck Product Manager Unix Infrastructure Information
Re: [ovirt-users] Re-add a node
What do you think about : *https://raw.github.com/dougsland/misc-rhev/master/engine_force_remove_Host.py https://raw.github.com/dougsland/misc-rhev/master/engine_force_remove_Host.py* 2014-04-11 15:10 GMT+02:00 James James jre...@gmail.com: Nothing new. node1 can't release SPM ... :( 2014-04-11 14:49 GMT+02:00 Dafna Ron d...@redhat.com: confirm host has been rebooted should release the SPM :) On 04/11/2014 01:23 PM, James James wrote: 2014-04-11 14:08 GMT+02:00 Dafna Ron d...@redhat.com mailto: d...@redhat.com: James, Please answer the user's list as well as to me so that other people can participate as well :) Oups ... I will do that .. did you try to press the confirm host has been rebooted (right click) Yes but same problem. node1 cannot be in maintenance mode. node1 is SPM . I will make node1 release the SPM ressource. On 04/11/2014 12:41 PM, James James wrote: 2014-04-11 13:16 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com: what is the error message you get when you try to put the host in maintenance? I have this message : Error while executing action: Cannot switch Host to Maintenance mode. Host is Storage Pool Manager and is in Non Responsive state. - If power management is configured, engine will try to fence the host automatically. - Otherwise, either bring the node back up, or release the SPM resource. To do so, verify that the node is really down by right clicking on the host and confirm that the node was shutdown manually. are there any running vm's reported? No there is no VM running on this host Try to press the confirm host has been rebooted button and than see if you can put the host in maintenance. If that fails, select the host, in the general tab you will get the re-install link. I am running ovirt 3.4.0-1. I down't know where is the re-install link but I can't see it. try to re-install, when install fails the host should change status to failed installation. On 04/11/2014 12:09 PM, James James wrote: No, I can't 2014-04-11 12:12 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com: can you put the host in maintenance? On 04/11/2014 10:43 AM, James James wrote: I can't delete the old node because it is in Non Responsive state. The remove button is stil blank . In the engine.log I've got this log : 2014-04-11 11:40:45,911 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker. GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-88) Command GetCapabilitiesVDSCommand(HostName = node1, HostId = 36fb6df3-c2c2-4133-86ac-fe50b99ee2e3, vds=Host[node1]) execution failed. Exception: VDSNetworkException: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target 2014-04-11 11:40:48,943 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker. GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-12) Command GetCapabilitiesVDSCommand(HostName = node1, HostId = 36fb6df3-c2c2-4133-86ac-fe50b99ee2e3, vds=Host[node1]) execution failed. Exception: VDSNetworkException: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target 2014-04-11 11:06 GMT+02:00 Alon Bar-Lev alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com mailto:alo...@redhat.com:
Re: [ovirt-users] Re-add a node
On 04/11/2014 04:10 PM, James James wrote: Nothing new. node1 can't release SPM ... :( allon/federico - thoughts? confirm host shutdown should release SPM for a non-responsive node 2014-04-11 14:49 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com: confirm host has been rebooted should release the SPM :) On 04/11/2014 01:23 PM, James James wrote: 2014-04-11 14:08 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com: James, Please answer the user's list as well as to me so that other people can participate as well :) Oups ... I will do that .. did you try to press the confirm host has been rebooted (right click) Yes but same problem. node1 cannot be in maintenance mode. node1 is SPM . I will make node1 release the SPM ressource. On 04/11/2014 12:41 PM, James James wrote: 2014-04-11 13:16 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com: what is the error message you get when you try to put the host in maintenance? I have this message : Error while executing action: Cannot switch Host to Maintenance mode. Host is Storage Pool Manager and is in Non Responsive state. - If power management is configured, engine will try to fence the host automatically. - Otherwise, either bring the node back up, or release the SPM resource. To do so, verify that the node is really down by right clicking on the host and confirm that the node was shutdown manually. are there any running vm's reported? No there is no VM running on this host Try to press the confirm host has been rebooted button and than see if you can put the host in maintenance. If that fails, select the host, in the general tab you will get the re-install link. I am running ovirt 3.4.0-1. I down't know where is the re-install link but I can't see it. try to re-install, when install fails the host should change status to failed installation. On 04/11/2014 12:09 PM, James James wrote: No, I can't 2014-04-11 12:12 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com: can you put the host in maintenance? On 04/11/2014 10:43 AM, James James wrote: I can't delete the old node because it is in Non Responsive state. The remove button is stil blank . In the engine.log I've got this log : 2014-04-11 11:40:45,911 ERROR [org.ovirt.engine.core.__vdsbroker.vdsbroker.__GetCapabilitiesVDSCommand] (DefaultQuartzScheduler___Worker-88) Command GetCapabilitiesVDSCommand(__HostName = node1, HostId = 36fb6df3-c2c2-4133-86ac-__fe50b99ee2e3, vds=Host[node1]) execution failed. Exception: VDSNetworkException: sun.security.provider.__certpath.__SunCertPathBuilderException: unable to find valid certification path to requested target 2014-04-11 11:40:48,943 ERROR [org.ovirt.engine.core.__vdsbroker.vdsbroker.__GetCapabilitiesVDSCommand] (DefaultQuartzScheduler___Worker-12) Command GetCapabilitiesVDSCommand(__HostName = node1, HostId =
Re: [ovirt-users] Re-add a node
I think there might be a chance it's related to issues we were seeing with engine cache. James, would you mind testing that? restart of ovirt-engine process should clear the cache. once you restart, log in to the webadmin and if the host is still spm and none responsive, can you try confirm host reboot again? If that still fails, can you please attach the engine log? Thanks. Dafna On 04/11/2014 02:13 PM, Itamar Heim wrote: On 04/11/2014 04:10 PM, James James wrote: Nothing new. node1 can't release SPM ... :( allon/federico - thoughts? confirm host shutdown should release SPM for a non-responsive node 2014-04-11 14:49 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com: confirm host has been rebooted should release the SPM :) On 04/11/2014 01:23 PM, James James wrote: 2014-04-11 14:08 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com: James, Please answer the user's list as well as to me so that other people can participate as well :) Oups ... I will do that .. did you try to press the confirm host has been rebooted (right click) Yes but same problem. node1 cannot be in maintenance mode. node1 is SPM . I will make node1 release the SPM ressource. On 04/11/2014 12:41 PM, James James wrote: 2014-04-11 13:16 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com: what is the error message you get when you try to put the host in maintenance? I have this message : Error while executing action: Cannot switch Host to Maintenance mode. Host is Storage Pool Manager and is in Non Responsive state. - If power management is configured, engine will try to fence the host automatically. - Otherwise, either bring the node back up, or release the SPM resource. To do so, verify that the node is really down by right clicking on the host and confirm that the node was shutdown manually. are there any running vm's reported? No there is no VM running on this host Try to press the confirm host has been rebooted button and than see if you can put the host in maintenance. If that fails, select the host, in the general tab you will get the re-install link. I am running ovirt 3.4.0-1. I down't know where is the re-install link but I can't see it. try to re-install, when install fails the host should change status to failed installation. On 04/11/2014 12:09 PM, James James wrote: No, I can't 2014-04-11 12:12 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com: can you put the host in maintenance? On 04/11/2014 10:43 AM, James James wrote: I can't delete the old node because it is in Non Responsive state. The remove button is stil blank . In the engine.log I've got this log : 2014-04-11 11:40:45,911 ERROR [org.ovirt.engine.core.__vdsbroker.vdsbroker.__GetCapabilitiesVDSCommand] (DefaultQuartzScheduler___Worker-88) Command GetCapabilitiesVDSCommand(__HostName = node1, HostId = 36fb6df3-c2c2-4133-86ac-__fe50b99ee2e3, vds=Host[node1]) execution failed. Exception: VDSNetworkException: sun.security.provider.__certpath.__SunCertPathBuilderException: unable to find valid certification path to requested target
Re: [ovirt-users] vdsm dependency rsyslog?
well as I already elaborated (with more detail in the BZ, though). ( Bug 1083100 - vdsm depends on rsyslog ) vdsm crashes if /dev/log is not there. the devs where a little inaccurate and added the dependency for rsyslog, but the real dependency is on /dev/log, which is also satisfied by e.g. syslog-ng. so if someone wants to use syslog-ng on rhel maybe because it is already used in other infrastructure you still have to install rsyslogd for no use? Am 11.04.2014 15:10, schrieb Itamar Heim: wouldn't that just be a dependency change for the package for the other distro's? its not like they will use the rhel spec anyway? -- Mit freundlichen Grüßen / Regards Sven Kieske Systemadministrator Mittwald CM Service GmbH Co. KG Königsberger Straße 6 32339 Espelkamp T: +49-5772-293-100 F: +49-5772-293-333 https://www.mittwald.de Geschäftsführer: Robert Meyer St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Re-add a node
can you attach the engine log and the vdsm log from the second host? The SPM cannot be released because the master storage domain is not visible from the second host. Dafna On 04/11/2014 02:38 PM, James James wrote: I try to follow the Daria's advices .. I restarted my engine to clear the cache but now I am facing a new problem. node1 is the SPM and I have this error message : Manual fence did not revoke the selected SPM (node1) since the master storage domain was not active or could not use another host for the fence operation. 2014-04-11 15:16 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com: I think there might be a chance it's related to issues we were seeing with engine cache. James, would you mind testing that? restart of ovirt-engine process should clear the cache. once you restart, log in to the webadmin and if the host is still spm and none responsive, can you try confirm host reboot again? If that still fails, can you please attach the engine log? Thanks. Dafna On 04/11/2014 02:13 PM, Itamar Heim wrote: On 04/11/2014 04:10 PM, James James wrote: Nothing new. node1 can't release SPM ... :( allon/federico - thoughts? confirm host shutdown should release SPM for a non-responsive node 2014-04-11 14:49 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com: confirm host has been rebooted should release the SPM :) On 04/11/2014 01:23 PM, James James wrote: 2014-04-11 14:08 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com: James, Please answer the user's list as well as to me so that other people can participate as well :) Oups ... I will do that .. did you try to press the confirm host has been rebooted (right click) Yes but same problem. node1 cannot be in maintenance mode. node1 is SPM . I will make node1 release the SPM ressource. On 04/11/2014 12:41 PM, James James wrote: 2014-04-11 13:16 GMT+02:00 Dafna Ron d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com mailto:d...@redhat.com: what is the error message you get when you try to put the host in maintenance? I have this message : Error while executing action: Cannot switch Host to Maintenance mode. Host is Storage Pool Manager and is in Non Responsive state. - If power management is configured, engine will try to fence the host automatically. - Otherwise, either bring the node back up, or release the SPM resource. To do so, verify that the node is really down by right clicking on the host and confirm that the node was shutdown manually. are there any running vm's reported? No there is no VM running on this host Try to press the confirm host has been rebooted button and than see if you can put the host in maintenance. If that fails, select the host, in the general tab you will get the re-install link. I am running ovirt 3.4.0-1. I down't know where is the
Re: [ovirt-users] Re-add a node
Hi, currently you need an additional operational host in the same cluster where the host which needs to get fenced resides. in other words: ovirt can't fence a host without another host in the same cluster. there's a BZ for getting this limitation out of the way: https://bugzilla.redhat.com/show_bug.cgi?id=1053434 There seems to be another BZ for getting this out of the way, but I don't know where: Eli Mesika wrote: there is already a RFE for letting the engine be the proxy. Am 11.04.2014 15:38, schrieb James James: I try to follow the Daria's advices .. I restarted my engine to clear the cache but now I am facing a new problem. node1 is the SPM and I have this error message : Manual fence did not revoke the selected SPM (node1) since the master storage domain was not active or could not use another host for the fence operation. -- Mit freundlichen Grüßen / Regards Sven Kieske Systemadministrator Mittwald CM Service GmbH Co. KG Königsberger Straße 6 32339 Espelkamp T: +49-5772-293-100 F: +49-5772-293-333 https://www.mittwald.de Geschäftsführer: Robert Meyer St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] off topic = newbie with bugzilla, gerrit and ovirt VDSM 4.14
On Fri, Apr 11, 2014 at 04:06:48PM +0300, Itamar Heim wrote: On 04/10/2014 11:04 PM, Tamer Lima wrote: hi, I have now ovirt 3.4 with vdsm 4.14 danken - the bug you referenced is fixed in 3.3, yet recurs in 3.4? I have very little explanations. The bug was hidden by an ad-hoc patch, adding 4.13 to SupportedVDSMVersions. I hope that Yaniv, Yair or Eli have better knowledge about the real fix, of honoring vdsm's supportedENGINEs again. Well, my question is not exactly how to solve an specific problem. In fact, what I want is learn how to apply the corrections proposed on bugzila, gerrit, etc. Or this is only for ovirt internal developers ? thanks we'll be happy if you try. this would be the starting point. if anything is not clear, ask, and we should probably add it there: http://www.ovirt.org/Develop ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] vdsm dependency rsyslog?
- Original Message - From: Sven Kieske s.kie...@mittwald.de To: Itamar Heim ih...@redhat.com, Dan Kenigsberg dan...@redhat.com, ybron...@redhat.com Cc: Users@ovirt.org List Users@ovirt.org, Alon Bar-Lev alo...@redhat.com Sent: Friday, April 11, 2014 4:20:32 PM Subject: Re: [ovirt-users] vdsm dependency rsyslog? well as I already elaborated (with more detail in the BZ, though). ( Bug 1083100 - vdsm depends on rsyslog ) vdsm crashes if /dev/log is not there. It should not crash if /dev/log is not available. Having syslog or /dev/log should be optional. the devs where a little inaccurate and added the dependency for rsyslog, but the real dependency is on /dev/log, which is also satisfied by e.g. syslog-ng. I do not see that this is a standard nor rsyslog provides this. so if someone wants to use syslog-ng on rhel maybe because it is already used in other infrastructure you still have to install rsyslogd for no use? The problem is that we usually part software part integration, configuring other components when we install ours, this creates tight coupling between what we can support. However, in this case I do not see we try to configure rsyslog, so rsyslog should be completely optional. If we want integration type solution, the rsyslog or any other logger should be installed and enabled using host-deploy. Am 11.04.2014 15:10, schrieb Itamar Heim: wouldn't that just be a dependency change for the package for the other distro's? its not like they will use the rhel spec anyway? -- Mit freundlichen Grüßen / Regards Sven Kieske Systemadministrator Mittwald CM Service GmbH Co. KG Königsberger Straße 6 32339 Espelkamp T: +49-5772-293-100 F: +49-5772-293-333 https://www.mittwald.de Geschäftsführer: Robert Meyer St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] SPM error
I have an error trying to bring the master DC back online. After several reboots, no luck. I took the other cluster members offline to try to troubleshoot. The remaining host is constantly in contention with itself for SPM ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-40) [38d400ea] IrsBroker::Failed::GetStoragePoolInfoVDS due to: IrsSpmStartFailedException: IRSGenericException: IRSErrorException: SpmStart failed ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] compatibility relationship between datacenter, ovirt and cluster
ok, Itamar, thanks. now I can understand how the things work. but I have a question based on the fact that I have ovirt-engine on version 3.4 and DC and Cluster both are configured on version 3.3 : - This means that I have available features on version 3.4 , but these features just will work when I update DC and Cluster to version 3.4 ? (this is what I understand, following Dafna Ron response) so, to really obtain the benefits of the new version (3.4) I need to run the commands (yum update ovirt-engine-setup, engine-setup) and also increase the version of compatibility within the datacenter, cluster and host to the version 3.4 - and what I have if only update ovirt-engine to version 3.4 and not update dc, cluster neither hosts. I realized changes on ovirt web admin interface, improvements on icons and maybe new fields. On Fri, Apr 11, 2014 at 9:56 AM, Itamar Heim ih...@redhat.com wrote: On 04/10/2014 10:50 PM, Tamer Lima wrote: Hi, yesterday my ovirt was 3.3 my datacenter and cluster (compatibility version) was aligned with ovirt 3.3 today my ovirt is now 3.4. and my datacenter and cluster (compatibility version) remains 3.3 (with the option enabled to change for 3.4) browsing the ovirt admin page I see 2 occurrences of ovirt version: datacenter tab = 3.3 cluster tab =3.3 I would like to understand what means all these versions, the same version for a lot of important things, and how my ovirt works/behaves using different versions. all my doubts together : What means datacenter in version 3.3(or lower version) when ovirt is 3.4 ? what means cluster in version 3.3 when ovirt is 3.4? what means change the compatibility version for datacenter? what means change the compatibility version for cluster? Gianluca replied with links for specific features. I'll reply on the general concept: - a host can be upgraded to a specific 3.x version. it can be in any 3.y cluster which is = 3.x once the host is upgraded, 3.x host-level features can be used - a cluster can be upgraded to 3.x once all active hosts in it are at least with 3.x. once the cluster is upgraded, 3.x cluster-level features can be used - a data cetner can be upgaeded to 3.x once all clusters in it are at least with 3.x once the DC is upgraded, 3.x DC-level features can be used i don't think we have it clearly documented per feture if it is host level, cluster level or DC level. when in doubt, ask... ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] SPM error
Anyone? - Original Message - From: Maurice James mja...@media-node.com To: users@ovirt.org Sent: Friday, April 11, 2014 2:05:17 PM Subject: [ovirt-users] SPM error I have an error trying to bring the master DC back online. After several reboots, no luck. I took the other cluster members offline to try to troubleshoot. The remaining host is constantly in contention with itself for SPM ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-40) [38d400ea] IrsBroker::Failed::GetStoragePoolInfoVDS due to: IrsSpmStartFailedException: IRSGenericException: IRSErrorException: SpmStart failed ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] SPM error
On 4/11/2014 2:05 PM, Maurice James wrote: I have an error trying to bring the master DC back online. After several reboots, no luck. I took the other cluster members offline to try to troubleshoot. The remaining host is constantly in contention with itself for SPM ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-40) [38d400ea] IrsBroker::Failed::GetStoragePoolInfoVDS due to: IrsSpmStartFailedException: IRSGenericException: IRSErrorException: SpmStart failed I'm no expert, but the last time I beat my head on that rock, something was wrong with my sanlock storage. YMMV Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] SPM error
How did you fix it? Sent from my Galaxy S®III Original message From: Ted Miller tmil...@hcjb.org Date:04/11/2014 6:00 PM (GMT-05:00) To: users@ovirt.org Subject: Re: [ovirt-users] SPM error On 4/11/2014 2:05 PM, Maurice James wrote: I have an error trying to bring the master DC back online. After several reboots, no luck. I took the other cluster members offline to try to troubleshoot. The remaining host is constantly in contention with itself for SPM ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-40) [38d400ea] IrsBroker::Failed::GetStoragePoolInfoVDS due to: IrsSpmStartFailedException: IRSGenericException: IRSErrorException: SpmStart failed I'm no expert, but the last time I beat my head on that rock, something was wrong with my sanlock storage. YMMV Ted Miller Elkhart, IN, USA ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] SPM error
Nooo. Sent from my Galaxy S®III Original message From: Ted Miller tmil...@hcjb.org Date:04/11/2014 7:08 PM (GMT-05:00) To: Maurice James mja...@media-node.com Subject: Re: [ovirt-users] SPM error I didn't, really. I did something wrong along the way, and ended up having to rebuild the engine and hosts. (My problems were due to a glusterfs split-brain.) Ted Miller On 4/11/2014 6:03 PM, Maurice James wrote: How did you fix it? Sent from my Galaxy S®III Original message From: Ted Miller tmil...@hcjb.org Date:04/11/2014 6:00 PM (GMT-05:00) To: users@ovirt.org Subject: Re: [ovirt-users] SPM error On 4/11/2014 2:05 PM, Maurice James wrote: I have an error trying to bring the master DC back online. After several reboots, no luck. I took the other cluster members offline to try to troubleshoot. The remaining host is constantly in contention with itself for SPM ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-40) [38d400ea] IrsBroker::Failed::GetStoragePoolInfoVDS due to: IrsSpmStartFailedException: IRSGenericException: IRSErrorException: SpmStart failed I'm no expert, but the last time I beat my head on that rock, something was wrong with my sanlock storage. YMMV Ted Miller Elkhart, IN, USA -- He is no fool who gives what he cannot keep, to gain what he cannot lose. - - Jim Elliot For more information about Jim Elliot and his unusual life, see http://www.christianliteratureandliving.com/march2003/carolyn.html. Ted Miller Design Engineer HCJB Global Technology Center, a ministry of Reach Beyond 2830 South 17th St Elkhart, IN 46517 574--970-4272 my desk 574--970-4252 receptionist ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users